EPA/625/3-90/017
                                                       September 1989
         Workshop Report on EPA Guidelines  for
Carcinogen Risk Assessment:   Use of Human Evidence
                              Assembled by:
                         Eastern Research Group, Inc.
                            6 Whittemore Street
                           Arlington, MA 02174
                         EPA Contract No. 68-02-4404

                                for the

                           Risk Assessment Forum
                    Technical Panel on Carcinogen Guidelines
                     U.S. Environmental Protection Agency
                          Washington, DC  20460

-------
                                    NOTICE
    Mention of trade names or commercial products does not constitute
endorsement or recommendation for use.

    This workshop was organized by Eastern'Research Group, Inc., Arlington,
Massachusetts, for the EPA Risk Assessment Forum.  ERG also assembled and
produced this workshop report.  Sections from individual contributors were
edited somewhat for clarity, but contributors were not asked to follow a
single format.  Relevant portions were reviewed by each workshop chairperson
and speaker.  Their time and contributions are gratefully acknowledged.  The
views presented are those of each contributor, not the U.S. Environmental
Protection Agency.
                                    -ii-

-------
                                   CONTENTS


                                                                         PAGE

INTRODUCTION	    1

MEETING AGENDA  .  .	    5

COLLECTED WORKSHOP MATERIALS	."...'.. ;  .........    7

•   Study Design and Interpretation  	  	  ...    9
    Chair, Summary  . .  .  .	    15

•   EPA Classification System  for Categorizing Weight of Evidence
    for Carcinogenieity  from Human Studies	    23
    Chair Summary	    27

•   Dose-Response Assessment   .  .	    39
    Chair Summary	    42

APPENDICES

Appendix A  EPA Risk Assessment  Forum Technical Panel on •
            Carcinogen Guidelines and Associates   	    47

Appendix B  List of Participants	    51

Appendix C  List of Observers	    57

Appendix D  Introductory Plenary Session Comments
            (Drs.  Philip Enterline,  Raymond R. Neutra, and Gerald  Ott)     61

Appendix E  1986 Guidelines for  Carcinogen Risk Assessment  	    81
                                    -iii-

-------

-------
      WORKSHOP REPORT ON EPA GUIDELINES FOR CARCINOGEN RISK ASSESSMENT:
                            USE OF HUMAN EVIDENCE
                               June 26-27,  1989
                                Washington,  DC

                                 INTRODUCTION

1.  Guidelines Development Program

    On September 24, 1986, the U.S. Environmental Protection Agency (EPA)
issued guidelines for assessing human risk from exposure to environmental
carcinogens (51 Federal Register 33992-34003).  The guidelines set forth
principles and procedures to guide EPA scientists in the conduct of Agency
risk assessments, to promote high scientific quality and Agency-wide
consistency, and to inform Agency decision-makers and the public about these
scientific procedures.  In publishing this guidance, EPA emphasized that one
purpose of the guidelines was to "encourage research and analysis that will
lead to new risk assessment methods and data," which in turn would be used to
revise and improve  the guidelines.  Thus, the guidelines were developed and
published with the  understanding that risk assessment is an evolving
scientific undertaking and that continued study would lead to changes.

    As expected, new  information and thinking in several areas of carcinogen
risk assessment, as well as accumulated experience  in using the guidelines,
has led to an EPA review to assess the need for revisions in the guidelines.
On August 26, 1988, EPA asked the public to provide information to assist this
review  (53 Federal  Register 52656-52658).  In addition, EPA conducted  two
workshops to  collect  further information.  The first workshop for analysis and
review  of these  issues was held in Virginia Beach,  Virginia, on January  11-13,
1989  (53 Federal Register 49919-20).  That workshop brought together experts
in various  areas of carcinogen  risk assessment to study and comment on the use
of animal evidence  in considering  both qualitative  issues in classifying
potential carcinogens and quantitative issues  in dose-response and
                                      -1-

-------
extrapolation.  The report from this workshop was made available to the public
on April 24, 1989 (54 Federal Register 16403).

    On June 16, 1989, the Agency announced that a workshop for the study and
review of the use of human evidence in risk assessment would be held in
Washington, D.C., on June 26 and 27, 1989 (54 Federal Register 25619).  This
report is a. compilation of the discussions and presentations from that
meeting.  As with the Virginia Beach meeting, the Agency's intention was not
to achieve consensus on or resolution of all issues.  It was hoped instead
that these workshops would provide a scientific forum for objective discussion
and analysis.

    These workshops are part of a three-stage process for reviewing and, as
appropriate, revising EPA's cancer risk assessment guidelines.  The first
stage began with several information-gathering activities to identify and
define scientific issues relating to the guidelines.  For example, EPA
scientists and program offices were invited to comment on their experiences
with the 1986 cancer guidelines.  Also, the August 1988 Federal Register
notice asked for public comment on the use of these guidelines.  Other
information was obtained in meetings with individual scientists who regularly
use the guidelines.  Information from the workshops and these other sources
will be used to decide when and how the guidelines should be revised.

    In the second stage of the guidelines review process, EPA is analyzing the
information described above to make decisions about changing the guidelines,
to determine the nature of any such changes and, if appropriate, to develop a
formal proposal for peer review and public comment.  EPA's analysis of  the
information collected so far suggests several possible outcomes, ranging from
no changes at this time to substantial changes for certain aspects of the
guidelines.

    In the third stage of this Agency review, any proposed changes would be
submitted to scientific experts for preliminary peer review, and then to the
general public, other federal agencies, and EPA's Science Advisory Board for

-------
comment.  All of these comments would be. evaluated in developing final
guidance.
2.  Human Evidence Workshop

    On June 26 and 27, 1989, epidemiologists and others met in Washington,
B.C. to study and comment on the scientific foundation for possible changes in
the human evidence sections of the 1986 carcinogen guidelines.  In general,
although these guidelines emphasize that reliable human evidence takes
precedence over animal data, guidance on the use of human evidence is
considerably less detailed than that for animal data.  Thus, workshop
discussions focused on the possibilities of expanding and clarifying the
guidelines by adding new language for (1) study design and interpretation,  (2)
quantification of human data, and (3) weight-of-evidence analyses for human
data.

    The workshop participants met both  in plenary sessions and  in separate
work groups to consider "strawman" language for potential inclusion in  revised
guidelines for carcinogen risk assessment.  They also addressed related
questions posed by the EPA Technical Panel.  The work group on  study design
issues was chaired by Dr. Marilyn Fingerhut, Chief,  Industrywide Studies
Branch, at the National Institute for Occupational Safety and Health  (NIOSH).
Dr. Philip Enterline, Emeritus Professor of Biostatistics at  the University of
Pittsburgh School of  Public Health, chaired the work group  on dose-response
issues.  The work group on weight of evidence  classification  issues was
chaired by Dr. Raymond Neutra, Chief of the Epidemiologic Studies Section of
the California Department of  Health Services.  Dr. Enterline  was the  overall
chair  for  the workshop.  The  strawman  language and questions  were developed by
a subcommittee of  the EPA Technical Panel.  These documents were intended to
 initiate  and guide work  group discussions  rather  than  to  formally propose
 specific  language  or  policy.   Members  of  the  EPA  Technical  Panel also
participated in  each  work  group.   Other EPA scientific  staff  and the  public
 attended the workshop as  observers.
                                       -3-

-------
    As a scientific forum for objective discussion and analysis among the
invited panelists, the workshop was designed to assist EPA epidemiologists and
scientists in developing the scientific foundation for proposed guidance on
the use of human evidence in risk assessment.  Broader policy issues will
become important later in the process when the public is invited to review any
proposed changes in the guidelines.                                      ,
                                       -4-

-------
Monday, June 26

Time

7:30 a.m.

8:30 a.m.

8:35 a.m.

8:50 a.m.

9:20 a.m.

9:50 a.m.

10-10:20 a.m.

10:20  a.m.

11:30  a.m.

12:00-1:15  p.m.

1:15 p.m.
3:15-3:30 p.m.

3:30 p.m.




5:30 p.m.

5:30-7:00 p.m.
                         EPA CANCER GUIDELINES REVIEW

                          WORKSHOP ON HUMAN EVIDENCE

                               June 26-27, i?89 ;

                       AGENDA AND WORRGROOP ASSIGNMENTS

                        Chairman, Dr. Philip Enterline
Topic

Registration Checkt-rin

Welcome

Opening Comments

Public interest VieVs

Private Sector Views

Administrative Announcements

COFFEE BREAK

Observer Comments

Charge to Work Groups

LUNCH

Workgroups

A:  Study Design  and
    Interpretation
B:  Weight of  Evidence
C:  Dose Response

COFFEE BREAK

Workgroups

     A &  B
     C

Adjourn

 Cash Bar Reception
                                                  Principals

                                                  All   ,

                                                  Dr. Patton

                                                  Dr, Enterline

                                                  Dr. Neutra  and  Panelists

                                                  Dr. Ott  and Panelists

                                                  Ms. Schalk
                                                   Dr.  Farland
                                                    Dr.  Fingerhut,  Chair

                                                    Dr.'  Neutra,  Chair
                                                    Dr.  Enterline,.Chair
                                                    Dr, Neutra & Dr. Fingerhut
                                                    or* Enterline
                                        -5-

-------
 Tuesday, June  27

 Time                Topic
 8:00 a.m.



 10:15-10:30 a.m.

 10:30 a.m.


 11:30 a.m.


 12:15 p.m.

 12:30 p.m.
 Workgroup Reports and
 Discussion
 BREAK

 Observer Comments and
 Discussion

 Workgroup. Recommendations
 Wrap-up

 ADJOURNMENT
Principals

Workgroup Chairs
(Drs. Enterline, Fingerhut
& Neutra); Panelists
Drs. Fingerhut, Neutra,
Enterline

Dr. Enterline
 Workgroup A;

 Chair:

 Members:
          WORKGROUP ASSIGNMENTS

 Study Design  and  Interpretation

 Dr.  Fingerhut

 Drs. Matanoski, Hulka,  Buffler,  Cantor,  Friedlander,
 D. Hill, Halperin,'Hogan, Koppikar
Workgroup  B;

Chair:

Members:
Weight of Evidence

Dr. Neutra

Drs. Cole, Ott, Blair, Falk, Infante, M.Chu, Bayliss,
Blondell,•Margosches
Workgroup C;

Chair:

Members:
Dose-Response

Dr. Enterline

Drs. Crump, Checkoway, Gibb, Raabe, Smith, Krewski, Chen,
Nelson, K. Chu, Farland, Scott
EPA Technical Panel;  Drs. /Farland, R. Hill,  Patton,  M.  Chu, Rhomberg, Wiltse,
                      Rees, Gibb, Bayliss,  Blondell,  Chen,  D.  Hill,  Hogan,
                      Margosches, Nelson,  Scott

Risk Assessment Forum Staff;  Drs. Patton,  Rees
                                  -6-

-------
   COLLECTED WORKSHOP MATERIALS


Study Design and Interpretation

    Strawman Language and Related Questions

    Chair Summary of Work Group Session

EPA Classification System  for  Categorizing Weight
of Evidence for Carcinogenicity from Human Studies

    Strawman Language and Related Questions

    Chair Summary of Work Group Session

Dose-Response Assessment

    Strawman Language and Related Questions

    Chair Summary of Work Group Session

-------

-------
                        STUDY DESIGN AND  INTERPRETATION

                    Strawman Language and Related Questions

Introduction and Study Types1

      Epidemiologic studies provide unique information about the response of
humans who have been exposed to suspect carcinogens.  These studies allow the
possible evaluation of the consequences of an environmental exposure in the
precise manner in which it occurs and will continue to occur in human
populations (OSTP, 1985).  There are various types of studies or study designs
that are well-described and defined in various textbooks and other documents
(e.g., Breslow et al., 1980, 1987; Kelsey et al.,  1986; Lilienfeld et al.,
1979; Mausner et al., 1985; Rothman, 1986).  The more common types are
described below.

      A variety of study designs are considered to be hypothesis-generating.
In general, these studies utilize already existing collections of data (e.g.,
vital statistics, census data), but only produce indirect associations,
because they are based on broadly defined group or population characteristics.
Studies depending on case reports typically are also considered hypothesis -
generating, because  the relatively limited numbers of cases and the absence of
comparison groups generally do not permit causal inferences.  Generally cross-
sectional studies are hypothesis-generating.  Sometimes a population may be
well enough followed or restricted that bias is unlikely to arise from
migration, mortality or similar removal  from the observed group; a cross-
sectional or prevalence study may then offer some sort of risk estimate.

      Epidemiologic  studies designed to  test a  specific hypothesis, such as
case-control and  cohort studies, are more useful in assessing risks to exposed
humans.  These  studies  examine the characteristics of  individuals within a
      Editor's Note:   Unless otherwise noted, paragraph  numbers refer to  the
 Human Studies section of the 1986  Guidelines  for  Carcinogen Risk Assessment in
 Appendix E of this document.
                                      —9—

-------
population.   Case-control studies can provide reasonable estimates of

population-based  risk when controls are properly chosen, while cohort designs

have  the best capability to provide accurate estimates of population-based

risk.  Under  certain circumstances, case-report studies may support causal

associations,  and prevalence studies may provide population-based risks.
      Issues:
            We have noted that use of the "descriptive/analytical"
            characterization of the array of study designs may provoke
            classification disagreements, detracting from the desired focus on
            which designs have what utility for use in risk assessment.  For
            that reason, two sentences previously in the Guidelines have been
            omitted.  Does the Panel believe they should be restored or
            supplied in some other fashion, or does the current text provide
            sufficient discussion?
            Should PMR studies or clusters be specifically addressed?
            do they fit in?
Where
            Should the guidelines point out that studies designed specifically
            to test hypotheses also can generate other hypotheses,  but the
            distinction between these types of information should be
            maintained.
Adequacy (to replace current paragraph 2)


      Criteria for the adequacy of epidemiologic studies for risk assessment
purposes include, but are not limited to, factors which depend on the study
design and conduct:


      1.    The proper selection and characterization of study and comparison
            cases or groups.

      2.    The adequacy of response rates and methodology for handling
            missing data.

      3.    Clear and appropriate methodology for data collection and
            analysis.

      4.    The proper identification and characterization of confounding
            factors and bias.
                                     -10-

-------
      5.     The appropriate consideration of latency effects,
      6.     The valid ascertainment of the causes of morbidity and death.
      7.     Complete and clear documentation of results.

      For studies claiming to show no evidence of human carcinogenicity
associated with an exposure, the statistical power to detect an appropriate
outcome should be included in the assessment, if it can be calculated.  It
should be noted that sufficient statistical power alone does not determine the
adequacy of a study.

      Although not unique to human studies, it is important to reiterate that
sufficient and thorough evaluation of suspect carcinogens requires that
evidence be available in a  form, quality, and quantity suitable for
assessment.  In  some cases, the availability of  and  access to raw data may be
important.  Guidelines for  reporting  epidemiolpgical research results have
been previously  published  (IRLG, 1981; others?)

       Issues:
       a.    Application'of various  criteria are  dependent on  study type; most
            listed here usually apply to case-control  and cohort  studies.   Is
            this a problem?
       b.    Should we  address rationales for combining sites  and  tumor  types
             in this section,  as done in the animal  section?
       c.    Given the  discussion in the weight-of-evidence  section,  do  we  need
            much more  here on what constitutes an adequate  study?
       d.    Any suggestions for appropriate citations?
       e.     Is there too much emphasis on statistical power?


 Criteria for Causality (paragraphs 3 and 4 deleted: new text)

       Epidemiologic data are often used to infer causal relationships.   Many
 forms of cancer are stated as causally related  to exposure to agents for which
 there is no direct biological evidence, most notably, cigarette smoking,and
                                       -11-

-------
lung cancer.  As  insufficient knowledge about the biological basis for disease
in humans makes it difficult to classify exposure to an agent as causal,
epidemiologists and biologists have provided a set of criteria that define a
set of relationships about data.  A causal interpretation is enhanced for
studies that meet the following criteria.  None of these criteria actually
proves causality; actual proof is rarely attainable when dealing with
environmental carcinogens.  The absence of any one or even several of these
criteria does not prevent a causal interpretation; none of these criteria
should be considered either necessary or sufficient in itself.

      Criteria for causality are:

1.  Consistency:  Several independent studies of the same exposure in
different populations, all demonstrating an association which persists despite
differing circumstances, usually constitute strong evidence for a causal
interpretation (assuming the same bias or confounding is not also duplicated
across studies).  This criterion also applies if the association occurs
consistently for  different subgroups in the same study.

      Issue: Diverse responses from similar populations (races or species)
      lend weight to human conclusions but seem to detract from animal ones.
      How should we address this inconsistency?
2.  Strength (magnitude) of association:   The greater the estimate of the
risk of cancer due to exposure to the agent, the more credible will be a
causal interpretation.  It is less likely that nonrandom error (e.g.,  bias or
some confounding variable) or chance can explain the association because these
factors themselves have to be highly associated with the disease.  A weak
association might be more readily explained by the presence of chance or bias.
      Issues:
            Should we provide a guideline value,  such as a relative risk of
            5.0, since magnitude of association can also depend on, for
            instance, the variety and range of magnitudes of exposure present
            and the rarity of the cancer?
                                    -12-

-------
      b.    Should the example of smoking-alcohol-esophageal cancer be
            considered here?

3.  Temporal relationship:   The disease occurs within a reasonable enough
time frame after the initial exposure to account for the health effect.
Cancer requires a latent period during which transformation of neoplasia into .
malignancy occurs and a period of time passes before discovery.  While latency
periods vary, existence of the period is acknowledged.  Since the time of
transformation is seldom known, however, the initial period of exposure to the
agent is the accepted starting point in most epidemiologic studies.

4.  Dose-response or biologic gradient:   An increase in the measure of effect
is correlated positively with an increase in the exposure or estimated dose.
A strong dose-response relationship across several categories of exposure can
be considered to be evidence for causality if confounding effects are unlikely
to be correlated with exposure levels.  The absence of a dose-response
gradient, however, may mean only that the maximum effect had already occurred
at the lowest dose or perhaps all gradients of exposure were too low to
produce a measurable effect.  The absence of a dose-response relationship
should not be construed as evidence of a lack of a causal relationship.

5.  Specificity of the association:   If a single, clearly-defined exposure is
associated with an excess risk of one or more site-specific cancers, while
other sites show no association, it increases the likelihood of a causal
interpretation.  Different agents, however, may be responsible for more than
one site-specific cancer.  Replication of the specific association(s) in
different population groups (cf. consistency) would then be needed to provide
strong support for a causal interpretation.

      Issue: Shall we retain this last sentence?  A comment has been made that
      specific locations  (i.e., microenvironments) influence expression of
      cancer.

      In some cases, conclusions regarding an association may be based on a
mixture of chemicals rather than the specific chemical in question.  In these
cases, judgment on the causal relationship to the specific chemical will
                                      -13-

-------
depend on such other information as the pharmacokinetics of the chemical or

other biologic or epidemiologic data.


      Issue:  Shall we include: In some instances, it may be concluded only
      the mixture can be held culpable, e.g., in the process to produce benzyl
      chloride.

6.  Biological plausibility:   The association makes sense in terms of what is
known about the biologic mechanisms of the disease or other epidemiologic
knowledge.  It is not inconsistent with biological knowledge about how the

exposure under study could produce the cancer.


7.  Collateral evidence:   A cause-and-effect interpretation is consistent
with what is known about the natural history and biology of the disease.  A
proposed association that conflicted with existing knowledge would have to be

examined with particular care.
References

Breslow, N.E. and Day, N.E.  Statistical Methods in Cancer Research, Vol.  1.
      The Analysis of Case-Control Data. 1980.

Breslow, N.E. and Day, N.E.  Statistical Methods in Cancer Research, Vol.  2.
      The Design and Analysis of Cohort Studies. 1987.

Interagency Regulatory Liaison Group (IRLG).  Guidelines for documentation of
      epidemiologic studies.  American Journal of Epidemiology. 114:609-613.
      1981.

Kelsey, J.L, , Thompson, ₯.D., and Evans, A.S.  Methods in Observational
      Epidemiology.  1986.

Lilienfeld, A.M. and Lilienfeld, D.  Foundations of Epidemiology, 2nd ed.
      1979.

Mausner, J.S. and Kramer, S. Epidemiology, 2nd ed. 1985.

Office of Science and Technology Policy (OSTP).  Chemical carcinogens:  review
      of the science and its associated principles. Federal Register 50:10372-
      10442.  1985.

Rothman, K.J.  Modern Epidemiology. 1986.
                                      -14-

-------
                    Chair Summary of Work Group Session on
                        Study Design and Interpetation
                         Chair:  Dr. Marilyn Fingerhut

                                 Introduction

    The Study Design Work Group focused on three questions contained in the
strawman language suggested by the EPA staff as a preliminary revision of
Section II.B.7:  1)  What types of epidemiologic studies are acceptable to the
EPA for risk assessment?  2) What characteristics are desirable in a study to
be used for risk assessment?  and 3) What criteria strengthen the view that an
epidemiologic association may reflect a causal relationship?

    The Study Design Work Group and the Weight-of-Evidence Work Group met
together to consider the first question and to agree upon the types of studies
to be discussed in each Work Group.  The members of the Study Design Group
discussed Questions 2 and 3.

    The sections below briefly describe the discussions pertaining to the
three questions, and identify the recommendations and suggestions made to EPA.
The sections contain revised strawman'language for Section II.B.7, which
reflects the ideas suggested by  the  Study Design Work Group.  This chair's
summary presents the work group's views within the context of the strawman
document.

           Question 1:  What types of human studies are  acceptable to
                   the EPA  for purposes  of risk assessment?

    The members  of both  the Study Design and  Weight-of-Evidence groups
discussed  this question  in  some  detail  and concluded  that all valid
epidemiologic  studies can contribute information  to an EPA risk assessment.
Consequently,  the  panelists rejected suggestions  by a few members to weight
certain study  types  more heavily than others.   There  was  general  agreement
that  the various types  of epidemiologic studies,  properly conducted, could be
                                       -15-

-------
useful.  These include cohort, case-control, cross-sectional,  proportional
mortality (incidence) ratios, clusters, clinical trials, and correlational
studies.  Each type has strengths and limitations.  It was agreed that "case
reports" do not constitute studies, but that some of these should be reviewed
by EPA during a risk assessment effort, because series of case reports have
provided key information about human risk for several chemicals.   Vinyl
chloride was one example cited.

    Both groups strongly recommended that EPA needs additional experienced
epidemiologists to evaluate epidemiologic data and to assist in risk
assessments, because professionally sophisticated judgments are required when
evaluating the studies.

    The Study Design Group reviewed the EPA strawman language suggested as a
replacement for the current paragraph 1 of Section II.B.7 Human Studies.   The
group generally agreed that the proposed revision not be used.  The group
suggested a brief replacement paragraph:
    Introduction and Study Types (to replace the current paragraph 1)

    Epidemiologic studies with various study designs can provide unique
    information about the response of humans who have been exposed to suspect
    carcinogens.  Each study must be evaluated for its individual strengths
    and limitations.  Conclusions about causal associations usually also
    include consideration of the entire body of literature, including
    toxicology and biologic mechanisms.

    The Study Design Work Group suggested that guidelines be written for use
by experienced epidemiologists.  Therefore, the following responses were given
to the questions posed by the EPA on page 2 of the strawman text (p. 9 of this
document).  There is no need to distinguish studies as analytical vs.
descriptive, or hypothesis-generating vs. hypothesis-testing, or complete vs.
incomplete (as suggested by one member of the group) because experienced
                                     -16-

-------
epidemiologists,  who are aware of strengths and limitations of the various
study designs, will judge studies by their inherent validity and applicability
to the particular risk assessment.  For this reason, there is no need to
specifically address proportional mortality ratios (PMRs) or clusters,  or to
address the distinction between hypothesis-generating and hypothesis-testing
in the guidelines.

    The Study Design Work Group recognized that it may be desirable to provide
in the guidelines an overview of epidemiologic principles and study types.
The information would be useful for professionals trained in other
disciplines.  The information could also explain to the public how the EPA
uses human studies in risk assessment.  The group suggested that this overview
of epidemiology might be placed in an appendix.

    The group suggested that EPA continue to provide epidemiologic training to
nonepidemiolegists in the Agency who are involved with the risk assessment
activities.  However, the key judgments on epidemiologic studies should be
made by experienced epidemiologists.  Upon learning that EPA has very few
epidemiologists on staff, the group recommended expanding this expertise in
the Agency.  A few of the members of the combined Study Design and
Weight-of-Evidence Groups suggested that EPA might wish to consider using, for
risk assessment,  the approach of the International Agency for Research on
Cancer (IARC) in which the entire evaluation of human data is conducted,by
expert epidemiologists, and is thus free from political interference.  Other
participants observed that political concerns may influence any such group.
They suggested that the regular use of EPA staff provides an objective"
approach to risk assessment.  There Was some discussion but no agreement in
the groups about this point.
                                      -17-

-------
          Question 2:  What characteristics are desirable in a study
                           used for  risk  assessment?

    The EPA had provided strawman language to replace paragraph 2 of Section
II.B.7 Human Studies, which focused on the question of "criteria for adequacy
of epidemiologic studies for risk assessment purposes."  Because the members
of the group generally agreed with the view that all valid epidemiologic
studies may contribute information to a risk assessment,  discussion by the
Study Design Work Group led to substitution of a different question:  "What
characteristics are desirable in a study used for risk assessment?"  Since
each type of study has particular characteristics, strengths, and limitations,
the group suggested revising of the EPA strawman language for paragraph 2 to
describe characteristics desirable (rather than required) for risk assessment.
Several new characteristics were added to those identified by the EPA version.

    The following suggested revision is a restatement of ideas from the Group
and should not be considered a polished or finished revision.

    Adequacy (to replace current paragraph 2)

    Criteria for the adequacy of epidemiologic studies are well recognized.
    Considerations made for risk assessment should recognize the
    characteristics, strengths, and limitations of the various epidemiologic
    study designs.  Characteristics which are desirable in the epidemiologic
    studies are listed here.

    1.   Relevance
        - The study deals with the exposure-response relationship central to
          the risk assessment.

    2.   Adequate Exposure Assessment
        - Study subjects have exposure.
        - Analysis deals with time-related measures as far as study type
          permits, e.g., duration, intensity, age at first exposure, etc.
                                      -18-

-------
   3.  Proper Selection and Characterization of Study and Comparison groups
       - Selection and characterization are carefully described.
       - Source population is appropriate.
       - Results are generalizable to populations to be protected by the
         risk assessment.

   4.  Identification of a Priori Hypotheses

   5.  Adequate Sample Size

   6.  Adequate Response Rates  and Methodology for Handling Missing Data

   7.   Clear and Appropriate Methodology  for Data Collection and Analysis

   8.   Proper  Identification and Characterization of  Confounding and Bias

   9.   Appropriate Consideration of  Latency Effects

   10.   Valid Ascertainment of  Causes of Morbidity and Death

   11.   Complete and Clear Documentation of Results

    The panelists recommended that EPA continue to actively seek available
unpublished studies, if an unpublished report  (or the documentation for a
published report) might contribute to the risk assessment process.
            Question 3:   What criteria strengthen the view  that  an
         epidemiologic association may reflect  a causal  relationship?

    Strawman language had been provided by EPA to substitute for paragraphs 3
and 4 of Section II.B.7 Human Studies.  The Study Design Work Group suggested
that the EPA staff  consider  rewriting  this text to express an historical
                                      -19-

-------
approach, indicating that Koch's postulates were modified by Bradford Hill for
use in environmental studies, and that his criteria have been modified by EPA
for considerations relevant to risk assessment.

    The panelists were in general agreement on most points that are contained
in the suggested text below.  Some members suggested deleting "specificity."
They viewed it as misleading or incorrect, based upon the view that most
agents are observed to cause several effects.  However,  all agreed that as
expressed below, it is a useful criterion when it is. present.

    The panelists agreed that only one criterion (temporal relationship) was
essential for causality.   The presence of other criteria may increase the
credibility of a causal association,  but their absence does not prevent a
causal interpretation.   The panelists viewed all but specificity and coherence
as applicable to an individual study.

    The panelists'  ideas  for a suggested revision follow:

    Criteria for Causality (paragraphs 3 and 4 deleted:  new text).

    Epidemiologic data are often used to infer causal relationship.   A causal
    interpretation is enhanced for studies to the extent that  they  meet the
    criteria described below.   None of these actually establishes causality;
    actual proof is rarely attainable when dealing with  environmental
    carcinogens.   The absence of any  one or even several of the  others does
    not prevent a causal  interpretation.   Only the first criterion  (temporal
    relationship)  is essential to a causal relationship:   with that exception,
    none of the criteria  should be considered as  either  necessary or
    sufficient in itself.   The first  six criteria apply  to an  individual
    study.   The last criterion (coherence)  applies  to a  consideration of all
    evidence in the entire body of knowledge.

    1-  Temporal relationship:   This is  the single absolute requirement,  which
       itself does  not  prove  causality,  but which must ,be  present if  causality
                                     -20-

-------
   is to be considered.   The disease occurs within a biologically
   reasonable time frame after the initial exposure tb account for the
   specific health effect.   Cancers require certain latency periods.
   While latency periods vary, existence of the period is acknowledged.
   The' initial period of exposure to the agent is the accepted starting
   point in most epidemiologic studies.

2. Consistency:  When compared to several independent studies of a similar
   exposure in different populations, the study in question demonstrates a
   similar association which persists despite differing circumstances.
   This usually constitutes strong evidence for a causal interpretation
   (assuming that the same bias or confounding is not also duplicated
   across studies).  This criterion also applies if the association occurs
   consistently for different subgroups in the same study.

3. Strength (magnitude') of association:   The greater the estimate of risk
   and the more precise (narrow confidence limits), the more credible the
   causal association.

4. Dose-response or biologic  gradient:  An increase, in the measure of
   effect is correlated positively with an increase in the exposure or
   estimated dose.  A strong  dose-response relationship across several
  1 categories of exposure, latency, and duration is supportive although
   not conclusive for causality, assuming confounding effects are unlikely
   to be correlated with exposure levels.  The absence of a dose-response
   gradient, however, may be  explained in many ways.  For example, it may
   mean only that the maximum effect had already occurred at the lowest
   dose, or perhaps all gradients of exposure were too low to produce a
   measurable effect.  If present, this characteristic should be weighted
   heavily in considering causality.  However, the absence of a
   dose-response relationship should riot be construed by itself as
   evidence of a lack of a causal relationship.
                                 -21-

-------
    5. Specificity of the association:  In the study in question,  if a single
       exposure is associated with an excess risk of one or more cancers also
       found in other studies,  it increases the likelihood of a causal
       .interpretation.  Most known agents, however, are responsible for more
       than one site-specific cancer.  Therefore, if this characteristic is
       present, it is useful.  However, its absence is uninformative.

    6. Biological plausibility:  The association makes sense in terms of
       biological knowledge.  Information from toxicology, pharmacokinetics,
       genotoxicity,  and in vitro studies should be considered.

    7. Coherence:  This characteristic is used to evaluate the entire body of
       knowledge about the chemical in question.  Coherence exists when a
       cause-and-effect interpretation is in logical agreement with what is
       known about the natural history and biology of the disease.   A proposed
       association that conflicted with existing knowledge would have to be
       examined with particular care.

    In a joint session of the Study Design and Weight-of-Evidence Groups at
the end of the meeting, some panelists noted the desirability of having
epidemiologic data available for use in risk assessment at the time that
animal studies are completed by the National Toxicology Program (NTP).   A
suggestion was made by some that EPA consider undertaking an effort to assess
the feasibility of conducting a human epidemiologic study at the same time the
Agency recommends that NTP undertake an animal study.  There was only limited
discussion of this point.  Some panelists objected to it, mainly because of
logistic difficulties.
                                     -22-

-------
       EPA CLASSIFICATION  SYSTEM  FOR  CATEGORIZING WEIGHT OF EVIDENCE FOR
                      GARCINOGENICITY FROM HUMAN STUDIES

                    Strawman Language and Related Questions

Assessment of Weight of Evidence' for Garcinogenlcity from Studies in Humans

    There are a variety of sources of human data.   When the totality of human
evidence  is considered, the conditions under which the information has been
collected are of importance in defining the limits of its inference.   These
limits are particularly critical for studies where no positive results have
been seen, although they contribute  to conclusions in all circumstances.

     In the evaluation of carcinogenicity based on epidemiologic studies it is
necessary to consider the roles  of extraneous factors such as bias and other
nonrandom error and chance  (random error) and how they might affect evaluation
and estimates of an agent's effects.  Some extraneous factors of concern are
selection bias, information bias, and confounding.  Five classifications of
human  evidence are established in this section.  The following discussion
includes  some interpretation and illustration of their use.

1.  The category of sufficient implies the existence of a" causal relationship
between'the exposure in question and an  elevation of cancer risk.  Most if not
all of the criteria for causality as defined in Section II. B. 7. should be
met.   Most agents or mixtures falling into this category would require at
least  one methodologically  sound epidemiologic study meeting most of the
criteria  for causality and whose results  cannot be explained by chance, bias,
or confounding.
                                       -23-

-------
    Issues:
    a.   If one such study is available,  would others be needed as
         confirmatory?

    b.   Is  it necessary to specify what are "most criteria?"
Sometimes a case series will present data that drive a causal conclusion.
    Issues:
         Are supporting studies needed?  Language that might serve is:  One or
         more supporting epidemiologic studies that also demonstrate a
         relationship between the exposure and cancer should be available.
         The latter studies need not be definitive by themselves although the
         stronger they are in terms of their validity the more credible will
         be a "sufficient" categorization of the epidemiologic data.

         Sometimes studied populations will differ only in cumulative dose or
         in dose rate.  Should a conclusion of carcinogenicity be limited to
         circumstances of exposure?

         Corollary:  Shall we include discussion of the evaluation of a body
         of studies where some show effects at one site and some show them at
         another or where some are of different ethnic or geographic groups or
         where there are age and sex differences?

         The Agency receives studies, some by statute, from a variety .of
         sources, that may not have appeared in the open or peer-reviewed
         literature.  Should comment be made regarding our intent to use such
         studies?
2.  The category of limited implies that a causal interpretation is more

credible than nonrandom error, although it cannot be entirely ruled out as an

explanation for the statistically significant positive association found in at

least one or more epidemiologic studies.  Such studies would typically include

a vigorous effort by the author or be carefully reviewed by the Agency to

explain why nonrandom error (confounding, information bias, etc.) is unlikely

to account for the association.


    Also included in the limited category are agents for which the evidence

consists of some number of independent studies exhibiting statistically

significant positive associations between the exposure and the same site-
                                      -24-

-------
specific cancer but for which nonrandom error could not be ruled out entirely
as the explanation for the association in each study.  This category may also
include substances for which a series of epidemiclogic studies (some number of
which must be considered valid) exhibit apparent but not significant positive
associations for the same site-specific cancer without any series of valid
studies in which there is apparent lack of association to counter the observed
association.

    Issues:
    a.   How many studies would support each conclusion?
    b.   Is it necessary for responses in a series of studies to be specific.
         to site in order to fall into the limited category?

3.  The category of inadequate implies that the data, although perhaps
suggestive, do not meet the criteria for a limited categorization of the
evidence.   This would include studies that demonstrate statistically
significant positive associations that could be explained by the presence of
nonrandom error and which are not specific with respect to site.   Also
included in this category are studies deemed of insufficient quality or
statistical power, and where there is no confidence in any .particular
interpretation.  For example, results may be consistent with a chance effect,
or exposure may not clearly.be tied to the agent in question.  Alternately, a
report of a study may render it incapable of being evaluated owing to
insufficient documentation.
    Issue:  Is it proper to modify or downgrade a category by such language as
    "Studies showing no positive results can be used to lower the
    classification from limited to inadequate . .  .  " ?
    Possible Choices to Complete This Statement Might Be:
       "only if the exact same conditions (including sensitivity) have been
       replicated in a statistically significantly positive study of the kind
       described under this category.  The results would thus be contradictory
       and the net effect of the latter study would be to  negate the findings
       of the former."
                                      —25—

-------
              or

       "if they are at least as likely to detect an effect as an already-
       completed study providing limited evidence."

       Corollary:  How explicit should we be?


4.  The category of no data indicates no data are available directly regarding

humans.


5.  The category of evidence of not being a carcinogen in humans is reserved

for circumstances in which the body of evidence indicates that no association

exists between the suspected agent and an increased cancer risk.  It should be

recognized that alterations in the conditions under which a study is done may

lead to statistically significant risk estimates where they did not exist

before.  Studies of uncertain quality with no positive results should not be

used to reduce the weight of evidence.


    Issues:
    a.   We are reluctant to use the word "negative" because it has come to
         mean a variety  of things including  (1) a  study with no cases of
         cancer,  (2) a study judged statistically  to have no excess cases of
         cancer, and (3) a study that leads  readers to believe there is no
         need for concern about carcinogenicity.   We do not wish to perpetuate
         the misuse of the term "negative" when referring to certain
         epidemiologic studies.   Have we adequately described the
         circumstances under which we would  conclude that the body of
         epidemiologic evidence suggests an  agent  is not a carcinogen?

    b.   Do we need this category?  Will it  ever be used?
                                      -26-

-------
         Chair Summary of Work Group Session on Classification System
            for  Categorizing Weight  of  Evidence  for  Carcinogenieity
                              from Human Studies
                         Chair:   Dr.  Raymond R.  Neutra
I.   INTRODUCTORY DISCUSSIONS
                                                        t
     Weighing evidence refers to the act of reviewing and summarizing human
evidence ranging from case studies to randomized trials.  Evidence which
suggests positive, null, or even protective carcinogenic effects is considered
while taking note of the quality of each piece of information.  The group
seemed to advocate a procedure that considers all study results regardless of
direction, rather than only positive studies, in other words, a "weight of
evidence approach" rather than a "strength of evidence approach,"
respectively.

     One workshop participant pointed out that those who review human evidence
assign some informal prior probability to the hypothesis that the substance
under investigation causes cancer in humans.  Without advocating formal
Bayesian statistical procedures, it should be noted that this "prior
probability" is influenced by the nature of the substance, information on its
metabolism, and behavior in short-term tests.  The group decided that results
of animal bioassays or subchronic tests should not influence judgments on the
prior probability or in the interpretation of the human studies, since a
separate process, in EPA deals with the weight of animal evidence.  In a
subsequent process, the two streams of evidence will be combined by scientists
to give a final "posterior" characterization of the evidence.

     There was a discussion of nomenclature.   It was agreed that the adjective
"negative" should be avoided, as it is ambiguous.  It has been used to mean
"bad," "protective effect," "no effect," or "absent."  -For the purposes of the
work group it was agreed that "null" would be used for a study that had a
relative risk close to 1.0 with confidence limits which included 1,. 0, and that
"inverse association" is the appropriate terminology for a study that showed a
                                      -27-

-------
relative risk less than 1.0 and confidence limits which did not include 1.0.
A positive study is one with a relative risk greater than 1.0 with confidence
limits which do not include one.

     The group recognized that the evaluation of a body of evidence depends
on,.but is different from, the process of evaluating individual studies.  The
latter process was discussed by the Study Design and Evaluation work group,
which listed a series of study characteristics and criteria for likely
causality that should be considered in characterizing a single study.  In the
process of evaluating the evidence from a single study, one of the following
categories might be designated:  clear, some, equivocal, or no evidence of
human carcinogenieity.  The study design group felt that it was not possible
or desirable to have a rigid algorithm for making this categorical
determination on a particular study.  For the purposes of discussion in the
weight-of-evidence work group, these terms were used even though definitions
were not developed by that group.  Similarly the weight-of-evidence group was
not in favor of a rigid algorithm for combining evidence among individual
studies.  Instead, it was suggested that a review of all the studies by a
qualified group could lead to an ordinal classification.  It was agreed that
assigning a ratio scale numerical score of evidentiary sufficiency would not
be helpful since the Agency would need to categorize the score anyway for
action purposes.  An artificial numerical score might well complicate rather
than simplify the regulatory process.

     The general consensus of the workshop was to avoid too narrowly defined
guidelines, e.g., the use of specific numerical standards in defining "tight
confidence limits."

     It was further noted that overly specific guidelines had a number of
drawbacks.  First, they could never capture all conceivable contingencies  and
thus would be a Procrustean bed.  Second, a cookbook could be inappropriately
used.  This is a particular problem in an agency such as EPA where the
prevalence of epidemiologists is below 1/1000  (15 total in this organization
of 16,000 employees).  The group encouraged EPA to increase the number  of
                                      -28-

-------
epidemiologists on their staff and to have continuing education for
epidemiologists and others to foster interdisciplinary work.

     One participant urged that an lARC-like process using external experts
should be employed for weighing evidence.  A number of drawbacks were pointed
out.

     There was discussion about the methods that EPA should employ in
summarizing bodies of evidence.  For example, should there be an appendix
presenting techniques for meta analysis and its graphical presentation?  No
consensus emerged because of concern that these techniques could be misused.
II.  PROPOSED MODIFICATION OF THE STRAWMAN CATEGORIES FOR WEIGHED EVIDENCE

     The strawman language suggested the categories of Sufficient, Limited,
Inadequate, No Human Data, Evidence of Not Being a Carcinogen in Humans.

     The work group suggested the categories:  Sufficient Evidence for Human
Carcinogenicity, Limited Evidence for Human Careinogenieity, Inconclusive
Evidence for Human Carcinogenicity, No Human Data.  There was no consensus and
considerable argumentation about two additional possible categories:  Human
              I
Evidence Not Suggestive of Carcinogenicity and Sufficient Evidence for Lack of
Human  Carcinogenicity.  The work group's understanding of the categories and
its  responses to specific strawman issues raised by EPA staff in each
respective  section are dealt with below.
                                      -29-

-------
III.   SPECIFIC  COMMENTS ON EACH EVIDENTIARY CATEGORY

1.  Sufficient  Evidence for Human Careinogenieity

Issues A and B.  Required Number and Quality of Studies

     In some circumstances where the information "prior probability" was high
and the study was particularly strong, the work group felt that a single study
with clear evidence could provide sufficient evidence for a substance.  In
most cases, there would probably be more than one study with clear evidence.
The group did not want to provide a cookbook to define the criteria for
sufficiency.

Issues A, B and C.  Need for Supportive Information and Peer Review

     The work group felt that only in the rarest circumstances would a case
series provide  sufficient evidence for human carcinogenicity,  e.g., vinyl
chloride.  They were reluctant, however, to provide a rigid algorithm which
always requires supporting studies.

     The work group did not wish to limit hazard identification to the dose
scenario covered in the epidemiological study.  That is dealt with during
dose-response assessment.  It also advised that a series of studies which show
an increased risk of cancer at a particular site should be given more weight
than a series showing an increased risk at various sites.  In the latter case,
the mechanism of causation should enter into the weighing process.  Also,
hazard identification should not be limited to the particular races, sex,  or
age group covered by the epidemiological studies.

     Most work group members thought that unpublished studies  should be
considered in weighing evidence.  Two strong caveats were voiced.  There
should be deadlines for submission to prevent a continual stream of last-
minute submissions which delay the regulatory process indefinitely.  There
                                     -30-

-------
should be regulatory peer review and perhaps a requirement that any journal
acceptance or rejection correspondence be submitted to the Agency.

2.  Limited Evidence For Human Carcinogenicity

Issues A and B.  Number of Required Studies and Sites Specificity

     One or more studies providing "some evidence" even if there are some
"null" studies, will qualify for this classification.  Alternatively, a series
of positive equivocal evidence studies in the absence of any null studies
would qualify as well.  The work group did not have suggestions for an
algorithm to deal with these issues.
3.  Inconclusive Evidence For Human Carcinogenicity

     The work group preferred the term "inconclusive" to the term "inadequate"
because the latter implies poor quality evidence when in fact the studies may
be of good quality but contradictory.

     The evidence may gain this characterization under three contingencies:  1)
the evidence is a mixture of equivocal and null studies; 2) the evidence is a
mixture of imprecise null studies which do not add up to a precise null study;
or 3) The evidence does not meet the criteria for the other categories.

Issue A.  Modifying and Downgrading Categories

     The work group did not discuss exact wording to cover the situation in
which a positive study is followed by a null study so that a substance would
fall from the limited category into the inconclusive category.
                                     -31-

-------
4.  No Human Data

     The category of No Data indicates no data are available directly
regarding humans.  The subcommittee had no comments on this self-evident
category.
5.   No Evidence for Human Carcinogenicity (Human Evidence Not
     Suggestive of Carcinogenicity)
     There was considerable discussion about the concept of this and the
following category and of the names which should properly apply to them.  It
should be kept in mind that this epidemiological categorization was to be
based on human studies and interpreted in the light of short-term studies and
mechanistic  insights, but not subchronic or chronic cancer animal bioassays.
The ultimate classification would weight the two streams of evidence.

     The sensitivity of this and the following category has to do with the
weight required for human studies to overcome sufficient animal evidence.  The
weight of evidence for a substance would fall into this category rather than
the "inconclusive" category if  all the human studies had been null studies yet
animal risk  assessment .would have predicted null studies at the human dose
delivered to the population size "exposed, and there were possible mechanisms
of action which would predict a nonthreshold dose-response curve.  In this
case, a series of good null studies are simply not good enough  to definitively
cancel out sufficient animal evidence.  Yet the evidence of a series of null
studies with individually tight confidence intervals or tight intervals when
taken together somehow warrants more than an/'inconclusive" label.  The weight
of evidence  regarding EDB is an example of this situation.  The substance  is
genotoxic and animal risk extrapolations would have predicted that the worker
studies carried out could not have detected the fairly small relative risks
expected from the doses received.       .
                                      -32-

-------
     There was some discussion about what it meant to be  "taken together."
Subjecting a series of small studies to a Mantel-Haenszel procedure was
suggested.  There were some technical objections,  but it  was agreed that
consensus might be found for some analogous procedure.
6.  Sufficient Evidence for Noncarcinogenicity in Humans

     There was no consensus on this classification.   Many members  felt that a
series of strong null studies with tight confidence limits would qualify if
coupled with a widely accepted mechanistic understanding that suggested that
the agent should not cause cancer at doses to which humans could be accidently
exposed at work or in the environment.  This kind of mechanistic and
epidemiological evidence could cancel out a series of positive animal
bioassays for the purpose of hazard identification.

     A few others in the workshop pointed out that null studies could be used
to determine if humans were substantially less sensitive than animals in the
dose-response stage of risk assessment.  In the hazard identification stage,
however, only a much more stringent criterion was appropriate.  The confidence
limits around the null value needed to be so tight that they excluded the
possibility of added risk of public health and regulatory concern.  One
proposal supported by two of the Workshop participants was that "risk of
regulatory concern" be quantified. '  ''<

     The majority who disagreed with this proposal seemed to have two views.
First, that it was unwise to tie the Agency's hand with a number Which might
change and which was an issue of risk management.  Second, it was perceived
that no study would be able to rule out' all risks of potential regulatory
concern.  Such power is not practically achievable, and even it if were, one
would not trust epidemiology's ability to control confounding sufficiently to
accurately assess relative risks so close to the null.  This stringent
requirement might also create disincentives for government arid corporate
sponsors who fund epidemiological studies in the hope that they would give the
                                      -33-

-------
candidate chemical a "clean regulatory bill of health" in the hazard
identification phase.  Everyone recognized that it is not possible to prove an
absolute zero risk.

     The advocates of the more stringent definition,  e.g.,  to require
exclusion of all risks of regulatory concern, responded that this was exactly
the point.  Sponsors should give up the vain hope that null epidemiological
evidence could be used in the hazard identification process to get their
substances "off the list"1 when there  is  sufficient evidence  from animal
studies.

     Although epidemiology can rarely get a substance off the hazard
identification list once an animal study has put it there,  a series of good
quality null studies would put the substance in the Human Evidence Not
Suggestive of Carcinogenicity category.  If the human response was
considerably lower than that predicted from animal studies, this may lead to
higher tolerated industrial emissions of the substance, which in turn, may, in
some cases, have important economic implications.  Thus incentives exist for
carrying out epidemiological investigations even if these studies can rarely
be used to justify delisting a substance.
     Saccharin was cited as an example of a substance that should not get a
"clean bill of health."  A series of strong null human studies was still
easily compatible with the animal predictions of 800 extra cases per year in
the United States.  Although this is equivalent to a relative risk of 1.01,
small by epidemiological standards, one is hard pressed to exonerate a
substance with a study which does not have the power to exclude the very
number predicted by animal risk assessment.  It is for this reason that some
members in the subcommittee demanded a null study with the power to exclude
the low added lifetime risks of interest to regulatory agencies.  The wording
used by IARC for a similar category was proposed with the addition of a
sentence dealing with the need for power to exclude risks of regulatory
     Editor's note.  Concept of a list arose during the work group discussion,
and was not previously referred to the strawman language.
                                      -34-

-------
interest.  This line of argument was unfamiliar and even irritating to many of
the epidemiologists present.

     The proponents for the category,  Sufficient Evidence for
Noncarcinogenicity in Humans,  responded that saccharin was a good candidate
for the category since there was experimental and mechanistic evidence that
bladder cancer in rats should only occur at high doses and that downward
extrapolation of risk to dietary levels in humans was not warranted.  There
was a question as to whether there was scientific consensus on this.

     Although most of the discussants did not question the use of widely
accepted mechanistic arguments separating man from animal, there were a few
concerns that there could always be other carcinogenic mechanisms that were
shared between humans and rodents that might still operate.  This is something
which needs to be examined carefully.

     A consensus did seem to emerge against the more permissive strawman
language which suggested that a series of unopposed null studies constituted
Sufficient Evidence for Noncarcinogenicity in Humans.

     The work group agreed with EPA staff about not using the,word "negative"
because  of the many different interpretations that can be given to  this word.

   ,  The discussion in the strawman document about how to interpret null
studies was not adequate and prompted the arguments outlined above.  The work
group did not come to a consensus about categories that dealt with null
studies.
                                      -35-

-------
IV.  MISCELLANEOUS OBSERVATIONS

1.  Control of Smoking

     When a series of studies show an effect but smoking has  not been
controlled for, this should not automatically disqualify the  studies for
consideration.  It should not always be assumed that controlling  for smoking
would weaken an observed chemical effect.  Highly exposed individuals may
smoke less.

2.  Multiple Exposures

     There was a discussion of the problem of concomitant  exposure  to other
chemicals in a series of studies.  One should determine .if all of the studies
were characterized by exposure to the same set of chemicals.   If  not, some of
the other chemicals could be removed as confounders.   If so,  the  participants
recommended following the IARC policy of implicating the process  as a whole.
3.  Carcinogenic Metabolites

     Sometimes a substance produces by metabolism another substance that has
achieved some degree of evidentiary sufficiency even though the parent
compound has not been studied.  If the target organ of main exposure is the
same, the workshop members agreed that the parent compound should be
classified similarly to the metabolite.  As circumstances deviate from this
paradigm, more judgment will be needed in the classification process.
4.  Proper Use of Power Calculations

     The work group came to a consensus that it made no sense to  calculate the
power of the study after the fact.  Instead, one should inspect the confidence
limits to see if they include the expected effect.
                                     -36-

-------
5.  Epidemiologic Research Needs

     A number of participants suggested that EPA fund' an epidemiological study
every time a request was made to the National Toxicology Program (NTP) for an
animal study.  Others felt that this would be wasteful of scarce
epidemiological resources and that one should wait for positive animal results
'before embarking on an epidemiological study.  Still others felt that we could
be missing human carcinogens that happened to have negative animal results.
There seemed to'be some consensus that a search be initiated for an exposed
cohort and an exposure assessment be carried out, in parallel with each NTP
study.
                                       -37-

-------

-------
                           DOSE-RESPONSE ASSESSMENT

                   Strawman Language and Related Questions1

1.  Selection of data:  As indicated in Section II.D.,  guidance needs to be
given by the individuals doing the qualitative assessment (epidemiologists,
toxicologists, pathologists,  pharmacologists,  etc.) to those doing the
quantitative assessment as to the appropriate data to be used in the dose-
response assessment.   This is determined by the quality of the data, its
relevance to the likely human modes of exposure, and other technical details.

     A. Human studies.  Estimates based on adequate human epidemiologic data
are preferred over estimates based on animal data.  Intraindividual
differences, including age- and sex-related differences, should be considered
where possible.  If adequate exposure data exist in a well-designed and well-
conducted epidemiologic study that has shown no positive results for any
relevant endpoints, it may be possible to obtain an upper-bound estimate of
risk from that study.  Animal-based estimates, if available, also should be
presented when such upper bound estimates are calculated.   More carefully
executed dose-response assessments benefit from the availability of data that
permit the ages of exposure and onset of disease and the level of exposure and
duration to that exposure to be incorporated in the assessment.
    Issues:
    a.
Should an upper-bound risk estimate be made from a nonpositive human
study if it is the only risk estimate that can be made?
          first two paragraphs of the strawman language provided are intended
 to be parallel to  the first two paragraphs of III.A.I. of the current
 guidelines  (see Appendix E).  The second three paragraphs of the strawman
 language are  intended to be parallel to the first three paragraphs of III.A.2,
 of the current guidelines.
                                      -39-

-------
    b.   Should we describe what is minimal and what is preferred data?  A
         discussion of "preferred data" could  become so extensive that
         description in the guidelines would be cumbersome,  such data may
         never be obtainable, and the discussion in the guidelines would
         probably never be exhaustive.  Alternately, would it be possible to
         specify levels of preferred data?
    c.   In the absence of dose-rate information, is the use of cumulative
         dose an appropriate default position?  Should that be specified in
         the guidelines?
    d.   Should the guidelines be made to reflect possible differences in
         dose-response between children and adults because of differences in
         tissue growth, metabolism, food and fluid intake, etc.?
    e.   Should the use of person-years of observation be counted from the
         beginning or the end of exposure for dose-response assessment?
         Should this be discussed in the guidelines?

2.  Choice of mathematical extrapolation model:  Since risks at low-exposure
levels cannot be measured directly either by animal experiments or by
epidemiologic studies, a number of mathematical models have been developed to
extrapolate from high to low dose.  Models should make optimal use of biologic
data where possible.  Different extrapolation .  . .  (The language here would
be the same as that in the current guidelines.)  . .  . A rationale will be
included to justify the use of the chosen model.

    A.  Human data.   Dose-response assessments with human data should
consider absolute as well as relative risk models when the data are available.
Where possible, results from both models should be presented.  If selecting
                                                             /
one model over another, the rationale should be described.  In the absence of
information to the contrary, a dose-response model that is linear at low doses
will be employed with human data.  A point estimate from the model may be used
to estimate risk at doses below the observable range.

    B.  Animal data.  For animal data, the linearized multistage procedure
will be employed in the absence of information to the contrary.  The
linearized multistage model is a curve-fitting procedure.  It does not model
what is believed to be a multistage process of tumor development.  It is
appropriate as a default procedure, however, in that it is linear at low
                                    -40-

-------
doses.  Where appropriate, the results of different extrapolation models may

be presented for comparison with the linearized multistage procedure.  When

longitudinal data . .  . (Continue discussion in current guidelines.)
    Issues:
    a.
Point estimates from models of human data have been used in the
Agency in the past for dose response assessment.  This differs from
risk estimates made from animal data in that the estimates from
animal data are statistical upper bounds.  The rationale for using a
point estimate with human data is that (1) there is no cross-species
extrapolation with the human data (2) exposures to humans in the
epidemiologic data sets used for modeling (usually occupational
studies) are much closer than the doses used in animal studies to the
environmental exposures of concern and (3) the point estimate, though
not a statistical upper bound, provides an upper bound in the sense
that the response at lower doses is likely to be less than that
predicted by a model with low-dose linearity.  Should statistical
upper-bound dose-response estimates be used with human data for
consistency with the dose-response estimates from animal data?

The linearized multistage model is recognized as a curve-fitting
procedure.  It does not model stages of cancer.  Is it appropriate to
recommend the linearized multistage procedure in absence of
information to the contrary for the dose-response assessment of
animal data?  Would it be more appropriate to simply recommend'a
model that is linear at low doses in the absence of information to
the contrary?

Are there examples of agents which have supralinear dose response for
humans at low doses?  If so, a model with low dose linearity may not
be protective of public health.
                                      -41-

-------
                     Chair  Summary  of Work Group Session on
                            Dose-Response Assessment
                         Chair:  Dr. Philip Enterline
    The Dose-Response Work  Group discussed two major questions posed by EPA in
 the strawman language.  These were:

    (1)  How should the most appropriate data be selected for use in dose-
         response  assessment?
    (2)  How should the most appropriate extrapolation model be selected for
         estimating risks from human data sets and should these models be
         consistent with those used for animal data?

      The sections  below apply to Section III.A.I and 2 of the current
 guidelines  (see Appendix E).  Some suggestions are given for changes to the
 strawman language.

    1.  Selection of Data

    While estimates  based on adequate human epidemiologic data are preferred
 over estimates based on animal data, many issues need to be considered so that
 these estimates will be scientifically sound.  The following paragraphs
 address some  of EPA's critical issues in choosing data for dose-response
 assessment.

    Issue A.  Estimating Risk from Nonpositive Studies

    The group felt that when no positive evidence (either animal or human) is
 available,  an upper bound risk estimate should not be made from a nonpositive
human study.  In the presence of a good positive animal study,  however, it was
 felt that a human study could be used and that,  under appropriate conditions,
 the upper bound from the human study could be used rather than the upper bound
based on animal studies.
                                     -42-

-------
     Issue B.  Acceptable Quality of Data

     With regard to the kind of epidemiologic data needed, the committee felt
 that the new strawman language proposed, which appears on page 1  (p. 39 of
 this document), is adequate:  "more carefully executed dose-response
 assessments benefit from the availability of data that permit the ages of
 exposure and onset of disease and the level of exposure and duration of
 exposure to be incorporated in the assessment."

     Issue'C.  Default Position for Dose-Rate Information

     The committee felt that cumulative dose is an appropriate default
 position.  It is assumed that the wording that now appears in the first full
 paragraph on page 91 of this document (Appendix E), applies to epidemiologic
 information.  Clearly, the use of daily average or cumulative dose is not
 always ideal and dose rate information by time would be desirable.
 Extrapolation from occupational studies that deal with only part of a '
 lifetime-to-lifetime risk may not always be appropriate.  The committee felt
 that using lifetime daily averages will probably not grossly understate risk
 but  in some cases might cause an overstatement.

     Issue D.  Adjustments for Children as Compared to Adults

     It was felt that the assumption that there is no difference between
 children and adults is a default position.  After taking dosimetry into
 account, other factors such as remaining lifetime; tissue growth, metabolism,
 food and fluid intake, etc., should be taken into consideration wherever
 possible.

     Issue E.  Options for Counting Person Years  of Observation

    The committee couldn't comment directly on the issue of whether person
years of observation should be counted from the  beginning or the end of
 exposure for dose response assessment.   The committee did feel,  however,  that
                                    -43-

-------
latency should be considered in calculating dose for the purpose of examining
dose-response relationships.  This might take the form of lagging (5 years,  10
years, etc.) or of weighting dose by a time-to-tumor distribution with little
weight given to times distant from some average value.

    The workshop group suggested the following changes in the strawman
language:

    Section IIIA.  Dose-Response Assessment, Paragraph 1. Selection of Data.
On page 1 of the strawman text  (pp. 39 and 89 of this document) delete "by the
individuals doing the qualitative assessment (epidemiologists, toxicologists,
pathologists, pharmacologists,  etc.)."

    Section IIIA.  Dose-Response Assessment, Paragraph 2. Selection of Data:
Human studies:  On page 1 of the strawman text  (pp. 39 and 89 of this
document) add to the first  line the word "positive" so that the first sentence
reads, "Estimates based on  adequate positive epidemiologic data are preferred
over estimates based on animal  data."

    2.  Choice of Mathematical  Extrapolation Model

    While mathematical models must be  relied upon,  since risks at  low exposure
levels cannot be measured directly, a  range of  choices exists regarding the
type  of model and  its assumptions.  The  following paragraphs provide guidance
based on the work  group discussions for  the critical  issues identified by EPA
in  the strawman  document.

    Issue A.  Use  of  Statistical Upper Bound Dose-Response Estimates

    The  committee  felt  that when dose-response  estimates are made  from
positive human  data,  statistical upper bounds  should  be  used  so  as to be
consistent  with  dose-response  estimates  from animal data.  While  it is  true
 that  there  is no cross-species  extrapolation with  the human data and that
exposures  are  closer  to those  actually experienced by humans  in risk
                                      -44-

-------
assessment calculations, it was felt that this might be offset by the fact
that the general population to which risk assessments apply may be more
heterogeneous in terms of susceptibility than the data sets (often based on
occupationally exposed groups) from which risk is estimated.  Moreover, the
committee was not certain that the true dose-response relationship for humans
was always concave upward, and thus linear extrapolation may not always
provide a margin of safety.  In addition to upper bound estimates, the
committee felt that point estimates as presently calculated by the EPA should
be shown.

    Issue B.  Use of the Linearized Multistage Model

    The committee discussed the appropriateness of the linearized multistage
model as compared with simple linear models.  It was felt that the question of
appropriate models was a subject that might be better dealt with at a workshop
where this was the main focus and in a context where other models could be
presented and discussed.

    Issue C.  Modeling Nonlinear Dose Response

    The committee felt that where dose is environmentally determined, there
are agents (e.g., radiation,  arsenic) where response is concave downward.
Under these conditions an assumption of low-dose lineararity would ;not be
protective of the public.  This was considered in the decision to recommend
the calculation of statistical upper bounds from human data.

    The workshop group suggested the following.change in the strawman
language:

    Section IIIA.  Dose-Response Assessment.  Choice of Mathematical
Extrapolation Model: , Human data.   On page 2 of the strawman text (p. 40 of
this document) delete the following sentence:   "a point estimate for the model
may be used to estimate risk if dose is below the observable range."
                                     -45-

-------

-------
                APPENDIX A

EPA RISK ASSESSMENT FORUM TECHNICAL PANEL
     AND SUBCOMMITTEE  ON EPIDEMIOLOGY
                    -47-

-------
                 WORKSHOP ON CARCINOGEN RISK ASSESSMENT

      EPA RISK ASSESSMENT FORUM TECHNICAL PANEL AND ASSOCIATES


                        Richard Hill, William Farland, Co-Chairs

                                    Margaret Chu
                                  Lorenz Rhomberg
                                   Jeanette Wiltse

                     Dorothy Patton, Chair, Risk Assessment Forum
                Cooper Rees, Science Coordinator, Risk Assessment Forum

                             Subcommittee on Epidemiology

                                    David Bayliss
                                    Jerry Blondell
                                     Chao Chen
                           Herman Gibb, Subcommittee Chair
                                     Doreen Hill
                                    Karen Hogan
                                  Aparna Koppikar
                                Elizabeth Margosches
                                     Neal Nelson
                                 Cheryl Siegel Scott


WORKSHOP PARTICIPANTS:
Study Design and Interpretation

Patricia Buffler
Kenneth Cantor
Marilyn Fingerhut, Chair
Barry Friedlander
William Halperin
Doreen Hill
Karen Hogan
Barbara Hulka
Renata Kimbrough
Aparna Koppikar
Genevieve Matanoski
Weight of Evidence

David Bayliss
aaron Blair
Jerry Blondell
Margaret Chu
Philip Cole
Henry Falk
Elizabeth Margosches
Raymond Neutra, Chair
Gerald Ott
                                      -48-

-------
                                    Dose Response

                                  Harvey Checkoway
                                      Chao Chen
                                     Kenneth Chu
                                    Kenneth Crump
                                 Philip Enterline, Chair
                                    William Farland
                                     Herman Gibb
                                    Daniel Krewski
                                     Neal Nelson
                                    Gerhard Raabe
                                 . Lorenz Rhomberg
                                  Cherly Siegel Scott
                                      Allan Smith
CONTRACTOR ASSOCIATES:

Kate Schalk, Conference Services, Eastern Research Group
Trisha Hasch, Conference Services, Eastern Research Group
Elaine Krueger, Environmental Health Research, Eastern Research Group
Norbert Page, Scientific Consultant, Eastern Research Group
                                            -49-

-------

-------
     APPENDIX B




LIST OF PARTICIPANTS
           -51-

-------
                     U.S. Environmental Protection Agency

                       Cancer Risk Assessment Guidelines
                            Human Evidence Workshop

                               June 26-27, 1989
                                Washington, DC

                            FINAL LIST OF ATTENDEES
Mr. David Bayliss
Office of Health and Environmental
Assessment  (RD-689)
U.S. Environmental Protection Agency
401 M Street, S.W.
Washington, DC  20460
(202) 382-5726

Dr. Aaron Blair
National Cancer Institute
Executive Plaza North, Room 418
6130 Executive Blvd.
Rockville, MD  20892
(301) 496-9093

Mr. Jerry Blondell
Hazard Evaluation Division (TS-769C)
U.S. Environmental Protection Agency
401 M Street, S.W.
Washington, DC  20460
(202) 557-2564

Dr. Patricia Buffler
University of Texas
Health Science Center at Houston
School of Public Health
Epidemiology Research Unit
P.O. Box 20186
Houston, XX  77225
(713) 792-7458

Dr. Kenneth Cantor
National Cancer Institute
Environmental Studies Section
6130 Executive Blvd.
Rockville, MD  20892
(301) 496-1691
Dr. Harvey Checkoway
Department of Environmental Health
SC 34
University of Washington
Seattle, WA  98195
(206) 543-4383

Dr. Chao Chen
Office of Health and
  Environmental Assessment (RD-689)
U.S. Environmental Protection Agency
401 M Street, S.W.
Washington, DC  20460
(202) 382-5719

Dr. Kenneth Chu
National Cancer Institute
9000 Rockville Pike
Bethesda, MD  20892
301-496-8544

Dr. Margaret Chu
Office of Health and Enyironmental
  Assessment (RD-689)
U.S. Environmental Protection Agency
401 M Street, S.W.
Washington, DC  20460
(202) 382-7335

Dr. Philip Cole
School of Public Health
University of Alabama
203 TH
UAB Station
Birmingham, AL  35294
(205) 934-6707
                                     -52-

-------
Dr. Kenneth Crump
Clement Associates
1201 Gaines Street
Ruston, LA  71270
(318) 255-4800

Dr. Philip Enterline
University of Pittsburgh  .
School of Public Health
Room A410
130 DeSoto Street
Pittsburgh, PA  15261
(412) 624-1559
(412) 624-3032

Dr. Henry Falk
Center for Disease Control
EHHC/CEHIC
Mailstop F-28
1600 Clifton Road, N.E.
Atlanta, GA  30333
(404) 488-4772

Dr. William Farland            :, .
Office of Health and Environmental
  Assessment (RD-689)
Office of Research and Development
U.S. Environmental Protection Agency
401 M Street, S.W.
Washington, DC  20460
(202) 382-7315

Dr. Marilyn Fingerhut
National Institute of
  Occupational Health and Saftey
4676 Columbia Parkway (R-13)
Cincinnati, OH  45226
(513) 841-4203

Dr. Barry Friedlander
Monsanto Company
800 North Lindbergh -A3NA
St. Louis, MO  63167
(314) 694-1000
Dr. Herman Gibb
Human Health Assessment Group
(RD-689)
U.S. Environmental'Protection Agency
401 M Street, S.W.
Washington, DC  20460
(202) 382-5720
Dr. Bill Halperin
51 Jackson Street
Newton Center, MA
(617) 732-1260
02159
Dr.  Doreen Hill
Analysis and Support Division
(ANR-461)
U.S. Environmental Protection Agency
401 M Street, S.W.
Washington, DC. 20460
(202) 475-9640

Ms.  Karen Hogan .
Exposure Evaluation Division
(TS-798)
U.S. Environmental Protection Agency
401 M Street, S.W.
Washington, DC  20460
(202) 382-3895

Dr.  Barbara Hulka
Department of Epidemiology
Rosenau Hall
CB 7400
University of North Carolina,
Chapel Hill, NC  27514
(919) 966-5734
Dr. Peter Infante
Health Standards Programs
N3718
OSHA/DOL
2QO Constitution Avenue, N.W.
Washington, DC  2Q210
(301) 523-7111
                                      -53-

-------
 Dr.  Renata Kimbrough
 Associate Administrator for Regional
   Operations  (A101)
 U.S.  Environmental Protection Agency
 401  M Street,  S.W.
 Washington, DC  20460
 (202)  382-4727

 Dr.  Aparna Koppikar
 Office of Health and
   Environmental  Assessment  (RD-689)
 U.S.  Environmental Protection Agency
 401  M Street,  S.W.
 Washington, DC  20460
 (202)  475-6765

 Dr.  Daniel Krewski
 Health and Welfare
 Environmental  Health Center
 Room 117
 Ottawa, Ontario
 CANADA K1A OL2
 (613)  954-0164

 Dr.  Elizabeth  Margosches
 Exposure  Evaluation  Division
   (TS-798)
.U.S.  Environmental Protection Agency
 401 M Street,  S.W.
 Washington, DC  20460
 (202)  382-3511

 Dr. Genevieve  Matanoski
 Johns  Hopkins  School of Hygiene
  and  Public Health
 615 North Wolfe  Street
 Baltimore, MD  21205
 (301)  955-8183
 (301)  955-3483 (main office)

Dr. Neal  Nelson
Analysis  and Support Division
 (ANR-461)
U.S. Environmental Protection Agency
401 M  Street,  S.W.
Washington, DC   20460
 (202) 475-9640
Dr. Raymond Neutra
California Department
  of Health Services
2151 Berkeley Way
Berkeley, CA  94704
(415) 540-2669

Dr. Gerald Ott
Arthur D. Little
25 Acorn Park
Cambridge, MA  02140
(617) 864-5770 (ext. 3136)

Dr. Dorothy Patton
Risk Assessment Forum (RD-689)
U.S. Environmental Protection Agency
401 M Street, S.W.
Washington, DC  20460
(202) 475-6743

Dr. Gerhard Raabe
Mobil Corporation
150 E. 142nd Street
New York, NY  10017
212-883-5368

Dr. David Cooper Rees
Risk Assessment Forum (RD-689)
U.S. Environmental Protection Agency
401 M Street, S.W.
Washington, DC  20460
(202) 475-6743

Dr. Lorenz Rhomberg
Office of Health and Environmental
  Protection (RD-689)
U.S. Environmental Protection Agency
401 M Street, S.W.
Washington, DC  20460
(202) 382-5723

Ms. Cheryl Siegel Scott
Exposure Evaluation Division
  (TS-798)
U.S. Environmental Protection Agency
401 M Street, S.W.
Washington, DC  20460
(202) 382-3511
                                     _54_

-------
Dr. Allan Smith
University of California
  at Berkeley
315 Warren Hall
Berkeley, CA  94720
(415) 642-1517 (office)
(415) 843-1736 Health Risk
Associates
                                       -55-

-------

-------
    APPENDIX C




LIST  OF OBSERVERS
        -57-

-------
                      U.S.  Environmental  Protection Agency

                       Cancer Risk Assessment Guidelines
                            Human Evidence Workshop

                               June  26-27, 1989
                                Washington, DC

                               LIST OF OBSERVERS
 Steven Bayard
 U.S.  EPA (RD-689)
 401 M Street,  S.W.
 Washington,  DC  20460
 202-382-5722

 Judith Bellin
 383  '0'  Street,  S.W.
 Washington,  DC  20024
 202-479-0664

 Greg  Beumel
 Combustion Engineering
 C.E.  Environmental
 1400  16th  Street, N.W.,  Suite 720
 Washington,  DC  20036
 202-797-6407

 Karen Creedon
 Chemical Manufacturers Assoc.
 2501  M Street, N.W.
 Washington,  DC  20037
 202-881-1384

 Maggie Dean
 Georgia-Pacific  Corp.
 1875  Eye Street, Suite 775
 Washington,  DC  20006
 202-659-3600

 R.J.   Dutton
 Risk  Science Institute
 1126  16th  Street N.W.
Washington,  DC   20036
 202-659-3306
Joel Fisher
International Joint Commission
2001 S. Street, N.W.,  Room 208
Washington, DC 20440
202-673-6222

Robert Gouph
Toxic Material News
951 Pershing
Silver Spring, MD  20310
301-587-6300

Stanley Gross
U.S. EPA (H7509C)
401 M Street, S.W.
Washington, DC  20460
202-557-4382

Cheryl Hogue
Chemical Regulation Reporter
1231 25th Street, N.W.
Washington, DC  20037
202-452-4584

Allan Katz
Technical Assessment Systems
1000 Potomac Street, N.W.
Washington, DC  20007'
202-337-2625

Bob Ku
Syntax Corporation
3401 Hillview Avenue
Palo Alto,  CA 94301
415-852-1981
                                      -58-

-------
Susan LeFevre
Grocery Manufacturers of America
1010 Wisconsin Avenue, N.W.
Washington, DC 20007
202-337-9400

Lisa Lefferts
Center for Science in Public
Interest
1501 16th Street, N.W.
Washington, DC  20036
202-332-9110

George Lin
Xerox Corporation
Building 843-16S
800 Salt Road
Webster, NY  14580
716-422-2081

Bertram Litt                '
Litt Associates
3612 Veasey Street, N.W.
Washington, DC  20008
202-686-0191

Donna Martin
Putnam Environmental Services
2525 Meridian Parkway
P.O. Box 12763            ....  •   •
Research Triangle Park, NC  27709
919-361-4657

Ray McAllister
Madison Building, Suite 900
1155 15th  Street, N.W.
Washington, DC  20005
202-296-1585

Robert E.  McGaughy
U.S. EPA (RD-689)
401 M Street, S.W.
Washington, DC  20460
202-382-5898
Mark Morrel
Front Royal  Group
7900 W.  Park Drive,
McLean,  VA  22102
703-893-0900
Suite A 300
Nancy Nickell
Right-to-Know News
1725 K Street, N.W.,  Suite 200
Washington, DC  20006
202-872-1766

Jacqueline Prater
Beveridge & Diamond
1350' Eye Street N.W.  Suite 700
Washington, DC 20005
202-789-6113
                     „ ,         ^. e
Charles Ris
U.S. EPA (RD 689)
401 M Street, S.W.        .-     ;
Washington, DC  22205
202-382-5898

Robert Schnatter
-Exxon Biomedical Sciences ;
Mettlers Road CN-2350
East Millstone, NJ 08875
201-873-6016

Melanie Scott
Business Publishers, Inc.
951 Pershing Drive
Silver Spring, MD  20910
301-587-6300

Sherry Selevan            ,
U.S. EPA (RD 689)       _    ,
401 M Street, S.W.
Washington, DC  20460
202-382-2604       '        ,    •

Tomiko Shimada
Shin Nippon Biomedical Laboratory
P.O. Box 856
Frederick, MD 21701
301-662-1023

Betsy Shirley
Styrene Information  & Research
Center
1275 K Street, N.W.
Washington, DC  20005
202-371-5314
                                       -59-

-------
Arthur Stock
Shea & Gardener
1800 Massachusetts Ave.
Washington, DC 20036
202-828-2147
N.W.
Jane Teta
Union Carbide Corporation
Health, Safety and Environmental
Affairs
39 Old Ridgebury Road
Danbury, CT  06817-0001
203-794-5884

Sandra Tirey
Chemical Manufacturers Association
2501 M Street, N.W.
Washington, DC 20037
202-887-1274

Keith Vanderveen
U.S. EPA (TS 798)
401 M Street, S.W.
Washington, DC  20460
202-382-6383

Frank Vincent
James River Corp.
P.O. Box 899
Neenah, WI  54976
414-729-8152
                                      -60-

-------
             APPENDIX D



    INTRODUCTORY PLENARY SESSION








Opening Comments, Dr. Philip Enterline



Public Interest Views, Dr. Raymond Neutra



Private Sector Views, Dr. Gerald Ott
                -61-

-------
   OPENING REMARKS:   EPA CANCER GUIDELINES  REVIEW WORKSHOP ON HUMAN  EVIDENCE

                           Philip Enterline, Ph.D.
                           Professor Emeritus of Biostatistics
                           University of Pittsburgh
                           School of Public Health
    I was pleased to learn of EPA's decision to expand and clarify its
guidelines for the use of human evidence in quantitative risk assessment.  As
perhaps many of you are aware, of the fairly large number of risk assessments
that have thus far been made, only a handful are based upon human evidence.
Most are based on extrapolations from animal experimental data.

    In principal, there is no difference between epidemiologic evidence and
experimental evidence.  The problem lies in the design and analysis of these
studies.  When I first became interested in epidemiology, it was not
considered to be a hard science and many of the "best" scientists were quite
undecided as to how much faith to put on epidemiologic observations.   One of
the doubters was Bradford Hill, a then well-known British medical
statistician, who suggested that perhaps epidemiology could be useful if in
designing these studies "the experimental approach was kept firmly in mind."
I think that is truly the key to good epidemiologic investigations.  Somehow
we must conduct epidemiologic studies so as to approach the conditions of an
experiment as closely as possible.

    We have made much progress here with a great boost from advances in
statistical methodology and computers.  Perhaps the major problem with
epidemiologic studies as a tool in quantitative risk assessment is a lack of
firm environmental data, although as Allan Smith has pointed out, it is
difficult to imagine how the environmental data could be more in error than
animal to human extrapolations.  I also feel that producers of epidemiologic
data need more guidance from consumers as to what kind of data is needed.
Most of us are primarily concerned with answering the question, "Is there a
disease excess?" rather than "What is the potency of the agent?"
                                      -62-

-------
    It is my feeling that all well-designed epidemiologic studies invplving
defined exposures provide some information that can be useful in risk
assessment.  This is true even if positive findings are not statistically
significant.  For a number of years I taught a course in Introductory
Biostatistics and in that course we covered measurements and tests of
significance, with the latter being particularly difficult for many of my
students.  As part of my final examination, I sometimes ask the following
question, "Suppose your grandmother has a cancer and your parents, wanting to
take full advantage of your place in the medical science field, ask you to see
what kind of treatment is currently in vogue.  You search the literature and
find two treatments that are being viewed favorably.  Results from a large
recent clinical trial show treatment A to give a 60 percent five-year survival
and treatment B to give a 75 percent five-year survival.  Numbers of subjects
studied were about the same in each of the two treatment groups and the
difference  in survival rates is not statistically significant.  Which
treatment would you select for your grandmother?"

    Perhaps not surprisingly most of my students conclude that since the
difference  in treatment was not statistically significant, there was no
difference.  If pressed they would simply toss a coin to decide on a treatment
for their  grandmother.  There are, however, a few students who would notice
that one of the treatments actually gave better results than the other.

    Epidemiologic studies are not different from clinical trials.  All contain
some information.  Some studies are more positive or some more negative than
others and this fact alone may be important.  A relative risk of 1.2, even if
not statistically significant, may mean more in a particular setting than a
relative risk of  .8.  Perhaps the former might be called nonpositive and the
latter called negative.  EPA clearly recognizes the usefulness of such
[nonpositive] epidemiologic data when they evaluate animal data.  A very
typical  situation is one in which there is a positive animal study and a
nonpositive or negative human study.  While EPA might dismiss the human study
because  of a belief that there is no such  thing as negative epidemiology, they
do use the upper confidence interval of the human data  to set an upper limit
                                      -63-

-------
of risk calculated from the animal study.  I think that is a fair way to view
human evidence, since the confidence interval is both a function of the power
of the study, that is, the sample size, and of the actual results of the
study.  Incidentally, I don't think it is proper to calculate power after the
study has been completed, since it ignores what was found in the study and the
situation is clearly different than it was before the study was ever
undertaken.

    Some people seem to feel that the doses of toxic agents received by humans
are too small to cause disease excesses large enough to be detectable by
epidemiologic studies.  In fact, my students often comment to me that it must
have been great in the good old days when there were so many things to
discover.  I would point out here, however, that for the most part, in the
"good old days," there was little to guide us in terms of what to look for or
where to look.  I recall when the first U.S. epidemiologic study of cigarette
smoking and cancer was reported in the early 1950s, there was a great deal of
debate as to whether this could be in fact true.  Why did it take us so long
to find such a grand relationship?  Even Bill Hueper of NIH, who was probably
our greatest prophet as to environmental causes of cancer, had missed this
relationship, attributing only cancer of the tongue and cheek to the use of
tobacco.  Of course, it was the human evidence that led to what is now perhaps
our greatest effort in the field of preventive medicine - the anti-smoking
campaign.

    I feel that there are many discoveries yet to be made from epidemiologic
studies.  Some of these involve simply a careful review of existing
literature, while others will require some new investigations guided perhaps
by observations made in animal experiments as well as the new field of
structural activity research (SAR).   In studies of working populations,  we
really need to take a hard look at studies that show large numbers of
statistically significant deficits in various diseases.   Can these all be
attributed to worker selection or is it possible that something in the design
or execution of these types of studies is systematically causing understated
risks?
                                     -64-

-------
    In closing, let me assure you'based on a couple of years experience as a
member of EPA's Science Advisory Board, that EPA is one federal agency that
listens to its consultants.  Your work during the next day and a half could
have an important impact on the quality of EPA's risk assessment activity in
the future.
                                      -65-

-------
                        EPIDEMIOLOGICAL RISK ASSESSMENT:
                    SOME OBSERVATIONS FROM THE PUBLIC SECTOR

                     Raymond Richard Neutra, M.D., Dr.P.H.
                     Chief  - Epidemiological Studies Section
                    California Department of Health Services
 I.   Should We Be Regulating Substance by Substance?

     It should be noted that there are 60,000  chemicals  in commercial  use  and
 that our regulatory scheme has  been to regulate  them one  by one  after years  of
 scientific debate.   This is analogous to regulating  fecal pathogens one by one
 instead of simply separating people from chemicals the  way we have separated
 them from feces.   This is worth pondering before we  plunge into  the
 difficulties  which this general approach presents us.
 II.   Epidemiologically Non-Detectable Risks May Be of  Societal Concern.

    Figure  1  gives  a schematized  example of diseases according to their
 baseline, lifetime,  cumulative rates, and the relative risks conveyed by the
 hypothetical  carcinogens  in each  case.  Common cancers, whose bas-^.ine risks
 are multiplied many times by a carcinogen, are easy to detect and are of
 societal concern.   Rare cancers affected by carcinogens that convey small
 relative risks are  neither important nor detectable.  But what of moderately
 rare  cancers  exposed to agents that convey relative risks less than two?  They
 may convey  lifetime  risks greater than one -in a million or one j.n a hundred
 thousand, and not be detectable with epidemiology.  Saccharin is an example of
 a widely used agent  for whom the animal risk assessment suggested an added
burden of 800 bladder  cancers a year.  Yet this was a relative risk of only
 1.01!  Even the enormous case control studies that were done could not rule
 out this added risk.   Epidemiologists said this study showed there was no risk
of public health concern, but eight hundred cases a year (if identifiable)
would attract public and legal attention (compare it to the number of
Guillaine-Barre (GB) cases in swine flu vaccine,  or the number of rabies cases
a year).  The public is not calmed by j£h_e fact that only a small percentage of

                                      -66-

-------
                    .-6
Lifetime
Risk in
Unexposed
      Most EPA
      environmental
      risk in this cell
                    -3
                           Rate Ratio Conveyed by Toxicant

                               Small              Large
Not Important
   Socially

     Not
  Detectable
   Socially
  Important

 r   Not
  Detectable
   Socially
  Important

  Detectable
 Socially
Important

Detectable
 Socially
Important

Detectable
 Socially
Important

Detectable
                      Figure 1. Possible Environmental Risk Scenarios.
                                         -67-

-------
all GB cases were attributable to the vaccine nor would they be calmed by a
similar claim for diet drinks.  The null Hoover study was useful in ruling out
some of the outlier risk assessments, and in reassuring us that humans are not
dramatically more sensitive than animals.  Ultimately, mechanistic evidence
may lay the issue to rest.  The point here is that just because an
epidemiologist can't see it doesn't mean it is unimportant.  Hence,
epidemiology will rarely, if ever - by itself, be able to give a clean bill of
health during the Hazard Identification phase of risk assessment.
III.  Keep the Four Kinds of Evaluation Conceptually Separate.

    One evaluates individual studies for Hazard Identification (Clear
Evidence, Some Evidence, Equivocal Evidence, No Evidence).

    One weighs a body of evidence for Hazard Identification (Sufficient,
Limited, Inconclusive, Evidence Not Suggestive of Carcinogenicity, Sufficient
Evidence for Noncarinogenicity).

    One evaluates individual studies for their usefulness in Dose-Response
Assessment (Very Useful, Somewhat Useful, Not Useful)

    One combines useful studies for the purposes of dose-response assessment.
(This has no nomenclature since a summary number comes out of the combined
dose-response assessment.)

    The strawman document was sometimes unclear about the distinction between
these various activities.
                                     -68-

-------
IV.  Systematically Anticipate What to Do When Dose-Response Assessment Is
     Based on Conflicting Human and Animal Data

    California health department staff have found unforeseen scenarios in
which animal and human dose-response assessment may or may not agree.  Figure
2 shows a simplified flow diagram spelling out the possible combinations.
This diagram could serve as a guide to generate "scenario queries" to the
participants, e.g., "How would you handle this one?"

    What do you do if you have "sufficient animal evidence" but human evidence
is "inconclusive" because the only human studies gave null results?  According
to this flow diagram one would choose the "best" null study to see if the
upper confidence level risk was lower than the risk predicted from animals.
If so, (as was the case with cadmium) the human upper confidence level risk
would be used.
                                     -69-

-------
                                                   Sufficient
                               Animal Evidence
                Inadequate
     Epidemiol.
     Evidence
Epidemiol.
Evidence
                                         Inadequate
                         Epidemiol
                         Evidence
Sufficient (+
           Sufficient
             ,•"-•>
                                                                     Sufficient
                                                                           +
Positive
Association
  9999
No
Dose
Data
          Do Assessment
                                                          Do Assessment
                                 Do Assessment
                                                              Agree with
                                                              Animal???
                                        Use Human
                                                                           EDB
                                                                           ETO
             Figure 2. Flow Diagram of Possible Combinations of Associations.
                      between Human and Animal Data.

                                        -70-

-------
    USE  OF  HUMAN EVIDENCE IN RISK ASSESSMENT  - A PRIVATE  SECTOR PERSPECTIVE

                              Gerald Ott, Ph.D.
                              Senior Consultant,  Epidemiology
                              Arthur D. Little,  Inc.

Background

    The views which I present  today1 undoubtedly reflect  my  experience as an
occupational and environmental epidemiologist working in the private sector;
however, they are my  own views and not necessarily those of any specific
organization.

    Our standard of living in  the United States has been achieved  through the
individual and collective efforts of people to convert resources,  some
renewable and others  not, into useful products.   This production of goods and
services almost inevitably leads to the generation of waste byproducts that
may be released to  the environment.  Wastes are any materials that are deemed
to be of no discernable value and to have no utility to individuals,
institutions,  or society in general.  Because of the costs of recovery,  the
unintended release  of valued products to the environment may also render these
products classifiable as waste.

    In  an  increasingly congested world,  there is ample reason to be concerned
about the release of  hazardous materials to the work and general environments.
Human health and environmental quality have been adversely affected by
hazardous wastes in the past when both the production of goods was at a lower
level and people were not residing in such close proximity to one another.   To
minimize adverse impacts on human health and the environment, waste control
problems must be recognized and addressed using all of the scientific
knowledge and appropriate resources available to us.
     The  impact of wastes may be controlled  through:
     •  Decreased production of goods and services.
                                    -71-

-------
     •  Waste minimization  (e.g., continuous production in enclosed systems
       versus batch production in open systems).
     •  Recycling  (extracting greater value from materials otherwise viewed as
       useless).
     •  Confinement in on-site and off-site disposal areas.
     •  Incineration (with  subsequent dispersion and/or confinement of
       residual materials).
     •  Intentional dilution or dispersion in various environmental media.

None of these approaches to waste management is free of risks.  With each
approach, there may be impacted populations that do not share proportionately
in the costs and benefits  of the enterprises producing the products.

     In Table 1, various waste control approaches are listed  together with  the
populations that may be impacted.  These include employee populations,
community populations, ecologic populations, and global populations.   By
ecologic populations, I mean the interacting biologic species that exist
within an impacted ecosystem.  Clearly,  there are tradeoffs in control
strategies that could differential'ly impact the various populations.   For
example,  venting used to reduce the likelihood of employee exposure may
subject the community population to greater exposure opportunities.

     The  United  States Congress and various  governmental  agencies have
recognized the need to assure that employees and communities are informed of
the potential risks and are afforded an opportunity to participate in risk
management decisions related to hazardous wastes.  This recognition has been
reflected in recent employee and community right-to-know laws and regulations
and in regulations related to the siting of hazardous waste facilities.
Attendant with right-to-know is an obligation to inform people of both the
potential health effects of substances to which they may be exposed and the
risks projected to result  from those exposures.  Quantitative risk assessment
has become an important tool for informing persons about the projected risks
associated with hazard control decisions.
                                     -72-

-------
 Epidemiology  and Quantitative Risk Assessment

     The epidemiologic approach to assessing health risks of environmental
 factors relies  on inductive  reasoning,  that  is, reasoning from a particular
 set  of  facts  to general principles.  Consequently, epidemiology is data-
 driven.  The  epidemiologic approach requires (L) a characterization of both
 exposures and health  outcomes in  the selected human population of interest,
 and  (2) analyses  of the relationships between exposures  and the health
 outcomes in that  population.  There are, of  course, both strengths and
 weaknesses in the epidemiologic approach.  Among the strengths is its direct
 relevance to  the  subject at  hand, namely, determining the effects of exposures
 on human health.

 Major limitations  are:

     •  The size of the population available  for study may be too small to
       allow  detection of low level but important health risks.
     •  The observation period may not have been sufficiently long for chronic
       health effects to have occurred.
     •  The study  design may  not address all  alternative  explanations for the-
       observed health findings.
     •  The occurrence of real adverse health effects can only be demonstrated
       after  the  fact.

     This latter limitation suggests that the toxicity endpoints evaluated
 should emphasize early indicators of reversible adverse effects.   An
 additional,  frequently cited, limitation is the lack of quantitative exposure
 assessments in  support of epidemiologic studies.   However,  with more extensive
use of modeling techniques to estimate exposures and with increasingly precise
measurement procedures,  it may be possible to minimize the practical
 importance of this limitation in both occupational and environmental settings.
    Quantitative  risk  assessment has emerged as the major scientific tool for
deductively determining the likelihood that harm will come to people as a
consequence of predictable exposure to hazardous substances.   The four steps

                                       -73-

-------
of the quantitative risk assessment process are hazard identification,
toxicity or dose-response assessment, exposure assessment, and risk
characterization.  Since quantitative risk assessment utilizes external
toxicity information to define a hazard profile for the environmental agents
of concern, it is appropriate to include the results of epidemiologic research
as well as animal bioassays and other toxicity tests in identifying specific
hazards (e.g., establishing the carcinogenicity of a particular substance), in
describing exposures and identifying sensitive populations, and in assessing
the dose-response relationship.

     There  are several  notable strengths  of the quantitative  risk  assessment
approach.  First, a quantitative risk assessment establishes what effects
could take place in the absence of intervention measures.  Thus, there may be
opportunities to initiate corrective actions before injury to health has
occurred.  Secondly, the quantitative risk assessment approach is "highly risk
sensitive".  This stems from  the use of models to predict risks that are far
below the risk levels  that could be detected in a health study of the subject
population.  Through the use  of across species and low dose extrapolations,
acceptable exposure concentrations can be calculated which would yield
virtually safe doses provided the assumptions  of the risk assessment are
valid.

     For the remainder  of this presentation,  I  would like to  discuss  the role
of epidemiologic evidence in  several specific  aspects of the quantitative risk
assessment process.  These are  (1) the selection of relevant epidemiologic
studies to be included in the evaluation of risks,  (2) the methods by which
evidence for  or against a particular effect is combined across studies, and
(3)  the use of epidemiologic  evidence as a final check of the dose-response
assessment.
                                      -74-

-------
The Selection of Relevant Epidemiologic Studies

    Evaluating  the consequences  of exposure  to  an agent requires  a critical
review of the available toxicologic and epidemiologic data for that .substance
and other interrelated substances.  The purpose of the review is to determine
appropriate toxicity endpoints and, in particular, to determine the evidence
for and against carcinogenicity.  In evaluating the available epidemiologic
data,  two important decisions need to be made.  The first decision is whether
or not a particular study is relevant (admissible) to the evaluation process.
The second decision relates to how the evidence is to be combined across the
relevant epidemiologic and toxicologic studies to assess the overall evidence
for human carcinogenicity.

    In addressing  the  first  decision, it  is  necessary to identify  and
characterize each candidate study on the basis of both relevance and
methodological strengths.  Studies under consideration may range from case
reports to cohort studies which specifically examine the exposure of interest.
To be admissible, each study should address a relevant biologic outcome,- there
should be a reasonable basis for ascribing exposure to the agent of interest,-
and the research should be methodologically sound within the context of its
intended purpose.

    An assessment  of the  internal  evidence for  or against a causal
relationship should not be part of the admissibility criteria.   In other
words, studies should not be selected based on their outcomes or conclusions.
From a methodologic viewpoint, the guidelines for evaluating a study should be
consistent with "good laboratory practices" and with guidelines developed by
the National Academy of Sciences and other professional organizations for
judging the quality of epidemiologic research.

    Based on  these  guidelines and  relevancy  criteria, studies can be
classified as (1) not relevant,  (2) relevant but methodologically unsound,  or
(3) admissible by virtue of bojth relevancy and soundness of methodology.   The
decision that a study has utilized sound methodology strengthens the basis of
                                     -75-

-------
its admissibility.  However, studies that have marginal or even important  .
methodologic deficiencies should not be excluded at this point except where
other clearly superior studies are available.

    While  it is  essential that peer  review takes place,  the available studies
should not be restricted  to those appearing only in the peer-reviewed
literature.  This is important for two reasons.  First, highly relevant and
admissible studies may otherwise be excluded from consideration while awaiting
publication.  This would  seem a harsh penalty to exact because of time
constraints.  Secondly, there are indications that publication bias may result
in SL shift of the published literature toward positive findings, thus making
it difficult to combine evidence across studies without aggregating the bias
component.  The effort to review critically those few relevant studies not
already published would appear to be effort well spent.
Combining Evidence Across Studies                                ••-.-.

     A variety of approaches  have  been proposed  to assist  the  risk assessor  in
combining evidence across studies.  These include expert or judgment-based
approaches, categorical accept-reject analysis,  classical statistical
approaches, and meta-analysis.  These approaches to decision-making are
conceptually similar in assigning weights to each component study and
combining the weighted evidence in some fashion to arrive at an overall
judgment regarding causality.  They may differ considerably in the methods for
determining how much weight  to assign to each study and how explicit to be in
assigning weights.  The system used by the International Agency for Research
on Cancer in evaluating the  weight of evidence for human carcinogenicity is
well known and relies primarily on expert judgment -to combine evidence across
studies.

     Methods  for  aggregating  biological evidence may  differ  fundamentally from
those  used to combine statistical evidence.  For example, the statistical
power  to detect a difference can be increased by combining results across
                                      -76-

-------
 comparably  conducted parallel  studies and yield a different conclusion than
 any  of  the  studies  viewed in isolation.  It is certainly plausible that two
 studies, neither  of which provides  statistical evidence of an effect alone,
 could demonstrate statistical  significance when combined.  Biologic support
 for  a hypothesis  may come from dissimilar studies that demonstrate a
 connection  between  different aspects of a particular disease process.  Thus, a
 study demonstrating that  methylene  chloride is metabolized to carbon monoxide
 in the  liver and  a  separate study on the cardiovascular effects of carbon
 monoxide may convincingly link methylene chloride exposure to adverse   .
 cardiovascular effects.

     While  statistical attributes of a  study may be  judged on the basis of
 that study  alone, biological attributes include elements that are frequently
 external to the study.  This is  evidenced by the common organizational
 structure of most research reports.  When reporting findings,  researchers
 typically provide a summary of their own study results followed by an
 interpretation of their results  in  the light of existing studies and accepted
 biologic knowledge.

     This suggests that evidence for a causal relationship is primarily
 aggregated  on a biologic plane.  The biologic evidence may be combined either
 in serial or parallel fashion,  whereas statistical evidence is  only aggregated
 in parallel.  Various explanations may be invoked to describe a viable
 argument for causation, with the most direct argument consistent with biologic
 knowledge and observation  and requiring the fewest assumptions  being accorded
 the highest status.  Statistical evidence secures our confidence in the
validity of component statements within the various arguments but does not
make the case for causality in and of itself.   Broad statements that arise
 from ecologic or  correlational studies generally provide weak arguments since
 they typically require many assumptions about component statements  in the
argument.

     Evaluating the  strengths of various causal arguments may represent an
alternative approach to performing weight-of-evidence determinations for  human
                                    -77-

-------
carcinogenicity.  In other words, competing arguments would be put forward and
then evaluated to determine the most plausible arguments.   The plausible
arguments themselves would then be subjected to a weight-of-evidence
determination.  One potential advantage of this approach is that future
research may be more readily directed to address the weakest links in existing
causality arguments.
A Final Check of the Pose-Response Assessment
     In addition to their other contributions,  epidemiologic studies may also
serve as an overall validation check on the final proposed dose-response model
for a given substance.  The field of observation potentially open to
epidemiologic assessment  includes chemical process employees in direct contact
with the substance of  interest, employees whose assignments involve
intermittent direct contact with the substance, employees assigned to the same
production site, but whose assignments result  in indirect exposure to the
substance, members of  communities surrounding  the production sites and perhaps
customers who purchase the substance for subsequent use or who purchase
products contaminated  by  the substance.  Quantitative exposure estimates would
be required to  carry out  this  exercise; however, continued improvements in
exposure modeling  suggest that it is feasible  to develop appropriate exposure
estimates.

     Presumably, the most sensitive validation test would be one  which
compares the total projected excess of cases throughout the period of
observation and across all exposed populations with the observed excess of
cases  developed through epidemiologic followup of  the impacted populations.
An additional consistency test might be one which  examines only persons in the
upper  portions  of  the  exposure distribution with sufficient latency to allow
for  manifestation  of a carcinogenic response.  From the standpoint of
determining that a dose-response model overestimates the risks, this exercise
is only viable  for substances  that were in commercial use prior to the mid-
1950s.  For evaluating the possibility that risks  have been underestimated by
the  model, the  approach has merit under a broader  range of circumstances.
                                    -78-

-------
TABLE I. HAZARDOUS WASTE CONTROL APPROACHES AND IMPACTED POPULATIONS
  CONTROL OF HAZARDOUS WASTES


  ON-SITE CONTROLS


  Waste Minimization

  Recyling of Wastes


  Effective on-site Confinement


  OFF-SITE CONTROLS

  Decreased Production of Goods and Services



  Effective off-site Confinement

  Ineffective Confinement

  Recycling of Wastes (off-site)

  Intentional Dilution

  Intentional Dispersion
IMPACTED POPULATIONS
Employee Populations

Informed/Uninformed
Voluntary/Involuntary
Employee Populations
Community Populations
Ecologic Populations

Global Populations '

InfprmedAJniformed
Voluntary/Involuntary
                                           -79-

-------

-------
                   APPENDIX E




1986 GUIDELINES  FOR CARCINOGEN RISK ASSESSMENT
                       —81—

-------
                  51 FR 33992

GUIDELINES  FOR  CARCINOGEN RISK
ASSESSMENT

SUMMARY:On  September 24, 1986, the  U.S.
Environmental Protection Agency  issued the
following five guidelines for assessing the health
risks of environmental pollutants.               .

    Guidelines for Carcinogen Risk Assessment

    Guidelines for Estimating Exposures

    Guidelines for Mutagenicity Risk Assessment

    Guidelines for the Health Assessment of Suspect
    Developmental Toxicants

    Guidelines for the Health Risk Assessment of
    Chemical Mixtures

This section contains the Guidelines for Carcinogen
Risk Assessment.

    The Guidelines for Carcinogen Risk Assessment
(hereafter  "Guidelines") are  intended to guide
Agency evaluation of  suspect carcinogens in line
with the  policies and procedures established in the
statutes administered by the EPA. These Guidelines
were developed as part of an interoffice guidelines
development program under the auspices of the
Office of Health and  Environmental Assessment
(OHEA)  in the Agency's Office of Research and
Development. They reflect Agency consideration of
public and Science Advisory Board (SAB) comments
on the Proposed Guidelines for  Carcinogen  Risk
Assessment published November 23, 1984 (49 FR
46294).

    This publication completes the first round of risk
assessment guidelines  development. These
Guidelines  will be revised, and new guidelines will
be developed, as appropriate.

FOR FURTHER INFORMATION CONTACT:

Dr. Robert E. McGaughy
Carcinogen Assessment Group
Office of  Health and Environmental Assessment
(RD-689)
401M Street, S.W.
Washington, DC 20460
202-382-5898

SUPPLEMENTARY  INFORMATION: In 1983,
the National Academy of Sciences (NAS) published
its book entitled Risk Assessment in the Federal
Government: Managing the Process. In that book,
the NAS recommended  that Federal  regulatory
agencies  establish "inference guidelines" to,ensure
 consistency  and  technical quality  in  risk
 assessments and to ensure that the risk assessment
 process  was maintained  as a scientific effort
 separate from risk management. A task force within
 EPA accepted  that recommendation and requested
 that Agency  scientists, begin  to develop  such
 guidelines.                                 ,

 General
                 . , '.         '           i      '"" •
    The  guidelines are products of a two-year
 Agehcywide effort, which has included many
 scientists from the larger scientific community.
 These guidelines set forth principles and procedures
 to guide EPA scientists in the conduct of Agency risk
 assessments, and to inform Agency decision makers
 and .the public about these procedures. In particular,
 the guidelines emphasize that risk assessments will
 be conducted on a  case-by-case  basis, giving full
 consideration to all relevant scientific information.
 This case-by-case  approach  means that  Agency
 experts review the scientific  information on  each
 agent and use  the  most scientifically appropriate
 interpretation  to assess risk. The  guidelines also
 stress that this information will  be fully presented
 in Agency risk assessment documents, and  that
 Agency scientists will identify the strengths and
 weaknesses of each assessment by describing
 uncertainties, assumptions, and limitations, as well
 as the scientific basis and rationale for each
 assessment.

    Finally, the guidelines are formulated in part to
 bridge  gaps  in risk assessment  methodology and
 data. By  identifying these gaps and the importance
 of the missing  information  to the risk assessment
 process," EPA  wishes to encourage research and
 analysis that  will  lead to new  risk assessment
 methods and data.

 Guidelines for Carcinogen Risk Assessment

    Work on the Guidelines for Carcinogen  Risk
 Assessment began in January 1984.  Draft
 guidelines were developed by Agency work groups
 composed of expert scientists from throughout the
 Agency. The drafts were peer-reviewed by expert
 scientists, in the field .of carcinpgenesis  from
 universities, environmental groups, industry, labor,
 and other governmental agencies., They were  then
 proposed for public comment in the  FEDERAL
REGISTER (49 FR 46294). On November 9, 1984,
 the  Administrator directed that Agency offices use
 the proposed  guidelines in performing  risk
assessments until final guidelines become available.
                                             -82-

-------
    After the close of the public  comment period,
Agency staff prepared summaries of the comments,
analyses of the major issues presented by the
commentors, and proposed changes in the language
of the guidelines to deal with the issues raised.
These analyses were presented to review panels of
the SAB on March 4 and April 22-23, 1985, and to
the Executive Committee of the SAB on April 25-26,
1985. The SAB meetings were announced in th.e
FEDERAL REGISTER  as  follows: February 12,
1985  (50 FR 5811) and April 4, 1985 (50 FR 13420
and 13421).
    In a  letter to the Administrator dated June 19,
1985, the Executive Committee generally concurred
on  all five of the guidelines, but recommended
certain revisions, and requested  that any revised -
guidelines be  submitted to  the  appropriate  SAB
review panel chairman for review and concurrence
on behalf of the Executive Committee. As described
in the responses to comments (see Part B: Response
to  the Public and  Science Advisory Board
Comments), each guidelines document was revised,
where appropriate,  consistent with  the  SAB
recommendations, and revised draft guidelines were
submitted to the panel  chairmen. Revised draft
Guidelines for Carcinogen Risk Assessment  were
concurred on in a letter dated February 7,  1986.
Copies of the letters  are available at the Public
Information Reference Unit, EPA Headquarters
Library, as indicated elsewhere in  this section.

    Following this Preamble are two parts: Part A
contains the Guidelines and Part B, the Response to
the Public and Science Advisory Board Comments (a
summary of the major public  comments,  SAB
comments,  and  Agency  responses  to those
comments).

    The  Agency is continuing to study the risk
assessment issues raised in  the guidelines and will
revise these Guidelines in line with new information
as appropriate.
    References,  supporting documents,  and
comments received on the proposed guidelines, as
well as copies of the final guidelines, are available
for inspection and copying at the Public Information
Reference Unit (202-382-5926), EPA Headquarters
Library,  401 M Street, S.W., Washington, DC,
between the hours of 8:00 a.m. and 4:30 p.m.

    I  certify  that  these Guidelines are  not major
rules as defined by Executive Order 12291, because
they are nonbinding policy statements and have no
direct effect on the regulated community.  Therefore,
they will have no effect oh costs or prices, and they
will
                  [51 FR 33993]
                                 have  no other
significant adverse effects on the economy. These
Guidelines  were reviewed by the  Office  of
Management and Budget under Executive  Order
12291.


August 22, 1986

Lee M. Thomas,

Administrator

CONTENTS

Part A: Guidelines for Carcinogen Risk Assessment

1. Introduction

II Mazard Identification

   A. Overview  ,  :
   B. Elements of Hazard Identification
     1. Physical-Chemical Properties and Routes and
       Patterns of Exposure
     2. Structure-Activity Relationships
     3. Metabolic and Pharmacokinetic Properties
     4. Toxicologic Effects
     5. Short-Term Tests
     6. Ix>ng-Term Animal Studies
     7. Human Studies
   C. Weight of Evidence
   D. Guidance for Dose-Response Assessment
   E. Summary and Conclusion

III.Dose-Response Assessment, Exposure Assessment.'and Risk
    Characterization                .•       .  .

   A.Dose-Response Assessment                   '
     1. Selection of Data
     2. Choice of Mathematical Extrapolation
       Model
     3. Equivalent Exposure Units Among Species
   B. Exposure Assessment
   C. Risk Characterization
     I. Options for Numerical Risk Estimates
     2. Concurrent Exposure
     3. Summary of Risk Characterization

TV. EPA Classification System for Catagorizing Weight of
    Evidence for Carcinogehicity from Human and Animal
    Studies (Adapted from IARC)

   A. Assessment of Weight of Evidence for Carcinogenicity from
     Studies in Humans
   B. Assessment of Weight of Evidence for Carcinogenicity from
     Studies in Experimental Animals
   C. Categorization of Overall Weight  of Evidence for Human
     Carcinogenicity

V.References                                     '

Part B: Response to Public and Science Advisory Board
Comments

/. Introduction

II. Office of Science and Technology Policy Report on
  Chemical Carcinogens

III. Inference Guidelines

IV.EualuationofBenignTumors         '       '    .

. V. Transplacental and Multigenerational Animal Bioassays

VI.MaximumToleratedDose

VII. Mouse Liver Tumors

VIII.Weight-of-Euidence Categories

XI.Quantitative Estimates of Risk
                                                    -83-

-------
Part A:  Guidelines  for Carcinogen  Risk
Assessment

/. Introduction
   This 5s the first revision of the 1976 Interim
Procedures  and Guidelines for Health Risk
Assessments of Suspected Carcinogens (U.S. EPA,
1976; Albert et  al.,  1977). The  impetus for this
revision  is the  need to incorporate into these
Guidelines the concepts and approaches  to
carcinogen risk assessment  that have been
developed  during the  last ten years. The purpose of
these Guidelines is to promote .quality and
consistency of carcinogen risk assessments within
the EPA and to inform those outside the EPA about
its approach to carcinogen risk assessment. These
Guidelines emphasize  the broad but essential
aspects of risk assessment that are needed  by
experts in the various disciplines required (e.g.j
toxicology, pathology, pharmacology, and statistics)
for carcinogen risk assessment. Guidance is given in
general terms since the science of carcinogenesis is
in a stale of rapid advancement, and overly specific
approaches may rapidly become obsolete.

   These Guidelines describe the  general
framework to be followed in developing an analysis
of carcinogenic risk and some salient principles to be
used in evaluating  the quality of data and in
formulating judgments concerning the nature and
magnitude of the cancer  hazard from suspect
carcinogens. It is the intent of these Guidelines to
permit sufficient flexibility to accommodate new
knowledge and new assessment  methods as they
emerge. It is also recognized that there is a need for
new methodology that has not been addressed in this
document in a number of areas,  e.g.,  the
characterization of uncertainty. As this  knowledge
and assessment methodology are developed, these
Guidelines will be revised whenever appropriate.
   A summary of the current state of knowledge in
the field of carcinogenesis and a statement of broad
scientific principles of carcinogen risk assessment,
which was developed by the Office of Science and
Technology Policy (OSTP, 1985), forms an important
basis for  these Guidelines; the format of these
Guidelines is similar  to that  proposed by the
National  Research Council (NRC) of the National
Academy  of Sciences in  a book entitled Risk
Assessment in the Federal Government: Managing
the Process (NRC, 1983).
   These Guidelines are  to be  used within the
policy framework already  provided by applicable
EPA statutes and do  not alter such policies. These
Guidelines provide general directions for analyzing
and organizing available data. They do not imply
that one kind of data or another is prerequisite for
regulatory action to control, prohibit, or allow  the
use of a carcinogen.
   Regulatory decision making  involves two
components: risk assessment and risk management.
Risk assessment defines the adverse health
consequences of exposure to toxic agents. The risk
assessments will be.carried out independently from
considerations  of  the consequences of regulatory
action.  Risk management combines the risk
assessment with  the directives of regulatory
legislation, together with socioeconomic, technical,
political, and  other considerations, to reach- a
decision as to whether or how much to control  futur.6
exposure to the suspected toxic agents.          , .

   Risk assessment includes  one or  more  of the
following components: hazard  identification, dose-
response assessment, exposure  assessment, and risk
characterization (NRC, 1983).

   Hazard identification is  a qualitative risk
assessment, dealing with the process of determining
whether exposure  to an agent  has the potential to
increase the incidence of cancer. For purposes of
these Guidelines, both malignant and benign
tumors are  used in  the evaluation of the
carcinogenic hazard.  The hazard  identification
component  qualitatively answers the question of
how likely an agent is to be a human carcinogen.  ,:

   Traditionally,  quantitative risk assessment has
been used as an inclusive term  to describe  all or
parts of dose-response assessment,  exposure
assessment, and risk characterization. Quantitative
risk assessment can be a useful general term in
some circumstances,  but  the more  explicit
terminology developed by the NRC (1983) is .usually
preferred. The dose-response assessment defines the
relationship between the dose  of an agent and the
probability of induction of a carcinogenic effect. This
component usually entails an extrapolation from the
generally high doses administered to experimental
animals or exposures noted in epidemiologic studies
to the exposure levels expected from  human contact
with the agent in  the environment; it also includes
considerations  of  the  validity  of   these
extrapolations.

   The exposure assessment identifies populations
exposed to  the agent, describes  their composition
and size,  and  presents the types, magnitudes,
frequencies, and durations of exposure to the agents
                 [51 PR 339941
   In risk characterization, the results of the
exposure assessment  and  the dose-response
assessment are combined to estimate quantitatively
the  carcinogenic  risk.  As part  of risk
characterization, a summary of  the strengths anjl
weaknesses in  the hazard identification, dose-
response assessment, exposure assessment, and the
public health risk estimates are presented.  Major
assumptions, scientific judgments, and, to the extent
possible, estimates of the uncertainties embodied in
the assessment are also presented, distinguishing
clearly between fact, assumption, and science policy.
                                              -84-

-------
   The National Research Council (NRG, 1983)
pointed  out that  there  are  many questions
encountered in the risk assessment process that are
unanswerable given current scientific knowledge.
To bridge the uncertainty that exists in these areas
where there is no scientific consensus, inferences
must be made to ensure that progress continues  in
the assessment process. The OSTP (1985) reaffirmed
this position, and generally left  to the regulatory
agencies  the job of articulating  these inferences.
Accordingly, the Guidelines incorporate judgmental
positions (science policies) based on evaluation of the
presently  available  information  and on  the
regulatory mission of the Agency. The Guidelines
are consistent with the principles developed by the
OSTP (1985), although in many instances are
necessarily more specific.
//. Hazard Identification
A. Overview

   The  qualitative assessment or  hazard
identification part of risk  assessment contains a
review of the relevant biological and chemical
information  bearing on whether or not an agent may
pose  a carcinogenic hazard. Since chemical  agents
seldom occur  in a  pure state and  are often
transformed in the body, the review should include
available information on contaminants, degradation
products, and metabolites.

   Studies  are evaluated according to  sound
biological and statistical considerations  and
procedures.  These have been described in several
publications (Interagency Regulatory  Liaison
Group, 1979; OSTP, 1985; Peto et al:,  1980; Mantel,
1980; Mantel and Haenszel, 1959; Interdisciplinary
Panel on Carcinogenicity, 1984; National Center for
Toxicological Research, 1981; National Toxicology
Program, 1984;  U.S.  EPA, 1983a,  1983b,  1983c;
Haseman,  1984).  Results  and  conclusions
concerning the agent, derived from different types of
information, whether indicating positive or negative
responses, are melded together  into a weight-of-
evidence determination.  The  strength  of the
evidence  supporting a  potential  human
carcinogenicity judgment is developed in a  weight-
of-evidence stratification scheme.

B. Elements of Hazard Identification

   Hazard identification should include a review of
the following information to the extent that  it  is
available.

   1. Physical-Chemical Properties and Routes and
Patterns of Exposure. Parameters relevant  to
carcinogenesis,  including physical state, physical-
chemical properties, and exposure pathways in the
environment should be described where possible.

   2. Structure-Activity Relationships. This section
should summarize relevant structure-activity
correlations that support or argue  against the
prediction of potential carcinogenicity.

   3. Metabolic and Pharmacokinetic Properties.
This section should summarize relevant metabolic
information. Information such as whether the agent
is direct-acting or requires conversion to a reactive
carcinogenic  (e.g., an  electrophilic) species,
metabolic pathways  for  such  conversions,
macromolecular interactions, and fate (e.g.,
transport, storage, and excretion), as well as species
differences, should be discussed  and critically
evaluated. Pharmacokinetic properties determine
the biologically effective dose and may be relevant to
hazard identification and other components of risk
assessment.  t

   4. Toxicologic  Effects. Toxicologic  effects other
than  carcinogenicity (e.g., suppression of the
immune  system,  endocrine  disturbances, organ
damage)  that are  relevant to the evaluation of
carcinogenicity should be summarized. Interactions
with other chemicals or agents and with lifestyle
factors should be discussed. Prechronic and chronic
toxicity evaluations, as well  as other  test results,
may yield information on target  organ effects,
pathophysiological  reactions, and preneoplastic
lesions  that  bear  on the evaluation  of
carcinogenicity. Dose-response and time-to-response
analyses of these reactions may also be helpful.

   5. Short-Term Tests. Tests for point mutations,
numerical and structural  chromosome aberrations,
DNA  damage/repair, and in  vitro  transformation
provide supportive evidence of .carcinogenicity and
may give information on potential carcinogenic
mechanisms. A range of tests from each of the above
end points helps to characterize an agent's response
spectrum.                •

   Short-term in  viuo and in vitro tests that can
give indication of initiation and promotion activity
may  also  provide supportive  evidence  for
carcinogenicity. Lack of positive results in short-
term tests for genetic toxicity does not provide a
basis  for  discounting positive results  in long-term
animal studies.

   6. Long-Term Animal Studies.  Criteria for the
technical adequacy of animal carcinogenicity
studies have been published  (e.g., U.S. Food  and
Drug Administration, 1982; Interagency Regulatory
Liaison Group, 1979; National Toxicology Program,
1984; OSTP, 1985;  U.S. EPA, 1983a, 1983b, 1983c;
Feron et al., 1980; Mantel, 1980) and should be used
to  judge  the acceptability of individual studies.
Transplacental    and   multigenerational
carcinogenesis  studies, in addition to  more
conventional long-term animal studies, can yield
useful information about the carcinogenicity of
agents.

   It is  recognized that chemicals  that  induce
benign tumors frequently also induce malignant
                                              -85-

-------
tumors, and that benign tumors often progress to
malignant tumors  (Interdisciplinary Panel  on
Carcinogenicity, 1984). The incidence of benign and
malignant tumors will  be combined  when
scientifically defensible (OSTP, 1985; Principle 8).
For example, the Agency will, in general, consider
the combination of benign and malignant tumors to
be scientifically defensible unless the benign tumors
are not considered to have the potential to progress
to the associated  malignancies of  the  same
histogenic origin. If an increased incidence of benign
tumors is observed  in  the  absence  of malignant
tumors, in most cases the evidence  will  be
considered as limited evidence of carcinogenicity.

   The weight of  evidence that  an  agent is
potentially carcinogenic for  humans increases  (1)
with the increase in number of tissue sites affected
by the agent;  (2) with the increase in number of
animal species, strains, sexes, and number of
experiments and doses showing a  carcinogenic
response; (3) with the occurrence of clear-cut dose-
response relationships as well as a  high level of
statistical  significance of  the  increased tumor
incidence in treated compared to control groups; (4)
when there is a dose-related shortening of the time-
to-tumor occurrence  or time to death with tumor;
and (5) when there is a dose-related increase in the
proportion of tumors that are malignant.

   Long-term animal studies at  or near  the
maximum tolerated dose level (MTD) are used to
ensure an adequate power for the detection of
carcinogenic
                 [51 PR 33995]
                               activity (NTP,
1984;  IARC, 1982).  Negative long-term animal
studies at exposure levels above the MTD may not be
acceptable if animal survival is so impaired that the
sensitivity of  the  study is  significantly  reduced
below that of a conventional chronic animal study at
the MTD. The OSTP (1985; Principle 4) has stated
that,
The carcinogenic effects of agents may be influenced by non-
physiological responses (such as extensive organ damage, radical
disruption of hormonal  function, saturation of metabolic
pathways, formation of stones in the urinary tract, saturation of
DNA repair with a functional loss of the system) induced in the
model systems. Testing regimes inducing these responses should
be evaluated for their relevance to the human response to an
agent and evidence from such a study, whether positive or
negative, must be carefully reviewed.

Positive studies at levels above the MTD should be
carefully reviewed to ensure that the  responses are
not due to factors which do not operate at exposure
levels below the MTD. Evidence indicating that high
exposures alter tumor responses by indirect
mechanisms that may be unrelated to effects at
lower exposures  should be dealt with on  an
individual  basis. As noted  by the OSTP (1985),
"Normal metabolic activation of carcinogens may
possibly also be  altered and carcinogenic potential
reduced as a consequence [of high-dose testing]."
    Carcinogenic responses under conditions of the
experiment should be reviewed  carefully as  they
relate to the relevance of the  evidence  to  human
carcinogenic risks  (e.g., the occurrence of bladder
tumors in the presence  of bladder stones and
implantation site sarcomas). Interpretation of
animal studies is aided by the review of target organ
toxicity and other effects (e.g., changes  in the
immune and endocrine systems) that  may be noted
in prechronic or other  toxicological studies. Time
and dose-related  changes in the incidence of
preneoplastic lesions  may also  be helpful  in
interpreting animal studies.

    Agents that are positive in long-term  animal
experiments and also show evidence of promoting or
cocarcinogenic activity in specialized tests should be
considered as complete carcinogens unless there is
evidence to the contrary because it is, at present,
difficult to determine whether an agent is only a
promoting or cocarcinogenic agent.  Agents  that
show positive results in special tests  for initiation,
promotion, or cocarcinogenicity and no indication of
tumor response in well-conducted and  well-designed
long-term animal studies should be dealt with on an
individual basis.

    To evaluate carcinogenicity, the  primary
comparison is tumor response in dosed animals as
compared with  that  in contemporary  matched
control animals.  Historical control  data are often
valuable, however, and  could be used along  with
concurrent  control data in  the  evaluation of
carcinogenic responses  (Haseman et al.,  1984). For
the evaluation of rare  tumors, even  small tumor
responses may be significant compared to historical
data. The review of tumor data at  sites with  high
spontaneous background   requires  special
consideration (OSTP,  1985;  Principle  9). For
instance, a response that is significant with  respect
to the  experimental control  group  may  become
questionable if the historical control  data  indicate
that the  experimental control  group had an
unusually low background incidence (NTP, 1984).

    For a number of  reasons, there are  widely
diverging scientific views (OSTP,  1985; Ward et al.,
1979a, b; Tomatis, 1977; Nutrition  Foundation,
1983) about the validity of mouse liver tumors as an
indication of potential carcinogenicity in humans
when such  tumors occur in  strains with  high
spontaneous background incidence  and when  they
constitute the only tumor response  to  an agent.
These Guidelines take  the position that when the
only tumor response is in the mouse liver and when
other conditions for a  classification of "sufficient"
evidence in animal studies are met (e.g., replicate
studies, malignancy; see section IV), the data should
be  considered  as "sufficient" evidence  of
carcinogenicity.  It  is understood that  this
classification could be changed on  a  case-by-case
basis to "limited," if warranted, when factors such as
the following, are observed: an increased incidence
                                              -86-

-------
of tumors only in the highest dose group and/or only
at the end of the study; no substantial dose-related
increase in  the proportion  of  tumors that are
malignant; the occurrence  of  tumors that are
predominantly benign; no dose-related shortening-of
the time to the appearance of tumors; negative  or
inconclusive results from a spectrum of short-term
tests for mutagenic activity; the occurrence of excess
tumors only in a single sex.

    Data from all long-term animal studies are to be
considered  in the evaluation of carcinogenicity. A
positive   carcinogenic   response  in  one
species/strain/sex  is not  generally negated  by
negative  results in other  species/strain/sex.
Replicate  negative studies that are essentially
identical in all other respects to a positive study may
indicate that the positive results are spurious.

    Evidence  for carcinogenic action should be based
on the observation of statistically significant tumor
responses in specific organs or tissues. Appropriate
statistical analysis should be performed on data
from long-term studies to help determine whether
the effects are treatment-related  or possibly due to
chance. These should at least include a statistical
test for trend, including appropriate correction tar
differences in survival. The weight to be given to the
level of statistical significance (the p-value) and to
other available  pieces of information is a matter of
overall scientific judgment.  A  statistically
significant excess of tumors of all  types  in the
aggregate, in the  absence of a  statistically
significant increase of any individual tumor type,
should be regarded as minimal  evidence  of
carcinogenic action unless there are persuasive
reasons to the contrary.

    7. Human  Studies.  Epidemiologie studies
provide unique  information about the response of
humans who  have  been exposed  to  suspect
carcinogens.  Descriptive epidemiologic studies are
useful  in  generating hypotheses and providing
supporting data, but can rarely be used to make a
causal inference. Analytical epidemiologic studies of
the case-control or cohort variety, on the other hand,
are especially useful in assessing risks to exposed
humans.

    Criteria  for the  adequacy  of epidemiologic
studies  are well recognized. They include factors
such as the proper selection and characterization of
exposed and control groups,  the  adequacy  of
duration  and quality of follow-up, the proper
identification and characterization of confounding
factors ,and bias, the appropriate consideration of
latency effects, the valid ascertainment of the causes
of morbidity  and death, and the ability to detect
specific effects. Where  it can be calculated, the
statistical power to detect an appropriate outcome
should be included in the assessment.

    The strength of the epidemiologic evidence for
carcinogenicity depends, among other things, on the
type of analysis and  on the  magnitude  and
specificity of the response. The'weight of evidence
increases  rapidly with the number of adequate
studies that show comparable results on populations
exposed  to the  same  agent under different
conditions.

   It should be  recognized  that epidemiologic
studies are  inherently capable of detecting only
comparatively large increases in the relative risk of
 '                151 FR33996]           '"'.'.'
                            '." cancer. Negative
results from such studies cannot, prove the absence,
of carcinogenic  action; however, negative results
from a well-designed  and; well-conducted
epidemiologic study  that contains usable exposure
data can serve to define upper limits of risk; these
are useful if animal evidence indicates that the
agent is potentially carcinogenic in humans.

C. Weight  of Evidence

   Evidence of possible carcinogenicity in humans
comes primarily from two sources: long-term animal
tests and epidemiologic investigations. Results from
these studies are supplemented with available
information from short-term tests, pharmacokinetic
studies, comparative metabolism studies, structure-
activity relationships, and other relevant toxicologic
studies. The questidn of how likely an agent is to be
a human  carcinogen should  be  answered in the
framework  of  a  weight-of-evidence judgment.
Judgments about the weight" of  evidence involve
considerations of the quality and adequacy of the
data and the kinds  and consistency of responses
induced by a suspect carcinogen. There are three
major .steps to characterizing the weight of evidence
for carcinogenicity in humans: (1) characterization
of the evidence from human studies and from animal
studies individually, (2) combination of the
characterizations of these two  types of data into ah
indication of the overall weight of evidence  for
human carcinogenicity, and 1(3). evaluation of all-
supporting information  to determine if the overall
weight of evidence should be modified.

   EPA has developed a system for stratifying the
weight of  evidence  (see section IV). This
classification is not meant to be applied rigidly or
mechanically.  At various points in the  above
discussion, EPA has emphasized the need for an
overall, balanced judgment of the totality of the
available  evidence.  Particularly for well-studied
substances,  the scientific data base will have  a
complexity that cannot be captured by  any
classification  scheme. Therefore, the hazard
identification section should  include a  narrative
summary  of the strengths  and weaknesses of the
evidence as  well as  its categorization in the EPA
scheme.         .
   The EPA classification system is, in general, an
adaptation of the International Agency for Research
on Cancer (IARC, 1982) approach for classifying the
                                               -87-

-------
weight of evidence for human data and animal data.
The  EPA  classification  system  for  the
characterization of the overall weight of evidence for
carcinogenicity (animal,  human,  and  other
supportive data) includes: Group A - Carcinogenic
to Humans; Group B -- Probably  Carcinogenic to
Humans; Group C --  Possibly Carcinogenic to
Humans; Group D — Not Classifiable as to Human
Carcinogenicity; and  Group E -- Evidence of Non-
Carcinogenicity for Humans.
   The following modifications of the  IARC
approach have been made for classifying human and
animal studies.
   For human studies:
   (1) The observation of a statistically significant
association between an agent and life-threatening
benign tumors in humans is included in the
evaluation of risks to humans.
   (2) A "no data available" classification  is added.
   (3)  A  "no  evidence of  carcinogenicity"
classification is added.  This  classificaton indicates
that no association was found between exposure and
increased risk of cancer  in  well-conducted, well-
designed, independent analytical epidemiologic
studies.
    For animal studies:
    (1) An increased  incidence of combined benign
and malignant tumors will be considered to provide
sufficient evidence of carcinogenicity if  the other
criteria defining the  "sufficient"  classification of
evidence  are  met (e.g.,  replicate  studies,
malignancy; see section IV). Benign and malignant
tumors will be combined  when scientifically
defensible.
    (2) An  increased incidence of benign tumors
alone generally  constitutes  "limited"  evidence of
carcinogenicity.
    (3) An  increased  incidence  of neoplasms  that
occur with high spontaneous background incidence
(e.g., mouse liver tumors and  rat pituitary tumors in
certain strains)  generally constitutes  "sufficient"
evidence of carcinogenicity, but may be changed to
"limited" when  warranted by the  specific
information available on the agent.
    (4) A "no data available"  classification has been
added.
    (5) A  "no  evidence of carcinogenicity"
classification  is also added.  This  operational
classification would include  substances  for which
there is no increased incidence of neoplasms in at
least  two well-designed and  well-conducted animal
studies of adequate  power  and dose  in different
species.
D. Guidance for Dose-Response Assessment

    The qualitative  evidence for carcinogenesis
should be discussed for purposes of guiding the dose-
response assessment. The guidance should be given
in terms of the appropriateness and limitations of
specific studies as well  as  pharmacokinetic
considerations that should be factored into the dose-
response assessment. The appropriate  method of
extrapolation should be  factored in  when the
experimental  route of exposure differs from that
occurring in humans.

   Agents that are judged to be in the EPA weight-
of-evidence stratification Groups A and B would be
regarded  as  suitable for quantitative  risk
assessments. Agents that are judged to be in Group
C will generally be  regarded as  suitable for
quantitative risk assessment, but judgments in this
regard may be made on a case-by-case basis. Agents
that are judged to be in Groups D and E would not
have quantitative risk assessments.

E. Summary and Conclusion

   The summary should  present all of  the key
findings in all of the sections of the qualitative
assessment and  the  interpretive rationale that
forms the basis for the conclusion. Assumptions,
uncertainties  in the evidence, and other factors that
may affect the relevance of the evidence to humans
should be discussed. The conclusion should present
both  the  weight-of-evidence ranking'and  a
description that brings out the more subtle aspects of
the evidence  that may not be evident from the
ranking alone.

///.  Dose-Response  Assessment, Exposure
Assessment, and Risk Characterization

   After  data  concerning the carcinogenic
properties of a  substance have been collected,
evaluated, and categorized, it is frequently desirable
to estimate the likely  range of excess cancer risk
associated with  given levels  and conditions of
human exposure. The first step of the analysis
needed to make such estimations is the development
of the likely relationship between dose and response
(cancer incidence) in the region  of human exposure.
This information on dose-response relationships is
coupled with information on the  nature and
magnitude of human exposure to yield an estimate
of human risk. The risk-characterization step also
includes an interpretation of these estimates in light
of the biological, statistical,  and  exposure
assumptions  and uncertainties that have arisen
throughout the process of assessing risk.

   The elements of dose-response  assessment are
described in  section III.A. Guidance on human
exposure assessment is provided in another EPA
                 [51 FR 33997]
                              document  (U.S.
EPA, 1986); however, section I1I.B. of  these
Guidelines includes a brief description of the specific
type of exposure information  that  is  useful for
carcinogen risk assessment. Finally, in section III.C.
on risk characterization, there is a description of the
manner in which risk estimates  should be presented
so as to be most informative.

    It should be emphasized that calculation of
quantitative  estimates of cancer risk does not
                                               -88-

-------
require that an agent be carcinogenic in humans.
The likelihood that an agent is a human carcinogen
is a function of the weight of evidence, as this has
been described in the hazard identification section of
these Guidelines. It is nevertheless important to
present quantitative estimates,  appropriately
qualified and interpreted, in those circumstances in
which  there is  a reasonable  possibility, based on
human and animal  data, that  the  agent  is
carcinogenic in  humans.
    It should be emphasized  in every quantitative
risk estimation that the results are uncertain.
Uncertainties  due  to  experimental   and
epidemiologic variability as well  as uncertainty in
the exposure assessment can be important. There
are major uncertainties in extrapolating both  from
animals to humans and  from  high to low doses.
There are important species differences in uptake,
metabolism, and organ distribution of carcinogens,
as well as species  and strain differences in target-
site susceptibility. Human populations are variable
with  respect  to genetic  constitution, diet,
occupational and  home environment,  activity
patterns, and other cultural factors. Risk estimates
should be presented together with  the associated
hazard assessment (section III.C.3.) to ensure  that
there is an appreciation of the weight of evidence for
carcinogenicity that underlies the quantitative risk
estimates.

A. Dose-Response Assessment

    1. Selection  of Data. As indicated in section II.D.,
guidance needs to be given by the individuals doing
the  qualitative  assessment  (toxicologists,
pathologists, pharmacologists, etc.)  to  those doing
the quantitative assessment as to the  appropriate
data to be  used in the dose-response assessment.
This is determined by the quality of the data, its
relevance to human  modes of exposure, and other
technical details.
    If available, estimates based on adequate human
epidemiologic data are preferred over estimates
based on animal data. If adequate exposure  data
exist in a well-designed and well-conducted negative
epidemiologic study, it may be possible to obtain an
upper-bound estimate of risk from that study.
Animal-based estimates, if available, also should be
presented.
    In  the absence of appropriate human studies,
data from a species that responds most like humans
should be used, if information to  this effect exists.
Where, for a given  agent,  several studies are
available,  which may involve different animal
species, strains, and sexes at several doses and by
different routes of exposure, the following approach
to selecting the data sets is used:  (1) The  tumor
incidence data are separated according to organ site
and tumor type. (2) All biologically and statistically
acceptable data sets are presented. (3) The range of
the risk estimates is presented with due regard to
biological  relevance (particularly in the case of
animal studies) and appropriateness of route of
exposure.  (4) Because  it is possible that human
sensitivity is as  high as the  most sensitive
responding animal  species, in  the  absence of
evidence to the contrary, the biologically acceptable
data set from long-term animal studies showing the
greatest  sensitivity should generally be given the
greatest emphasis,  again with due  regard to
biological and statistical considerations.

    When the exposure route in  the species from
which  the  dose-response information is  obtained
differs from the route occurring in environmental
exposures,  the  considerations  used in making the
route-to-route extrapolation must be carefully
described.  All  assumptions should be presented
along with a discussion of the uncertainties in the
extrapolation. Whatever procedure is adopted in a
given case, it must be consistent with the existing
metabolic and pharmacokinetic information on the
chemical (e.g., absorption efficiency via the gut and
lung, target organ doses, and changes in  placental
transport throughout gestation for transplacental
carcinogens).

    Where  two or more significantly elevated tumor
sites or  types  are  observed in the same study,
extrapolations may be conducted on selected sites or
types. These selections  will be made on biological
grounds.  To obtain a total estimate of carcinogenic
risk, animals with one or more tumor sites or types
showing  significantly elevated tumor incidence
should be pooled and used for extrapolation. The
pooled estimates will generally be used in preference
to risk estimates based on single sites or types.
Quantitative risk extrapolations will generally not
be done on the basis of totals that include tumor sites
without statistically significant elevations.

    Benign tumors should generally be combined
with malignant tumors for risk estimates unless the
benign tumors are  not considered to have  the
potential to progress to the associated malignancies
of the same histogenic origin. The contribution of
the benign  tumors, however, to the total risk should
be indicated.

    2. Choice of Mathematical Extrapolation Model.
Since  risks at low exposure levels cannot be
measured directly either by animal experiments or
by epidemiologic studies, a number of mathematical
models have been developed to extrapolate from
high to low dose. Different  extrapolation models,
however, may fit the observed data reasonably well
but may  lead to large differences  in the projected
risk at low doses.

    As was pointed out  by  OSTP (1985; Principle
26),
    No single mathematical procedure is recognized as the most
appropriate for low-dose extrapolation in carcinogenesis. When
relevant biological evidence on mechanism of action exists (e.g.,
pharmacokinetics, target organ dose), the models or  procedures
                                               -89-

-------
employed should bo consistent with the evidence. When data and
information are limited, however, and when much uncertainty
exists regarding the mechanism of carcinogenic action, models or
procedures which incorporate low-dose linearity are preferred
when compatible with the limited information.

At present,  mechanisms of the  carcinogenesis
process are largely unknown and data are generally
limited. If a carcinogenic  agent acts by accelerating
the same carcinogenic process that leads to the
background occurrence of cancer, the added effect of
the carcinogen at low  doses  is  expected  to be
virtually linear (Crump etal., 1976).

   The Agency will review each assessment  as to
the evidence on carcinogenesis mechanisms and
other biological or statistical evidence that indicates
the suitability of a particular extrapolation model.
Goodness-of-fit to the experimental observations is
not an effective means  of discriminating among
models (OSTP, 1985). A rationale will be included to
justify the use of the chosen model. In the absence of
adequate information to the contrary, the linearized
multistage procedure will  be employed. Where
appropriate,  the  results of using  various
extrapolation models may be useful for comparison
with the linearized multistage procedure. When
longitudinal data on tumor development are
available, time-to-tumor models may be used.

    It should be emphasized that the linearized
multistage procedure leads to
                  [51FR33998]
                               a plausible upper
limit to the risk  that is consistent with  some
proposed mechanisms of carcinogenesis. Such an
estimate, however, does not necessarily give a
realistic prediction of the risk. The  true value of the
risk is unknown, and  may be as low as zero. The
range of risks,  defined by the upper limit given by
the chosen model and the lower limit which may be
as low as zero, should  be  explicitly stated.  An
established procedure does not yet  exist for making
"most likely" or "best" estimates of risk within the
range of uncertainty defined by the upper and  lower
limit estimates. If data  and procedures become
available, the Agency will also provide "most likely"
or "best" estimates of risk. This will be most feasible
when human data are available and when exposures
are in the dose range of the data.
       In certain cases, the linearized multistage
procedure cannot be used with the observed data as,
for example, when the data are nonmonotonic or
flatten out at high doses. In these cases, it may be
necessary to make adjustments to  achieve low-dose
linearity.
       When pharmacokinetic or metabolism data
are available, or when other substantial evidence on
the mechanistic  aspects of the carcinogenesis
process exists, a low-dose extrapolation model other
than the linearized multistage procedure might be
considered more appropriate on biological grounds.
When a different model  is chosen,  the  risk
assessment should clearly discuss the nature and
weight of evidence that  led to  the  choice.
Considerable  uncertainty will remain concerning
response at low doses; therefore, in most cases an
upper-limit risk  estimate  using the  linearized
multistage procedure should also be presented.

   3. Equivalent  Exposure Units Among Species.
Low-dose risk estimates derived from  laboratory
animal data extrapolated  to  humans  are
complicated by a variety of factors that differ among
species and  potentially affect the  response  to
carcinogens.  Included among  these factors are
differences between  humans and experimental test
animals with respect to life span, body size, genetic
variability, population homogeneity, existence of
concurrent disease, pharmacokinetic effects such as
metabolism  and excretion patterns, and the
exposure regimen.
   The usual approach for making interspecies
comparisons has been to use standardized scaling
factors. Commonly employed standardized dosage
scales include mg per kg body weight per day, ppm
in the diet or water, mg per m2 body surface area per
day, and mg per kg body weight per lifetime. In the
absence of comparative toxicological,  physiological,
metabolic, and pharmacokinetic data for a given
suspect carcinogen,  the Agency takes the position
that the extrapolation on the basis of surface area is
considered to be appropriate because certain
pharmacological effects commonly scale according to
surface area (Dedrick, 1973; Freireich et al., 1966;
Pinkel, 1958).
B. Exposure Assessment
    In order to obtain a quantitative estimate of the
risk, the results of the dose-response assessment
must be combined with an estimate of the exposures
to which the populations of interest are likely to be
subject. While  the reader is referred to  the
Guidelines for Estimating Exposures (U.S. EPA,
1986) for specific details, it is important to convey an
appreciation of the impact of  the strengths and
weaknesses of exposure assessment on the overall
cancer risk assessment process.

    At present there  is no single  approach to
exposure  assessment that is appropriate for  all
cases. On  a case-by-case basis, appropriate methods
are selected to match the data on hand and the level
of sophistication  required. The assumptions,
approximations, and uncertainties need to be clearly
stated because, in some instances, these will have a
major effect on the risk assessment.
    In general, the magnitude,  duration, and
frequency  of exposure  provide  fundamental
information for estimating the concentration of the
carcinogen to which the organism is exposed. These
data are generated from monitoring information,
modeling  results, and/or reasoned estimates.  An
appropriate treatment of exposure should consider
                                                  -90-

-------
the potential for exposure via ingestion, inhalation,
and dermal penetration from relevant sources of
exposures including multiple avenues of intake from
the same source.

   Special  problems  arise  when the  human
exposure situation of concern suggests exposure
regimens, e.g.,  route and dosing schedule that are
substantially different from  those used  in  the
relevant animal studies. Unless there is evidence to
the contrary  in a particular case, the cumulative
dose received over a lifetime, expressed as average
daily exposure  prorated  over a  lifetime, is
recommended as an appropriate measure of
exposure to a carcinogen. That is, the assumption is
made that a high dose of a carcinogen received over a
short period of time is equivalent to a corresponding
low dose spread  over  a lifetime. This approach
becomes more problematical as the  exposures in
question become  more intense  but less frequent,
especially when there is evidence that the agent has
shown dose-rate effects.          ,

   An attempt should be made to assess the level of
uncertainty associated  with the exposure
assessment which is to be used in a cancer risk
assessment. This  measure of uncertainty should be
included in the risk characterization (section 1II.C.)
in order to provide the decision-maker with a clear
understanding of the impact of this uncertainty on
any final quantitative risk estimate. Subpopulations
with heightened susceptibility (either because of
exposure or predisposition) should, when possible, be
identified.
C. Risk Characterization

    Risk characterization is  composed of two parts.
One is a presentation of the  numerical estimates of
risk; the other is a framework to help judge the
significance  of the  risk. Risk characterization
includes the exposure assessment and dose-response
assessment;  these are  used in  the estimation of
carcinogenic  risk. It  may also consist of a unit-risk
estimate which can be combined elsewhere with the
exposure assessment for the purposes of estimating
cancer risk.
    Hazard  identification and dose-response
assessment are covered in sections II. and III.A., and
a detailed discussion of exposure  assessment is
contained in EPA's Guidelines for  Estimating
Exposures (U.S. EPA, 1986). This section deals with
the numerical  risk estimates and the approach to
summarizing risk characterization.

    1. Options for Numerical Risk Estimates.
Depending on the needs of the individual program
offices, numerical estimates can be presented in one
or more of the following three ways.
    a. Unit Risk - Under an assumption of low-dose
 linearity, the unit cancer risk is the excess lifetime
 risk due to a continuous constant lifetime exposure
of one  unit  of carcinogen concentration.  Typical
exposure units include ppm or ppb in food or water,
mg/kg/day by ingestion, or ppm or ug/m3 in air.
   b. Dose Corresponding to a Given Level of Risk —
This approach can be useful, particularly when
using nonlinear extrapolation  models  where the
unit risk would differ at different dose levels.
   c. Individual and Population Risks — Risks may
be characterized either in terms of the excess
individual lifetime risks,  the excess number of
cancers
                 [51 FR 33999]
                               produced  per
year in the exposed population, or both.
      Irrespective of the <..,.lions chosen, the degree
of precision and accuracy  in the numerical risk
estimates  currently do not  permit more than one
significant figure to be presented.
   2. Concurrent Exposure. In characterizing the
risk due  to concurrent  exposure to  several
carcinogens, the risks are combined on the basis of
additivity  unless there is specific information to the
contrary. Interactions of cocarcinogens, promoters,
and  inititators with known carcinogens should be
considered on a case-by-case basis.

   3.  Summary of Risk  Characterization.
Whichever method of presentation is chosen, it is
critical that  the numerical estimates not be allowed
to stand  alone,  separated from the  various
assumptions and uncertainties upon which they are
based.  The risk characterization should contain  a
discussion and interpretation of the  numerical
estimates that affords the risk manager some
insight into  the degree  to which the quantitative
estimates  are likely to reflect the true magnitude of
human risk, which generally cannot be known with
the degree of quantitative accuracy reflected in the
numerical estimates. The final risk estimate will be
generally  rounded  to one significant figure and will
be coupled  with  the EPA classification of the
qualitative  weight of evidence. For example,  a
lifetime individual risk of  2X10-4 resulting from
exposure to  a "probable human carcinogen" (Group
B2)  should  be designated  as 2X10-4 [B2] .  This
bracketed designation of the qualitative weight of
evidence should be included with all numerical risk
estimates (i.e., unit  risks, which  are risks at  a
specified concentration or  concentrations
corresponding to a given risk). Agency statements,
such as FEDERAL REGISTER notices, briefings,
and action memoranda, frequently include
numerical estimates of carcinogenic risk.  It is
recommended that whenever these  numerical
estimates are used, the  qualitative weight-of-
evidence classification should also be included.

    The section on risk characterization  should
summarize the hazard identification, dose-response
assessment, exposure assessment, and the public
health risk estimates. Major assumptions, scientific
judgments, and, to the extent possible, estimates of
                                               -91-

-------
 the uncertainties embodied in the assessment  are
 presented.

 IV. EPA Classification  System for Categorizing
 Weight of Evidence for Carcinogenicity from Human
 and Animal Studies (Adapted from I ARC)
 A. Assessment  of Weight  of Evidence  for
 Carcinogenicity from Studies in Humans
    Evidence of Carcinogenicity from human studies
 comes from three main sources:
    1. Case reports of individual cancer patients who
 were exposed to the agent(s).
    2. Descriptive epidemiologic studies in which  the
 incidence of cancer in human populations was found
 to  vary in space or time with exposure to  the
 agent(s).
    3. Analytical  epidemiologic  (case-control and
 cohort) studies in which individual exposure to  the
 agent(s)  was  found  to  be associated with  an
 increased risk of cancer.

    Three criteria must be met before a causal
 association can be inferred between exposure and
 cancer in humans:
    1. There is no identified bias  that could explain
 the association.
    2. The  possibility of confounding has  been
 considered and  ruled  out as explaining the
 association.
    3. The  association is  unlikely to  be due to
 chance.

    In general, although a single study may  be
 indicative of a cause-effect relationship, confidence
 in inferring a causal association  is increased when
 several independent  studies are concordant  in
 showing the association, when  the association is
 strong, when there is a dose-response relationship,
 or when a reduction in exposure is followed by a
 reduction in the incidence of cancer.
    The weight of evidence for Carcinogenicity1 from
 studies in humans is classified as:
    1. Sufficient evidence of Carcinogenicity,  which
 indicates that there is a causal relationship between
 the agent and human cancer.
    2. Limited  evidence of Carcinogenicity, which
indicates that a causal interpretation is credible, but
 that alternative explanations, such as chance, bias,
or confounding, could not adequately be excluded.
     1 For purposes of public health protection, agents
 associated with life-threatening benign tumors in humans are
 included in the evaluation.
     8 An increased incidence of neoplasms that occur with high
 spontaneous background incidence (e.g., mouse liver tumors
 and rut pituitary tumors in certain strains) generally
 constitutes "sufficient" evidence of Carcinogenicity, but may be
 changed to "limited" when warranted by the specific
 information available on the agent,
     3 Benign and malignant tumors will be combined unless
 the benign tumors are not considered to have the potential to
 progress to the associated malignancies of the same histogenic
 origin.
    3. Inadequate evidence, which indicates that one
 of two conditions  prevailed: (a)  there were few
 pertinent data, or  (b)  the available studies, while
 showing evidence of association, did not exclude
 chance, bias, or confounding, and therefore a causal
 interpretation is not credible.
    4. No data, which indicates that data are not
 available.
    5. No evidence,  twhich indicates that no
 association was found between exposure  and an
 increased risk  of cancer in well-designed and  well-
 conducted independent analytical epidemiologic
 studies.

 B. Assessment of Weight of  Evidence  for
 Carcinogenicity from Studies in Experimental
 Animals

    These  assessments are  classified into five
 groups:
    1. Sufficient evidence2 of Carcinogenicity, which
 indicates that  there is an increased incidence of
 malignant tumors or  combined  malignant  and
 benign tumors;3 (a) in multiple species or strains; or
 (b) in multiple experiments  (e.g., with different
 routes of administration or using different  dose
 levels); or (c)  to an unusual degree  in a single
 experiment with regard to high  incidence,  unusual
 site or type of tumor, or early age at onset.
    Additional evidence may be provided by data on
 dose-response effects, as well  as information from
 short-term tests or on chemical structure.
    2. Limited  evidence of Carcinogenicity, which
 means that the data suggest a carcinogenic effect
 but are limited because: (a) the studies involve a
 single species, strain, or experiment and do not meet
 criteria for sufficient evidence (see section IV. B. l.c);
 (b) the experiments are restricted by inadequate
 dosage levels, inadequate duration of exposure to the
 agent, inadequate period of follow-up, poor survival,
 too few animals, or inadequate reporting; or (c) an
 increase in the incidence of benign tumors only.
    3.  Inadequate evidence, which indicates  that
 because  of major  qualitative or  quantitative
 limitations, the studies cannot  be interpreted as
 showing either the  presence  or absence of  a
 carcinogenic effect.
    4.  No data, which indicates  that data  are  not
 available.
    5. No evidence,  which indicates that there is no
 increased incidence of neoplasms  in  at  least two
 well-designed
                 [51 PR 34000]
                                and      well-
conducted animal studies in different species.

      The classifications "sufficient evidence" and
"limited evidence" refer only to  the weight of the
experimental  evidence that these agents are
carcinogenic  and  not  to the potency of their
carcinogenic action.
                                              -92-

-------
 C. Categorization of Overall Weight of Evidence for
 Human Carcinogenicity

    The overall  scheme for categorization of the
 weight of evidence of carcinogenicity of a chemical
 for humans uses a three-step process. (1) The weight
 of evidence in human studies or animal studies is
 summarized; (2) these lines of information are
 combined  to  yield a tentative  assignment  to  a
 category  (see  Table  1);  and  (3)  all  relevant
 supportive information is evaluated to see if the
 designation of the' overall weight of evidence needs
 to be modified. Relevant factors to be included along
 with the tumor information from human and animal
 studies include  structure-activity relationships;
 short-term test  findings; results  of appropriate
 physiological,  biochemical,  and toxicological
 observations; and comparative metabolism and
 pharmacokinetic  studies.  The nature  of these
 findings  may cause one  to adjust the overall
 categorization of the weight of evidence.

   , The agents are categorized into five groups as
 follows:

 '  ''Group A —Human Carcinogen

   -This group is used only when there  is sufficient
 evidence from epidemiologic studies to support a
 causal association between  exposure  to the agents
 and cancer.

    Group B — Probable Human Carcinogen

    This group includes agents for which the weight
 of evidence of human  carcinogenicity based on
.epidemiologic studies is "limited" and also includes
 agents for  which  the weight of  evidence of
.carcinogenicity based on animal  studies is
/'sufficient." The group  is divided into  two
 subgroups. Usually, Group Bl is reserved for agents
 for which there is limited evidence of carcinogenicity
 from epidemiologic studies. It is reasonable, for
 practical purposes, to regard an agent for which
 there is "sufficient" evidence of carcinogenicity in
 animals as if it presented a carcinogenic  risk to
 humans.  Therefore, agents for which  there is
 "sufficient" evidence  from animal studies and for
 which there is "inadequate  evidence" or "no data"
 from  epidemiologic studies  would usually be
 categorized under Group B2.

    Group C — Possible Human Carcinogen

    This group is  used for agents with limited
 evidence  of carcinogenicity in animals  in the
 absence of human data: It includes a wide variety of
 evidence, e.g., (a) a malignant tumor response in a
 single well-conducted experiment that does not meet
 conditions for sufficient  evidence,  (b)  tumor
 responses  of  marginal  statistical  significance in
 studies  having inadequate design or  reporting, (c)
 benign but not malignant tumors with an agent
 showing no response in a variety of short-term tests
 for mutagenicity, and  (d) responses of marginal
 statistical significance in a tissue known to have a
 high or variable background rate.

    Group  D  -- Not Classifiable  as to  Human
 Carcinogenicity

    This group is generally used for agents with
 inadequate  human and  animal evidence  of
 carcinogenicity or for which no data are available.

    Group E — Evidence of Non-Carcinogenicity for
 Humans    •      •

    This group is used  for  agents" that show no
 evidence for carcinogenicity in at least two adequate
 animal tests in different species or in both adequate
 epidemiologic and animal studies.

    The designation of an agent as being in Group E
 is based on the available evidence and should not be
 interpreted as a definitive conclusion that the agent
 will not be a carcinogen under any circumstances.

 V. References

 Albert, R.E., Train, R.E., and Anderson, E. 1977. Rationale
    developed by the Environmental Protection Agency for the
    assessment'of carcinogenic risks. J.  Natl. Cancer Inst.
    58:1537-1541.
 Crump, K.S., Hoel, D.G., Langley, C.H., Peto R. 1976.
    Fundamental carcinogenic processes and their implications
    for low dose risk assessment. Cancer Res. 36:2973-2979.
 Dedrick, R.L. 1973. Animal Scale Up. J. Pharmacokinet.
    Biopharm. 1:435-461.
 Feron, V.J., Grice, H.C., Griesemer, R., Peto R., Agthe, C., Althoff,
    J., Arnold, D.L., Blumenthal, H., Cabral, J.R.P., Delia Porta,
    G., Ito, N., Kimmerle, G., Kroes, R., Mohr, U., Napalkov,
    N.P., Odashima, S., Page, N.P., Schramm, T., Steinhoff, D.,
    Sugar, J., Tomatis, I.., Uehleke, H., and Vouk, V. 1980. Basic
    requirements for long-term assays for carcinogenicity. In:
    Long-term and short-term screening assays for carcinogens:
    a critical appraisal. I ARC Monographs, Supplement 2. Lyon,
    France: international Agency for Research on Cancer, pp 21-
    83.
 Freireich, E.J., Gehan, E.A., Rail, D.P., Schmidt, L.H., and
    Skipper, H.E. 1966. Quantitative comparison of toxicity of
    anticancer agents in mouse, rat, hamster, dog, monkey and
    man. Cancer Chemother. Rep. 50:219-244.
 Haseman, J.K. 1984. Statistical issues in the design, analysis and
    interpretation of animal carcinogenicity studies. Environ.
    Health Perspect. 58:385-392.
 Haseman, J.K., Huff, J., and Boorman, G.A. 1984. Use of
    historical control data in carcinogenicity studies in rodents.
    Toxicol.Pathol. 12:126-135.
 Interagency Regulatory Liaison Group (IRLG). 1979. Scientific
    basis for identification of potential  carcinogens and
    estimation of risks. J. Natl. Cancer Inst. 63:245-267.
 Interdisciplinary Panel on Carcinogenicity. 1984. Criteria for
    evidence of chemical carcinogenicity. Science 225:682-687.
 International Agency for Research on Cancer (I ARC). 1982. IARC
    Monographs on the
                   [51 FR 340011
                                 Evaluation  of the
Carcinogenic Risk of Chemicals to Humans, Supplement 4. Lyon,
France: International Agency for Research on Cancer.
 Mantel, N. 1980. Assessing laboratory evidence for neoplastic
    activity. Biometrics 36:381-399.
 Mantel, N., and Haenszel, W. 1959. Statistical aspects of the
    analysis of data from retrospective studiesofdisease. J.Natl.
    Cancer Inst. 22:719-748.
 National Center for Toxicological Research (NCTR). 1981.
    Guidelines for statistical tests for carcinogenicity in chronic
    bioassays. NCTR Biometry  Technical Report 81-001.
    Available from: National Center for Toxicological Research.
                                                  -93-

-------
  TABLE 1.-ILLUSTRATIVE CATEGORIZATION OF EVIDENCE BASED ON ANIMAL AND HUMAN DATAI


Sufficient
Limited
Inadequate
No data
No evidence
Animal evidence
Sufficient
A
B1
82
B2
B2
Limited
A
81
C
C
C
Inadequate
A
81
D
D
D
No data
A
81
D
D
D
No evidence
A
81
D
E1
E
      1 The above assignments are presented for illustrative purposes. There may be nuances in the classification of both
  animal and human data indicating that different categorizations than those given in the table should be assigned.
  Furthermore, these assignments are tentative and may be modified by ancillary evidence. In this regard all relevant
  information should be evaluated to determine if the designation of the overall weight of evidence needs to be modified.
  Relevant factors to be included along with the tumor data from human and animal studies include structure-activity
  relationships, short-term test findings, results of appropriate physiological, biochemical, and toxicological observations, and
  comparative metabolism and pharmacokinetic studies. The nature of these findings may cause an adjustment of the overall
  categorization of the weight of evidence.
National Research Council (NRG). 1983. Risk assessmentin the
    Federal government: managing the process. Washington,
    D.C.: National Academy Press.
National Toxicology Program. 1984. Report of the Ad Hoc Panel
    on Chemical Carcinogenesis Testing and Evaluation of the
    National Toxicology Program,  Board of Scientific
    Counselors. Available from: U.S. Government Printing
    Office, Washington, D.C. 1984-421 -132:4726.
Nutrition Foundation. 1983. The relevance of mouse liver
    hepatoma to human carcinogenic risk: a report  of  the
    International Expert Advisory Committee to the Nutrition
    Foundation. Available from: Nutrition Foundation. ISBN 0-
    935368-37-x.
Office of Science and Technology Policy (OSTP). 1985. Chemical
    carcinogens: review of the science and its  associated
    principles.FederalRegister50:10372-10442.
Peto, R., Pike, M., Day, N., Gray, R., Lee, P., Parish, S., Peto, J.,
    Richard, S., and Wahrendorf, J. 1980. Guidelines for simple,
    sensitive, significant tests for carcinogenic effects in long-
    term animal experiments. In: Monographs on the long-term
    and short-term  screening assays for Carcinogens: a critical
    appraisal. IARC Monographs, Supplement 2. Lyon, France:
    International Agency for Research on Cancer, pp.311 -426.
Pinkel, D. 1958. The use of body surface area as a criterion of drug
    dosage in cancer chemotherapy. Cancer Res. 18:853-856.
Tomatis, L. 1977. The value of long-term testing for the
    implementation of primary prevention. In: Origins of human
    cancer. Hiatt, H.H.,  Watson, J.D., and Winstein, J.A., eds.
    Cold Spring Harbor Laboratory, pp. 1339-1357.
U.S. Environmental Protection Agency (U.S. EPA). 1976. Interim
    procedures and guidelines for health risk and economic
    impact assessments of suspected carcinogens. Federal
    Register41:21402-21405.
U.S. Environmental Protection Agency (U.S. EPA). 1980. Water
    quality criteria documents; availability. Federal Register
    45:79318-79379.
U.S. Environmental Protection Agency (U.S. EPA). 1983a. Good
    laboratory practices  standards - toxicology testing. Federal
    Register48:53922.
U.S. Environmental Protection Agency (U.S. EPA). 1983b.
    Hazard evaluations: humans and domestic animals.
    Subdivision F. Available from: NTIS, Springfield, VA. PB 83-
    153916.
U.S. Environmental Protection Agency (U.S. EPA). 1983c. Health
    effects test guidelines. Available from: NTIS, Springfield,
    VA.PB 83-232984.
U.S. Environmental Protection Agency (U.S. EPA). 1986, Sept.
    24.Guidelines for estimating exposures. Federal Register 51
    (185): 34042-34054
U.S. Food and Drug Administration (U.S. FDA). 1 982.
    Toxicological principles for the safety assessment of direct
    food additives and color additives used in food. Available
    from: Bureau of Foods, U.S. Food and Drug Administration.
Ward, J.M., Griesemer, H.A., and Weisburger.E.K. 1979a.The
    mouse liver tumor as an endpoint in carcinogenesis tests.
    Toxicol. Appl. Pharmacol. 5 1 :389-397.
Ward, J.M., Goodman, D.G., Squire, R.A. Chu, K.C., and Linhart,
    M.S. 1979b. Neoplastic  and nonneoplastic lesions in aging
                                 mice. J. Natl. Cancer
    Inst. 63:849-854.

Part  B:  Response  to  Public and Science
Advisory Board Comments

/. Introduction
    This section summarizes the major issues raised
during both the public comment period  on the
Proposed Guidelines for  Carcinogen  Risk
Assessment published on November 23, 1984 (49 FR
46294),  and also during the April  22-23, 1985,
meeting of  the Carcinogen  Risk  Assessment
Guidelines Panel of the Science  Advisory Board
(SAB).
    In order  to respond to  these issues the  Agency
modified the proposed guidelines in  two  stages.
First,  changes resulting from consideration of the
public comments were made in a  draft sent to the
SAB review panel prior to  their April meeting.
Secondly, the guidelines were further  modified  in
response to the panel's recommendations.

    The Agency received 62 sets of comments during
the  public comment period, including 28 from
corporations,  9  from professional  or  trade
associations, and 4 from academic institutions.  In
general, the  comments  were  favorable.  The
commentors welcomed the update  of the 1976
guidelines and felt  that the proposed  guidelines of
                                                     -94-

-------
1985 reflected some of the progress that has occurred
in Understanding the mechanisms of carcinogenesis.
Many commentors, however, felt that additional
changes were warranted.

   The SAB concluded that the guidelines are
"reasonably complete in their conceptual framework
and are sound in their overall interpretation of the
scientific  issues"   (Report  by   the  SAB
Carcinogenicity Guidelines Review Group, June 19,
1985). The SAB suggested various editorial changes
and raised some issues regarding the content of the
proposed guidelines, which are discussed below.
Based on these recommendations, the Agency has
modified the draft guidelines.

II. Office of Science and Technology Policy Report on
Chemical Carcinogens

   Many commentors requested  that the  final
guidelines not be issued until after publication of the
report of the Office of Technology and Science Policy
(OSTP) on chemical carcinogens.  They further
requested that this report be incorporated into the
final Guidelines for Carcinogen Risk Assessment.

   The final OSTP report was published in 1985 (50
FR 10372). In its deliberations, the Agency reviewed
the final OSTP report and feels that the  Agency's
guidelines are consistent with the  principles
established by the OSTP. In its review,  the SAB
agreed that the Agency guidelines are generally
consistent with the OSTP report. To emphasize this
consistency, the  OSTP principles have  been
incorporated into the guidelines when controversial
issues are discussed.

III. Inference Guidelines

   Many commentors felt that the proposed
guidelines did  not  provide a sufficient  distinction
between scientific fact and policy decisions. Others
felt that EPA should not attempt to propose firm
guidelines in the absence of scientific consensus. The
SAB report also indicated the need to "distinguish
recommendations based on scientific evidence from
those based on science policy decisions."

   The Agency agrees with the recommendation
that policy, judgmental, or inferential decisions
should be clearly identified. In its  revision of the
proposed  guidelines, the  Agency has included
phrases (e.g., "the Agency takes the position that")
to more clearly distinguish policy decisions.

   The Agency also recognizes the need  to establish
procedures for action on important issues in the
absence of complete scientific  knowledge or
consensus. This need was acknowledged in'both the
National Academy of Sciences book entitled Risk
Management in the Federal  Government: Managing
the  Process and  the OSTP report on chemical
carcinogens. As the NAS report states, "Risk
assessment is  an analytic  process  that is firmly
based  on scientific considerations, but it  also_
requires judgments to be made when the available
information is incomplete.  These  judgments
inevitably draw on both  scientific  and policy
considerations."
                 151 PR 34002]
    The judgments of the Agency have been based on
current available scientific information and on the
combined experience of Agency experts. These
judgments, and the  resulting guidance,  rely on
inference; however, the positions taken  in  these
inference guidelines are felt to be reasonable and
scientifically defensible. While all of the guidance is,
to some degree, based on inference,  the guidelines
have  attempted to distinguish those  issues that
depended more oh judgment. In these cases, the
Agency has stated a position but has also retained
flexibility to accommodate new data or specific
circumstances that demonstrate that the proposed
position is inaccurate. The Agency recognizes that
scientific opinion will be divided on these issues.

    Knowledge   about   carcinogens   and
carcinogenesis is progressing at a rapid rate. While
these  guidelines are considered a best effort at the
present  time, the  Agency has attempted to
incorporate flexibility into the current guidelines
and also recommends that the guidelines be revised
as often as warranted by advances in the field.

IV. Evaluation of Benign Tumors

    Several commentors discussed the  appropriate
interpretation of an increased incidence of benign
tumors alone or with an increased incidence of
malignant tumors as part of the evaluation of the
carcinogenicity of an agent.' Some comments were
supportive of the position in the proposed guidelines,
i.e., under certain circumstances, the incidence of
benign and malignant tumors would be combined,
and an increased incidence of benign tumors  alone
would be considered an indication, albeit limited, of
carcinogenic potential. Other commentors raised
concerns about the criteria that would be  used to
decide which tumors should be combined. Only a few
commentors felt that benign tumors should never be
considered in evaluating carcinogenic potential.

  .  The Agency believes that current information
supports  the use of benign tumors. The guidelines
have  been modified to incorporate the  language of
the OSTP report,  i.e., benign tumors  will be
combined  with  malignant  tumors  when
scientifically defensible. This  position allows
flexibility in evaluating the data base for each
agent. The guidelines have also  been  modified to
indicate  that, whenever benign and malignant
tumors have been  combined, and the agent is
considered  a candidate for  quantitative  risk
extrapolation, the contribution  of benign tumors to
the estimation of risk will be indicated.

V.  Transplacental and Multigenerational  Animal
Bioassays
                                              -95-

-------
    As one of its two proposals- for additions to the
guidelines, the. SAB recommended a discussion of
transplacental and multigenerational animal
bioassays for carcinogenicity.

    The  Agency  agrees that such  data, when
available, can provide useful information  in the
evaluation of a chemical's potential carcinogenicity
and has stated this in  the  final guidelines. The
Agency has also revised the guidelines to indicate
that  such  studies  may provide  additional
information on the metabolic and pharmacokinetic
properties of the chemical. More guidance on the
specific use of these studies will be considered in
future revisions of these guidelines'.

VI. Maximum Tolerated Dose

    The  proposed guidelines discussed  the
implications of using a  maximum tolerated dose
(MTD) in bioassays  for carcinogenicity.  Many
commentors requested that EPA define MTD. The
tone  of the comments suggested that  the
commentors were concerned about the uses and
interpretations of high-dose testing.

    The  Agency  recognizes that controversy
currently surrounds these issues. The appropriate
text from the OSTP report has been  incorporated
into the final guidelines which suggests that the
consequences of high-dose testing be evaluated on a
case-by-case basis.
VII. Mouse Liver Tumors

    A large number  of commentors expressed
opinions about the assessment of bioassays in which
the only increase  in  tumor  incidence was liver
tumors in the mouse.  Many felt  that mouse liver
tumors were afforded too much credence, especially
given existing information that indicates that they
might arise by a different mechanism, e.g., tissue
damage followed by regeneration. Others felt that
mouse liver tumors were  but one case of a high
background incidence of one particular type of
tumor and that all such tumors should be treated in
the same fashion.

    The Agency has reviewed these comments and
the OSTP principle regarding this issue. The OSTP
report does not reach conclusions as to the treatment
of tumors with a high spontaneous background rate,
but states, as is now  included in the text  of the
guidelines, that  these data  require  special
consideration. Although questions have been raised
regarding the validity of mouse liver tumors in
general, the Agency feels that mouse liver tumors
cannot be ignored as an indicator of carcinogenicity.
Thus, the position in the proposed guidelines has not
been changed: an increased incidence of only mouse
liver tumors will be regarded as  "sufficient"
evidence of carcinogenicity if all other criteria, e.g.,
replication  and  malignancy, are met with  the
understanding that this classification could  be
changed to "limited" if warranted. The factors that
may cause this re-evaluation are indicated in the
guidelines.

VIII. Weight-of Evidence Catagories

    The Agency was praised by both the public and
the SAB for  incorporating a weight-of-evidence
scheme into  its evaluation of carcinogenic  risk.
Certain specific aspects of the scheme, however,
were criticized.

    1. Several commentors noted that while the text
of the proposed guidelines  clearly states that EPA
will use all available data in its categorization of the
weight of the evidence  that a  chemical  is  a
carcinogen, the classification system in Part A,
section IV did not indicate the manner in which EPA
will use information other  than data from humans
and long-term animal studies in assigning a weight-
of-evidence classification.
    The Agency has added a discussion to Part A,
section IV.C.  dealing with  the characterization of
overall evidence for human carcinogenicity.  This
discussion clarifies EPA's  use of  supportive
information to adjust, as warranted, the designation
that would have been made solely on the basis of
human and long-term animal studies.

    2. The Agency agrees with the SAB and those
commentors who felt that a simple classification of
the weight of evidence, e.g., a single letter or even a
descriptive title, is inadequate to describe fully the
weight of evidence for each individual chemical. The
final guidelines  propose that  a  paragraph
summarizing the  data should accompany the
numerical estimate and weight-of-evidence
classification whenever possible.

    3.  Several  commentors objected to  the
descriptive title E (No Evidence of Carcinogenicity
for Humans)  because they  felt the title would be
confusing  to people, inexperienced with  the
classification system. The title for  Group E, No
Evidence of  Carcinogenicity for Humans,  was
thought by these commentors to suggest the absence
of data. This group, however, is  intended  to be
reserved for agents for-which there exists credible
data  demonstrating that  the  agent is  not
carcinogenic.
    Based on  these comments  and further
discussion, the Agency has changed the    .
                 [51FR34003]
                               title  of Group  E
to "Evidence cf Non-Carcinogenicity for Humans,"

    4. Several commentors felt that the title for
Group C, Possible Human Carcinogen, was not
sufficiently distinctive from Group B, Probable
Human Carcinogen.  Other commentors felt that
those agents that minimally qualified for Group  C
would lack sufficient data for such a label.
    The Agency recognizes that Group C covers a
range of chemicals and has considered whether to
subdivide Group C. The consensus  of the Agency's
                                             -96-

-------
Carcinogen Risk Assessment Committee, however,
is that the current groups, which are based on the
IARC categories, are a reasonable stratification and
should be retained at present. The structure of the
groups will.be reconsidered when the guidelines are
reviewed in  the future. The Agency also feels that
the descriptive title it originally selected best
conveys the meaning of the classification within the
context of EPA's past and current activities.

    5. Some  commentors indicated a concern about
the distinction between Bl and B2 on the basis of
epidemiologic evidence only. This issue has been
under discussion in the Agency and may be revised
in future versions of the guidelines.

    6. Comments were also received about the
possibility of keeping the groups for animal and
human data separate without reaching a combined
classification. The Agency feels  that a combined
classification is  useful; thus, the  combined
classification was retained in the final guidelines.

    The SAB suggested that a table be added to Part
A, section IV  to indicate  the  manner in which
human  and  animal  data  would  be combined  to
obtain an overall weight-of-evidence category. The
Agency realizes that a table that would present all
permutations of potentially available data would be
complex and possibly impossible to construct since
numerous combinations  of ancillary data  (e.g.,
genetic toxicity, pharmacokinetics) could be used to
raise or lower the weight-of-evidence classification.
Nevertheless, the Agency decided to include a table
to illustrate the most probable  weight-of-evidence
classification that would be assigned on the basis of
standard animal   and human data  without
consideration of the ancillary data. While it is hoped
that this table will clarify  the weight-of-evidence
classifications, it is also important to recognize that
an agent may be assigned to a final categorization
different from the category which would appear
appropriate from the table and still conform to the
guidelines.

IX. Quantitative Estimates of Risk

    The  method for  quantitative estimates of
carcinogenic risk in the proposed guidelines received
substantial comments from the  public. Five issues
were discussed by the Agency and have resulted in
modifications of the guidelines.

    1. The major criticism was the perception that
EPA  would  use  only one  method  for the
extrapolation  of carcinogenic risk and would,
therefore, obtain  one estimate of risk.  Even
commentors  who concur with the procedure usually
followed by EPA felt that some indication of the
uncertainty of the risk estimate should be included
with the risk estimate.
    The Agency feels  that the proposed guidelines
were  not intended  to suggest that EPA would
perform quantitative risk estimates in a rote or
 mechanical fashion. As indicated by the  OSTP
 report and paraphrased in the proposed guidelines,
 no single mathematical procedure has been
 determined to be the most appropriate method for
 risk extrapolation. The final guidelines quote rather
 than paraphrase the OSTP principle. The guidelines
 have been revised to stress the importance of
 considering all available data in the risk assessment
 and now  state, "The  Agency will review each
 assessment as to the  evidence on carcinogenic
 mechanisms and other biological or statistical
 evidence that indicates the suitability of a particular
 extrapolation model." Two issues are  emphasized:
 First,  the text  now indicates the potential for
 pharmacokinetic information to  contribute to the
 assessment of carcinogenic risk.  Second,  the final
 guidelines  state that  time-to-tumor risk
 extrapolation  models  may  be  used when
 longitudinal data  on  tumor development are
 available.

    2.  A number of commentors noted  that  the
 proposed  guidelines did not indicate  how the
 uncertainties of risk characterization would be
 presented. The Agency has revised the  proposed
 guidelines to indicate that major assumptions,
 scientific judgments, and, to the extent  possible,
 estimates  of the  uncertainties embodied in the risk
 assessment will be presented  along with the
 estimation of risk.

    3.  The proposed guidelines  stated  that the
 appropriateness of quantifying risks for chemicals in
 Group  C (Possible Human Carcinogen), specifically
 those agents that were on the boundary of Groups C
 and  D   (Not  Classifiable   as  to  Human
 Carcinogenicity), would be judged on a case-by-case
 basis. Some commentors felt that quantitative risk
 assessment should not be performed on any agent in
 Group C.
    Group C includes a wide range of agents,
 including  some for which there are positive results
 in one species in one  good bioassay. Thus, the
 Agency feels that many agents in Group C will be
 suitable for quantitative risk  assessment, but that
judgments in this regard will be made on a case-by-
 case basis.

    4. A few commentors felt that EPA intended to
 perform quantitative risk estimates on aggregate
 tumor incidence. While  EPA will consider an
 increase in total aggregate tumors as suggestive of
 potential carcinogenicity, EPA does not generally
 intend to make quantitative  estimates of
 carcinogenic risk based on total  aggregate tumor
 incidence.

    5. The proposed choice of body surface area as an
 interspecies scaling factor was criticized by several
 commentors who felt that body weight was also
 appropriate and that both methods should be used.
 The OSTP report recognizes that both scaling factors
 are in common use. The Agency feels that the choice
 of the  body surface area scaling factor can be
                                               -97-

-------
  justified from the data on effects of drugs in various
  species. Thus, EPA will continue to use this scaling
  factor unless data on a specific agent suggest that a
  different scaling factor is justified. The uncertainty
  engendered by choice of .scaling factor will be
  included in the summary of uncertainties associated
  with the assessment of risk  mentioned  in point 1,
  above.

     In the second of its two proposals for additions to
  the proposed guidelines, the SAB suggested that a
  sensitivity  analysis  be  included  in  EPA's
  quantitative estimate  of a chemical's carcinogenic
  potency. The Agency agrees that an analysis of the
  assumptions and  uncertainties  inherent in an
  assessment of carcinogenic risk  must be  accurately
  portrayed. Sections of  the final guidelines that deal
  with this issue have been strengthened to reflect the
  concerns of the SAB and the Agency. In particular,
  the last paragraph of the guidelines states  that
  "major assumptions, scientific judgments,  and, to
  the extent possible, estimates of the uncertainties
  embodied in the assessment" should be presented in
  the summary characterizing the risk.  Since  the
  assumptions and uncertainties  will vary for each
  assessment, the  Agency  feels that  a  formal
  requirement for a particular type of sensitivity
  analysis would be less useful than a case-by-case
  evaluation of the  particular  assumptions  and
  uncertainties most significant for a particular risk
  assessment.
*US GOVERNMENT PRIhrrWCOFHCEd 992 -750-002/60100
                                               -98-

-------