# Document Display

Include current hits
 Describe the error you saw: E-mail Address (Highly Recommended) When you have finished entering your information, click the Submit Error button.

Page 1 of 276

<pubnumber>630R98004</pubnumber>
<title>Report of the Workshop on Selecting Input Distributions for Probabilistic Assessments</title>
<pages>276</pages>
<pubyear>1998</pubyear>
<provider>NEPIS</provider>
<access>online</access>
<operator>BO</operator>
<scandate>06/08/99</scandate>
<origin>hardcopy</origin>
<type>single page tiff</type>
<keyword>data population representativeness distribution risk distributions assessment one exposure surrogate issue concern uncertainty edf expert fit workshop variability sample analysis</keyword>

&EPA
United States
Environmental Protection
Agency
Office of Research and
Development
Washington DC 20460
EPA/630/R-98/004
January 1999
Report of the Workshop on
Selecting Input
Distributions for
Probabilistic Assessments
RISK ASSESSMENT FORUM
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

EPA/630/R-98/004
January 1999
Report of the Workshop on
Selecting Input Distributions
For Probabilistic Assessments

U.S. Environmental Protection Agency
New York, NY
April 21-22, 1998
Risk Assessment Forum
U.S. Environmental Protection Agency
Washington, DC 20460
Printed on Recycled Paper
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

NOTICE
This document has been reviewed in accordance with U.S. Environmental Protection Agency
(EPA) policy and approved for publication.  Mention of trade names or commercial products does not
constitute endorsement or recommendation for use.

This report was prepared by Eastern Research Group, Inc. (ERG), an EPA contractor  (Contract
No. 68-DS-0028, Work Assignment No. 98-06) as a general record of discussions during the Workshop
on Selecting Input Distributions for Probabilistic Assessments. As requested by EPA, this report
captures the main points and highlights of discussions held during plenary sessions. The report is not a
complete record of all details discussed nor  does it embellish, interpret, or enlarge upon matters that were
incomplete or unclear. Statements represent the individual views of each workshop participant; none of
the statements represent analyses by or positions of the Risk Assessment Forum or the EPA.
11

image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

CONTENTS
Page

SECTION ONE        INTRODUCTION		 1-1

1.1     Background and Purpose	 1-1
1.2     Workshop Organization	 1-2

SECTION TWO        CHAIRPERSON'S SUMMARY 		2-1

2.1     Representativeness	 2-1
2.2     Sensitivity Analysis	2-2
2.3     Making Adjustments to Improve Representation  	-	 2-4
2.4     Empirical and Parametric Distribution Functions	,	 2-6
2.5     Goodness-of-Fit	 2-8

SECTION THREE      OPENING REMARKS ....	  	 3-1

3.1     Welcome and Regional Perspective	 3-1
3.2     Overview and Background	 3-1
3.3     Workshop Structure and Objectives  	 3-2

SECTION FOUR       ISSUE PAPER PRESENTATIONS	 4-1

4.1     Issue Paper on Evaluating Representativeness of Exposure Factors Data  	 4-1
4.2     Issue Paper on Empirical Distribution Functions and Non-Parametric Simulation.  . 4-2

SECTION FIVE        EVALUATING REPRESENTATIVENESS OF
EXPOSURE FACTORS DATA	 5-1

5.1     Problem Definition	 5-1

5.1.1     What information is required to specify a problem definition fully? .... 5-1
5.1.2     What constitutes representativeness (or lack thereof)?
What is "acceptable deviation"?	 5-3
5.1.3     What considerations should be included in, added to, or
excluded from the checklists?	 5-6

5.2     Sensitivity	 5-8
5.4     Summary of Expert Input on Evaluating Representativeness	  5-14
111
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

SECTION SEX
CONTENTS (Continued)
EMPIRICAL DISTRIBUTION FUNCTIONS AND
RESAMPLING VERSUS PARAMETRIC DISTRIBUTIONS      6-1
6.1     Selecting an EDF or PDF	 6'1
6.2     Goodness-of-Fit (GoF)	-	 6'5
6.3     Summary of EOF/PDF and GoF Discussions	 6-7

SECTION EIGHT    REFERENCES  	 	  	 8-1

APPENDICES

APPENDIX A  Issue Papers 	A-1

APPENDIXB  List of Experts and Observers	 -	B-l

APPENDIX C  Agenda	•	C'1

APPENDEXD  Workshop Charge	D'1

APPENDIX E  Breakout Session Notes	 . E-l

APPENDIXH  Presentation Materials 	H-l
IV
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

SECTION ONE

INTRODUCTION
1.1     BACKGROUND AND PURPOSE

The U.S. Environmental Protection Agency (EPA) has long emphasized the importance of
adequately characterizing uncertainty, and variability in its risk assessments, and it continuously studies
various quantitative techniques for better characterizing uncertainty and variability. Historically, Agency
risk assessments have been deterministic (i.e., based on a point estimate), and uncertainty analyses have
been largely qualitative. In May 1997, the Agency issued a policy on the use of probabilistic techniques
in characterizing uncertainty and variability.  This policy recognizes that probabilistic analysis tools like
Monte Carlo analysis are acceptable provided that risk assessors present adequate supporting data and
credible assumptions.: The policy  also identifies several implementation  activities that are designed to
help Agency assessors review and prepare probabilistic assessments.

To this end, EPA's Risk Assessment Forum (RAF) is developing a framework for selecting input
distributions for probabilistic assessment. This framework emphasizes parametric distributions,
estimations of the parameters of candidate distributions, and evaluations  of the candidate distributions'
quality of fit. A technical panel, convened under the auspices of the RAF, began work on the framework
in the summer of 1997. In September 1997, EPA sought input on the framework from 12 experts from
outside the Agency.  The group's recommendations included:
Expanding the framework's discussion of exploratory data analysis and graphical
methods for assess the quality of fit.

Discussing distinctions between variability and uncertainty and their implications.
:
Discussing empirical distributions and bootstrapping.

Discussing correlation and its implications.

Making the framework available to the risk assessment community as soon as possible.
In response to this input, EPA initiated a pilot program in which the Research Triangle Institute
(RTI) applied the framework for fitting distributions to data from EPA's Exposure Factors Handbook
(EFH) (US EPA, 1996a). RTI used three exposure factors—drinking water intake, inhalation rate, and
residence time—as test cases. Issues highlighted as part of this effort fall into two broad categories: (1).-,
issues associated with the representativeness of the data, and (2) issues associated with using the
Empirical Distribution Function (EDF) (or resampling techniques) versus using a theoretical Parametric
Distribution Function (PDF).

In April 1998, the RAF organized a 2-day workshop, "Selecting Input Distributions for
Probabilistic Assessments," to solicit expert input on these and related issues. Specific workshop goals
included:

1-1
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

•      Discussing issues associated with the selection of probability distributions.

•      Obtaining expert input on measurements, extrapolations, and adjustments.

•      Discussing qualitatively how to make quantitative adjustments.
EPA developed two issue papers to serve as a focal point for discussions:  "Evaluating
Representativeness of Exposure Factors Data" and "Empirical Distribution Functions and Non-
parametric Simulation."  These papers which were developed strictly to prompt discussions during the
workshop are found in Appendix A. Discussions during the 2-day workshop focused on technical issues,
not policy. The experts discussed issues that would apply to any exposure data.

This workshop report is intended to serve as an information piece for Agency assessors who
prepare or review assessments based on the use of probabilistic techniques and who work with various
exposure data. This report does not represent Agency guidance. It simply attempts to capture the
technical rigor of the workshop discussions and will be used to support further development and
application of probabilistic analysis techniques/approaches.
1.2    WORKSHOP ORGANIZATION

The workshop was held on April 21 and 22, 1998, at the EPA Region 2 offices in New York
City.  The 21  participants, experts in exposure and risk assessment, included biologists, chemists,
engineers, mathematicians, physicists, statisticians, and toxicologists, and represented industry,
academia, state agencies, EPA, and other federal agencies. A limited number of observers also attended
the workshop. The experts and observers are listed in Appendix B.

The workshop agenda is in Appendix C. Mr. McCabe (EPA Region 2), Steven Knott of the
RAF, and Dr. H. Christopher Frey, workshop facilitator, provided opening remarks. Before discussions
began, Ms. Jacqueline Moya and Dr. Timothy Barry of EPA summarized the two issue papers.

During the 2-day workshop, the technical experts exchanged ideas in plenary and four small
group breakout sessions. Discussions centered on the two issue papers distributed for review and
comment before the workshop. Detailed discussions focused primarily on the questions in the charge
(Appendix D). "Brainwriting" sessions were held within the smaller groups. Brainwriting, an  interactive
technique, enabled the experts to document their thoughts on a topic and build on each others'  ideas.
Each  small group captured the essence of these sessions and presented the main ideas to the entire group
during plenary sessions. A compilation of notes from the breakout sessions are included in Appendix E.
Following expert input, observers were allowed to address the panel with questions or comments. In
addition to providing input at the workshop, several experts provided pre- and postmeeting comments,
which are in Appendices F and G, respectively.

Section Two of this report contains the chairperson's summary of the workshop. Section Three
highlights workshop opening remarks.  Section Four summarizes Agency presentations of the two issue
papers. Sections Five and Six describe  expert input on the two main topic areas—representativeness and
Et)F/PDF issues. Speakers' presentation materials (overheads and supporting papers) are included in
Appendix H.

"                            1-2
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

SECTION TWO

CHAIRPERSON'S SUMMARY
Prepared by: H. Christopher Frey, Ph.D.
The workshop was comprised of five major sessions, three of which were devoted to the issue of
representativeness and two to issues regarding parametric versus empirical distributions and goodness-of-
fit. Each session began with a trigger question. For the three sessions on representativeness, there was
discussion in a plenary setting, as well as discussions within four breakout groups. For the two sessions
regarding selection of parametric versus empirical distributions and the use  of goddness-of-fit tests, the
discussions were conducted in plenary sessions.
2.1    REPRESENTATIVENESS

The first session covered three main questions, based on the portion of the workshop charge
(Appendix D) requesting feedback on the representativeness issue paper.  After some general discussion,
the following three trigger questions were formulated and posed to the group:

1.      What information is required to fully specify a problem definition?

2.      What constitutes (lack of) representativeness?

3.      What considerations should be included in, added to, or excluded from the checklists
given in the issue paper on representativeness (Appendix A)?

The group was then divided into four breakout groups, each of which addressed ajl three of these
questions. Each group was asked to use an approach known as "brainwriting." Brainwriting is intended
to be a silent activity in which each member of a group at any given time puts thoughts down on paper in
response to a trigger question. After completing an idea, a group member exchanges papers with another
group member. Typically, upon reading what others have written, new ideas are generated and written
down.  Thus, each person has a chance to read and respond to what others have written. The advantages
of brainwriting are that all participants can generate  ideas simultaneously, there is less of a problem with
domination of the discussion by just a few people, and a written record is produced as part of the process.
A disadvantage is that there is less "interaction" with the entire group. After the brainwriting activity
was completed, a representative of each group reported the main ideas to the entire group.

The experts generally agreed that before addressing the issue of representativeness, it is
necessary to have a clear problem definition.  Therefore, there was considerable discussion of what
factors must be considered to ensure a complete problem definition. The most general  requirement for a
good problem definition, to which the group gave general assent, is to specify the "who, what, when,
where, why, and how." The "who"  addresses the population of interest.  "Where" addresses the spatial
characteristics of the assessment. "When" addresses the temporal characteristics of the assessment.
"What" relates to the specific chemicals and health effects of concern. "Why" and "how" may help
clarify the previous matters. For example, it is helpful to know that exposures occur because of a
particular behavior (e.g., fish consumption) when attempting to define an exposed population and  the
spatial and temporal extent of the problem. Knowledge of "why" and "how" is also useful later for

2-1
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

proposing mitigation or prevention strategies.  The group in general agreed upon these principles for a
problem definition, as well as the more specific suggestions detailed in Section 5.1.1 of this workshop
report.

In regard to the second trigger question, the group generally agreed that "representativeness" is
CQntext-specific. Furthermore, there was a general trend toward finding other terminology instead of
using the term "representativeness." In particular, many the group concurred that an objective in an
assessment is to mak'e sure that it is "useful and informative" or "adequate" for the purpose at hand. The
acfequacy of an assessment may be evaluated with respect to considerations such as "allowable error" as
well as, practical matters such as the ability to make measurements that are reasonably free of major
errors or to reasonably interpret information from other sources that are used as an input to an
assessment. Adequacy may be quantified, in principle, in terms of the precision and accuracy of model
inputs and model outputs. There was some discussion of how the distinction between variability and
uncertainty relates to assessment of adequacy. For example, one may wish to have accurate predictions
of exposures for more than one percentile of the population, reflecting variability. For any given
percentile of the population, however, there may be uncertainty in the predictions of exposures.  Some
individuals pointed out that, because often it is not possible to fully validate  many exposure predictions
or to obtam input information that is free of error or uncertainty, there is an inherently subjective element
irt assessing adequacy, the stringency of the requirement for adequacy will depend on the purpose of the
assessment. It was noted, for example, that it may typically be easier to adequately define mean values of
exposure than upper percentile values of exposure. Adequacy is also a function of the level of detail of an
assessment; the requirements for adequacy of an initial, screening-level calculation will typically be less
rigorous than those for a more detailed analysis.

Regarding the third trigger question, the group was generally complimentary of the proposed
checklists in the representativeness issue paper (see Appendix A). The group, however, had many
suggestions for improving the checklists. Some of the broader concerns were about how to make the
checklists context-specific, because the degree of usefulness of information  depends on both the quality
of the information and the purpose of the assessment. Some of the specific suggestions included using
flowcharts rather than lists; avoiding overlap among the flowcharts or lists; developing an interactive
Web-based flowchart that would be flexible and context-specific; and clarifying terms used in the issue
paper (e.g., "external" versus "internal" distinction). The experts also suggested that the checklists or
flowcharts encourage additional data collection where appropriate and promote a "value of information"
approach to help prioritize additional data collection. Further discussion of the group's comments is
given in Section 5.1.3.

2,2    SENSITIVITY ANALYSIS

The second session was devoted to issues encapsulated in the following trigger questions:

fjow can one do sensitivity analysis to evaluate the implications of non-representativeness? In
other words, how do we assess the importance of non-representativeness?

The experts were asked to consider data, models, and methods in answering these questions.
Furthermore, the group was asked to keep in mind that the charge requested recommendations for
immediate, short-term, and long-term studies or activities that could be done to provide methods or

2-2
_
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

There were a variety of answers to these questions.  A number of individuals shared the view that
non-representativeness may not be important in many assessments. Specifically, they argued that many
assessments and decisions consider a range of scenarios and populations. Furthermore, populations and
exposure scenarios typically change over time, so that if one were to focus on making an assessment
"representative" for one point in time or space, it could fail to be representative at other points in time or
space or even for the original population of interest as individuals enter, leave, or change within the
exposed population.  Here again the notion of adequacy, rather than representativeness, was of concern to
the group.

The group reiterated that representativeness is context-specific.  Furthermore, there was some
discussion of situations in which data are collected for "blue chip" distributions that are not specific to
any particular decision.  The experts did recommend that, in situations where there may be a lack of
adequacy of model predictions based on available information, the sensitivity of decisions should be
evaluated under a range of plausible adjustments to the input assumptions. It was suggested that there
may be multiple tiers of analyses, each with a corresponding degree of effort and rigor regarding
sensitivity analyses.  In a "first-tier" analysis, the use of bounding estimates may be sufficient to establish
sensitivity of model predictions with respect to one or more model outputs, without need for a
probabilistic  analysis. After a preliminary identification of sensitive model inputs, the next step would
typically be to develop a probability distribution to represent a plausible range of outcomes for each of
the sensitive  inputs.  Key questions to be considered are whether to attempt to make adjustments to
improve the adequacy or representativeness of;the assumptions and/or whether to collect additional data
to improve the characterization of the input assumptions.

question: "Are the data good enough to replace an assumption?"  If not, then additional data collection is
likely to be needed. One would need to assess whether the needed data can be collected. A "value of
information" approach can be useful in prioritizing data collection and in determining when sufficient
data have been collected.

There was some discussion of sensitivity analysis of uncertainty versus sensitivity analysis of
variability. The experts generally agreed that sensitivity analysis to identify  key sources of uncertainty is
a useful and appropriate thing to do. There was disagreement among the experts regarding the meaning
of identifying key sources of variability. One expert argued that identifying key sources of variability is
not useful, because variability is irreducible. However, knowledge of key sources of variability can be
useful in identifying key characteristics of highly exposed subpopulations or in formulating prevention or
mitigation measures. Currently, there are many methods that exist for doing sensitivity analysis,
including running models for alternative scenarios and input assumptions and the use of regression or
statistical methods to identify the most sensitive input distributions in a probabilistic analysis. In the
short-term and long-term, it was suggested that some efforts be devoted to the development of "blue
chip" distributions for quantities that are widely used in many exposure assessments (e.g., intake rates of
various foods). It was also suggested that new methods for sensitivity analysis might be obtained from
other fields, with specific examples based on classification schemes, time series, and "g-estimation."
2-3
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

2.3    MAKING ADJUSTMENTS TO IMPROVE REPRESENTATION

In the third session, the group responded to the following trigger question:

How can one make adjustments from the sample to better represent the population of interest?

The group was asked to consider "population," spatial, and temporal characteristics when
considering issues of representativeness and methods for making adjustments. The group was asked to
provide input regarding exemplary methods and information sources that are available now to help in
making such adjustments, as well as to consider short-term and long-term research needs.

The group clarified some of the terminology that was used in the issue paper and in the
discussions The term "population" was defined as referring to "an identifiable group of people." The
experts noted that often one has a sample of data from a "surrogate; population," which is not identical to
the "target population" of interest in a particular exposure assessment. The experts also noted that there
Js a difference between the "analysis" of actual data pertaining to the target population and
"extrapolation" of information from data for a surrogate population to make inferences regarding a target
population. It was noted that extrapolation always "introduces" uncertainty,

On the temporal dimension, the experts noted that, when data are collected at one point in time
and are used in an assessment aimed at a different point in time, a potential problem may occur because
of shifts  in the characteristics of populations between the two periods.

Reweighting of data was one approach that was mentioned in the plenary discussion. There was
a, discussion of "general" versus mechanistic approaches for making adjustments.  The distinction here
was that "general" approaches might be statistical, mathematical, or empirical in their foundations (e.g.,
regression analysis), whereas mechanistic approaches would rely on theory specific to a particular
problem area (e.g., a physical, biological, or chemical model).  It was noted that temporal  and spatial
issues are often problem-specific, which makes it difficult to recommend universal approaches for
leaking adjustments.  The group generally agreed that it is desirable to include or state the uncertainties
associated with extrapolations.  Several participants strongly expressed the view that "it is okay to state
what you don't know," and there was no disagreement on this point.

The group recommended that the basis for making any adjustments to assumptions regarding
populations should be predicated on stakeholder input and the examination of covariates.  The
group noted that methods for analyzing spatial and temporal aspects exist, if data exist. Of course, a
common problem is scarcity of data and a subsequent reliance on surrogate information. For assessment
of spatial variations, methods such as kreiging and random fields were commonly suggested. For
assessment of temporal variations, time series methods were suggested-

There was a lively discussion regarding whether adjustments should be "conservative." Some
gxperts  initially argued that, to protect public health, any adjustments to input assumptions should tend to
be biased in a conservative manner (so as not to make an error of understating a health risk, but with
some nonzero probability of making an error of overstating a particular risk). After some additional
discussion, it appeared that the experts were in agreement that one should strive primarily for accuracy
and that ideally any adjustments that introduce "conservatism" should be left to decision makers. It was
pointed out that invariably many judgments go into the development of input assumptions for an analysis
and that these judgments  in reality often introduce some conservatism.  Several pointed out that

24
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

"conservatism" can entail significant costs if it results in over control or misidentification of important
risks.  Thus, conservatism in individual assessments may not be optimal or even conservative in a
Therefore, the overall recommendation of the experts regarding this issue is to strive for accuracy rather
than conservatism, leaving the latter as an explicit policy issue for decision makers to introduce, although
it is clear that individual participants had somewhat differing views.

The group's recommendations regarding measures that can be taken now include the use of
stratification to try to reduce variability and correlation among inputs in an assessment, brainstorming to
generate ideas regarding possible adjustments that might be made to input assumptions, and stakeholder
input for much the same purpose, as well as to make sure that no significant pathways or scenarios have
been overlooked. It was agreed that "plausible extrapolations" are reasonable when making adjustments
to improve representativeness or adequacy.  What is "plausible" will be context-specific.

In the short term, the experts recommended that the following activities be conducted:

Numerical Experiments. Numerical experiments can be used to test existing and new methods
for making adjustments based on factors such as averaging times or averaging areas. For
example, the precision and accuracy of the Duan- Wallace model (described in the
representativeness issue paper in Appendix A) for making adjustments from one averaging time
to another can be evaluated under a variety of conditions via numerical experiments.

Workshop on Adjustment Methods. The experts agreed in general that there are many potentially
useful methods for analysis and adjustment but that many of these are to be found in fields
outside the risk analysis community. Therefore, it would be useful to convene a panel of experts
from other fields for the purpose of cross-disciplinary exchange of information regarding
methods applicable to risk analysis problems.  For example, it was .suggested that geostatistical
methods should be investigated.

.   Put Data on the Web. There was a fervent plea from at least one expert  that data for "blue chip"
and other commonly used distributions be placed on the Web to facilitate the dissemination and
analysis of such data.  A common concern is that often data are reported in summary form, which
makes it difficult to analyze the data (e.g., to fit distributions). Thus, the recommendation
includes the placement of actual data points, and not just summary data, on publicly accessible
Web sites.

Suggestions on How to Choose a Method. The group felt that, because of the potentially large
number of methods and the need for input from people in other fields, it was unrealistic to
provide recommendations regarding specific methods for making adjustments. However, they
did suggest that it would be possible to create a set of criteria regarding desirable features for
such methods that could help an assessor when making choices among many options.

In the longer term, the experts recommend that efforts be directed at more data collection, such
as improved national or regional surveys, to better capture variability as a function of different
populations, locations, and averaging times.  Along these lines, specific studies could be focused on the
development or refinement of a select set of "blue chip" distributions, as well as  targeted at updating or
extending existing data sets to improve their flexibility for use in assessments of various populations,
2-5
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

locations, and averaging times. The group also noted that because populations, pathways, and scenarios
change over time, there will be a continuing need to improve existing data sets.
2.4    EMPIRICAL AND PARAMETRIC DISTRIBUTION FUNCTIQNS

In the fourth session, the experts began to address the second main set of issues as given in the
charge. The trigger question used to start the discussion was:

What are the primary considerations in choosing between the use of parametric distribution
functions (PDFs) and empirical distribution functions (EDFs)?

The group was asked to consider the advantages of using one versus the other, whether the
choice is merely a matter of preference, whether one is preferred, and whether there are cases when
neither should be used.

The initial discussion involved clarification of the difference between the terms EDF and
"bootstrap." Bootstrap simulation is a general technique for estimating confidence intervals and
characterizing sampling distributions for statistics, as described by Efron and Tibshirani (1993). An EDF
can be described as a stepwise cumulative distribution function or as a probability density function in
Which each data point is assigned an equal probability. Non-parametric bootstrap can be used to quantify
sampling distributions or confidence intervals for statistics based upon the EDF, such as percentiles or
moments. Parametric bootstrap methods can be used to quantify sampling distributions or confidence
intervals for statistics based on PDFs. Bootstrap methods are also often referred to as "resampling"
methods. However, "bootstrap" and EDF are not the same thing.

The experts generally agreed that the choice of EDF versus PDF is usually a matter of
preference, and they also expressed the general opinion that there should be no rigid guidance requiring
the use of one or the other in any particular situation. The group briefly addressed the notion of
consistency. While consistency in the use of a particular method (e.g., EDF or PDF in this case) may
offer benefits in terms of simplifying analyses and helping decision makers, there was a concern that any
strict enforcement of consistency will inhibit the development of new methods or the acquisition of new
data and may also lead to compromises from better approaches that are context-specific.  Here again, it is
important to point out that the experts explicitly chose not to recommend the use of either EDF or PDF as
a single preferred approach but rather to recommend that this choice be left to the discretion of assessors
on a case-by-case basis. For example, it could be reasonable for an assessor to include EDFs for some
inputs and PDFs for others even within the same analysis.
ii'ii •             .            .1.     '  ;     •   >••      ''i'',     '•'  •   ' !
Some participants gave examples of situations in which they might prefer to use an EDF, such as:
(a) when there are a large number of data points (e.g., 12,000); (b) access to high speed data storage and
retrieval systems; (c) when there is no theoretical basis for selecting a PDF; and/or (d) when one has an
''ideal" sample.  There was some discussion of preference for use of EDFs in  "data-rich" situations rather
than  "data-poor" situations. However, it was noted that "data poor" is context-specific. For example, a
data set may be adequate for estimating the 90th percentile but not the 99th percentile. Therefore, one
may  be "data rich"  in the former case and "data poor" in the latter case with the same data set.

Some experts also gave examples of when they would prefer to  use PDFs. A potential limitation
gf conventional EDFs is that they are restricted to the range of observed data.  In contrast, PDFs typically
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

intuitive or theoretical appeal.  PDFs are also preferred by some because they provide a compact
representation of data and can provide insight into generalizable features of a data set. Thus, in contrast
to the proponent of the use of an EDF for a data set of 12,000, another expert suggested it would be
easier to summarize the data with a PDF, as long as the fit was reasonable. At least one person suggested
that a PDF may be easier to defend in a legal setting, although there was no consensus on this point.

For both EDFs and PDFs, the issue of extrapolation beyond the range of observed data received
considerable discussion.  One expert stated that, the "further we go out in the tails, the less we know," to
which another responded, "when we go beyond the data, we know nothing." As a rebuttal, a third expert
asked "do we really know nothing beyond the maximum data point?" and suggested that analogies with
similar situations may provide a basis for judgments regarding extrapolation beyond the observed data.
Overall, most or all of the experts appeared to support some approach to extrapolation beyond observed
data, regardless of whether one prefers an EDF  or a PDF. Some argued that one has more control over
extrapolations with EDFs, because there are a variety of functional forms that can be appended to create
a "tail" beyond the range  of observed-data.  Examples of these are described in the issue paper. Others
argued that when there is a theoretical basis for selecting a PDF, there is also some theoretical basis for
extrapolating beyond the  observed data.  It was  pointed out that one should not always focus on the
"upper" tail; sometimes the lower tail of a model input may lead to extreme values of a model output
(e.g., such as" when an input appears in a denominator).

There was some discussion of situations in which neither an EDF or a PDF may be particularly
desirable. One suggestion was that there may be situations in which explicit enumeration of all
combinations of observed data values for all model inputs, as opposed to a probabilistic resampling
scheme, may be desired.  Such an approach can help, for example, in tracing combinations of input
values that produce extreme values  in model outputs. One expert suggested that neither EDFs nor PDFs
are useful when there must be large extrapolations into the tails of the distributions.

A question that the group chose to address was, "How much information do we lose in the tails
of a model output by not knowing the tails of the model inputs?"  One comment was that it may not be
necessary to accurately characterize the tails of all model inputs because the tails (or extreme values) of
model outputs  may depend on a variety of other combinations of model  input values.  Thus,  it is possible
that even if no effort is made to extrapolate beyond the range of observed data in model inputs, one may
still predict extreme values in the model outputs. The use of scenario analysis was suggested as an
alternative or supplement to probabilistic analysis in situations in which either a particular input cannot
reasonably be assigned a probability distribution or when it may be difficult to estimate the tails of an
important input distribution.  In the latter case, alternative upper bounds on the distribution, or alternative
assumptions regarding extrapolation to the tails, should be considered as scenarios.

Uncertainty in EDFs and PDFs was discussed. Techniques for estimating uncertainties in the
statistics (e.g.,  percentiles) of various distributions, such as bootstrap simulation, are available.  An
example was presented for a data set of nine measurements, illustrating how the uncertainty  in the fit of a
parametric distribution was greatest at the tails.  It was pointed out that when considering alternative
PDFs (e.g., Lognormal vs. Gamma) the range of uncertainty in the upper percentiles of the alternative
distributions will typically overlap; therefore, apparent differences in the fit of the tails may not be
particularly significant from a statistical perspective.  Such insights are obtained from an explicit
approach to distinguishing between variability and uncertainty in a "two-dimensional" probabilistic
framework.
2-7
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

The group discussed whether mixture distributions are useful. Some experts were clearly
proponents of using mixture distributions. A few individuals offered some cautions that it can be
difficult to know when to properly employ mixtures.  One example mentioned was for radon
particular assessment assuming a Lognormal distribution. Another responded that the concentration may
more appropriately be described as a mixture of normal distributions. There was no firm consensus on
whether it is better to use a mixture of distributions as opposed to a "generalized" distribution that can
take on many arbitrary shapes. Those who expressed opinions tended to prefer the use of mixtures
because they could offer more insight about processes that produced the data.
,;     ,      '  ,       '      .  ,            '       P-I.       '      ' :  '  " "          ''
Truncation of the tails of a PDF was discussed. Most of the experts seemed to view this as a last
resort fraught with imperfections. The need for truncation may be the result of an inappropriate selection
of a PDF. For exainple, one participant asked, "If you truncate a Lognormal, does this invalidate your
justification of the Lognormal?"  It was  suggested that alternative PDFs (perhaps ones that are less "tail
heavy") be explored. Some suggested that truncation  is often unnecessary. Depending upon the
probability mass of the portion of the distribution that is considered for truncation, the probability of
sampling an extreme value beyond a plausible upper bound may be so low that it does not occur in a
typical Monte Carlo simulation of only a few thousand iterations. Even if an unrealistic value is sampled
for one input, it may not produce an extreme value  in the model output.  If one does truncate a
distribution, it can potentially affect the mean and other moments of the distribution. Thus, one expert
summarized the issue  of truncation as "nitpicking" that potentially can lead to more problems than it
solves.
2.5     GOODNESS-OF-FIT

The fifth and final session of the workshop was devoted to the following trigger question:

On what basis should it be decided whether a data set is adequately fitted by a parametric
distribution?

The premise of this session was the assumption that a decision had already been made to use a
PDF instead of an EDF. While not all participating experts were comfortable with this assumption, all
agreed to base the subsequent discussion on it.

The group agreed unanimously that visualization of both the data and the fitted distribution is the
most important approach for ascertaining the adequacy of fit. The group in general seemed to share a
view that conventional Goodness-of-Fit (GoF) tests have significant shortcomings and that they should
not be the only or perhaps even primary methods for determining the adequacy of fit.

One expert elaborated that any type of probability plot that allows one to transform data so that
they can be compared to a straight line, representing a perfect fit, is extremely useful. The human eye is
generally good at identifying discrepancies from the straight line perfect fit. Another pointed out that
visualization and visual inspection is routinely used in the medical community for evaluation of
information such as x-rays and CAT scans; thus, there is a credible basis for reliance on visualization as a
means for evaluating models and data.
2-8
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

One of the potential problems with GoF tests is that they may be sensitive to imperfections in the
fit that are not of serious concern to an assessor or a decision maker. For example, if there are outliers at
the low or middle portions of the distribution, a GoF test may suggest that a particular PDF should be
rejected even though there is a good fit at the upper end of the distribution. In the absence of a visual
inspection of the fit, the assessor may have no insight as to why a particular PDF was rejected by a GoF
test.                                                                  .

The power of GoF tests was discussed. The group in general seemed comfortable with the
notion of overriding the results of a GoF test if what appeared to be a good fit, via visual inspection, was
rejected by the test, especially for large data sets or when the imperfections are in portions of the
distribution that are not of major concern to the assessor or decision maker. Some experts shared stories
of situations in which they found that a particular GoF test would reject a distribution due to only a few
"strange" data points in what otherwise appears to be a plausible fit. It was noted that GoF tests become
increasingly sensitive as the number of data points  increases, so that even what appear to be small or
negligible "blips" in a large data set are sufficient to lead to rejection of the fit. In.contrast, for small data
sets, GoF tests tend to be "weak" and may fail to reject a wide range of PDFs. One person expressed
concern that any strict requirement for the use of GoF tests might reduce incentives for data collection,
because it is relatively easy to avoid rejecting a PDF with few data.

The basis of GoF tests sparked some discussion. The "loss functions" assumed in many  tests
typically have to do with deviation of the fitted cumulative distribution function from the EDF for  the
data set.  Other criteria  are possible and, in principle, one could create any arbitrary GoF test. One expert
asked whether minimization of the loss function used in any particular GoF test might be used as a basis
for choosing parameter values when fitting a distribution to the data. There was no specific objection,
but it was pointed out that a degree-of-freedom correction would be needed.  Furthermore, other
methods, such as maximum likelihood estimation (-MLE), have a stronger theoretical basis as a method
for parameter estimation.

The group discussed the role of the "significance level" and the "p-value" in GoF tests. One
expert stressed that the  significance level should be determined in advance of evaluating GoF and that it
must be applied consistently  in rejecting possible fits. Others, however, suggested that the appropriate
significance level would depend upon risk management objectives. One expert suggested that it is useful
to know the p-value of  every fitted distribution so that one may have an indication of how good or  weak
the fit may have been according to the  particular GoF test.
2-9
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

SECTION THREE

OPENING REMARKS
At the opening session of the workshop, representatives from EPA Region 2 and the RAF
welcomed members of the expert panel and observers. Following EPA remarks, the workshop facilitator
described the overall structure and objectives of the 2-day forum, which this section summarizes.

3.1    WELCOME AND REGIONAL PERSPECTIVE
Mr. William McCabe, Deputy Director, Program Support Branch, Emergency and
Remedial Response Division, U.S. EPA Region 2

William McCabe welcomed the group to EPA Region 2 and thanked everyone for participating
in the workshop. He noted that, in addition to this workshop, Region 2 also hosted the May 1996 Monte
Carlo workshop, which ultimately led to the release of EPA's May 1997 policy document on
probabilistic assessment. He commented on how this 2-day workshop was an important followup to the
May 1996 eventTMr. McCabe stressed that continued discussions on viable approaches to probabilistic
assessments are important because site-specific decisions rest on the merit of the risk assessment. He
stated that this type of workshop is an excellent opportunity for attendees to discuss effective methods
and expressed optimism that workshop discussions would provide additional insight and answers to
probabilistic assessment issues. Resolution of key probabilistic assessment issues, he noted, will help the
region members as they review risk assessments using probabilistic techniques. He mentioned, for
example, the ongoing Hudson River PCB study for which deterministic and probabilistic assessments
will be performed. In that case, as in others, Mr. McCabe said it will be critical for Agency reviewers to
put the results into the proper context and to validate/critically review probabilistic techniques employed
by the contractor(s) for the Potentially Responsible Parties.
3.2    OVERVIEW AND BACKGROUND
Mr. Steve Knott, U.S. EPA, Office of Research and Development, Risk Assessment Forum

On behalf of the RAF, Steve Knott thanked Region 2 for hosting the workshop. Mr. Knott briefly
explained how the RAF originated in the early 1980s and comprises approximately 30 scientists from
EPA program offices, laboratories, and regions. One primary RAF function is to bring experts together
to carefully study and help-foster cross-agency consensus on tough risk assessment issues.

Mr. Knott described the following activities related to probabilistic analysis in which the RAF
has been involved:

•      Formation of the 1983 ad hoc technical panel on Monte Carlo analysis.

•      May 1996 workshop on Monte Carlo analysis (US EPA, 1996b).

•      Development of the guiding principles for Monte Carlo analysis (US EPA, 1997a)

•      EPA's general probabilistic analysis policy (US EPA, 1997b).

3-1
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

Mr. Knott reiterated the Agency's perspective on probabilistic techniques, stating that "the use of
probabilistic techniques can be a viable statistical tool for analyzing variability and uncertainty in risk
assessment" (US EPA, 1997b). Mr. Knott highlighted Condition 5 (on which this workshop was based)
Of the eight conditions for acceptance listed in EPA's policy:

Information for each input and output distribution is to be provided in the report. This includes
tabular and graphical representations of the distributions (e.g., probability density function and
cumulative distribution function plots) that indicate the location of any point estimates of interest
(e.g., mean, median, 95th percentile). The selection of distributions is to be explained and
justified. For both the input and output distributions, variability and uncertainty are to be
differentiated where possible (US EPA, 1997b).

Mr. Knott referred to the recent RTI report, "Development of Statistical Distributions for
Exposure Factors" (1998), which presents a framework for fitting distributions and applies the
framework to three case studies.

Mr. Knott explained that the Agency is seeking input from workshop participants primarily in the
following areas:

•     Methods for fitting distributions to less-than-perfect data (i.e., data that are not perfectly
representative of the scenario(s) under study).

"     Using the EDF (or resampling techniques) versus the PDF.

These issues were the focus of the workshop. Mr. Knott noted that the workshop will enable EPA to
receive input from experts, build on existing guidance, and provide Agency assessors additional insight.
EPA will use the information from this workshop in future activities, including (1)  developing or revising
guidelines and models, (2) updating the Exposure Factors Handbook, (3) supporting modeling efforts,
and (4) applying probabilistic techniques to dose-response assessment.
3.3     WORKSHOP STRUCTURE AND OBJECTIVES
Dr. H. Christopher Frey, Workshop Chair

Dr. Frey, who served as workshop chair and facilitator, reiterated the purpose and goals of the
workshop. As facilitator, Dr. Frey noted, he would attempt to foster discussions that would further
illuminate and support probabilistic assessment activities.  Dr. Frey stated that workshop discussions
\VOuld center on trie two issue papers mentioned previously.  He explained that the RTI report was
provided to experts for background purposes only. While the RTI report was not the review subject for
this workshop, Dr. Frey commented that it may provide pertinent examples.

The group*s charge, according to Dr. Frey, was to advise EPA and the profession on
representativeness and distribution function issues.  Because a slightly greater need exists for discussing
representativeness issues and developing new techniques in this area, Dr. Frey explained that this topic
would receive the greatest attention during the 2-day workshop. He reemphasized that the workshop
would focus on technical issues, not policy issues.
3-2
; i Ji
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

Dr. Frey concluded his introductory remarks by stating that the overall goal of the workshop was
to provide a framework for addressing technical issues that may be applied widely to different future
activities (e.g., development of exposure factor distributions).
Workshop Structure and Expert Charge

Dr. Frey explained that the workshop would be structured around technical questions related to
the two issue papers.  Appendix D presents the charge provided to experts before the workshop,
-including specific questions for consideration and comment. The workshop material, Dr. Frey noted, is~
inherently technical. He, therefore, encouraged the experts to use plain language where possible.  He
also noted that the workshop was not intended to be a short course or tutorial. In introducing the key
topics for workshop discussions, Dr. Frey highlighted the following, which he perceived as the most
challenging issues and questions based on experts' premeeting comments:

Representativeness. How should assessors address representativeness? What deviation is
acceptable (given uncertainty and variability in data quality, how close will we come to
answering the question)? How do assessors work representativeness into their problem
definition (erg., What are we asking? What form will the answer take?)                        T

Sensitivity.  How important is the potential lack of representativeness? How do we evaluate
this?

Adjustment. Are there reasonable ways to adjust or extrapolate in cases where exposure data are
not representative of the  population of concern?

EOF/PDF. How do assessors choose between EDFs and theoretical PDFs? On what basis do
assessors decide whether a data set is adequately represented by a fitted analytic distribution?

Dr. Frey encouraged participants to remember the following general questions as they discussed
specific technical questions during plenary sessions, small group discussions, and brainwriting sessions:

•      What do we know today that we can apply to answer the questions or provide guidance?

•      What short-term studies (e.g., numerical experiments) could answer the question or

•      What long-term research (e.g., greater than 18 months) may be needed to answer the

According to Dr. Frey, the answers_to these questions will help guide Agency activities related to
probabilistic assessments.

Dr. Frey also encouraged the group to consider what, if anything, is not covered in the issue
papers, but is related to the key topics. He noted some of the following examples, which were
3-3
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

•      Role of expert judgment and Bayesian methods, especially in making adjustments.

»      Is model output considered representative if all the inputs to the model are considered
representative? This issues relates, in part, to whether or not correlations or
dependencies among the input are properly addressed.

•      Role of representativeness in a default or generic assessment.
. •   „,  ,     i.      '"'",'    ,'     . • ''	          ''
•      Role of the measurement process.
!l jj •'••••'.•             ..•;•,     .,,  .    	;•',, :   .. , , 7 .. ,    •  .   |       , •  .
Lastly, Dr. Frey explained that the activities related to the workshop are public information.  The
Workshop was advertised in the Federal Register and observers were welcomed. Time was set aside on
both days of the workshop for observer questions and comments.
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

SECTION FOUR

ISSUE PAPER PRESENTATIONS
Two issue papers were developed to present the expert panelists with pertinent issues and to
initiate workshop discussions. Prior to the plenary and small group discussions, EPA provided an
overview of each paper. This section provides a synopsis of each presentation. The two issue papers are
presented in Appendix A- The overheads are in Appendix H.
4.1     ISSUE PAPER ON EVALUATING REPRESENTATIVENESS OF EXPOSURE
FACTORSDATA
Jacqueline Moya, U.S. EPA, NCEA, Washington, DC

Ms. Moya opened her overview by notirigthat, while exposure distributions are available in the
Exposure Factors Handbook, there is still a need to fit distributions for these data. Ms. Moya noted that a
joint NCEA-RTI pilot project in September 1997 was established to do this.  She then discussed the
purpose of the issue paper and the main topics she planned to cover (i.e., framework for inferences,
components of representativeness, the checklists, and methods for improving representativeness). The
purpose of the issue paper, Ms. Moya reminded the group, was to introduce concepts and to prompt
discussions on how to evaluate representativeness and what to do if a sample is not representative.

Ms. Moya presented a flow chart (see Figure 1 in the issue paper) of the data-collection process
for a risk assessment. If data collection is not possible, she explained, surrogate data must be identified.
The next step is to ask whether the surrogate data represent the site or chemical. Ms. Moya pointed to
Checklist I (Assessing Internal Representativeness), which includes suggested questions for determining
whether the surrogate data are representative of the population of concern. If not, the assessor must ask,
"How do we adjust the data to make it more representative?"

Ms. Moya then briefly reviewed the key terms hi the paper. Representativeness in the context of
an exposure/risk assessment refers to the comfort with which one can draw inferences from the data.
Population is defined in terms of its member characteristics (i.e., demographics, spatial and temporal
elements, behavioral patterns).  The assessor's population of concern is the population for which the
assessment is being conducted. The surrogate population is the population used when data on the
population of concern is not available.  The population of concern for the surrogate study is the sample
population for which the surrogate study was designed. The population sampled is a sample from the
population of concern of the surrogate study.

Ms. Moya briefly described the external and internal components of representativeness.  She
explained that external components reflect how well the surrogate population represents the population
of concern. Internal components refer to the surrogate study, specifically:

1.     How well do sampled individuals represent the surrogate population? This depends on
how well the study was designed. For example, was it random?
4-1
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

ii IIP	Oil '',;-ill!!!11.
2.     How well do the respondents represent the sample population? For example, if
recreational fishermen are surveyed, is someone who fishes more frequently more likely
to respond the survey, and therefore bias the response?

3.     How well does the measured value represent the true value for the measurement unit?
For example, are the recreational fishermen in the previous example accurately reporting
the sizes of the fish they catch?

Ms. Moya reviewed the four checklists in the issue paper  which may serve as tools for risk
assessors trying to evaluate data  representativeness. One checklist is for the population sampled versus
the population of concern for the surrogate study (internal representativeness). The other checklists refer
to the surrogate population versus the population of concern based on individual, spatial, and temporal
characteristics (external representativeness).  One goal of the workshop, Ms. Moya explained, was to
solicit input from experts on the  use of these checklists.  Specifically, she asked whether certain
questions should be eliminated (e.g., only a subset of the questions may be needed for a screening risk
assessment).

Lastly, Ms. Moya pointed to discussions in the issue paper on attempting to improve
representativeness. One section  refers to how to make adjustments for differences in population
characteristics (with discussions  geared toward using weights for  the sample). The second section refers
to time-unit differences and includes how to adjust for this. Ms. Moya asked the group to consider how
to evaluate the significance of population differences and how to perform extrapolations if they are
necessary.

4.2    ISSUE PAPER ON EMPIRICAL DISTRIBUTION FUNCTIONS AND NON-
PARAMETRIC SIMULATION
Timothy Barry, U.S. EPA, NCEA, Washington, DC

Dr. Barry reviewed the issues of concern related to selecting and evaluating distribution
functions.  He explained that, assuming data are representative, the risk assessor has two methods for
representing an exposure factor in a probabilistic analysis: parametric (e.g., a Lognormal, Gamma, or
Weibull distribution) and non-parametric (i.e., use the sample data to define an EDF).

To illustrate how the EDF is generated, Dr. Barry presented equations and histograms (see
Appendix H). The basic EDF properties were defined as follows:

•     Values between  any two consecutive samples, xk  and xk±,, cannot be simulated, nor can
values smaller than the sample minimum, x,, or larger than  the sample maximum, xm be
generated (i.e., x>x, and x<xn).

»     The mean of the EDF equals the sample mean. The variance of the EDF mean is always
smaller than the variance of the sample mean; it equals (n-l)/n times the variance of the
sample mean.

"     Expected values of simulated EDF percentiles are equal to the sample percentiles.
4-2
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

•      If the underlying distribution is skewed to the right (as are many environmental
quantities), the EDF tends to underestimate the true mean and variance.

In addition to the basic EDF, Dr. Barry explained, the following variations exist:

•      Linearized EDF. In this case, a linearized cumulative distribution pattern results. The
linearized EDF linearly extrapolates between two observations.

•      Extended EDF. An extended EDF involves linearization and adds lower and upper tails
to the data to reflect a "more realistic range" of the exposure variable. Tails are added
__""    based expert judgment.

»      Mixed Exponential. In this case, an exponential upper tail is added to the EDF. This
approach is based on extreme value theory.

After describing the basic concepts of EDFs, Dr. Barry provided an example in which
investigators compared and contrasted parametric and non-parametric techniques. Specifically, 90 air
exchange data points were shown to have a Weibull fit. When a basic EDF  for these data is used, means
and variance reproduce well. It was concluded that if the goal is to reproduce the sample, Weibull does
well on the mean but poorly at the high end.

Dr. Barry encouraged the group to consider the following questions during the 2-day workshop:

•      Is an EDF preferred over a PDF in any circumstances?

•      Should an EDF not be used in certain situations?

•      When an EDF is used, should the linearized, extended,  or mixed version be used?

Dr. Barry briefly described the Goodness of Fit (GoF) questions the issue paper introduces. He
explained that, generally, assessors should pick  the simplest analytic distribution not rejected by the data.
Because rejection depends on the chosen statistic and on an arbitrary level of statistical significance, Dr.
Barry posed the following questions to the group:

•      What role should the GoF statistic and its p-value (when available) play in deciding on
the appropriate distribution?

•      What role should graphical assessments of fit play?

•      When none of the standard distributions fit well, should you investigate more flexible
families of distributions (e.g., four parameter gamma, four parameter F, mixtures)?
4-3
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

_
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

SECTION FIVE
EVALUATING REPRESENTATIVENESS OF EXPOSURE FACTORS DATA
Discussions on the first day and a half of the workshop focused on developing a framework for
characterizing and evaluating the representativeness of exposure data. The framework described in the
issue paper on representativeness (see Appendix A) is organized into three broad sets of questions: (1)
those related.to differences in populations, (2) those related to differences in spatial coverage and scale,
and (3) those related to differences in temporal scale. Therefore, discussions were held in the context of
these three topic areas. The panel also discussed the strengths and weaknesses of the proposed
"checklists" in the issue paper, which were designed to help the assessor evaluate representativeness.
The last portion of the workshop session on representativeness included discussions on sensitivity
(assessing the importance of non-representativeness) and on the methods available to adjust data to better
represent the population of concern. This section describes the outcome of each of these discussions.

Initial deliberations centered on the need to define risk assessment objectives (i.e. problem
definition) before evaluating the representativeness of exposure data. Discussions  on sensitivity and
5.1     PROBLEM DEFINITION

The group agreed on two points: that "representativeness" depends on the problem at hand and
that the context of the risk analysis is critical. Several experts commented that assessors will have a
difficult time defining representativeness if the problem has not been well-defined.  The group therefore
spent a significant amount of time discussing problem definition and problem formulation in the context
of assessing representativeness. Several experts noted the importance of understanding the end use of the
assessment (e.g., site-specific or generic, national or regional analysis). The group agreed that the most
important step for assessors is to ask whether the data are representative enough for their intended use(s).

The group agreed that stakeholders and other data users should be involved in all phases of the
assessment process, including early brainstorming sessions. Two experts noted that problem definition
must address whether the assessment will adequately protect public health and the environment. Another
expert  stressed the importance of problem formulation, because not doing so risks running analyses or
engaging resources needlessly. One participant commented that the importance of representativeness
varies with the level (or tier) of the assessment. For example, if data are to be used  in a screening
manner, then conservativeness may be more important than representativeness.  If data are to be used in
something other than screening assessments, the assessor must consider the value added of more complex
analyses (i.e., additional site-specific data collection, modeling). Two  experts noted, however, that the
following general problem statement/question would not change with a more or less sophisticated (tiered)
assessment: Under an agreed upon set of exposure conditions, will the population of concern experience
unacceptable risks? A more sophisticated analysis would merely enable a closer look at less
conservative/more realistic conditions.
5-1
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

5.1.1  What information is required to specify a problem definition fully?

The group agreed that when defining any problem, the "fundamental who, what, when, where,
Why, and how" questions must be answered. One individual noted that if assessors answer these
questions, they will be closer to determining if data are representative.  The degree to which each basic
question is important is specific to the problem or situation. Another reiterated the importance of
remembering that the premier consideration is public health protection; he noted that if only narrow
issues are discussed^ the public health impact may be overlooked.

The group concurred that the problem must be defined in terms of location (space), time (over
what duration and when in time), and population (person or unit).  Some of these definitions may be
concrete (e.g., spatial locations around a site), while some, like people who live on a brownfield site, may
be more vague (e.g!, because they may change with mobility and new land use). Because the problem
addresses a future context, it must be linked to observable data by a model and assumptions.  The
problem definition should include these models and assumptions.

Various experts provided the following specific examples of the questions assessors should
consider at the problem formulation stage of a risk assessment.

»      What is the purpose of the assessment (e.g., regulatory decision, setting cleanup
standards)?

•      What is the population of interest?

•      What type of assessment is being performed (site-specific or generic)?

•      How is the assessment information being used? How will data be used (e.g., screening
assessment versus court room)?

•      \Vho are the stakeholders?

•      What are the budget limitations?  What is the cost/benefit of performing a probabilistic
versus a deterministic assessment?
,"" '             ,            :      '        '        lil!"' '   '"   ,'  i '    '   • / •"
•     What population is exposed, and what are its characteristics?

»     How, when, and where are people exposed?

»     In what activities does the exposed population engage? When does the exposed
population engage in these activities, and for how long? Why are certain activities
performed?

•     What type of exposure is being evaluated (e.g., chronic/acute)?

•     What is the scenario of interest (e.g., what is future land use)?

•     What is the target or "acceptable" level of risk (e.g., 10"2 versus 10'6)?
5-2
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

"      What is the measurement error?

•      What is the acceptable level of error?

•      What is the geographic scale and location (e.g., city, county)?

"      What is the scale for data collection (e.g., regional/city, national)?

•      What are site/region-specific issues (e.g., how might a warm climate or poor-tasting
. .water affect drinking water consumption rates)?

•      What is the temporal scale (day, year, lifetime)?

»      What are the temporal characteristics of source emissions (continuous)?

•      What is/are the route(s) of exposure?

.»      What is the dose (external, biological)?

•      What is/are the statistic(s) of interest (e.g., mean, uncertainty percentile)?

•      What is the plausible worst case?

•      What is the overall data quality?

•      What models must be used?

•      What is the measurement error?

•      When would results change a decision?

Many of the preceding questions are linked closely to defining representativeness. One subgroup
compiled a list of key elements that are directly related to these types of questions when defining
representativeness (see textbox on page 5-4).
5.1.2   What constitutes representativeness (or lack thereof)?  What is "acceptable
deviation"?

Several of the experts commented that, fundamentally, representativeness is a function of the
quality of the data but reiterated that it depends ultimately on the overall assessment objective. Almost
all data used in risk assessment fail to be representative in one or more ways. At issue is the effect of the
lack of representativeness on the risk assessment. One expert suggested that applying the established
concepts of EPA's data quality objective/data quality assessment process would help assessors evaluate
data representativeness. Because populations are not fixed in time, one expert cautioned that if a data set
is too representative, the risk assessment may be precise for only a moment. Another stressed the
importance of taking a credible story to the risk manager.  In that context, "precise representativeness"
may be less important than answering the question of whether we are being protective of public health.  It

5-3
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

Sources of Variability and Uncertainty Related to the Assessment of Data Representativeness

EPA policy sets the standard that risk assessors should seek to characterize central tendency and plausible upper
bounds on both individual risk and population risk for the overall target population as well as for sensitive
subpopulations.  To this extent, data representativeness cannot be separated from the assessment endpoint(s).
Following are some key elements that may affect data representativeness. These elements are not mutually
exclusive.

Exposed Population
General target population
Particular ethnic group
Known sensitive subgroup (e.g., children, elderly, asthmatics)
Occupational group (e.g., applicators)
Age group (e.g., infant, child, teen,  adult, whole life)
Gender
Activity group (e.g., sport fishermen, subsistence fishermen)

Geographic Scale, Location
Trends (e.g., stationary, nonstationary behaviors)
Past, present, future exposures
Less-than-lifetime exposures (e.g., hourly, daily, weekly,  annually)
Temporal characteristics of source(s) (e.g., continuous, intermittent, periodic, concentrated,
random)

Exposure Route
Inhalation
Ingestion (e.g., direct, indirect)
Dermal (direct) contact (by activity; e.g., swimming)
Multiple pathways

Exposure/Risk Assessment Endpoint
Cancer risk
Noncancer risk (margin of exposure, hazard index)
Potential dose, applied dose, internal dose, biologically effective dose
Risk statistic
Mean, uncertainty percentile of mean
Percentile of a distribution (e.g., 95th percentile risk)
Uncertainty limit of variability percentile (upper confidence limit on 95th percentile
risk)
Plausible worst case, uncertainty percentile of plausible worst case

Data Quality Issues
Direct measurement, indirect measurement (surrogates)
Modeling uncertainties
Measurement error (accuracy, precision, bias)
Sampling error (sample size, non-randomness, independence)
Monitoring issues (short-term, -long-term, stationary, mobile)
5-4
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

is important to understand whether a lack of representativeness could mean the risk assessment results
fail to protect public health or that they grossly overestimate risks.

One participant expressed concern that assessors feel deviations from representativeness can be
measured. In reality, risk assessors may more often rely on qualitative or semiquantitative ways of
describing that deviation.  Another expert emphasized that assessors often have no basis on which to
judge the representativeness of surrogate data (e.g., drinking water consumption), because rarely is local
data available for comparison.  Therefore, surrogate data, must be accepted or modified based on some
qualitative information (e.g., the local area is hotter than that which the surrogate data is based).

The experts provided the following views on what constitutes representativeness and/or an
acceptable level of non-representativeness. These views were communicated during small group and
plenary discussions.

Nearly consistent with the definition in the issue paper, representativeness was defined by one
subgroup as "the degree to which a value for a given endpoint adequately describes the value of that
endpoint(s) likely seen in the target population." The term "adequately" replaces the terms "accurately
and precisely" in the issue paper definition. One expert suggested changing the word representative to
"useful  and informative." The latter terms imply that  one has learned something from the surrogate
population. For example, the assessor may not prove  the data are the same, but can, at minimum, capture
the extent to which they differ.  The term non-representativeness was defined as  "important differences
between target and surrogate populations with respect to the risk assessment objectives." Like others, this
subgroup noted that the context of observation is important (e.g., what is  being measured: environmental
sample  [water, air, soil] versus human recall [diet] versus tissue samples in humans [e.g., blood]).
Assessors must ask about internal sample consistency, inappropriate methods, lack of descriptors (e.g.,
demographic, temporal), and inadequate sample size for targeted measure.

The group agreed, overall, that assessing adequacy or representativeness  is inherently subjective.
However, differing opinions were offered in terms of how to address this subjectivity. Several
participants stressed the importance of removing subjectivity to the extent possible but without making
future guidance too rigid. Others .noted, however, that expert judgment is and must remain an integral
part of the assessment process.

A common theme communicated by the experts was that representativeness depends on how
much uncertainty and variability between the population of concern and the surrogate population the
assessor is willing to accept. What is "good enough"  is case specific, as is the "allowable error." Several
experts  commented that it is also important for assessors to know if they are comparing data means or
tails. One expert suggested reviewing some case studies using assessments done for different purposes to
illuminate the process of defining representativeness. "With regard to exposure factors, we [EPA] need
to do a better job at specifying or providing better guidance on how to use the data that are available."
For example, the soil ingestion data for children are limited, but they may suffice to provide an estimate
of a mean. These data are not good enough to support a distribution or a good estimate of a high-end
value, however.                                                     -

One subgroup described representativeness/non-representativeness as the degree of bias  between
a data set and the problem.  For example:
5-5
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

Scenario:      Is a future Residential scenario appropriate to the problem?" For prospective risk
assessment, there are usually irreducible uncertainties about making estimates
about a future unknown population, therefore, a certain amount of modeling
must occur.

Model:        Is a.multiplicative, independent variable model appropriate?  tlriceriliritibs in the
rfibtlei call contribute to hoh-represehtativfehess (e.g., it might not apply, it may
be iVroitig, or calculations may be incorrect).
J!1'                         :              ,        ../ ' *   n               '
fariaties:     Is a particular study appropriate to the problem at hand—are the variables
biased, uncertain? It may be easy to get confused about distinctions between bias
(of inaccuracies), prebision/imprecisioh, and representativeness/noil
representativeness.  It is often assumed that a "representative" data set is one that
has been obtained with a certain amount of randomisation. More often, however,
data that meet this definition are not available.

the group spokesperson explained that a weli-designed and controlled randomized study
yielding two results can be "representative" of the mean and dispersion but highly imprecise.
Imprecision and representativeness are therefore different, but related, the central tendency of the
distribution may be accurately estimated, but the Upper percentile may not.

In summary, when assessing representativeness, the group agreed that emphasis should be placed
Oil the adequacy of the data and how useful and informative a data set is to the defined problem, the
gfbtip af fe'gd that these terms are more appropriate than "accuracy and precision" in defining
representative data in the context of a risk assessment, the importance of considering end Use of the data
was stressed and was a recurring theme in the discussions (i.e., how much representativeness is needed to
aftmver the problem). Because the subject population is often a moving target with unpredictable
direction in terms of its demographics and conditions of exposure, one expert commented that, In some
cases, representativeness of a givert data set may not be a relevant concept and generic models may be
ntore appropriate.
5.1.3
coisideratioHs should be included iti, added to, or excluded from the
j '
ch
ec
More than half the experts indicated that the checklists in Issue Paper 1 are useful for evaluating
representativeness  One expert noted that regulators aire often forced to make decisions without
information. A checklist helps the assessor/risk manager evaluate the potential importance of missing
exposure data.  One expert re-emphasized the importance of allowing for professional judgement and
expert elicitation when evaluating exposure data. Another panelist concurred, commenting that this type
of the checklist is preferred over prescriptive guidance. Several of the experts noted, however, that
checklists could be improved and offered several recommendations.

the group agreed that the checklist should be flexible for various problems and that users should
be directed to consider the purpose of the risk assessment. The assessor must know the minimum
requirements for a screening versus a probabilistic assessment.  As one expert said, the requirements for
i screening level assessment rtiiist differ from those for a full-blown risk assessment: t>o I have enough
information about the population (e.g., type, space, time) to answer the questions at this tier, and is that
5-6
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

information complete enough to make a management decision? Do I need to go through all the checklists
before I can stop?

Instead of the binary (yes/no) and linear format of the checklists, several individuals suggested a
flowchart format centered on the critical elements of representativeness (i.e., a "conditional"
checklist)—to what extent does the representativeness of the data really matter? A flowchart would
allow for a more iterative process and would help the assessor work through problem-definition issues.
One expert suggested developing an interactive Web-based flowchart that would be flexible and context-
specific. Another agreed, adding that criteria are needed to guide the assessor on what to do if
information is not available._As one expert noted, questions should focus on the outcome of the risk
assessment. The assessor needs to evaluate whether the outcome of the assessment changes if the
populations differ.

One of the experts strongly encouraged collecting more/new data or information. Collection of
additional data, he noted, is needed to improve the utility of these checklists. Another participant
suggested that the user be alerted to the qualities of data that enable quantifying uncertainty and
reminded that the degree of representativeness cannot be defined in certain cases.  When biases due to
lack of representativeness are suspected, how can assessors judge the direction of those biases?

In addition to general comments and recommendations, several individuals offered the following
specific suggestions for the checklists:

• '      Clarifying definitions (e.g., internal versus external).

•       Recategorizing. For example, use the following five categories: (1) interpreting
measurements (more of a validity than representative issue), (2) evaluating whether
sampling bias exists, (3) evaluating statistical sampling error, (4) evaluating whether the
study measured what must be known, and (5) evaluating differences in the population.
The first three issues are sources, of internal error, the latter two are sources of external
representativeness.

•       Reducing the checklists. Several experts suggested combining Checklists II, III, and IV.

•       Combining temporal, spatial, and individual categories. Avoid overlap in questions. For
example, when overlap exists (e.g.,  in some spatial and temporal characteristics), which
questions in the checklist are critical? A Web-based checklist, with the flow of questions
appropriately programmed, could be designed to avoid duplication of questions.

•       Including other  populations of concern (e.g., ecological receptors).

•       Including worked examples that demonstrate the criteria for determining if a question is
assessor on the issues that are critical to representativeness.

•       Separating bias  and sampling quality and extrapolation from reanalysis and
reinterpretation.
5-7
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

»      Asjdng the following additional questions:

— JT     Relative to application, is there consistency in the survey instruments used to
collect the exposure data?  How was measurement error addressed?

— =•     Is the sample representative enough to bound the risk?

rrm,"    Are data available on population characterization factors (e.g., age, sex)?

—     What is known about the population of concern relative to the surrogate
population? (If the population of concern is inadequately characterized,, then the
ability to consider the representativeness of the surrogate data is limited, and

In summary, the group agreed on the utility of the checklists but emphasized the need to include
in them decision criteria (i.e., how do we know if we have representative/non-representative data?)  A
brief discussion on the need to collect data followed. Some experts posed the following questions:  How
important is it to have more data?  Is the risk assessment really driving decisions?  Is more information
needed to make good decisions? Is making risk assessment decisions on qualitative data acceptable?
What data must to be collected, at minimum, to validate key assumptions? The results of the sensitivity
analysis, as one expert pointed out, are key to answering these questions.
5.2    SENSITIVITY

How do we assess the importance of non-representativeness?

In considering the implications of non-representativeness, the group was asked to consider how
one identifies the implications of non-representativeness in the context of the risk assessment. One
expert commented that the term "non-representativeness" may be a little misleading, and as discussed
earlier, finds the terms data adequacy or data useability more fitting to the discussions at hand. The
expert noted that, from a Superfund perspective, data representativeness is only one consideration when
assessing overall data quality or useability. Others agreed. The workshop chair encouraged everyone to
discuss the suitability of the term "representativeness" while assessing its importance during the small
group discussions.

One group described a way in which to assess the issue of non-representativeness as follows:
The assessor must check the sensitivity of decisions to be made as a  result of the assessment.  That is,
under a range of plausible adjustments, will the risk decision change? Representativeness is often not
tfoat important because risk management decisions depend on a range of target populations under various
Scenarios.  A few of the  experts expressed concern that problems will likely arise if the exposure assessor
is separated from decision makers. One person noted that often times an exposure  assessment will be
dpne absent of a specific decision (e.g., nonsite, non-Superfund situation). Another noted that in the
pesticide program  situations occur in which an exposure assessment is done before toxicity data are
available.  Such separations may be unavoidable. Another expert emphasized that  any future guidance
should stress the importance of assessors being cognizant of data distribution needs even if the assessors
are removed from  the decision or  have limited data.
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

One individual noted that examples would help. The assessor should perform context-specific
sensitivity analysis. It would help to. develop case studies and see how sensitivity analysis affects
application (e.g., decision focus).

Another group discussed sensitivity analysis in the context of a tiered approach. For the first tier,
a value that is "biased high" should be selected (e.g., 95th percentile upper bound). The importance of a '
parameter (as evidenced by a sensitivity analysis) is determined first, making the representativeness or
non-representativeness of the nonsensitive parameters unimportant. For the second tier (for sensitive
parameters), the assessor must consider whether averages or high end estimates are of greater
importance. This group presented an example using a corn oil scenario to illustrate when differences
between individuals (e.g. high end) and mixtures (averages) may be important. Because corn oil is a
blend with input from many ears of corn, if variability exists in the contaminant concentrations in
individual ears of corn, then corn oil will typically represent some type of average of those
concentrations. For such a mixture, representativeness is less of an issue.  It is not necessary to worry
about peak concentrations in one ear of corn.  Instead, one would be interested in situations which might
give rise to a relatively high average among the many ears of corn that comprise a given quantity of corn
oil. If one is considering individual ears of corn, it becomes more important to have a representative
sample; the tail of the distribution becomes of greater interest.

A third subgroup noted that, given a model and parameters, assessors must determine whether
enough data exist to bound the estimates: If they can bound the estimates, a sensitivity analysis is
performed with the following considerations:  (1) identify the sensitive parameters in the model; (2)
focus on sensitive parameters and evaluate the distribution beyond the bounding estimate (i.e., identify
the variability of these parameters) for the identified sensitive parameters; (3) evaluate whether the
distribution is representative; and (4) evaluate whether more data should be collected or if an adjustment
is appropriate.

Members of the remaining subgroup noted, and others agreed, that a "perfect" risk assessment is
not possible.  They reiterated that it is key to evaluate the data in the context of the decision analysis.
Again^ what are the consequences of being wrong, and what difference do decision errors make in the
estimate of the parameter being evaluated? This group emphasized that the question is situation-specific.
In addition, they noted the need for placing bounds on data used.

One question asked'throughout these discussions was "Are the data good enough to replace an
existing assumption and, if not, can we  obtain such data?" One individual again stressed the need for
"blue chip" distributions at the national level (e.g., inhalation rate, drinking water).  Another expert
suggested adding activity patterns to the list of needed data.

In summary, the group generally agreed that the sensitivity of the risk assessment decision must
be considered before non-representativeness is considered problematic. In some cases, there may not be
an immediate decision, but good distributions are still important.

How can one  do sensitivity analysis to evaluate the implications of non-representativeness?

The workshop chair asked the group to consider the mechanics of a sensitivity analysis. For
example, is there a specific statistic that should be used, or is it decision dependent? One expert
responded by noting that sensitivity analysis can be equated to partial correlation coefficients (which are

5-9
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

internal to a model).  He noted, however, that sensitivity analysis in the context of exposure assessment is
more "bottom line" sensitivity (i.e., if an assumption is changed, how does the change affect the bottom
line?). The focus here is more external—what happens when you change the inputs to the model (e.g.,
the distributions)? Another pointed to ways in which to perform internal sensitivity analysis.  For
example, the sensitivity of uncertainty can be separated out from the sensitivity of the variability
Component (see William Huber's premeeting comments on sensitivity). Another expert stressed,
however, that sensitivity analysis is inherently tied to uncertainty; it is not tied to variability unless the
Variability is uncertain. It was noted that sensitivity analysis is an opportunity to view things that are
subjective. Variability, in contrast is inherent in the data, unless there are too few data to estimate
variability sufficiently. One expert commented that it is useful to know which sources of variability are
fnost important in determining exposure and risk.

One individual voiced concern regarding how available models address sensitivity. Another
questioned whether current software (e.g., Crystal Ball® and @Risk®) covers sensitivity coefficients
adequately (i.e., does it reflect the depth and breadth of existing literature?).

Lastly, the group discussed sensitivity analysis in the context of what we know now and what we
need to know to improve the existing methodology. Individuals suggested the following:
.11;                         .'..,.., '      ,:	, '        ' • .
*      Add  the ability to classify sample funs to available software. Classify inputs and
evaluate the effect on outputs.

»      Crystal Ball® and @Risk® are reliable for many calculations, but one expert noted they
may  not currently be useful for second-order estimates, nor can they use time runs. Time
series analyses are particularly important for Food Quality Protection Act (FQPA)
evaluations.

»      Consider possible biases built into the model due to residuals lost during regression
ati|lyses. This factor is important to the sensitivity of the model prediction.
,>, ''!  '  ';     '" i    ' '   "''"   '   „'  ' '   :         ' ',:" '"V"'   ".'   ;,!'  - ;,/   ,"!
"i1  '                       '  ', J                       , ' „; i
One expert pointed out that regression analyses can introduce bias because residuals are often
dropped out. Others agreed that this is an important issue.  For example, it can make an order-of-
iriagnitude difference in body weight and surface area scaling.  Another expert stated that this issue is of
special interest for work under the FQPA, where use of surrogate data and regression analysis is
receiving more and more attention. Another expert noted that "g-estimation" looks at this issue.  The
group revisited this issue during their discussions on adjustment.

How can one adjust the sample to better represent the population of interest?

The experts addressed adjustment in terms of population, spatial, and temporal characteristics.
The group was asked to identify currently available methods and information sources that enable the
and long-term research needs in this area. The workshop chair noted that the issue paper only includes
discussion on adjustments to account for time-scale differences.  The goal, therefore, was to generate
some discussion on spatial and population adjustments as well.  Various approaches for making

5-10
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

adjustments were discussed, including general and mechanistic. General approaches include those that
are statistically-, mathematically-, or empirically-based (e.g., regression analysis). Mechanistic
approaches would involve applying a theory specific to a problem area (e.g., a biological, chemical, or
physical model).

Some differing opinions were provided as to how reliably we can apply available statistics to
adjust data. In time-space modeling, where primary data and multiple observations occur at different
spatial locations or in multiple measures over time, one expert noted that a fairly well-developed set of
analytic methods exist. These methods would fall under the category of mix models, kreiging studies for
spatial analysis, or random-effects models. The group agreed that extrapolating or postulating models are
less well-developed. One person noted that classical statistics fall short because they do not apply to
situations in which representativeness is a core concern. Instead, these methods focus more on the
accuracy or applicability of the model. The group agreed that statistical literature in this area is growing.

Another individual expressed concern that statistical tools and extrapolations introduce more
uncertainty to the assessment. This uncertainty may not be a problem if the assessor has good
information about the population of concern and is simply adjusting or reweighing the data, but when the
assessor is extrapolating the source term, demographics, and spatial characteristics simultaneously, more
assumptions and increasing uncertainty are introduced.

In general, the group agreed that a model-based approach has merit in certain cases.  The
modeled approach, as one expert noted, is a cheap and effective approach and likely to support
informed/more  objective decisions.  The group agreed that validated models (e.g., spatial/fate and
transport models) should be used. Because information on populations may simply be unavailable to
validate some potentially useful models, several participants reemphasized the need to collect more data,
which was a recurring workshop theme.

One expert pointed out that the assessor must ask which unit of observation is of concern. For
example, when  evaluating cancer risk, temporal/spatial issues (e.g., residence time) are less important.
When evaluating developmental effects (when windows of time are important), however, the
temporal/spatial issues are more relevant.  Again, assessors must consider the problem at hand before
identifying the unit of time.

From a pesticide perspective, it was noted that new data cannot always be required of registrants.
When considering the effects of pesticides, for example, crop treatment rates change over time. As a
result, bridging studies are used to link available application data to crop residues (using a multiple linear
regression model).

One expert stressed the importance and need for assessors to recognize uncertainty. Practitioners
of probabilistic assessment should be encouraged to aggressively evaluate and discuss the uncertainties in
extrapolations and their consequences. Often, probabilistic techniques can provide better information for
better management decisions. The expert pointed out that, in some cases, one may not be able to assign a
distribution, or  one may choose not to do so because it would risk losing valuable information. In those
cases, multiple  scenarios and results reported in a nonprobabilistic way (both for communication and
management decisions) may be  appropriate.

At this  point, one expert suggested that the discussion of multiple scenarios was straying from
the basic question to be answered— "If I have a data set that does not apply to my population, what do I

5-11
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

need to do, if anything?"  Others disagreed, noting that it may make sense to run different scenarios and
evaluate the difference. If a different scenario makes a difference, more data must be collected.  One
expert argued, however, that we cannot wait to observe trends; assessors must predict the future based on
a "snapshot" of today.

One expert suggested the following hierarchy when deciding on the need to refine/adjust data:

if      Can the effect be bounded? If yes, no adjustment is needed.

•      If the bias is conservative, no adjustment is needed.

•      Use a simple model to adjust the data.

»      If adjustments fail, resample/collect more data, if possible.

The group then discussed the approaches and methods that are currently available to address non-
representative data, and indicated that trie following approaches are viable:

1.      Start with brainstorming. Obtain stakeholder input to determine how the target
population differs from the population for which you have data.

2      Look at covariates to get an idea of what adjustment might be needed. Stratify data to
see if correlation exists. Stratification is a good basis for adjustments.

3.      Use "kreiging" techniques (deriving information from one sample to a smaller, sparser
data set).  Kreiging may not fully apply to spatial, temporal, and population adjustments,
however, because it applies to the theory of random fields.  Kreiging may help improve
the accuracy of existing data, but it does not enable extrapolation.

4.      Include time-steps in models to evaluate temporal trends.

5.      Use the "plausible extrapolation" model. This model is acceptable if biased
conservatively.

6.      Consider spatial estimates of covariate data (random fields).

7      Use the scenario approach instead of a probabilistic approach.

8.      Bayesian statistical methods may be applicable and relevant.

One expert presented a briefcase study as an example of Bayesian analysis of variability
and uncertainty and use of a covariate probability distribution model  based on regression
to allow extrapolation to different target populations.  The paper he summarized,
'fBayesian Analysis of Variability and Uncertainty on Arsenic Concentrations in U.S.
Public Water Supplies," and supporting overheads, are in Appendix G. The paper
describes a Bayesian methodology for estimating the distribution and its dependence on
cb'Variates. Posterior distributions were computed using Markov Chain Monte Carlo
(MCMC). In this example, uncertainties and variability were associated with time issues

5-12
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

and the self-selected nature of arsenic samples. After briefly reviewing model
specifications and distributional assumptions, the results and interpretations were
presented, including a presentation of MCMC output plots and the posterior cumulative
distribution of source water. The uncertainty of fitting site-specific data to the national
distribution of arsenic concentrations was then discussed. The results suggest that
Bayesian methodology powerfully characterizes variability and uncertainty in exposure
factors. The probability distribution model with covariates provides insights and a basis
for extrapolation to other targeted populations or subpopulations.  One of the main points
of presenting this methodology was to demonstrate the use of covariates.  This case study
showed that you can fit a modei with covariates, explicitly account for residuals (which
may be important), and apply that same model to a separate subpopulation where you
know something about the covariates. According to the presenter, such an approach helps
reveal whether national data represent local data.

When evaluating research needs, one expert pointed out that assessors should identify the
minimal amount of information they need to analyze the data using available tools. The group offered
the following  suggestions for both short and long-term research areas. The discussion of short-term
needs also included recommendations for actions the assessors can take now or in the short term to
address the topics discussed in this workshop.

Short-term research areas and actions

1.      Design studies for data collection that are amenable to available methods for data
analysis.  Some existing methods are unusable because not all available data, which were
used to support the methods, are from well-designed studies.

2.      Validate existing models on population variability (e.g., the Duan-Wallace model
[Wallace et al., 1994] and models described by Buck et al. [1995]). This validation can
be achieved by collecting additional data.

3.      Run numerical experiments to test existing and new methods for making adjustment
based on factors such as averaging times or area. Explore and evaluate the Duan-Wallace
model.

4.      Hold a separate workshop on adjustment methods (e.g., geostatistical and time series
methods). Involve the modelers working with these techniques on a cross-disciplinary
panel to learn how particular techniques might apply to adjustment issues that are
specific to risk assessment.

5.      Provide guidelines on how to evaluate or choose an available method, instead of simply
describing available techniques. These guidelines would help the assessor determine
whether a method applies to a specific problem.

6.      To facilitate their access and utility, place national data on the Web (e.g., 3-day CSFII
data, 1994-1996 USDA food consumption data). Ideally, the complete data set, not just
summary data, could be placed on the Web because data in  summary form is difficult 'to
analyze (e.g., to fit distributions).
5-13
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

Possible long-term research areas

I,      Collect additional exposure parameter data on the national and regional levels (e.g., "blue
chip" distributions). One expert cautioned that some sampling data have been or may be
ccjljected by field investigators working independently of risk assessment efforts.
Therefore, risk assessors should have input in methods for designing data collection.
j   '  : i , •        • ..'I  ' ;    '    ;  	'  •'; ;. ' ,';";-' •,';  • ';•'• ,   ' '•:  :   '.' .,*•!    '.  :    '••• .
1,      Perform targeted studies (spatial/temporal characteristics) to update existing data.

Discussions of adjustment ended with emphasis on the fact that adjustment and the previously
described methods only need be considered if they impact the endpoint.  One expert reiterated that when
no quantitative or objective ways exist to adjust the surrogate data, a more generalized screening
approach should be used.

As a follow-up to the adjustment discussions, a few individuals briefly discussed the issue of
"bias/loss function" to society.  Because this issue is largely a policy issue, it only received brief
attention. One expert noted that overconservatism is undesirable. Another stressed that it is not in the
public interest to extrapolate in the direction of not protecting public health; assessors should apply
conservative bias but make risk managers aware of the biases. The other expert countered that blindly
applying conservative assumptions could result in suboptimal decisions, which should not be taken
lightly. In general, the group agreed on the following point:  Assessors should use their best scientific
judgment and strive for accuracy when considering representativeness and uncertainty issues. Which
choice will ensure protection of public health without unreasonable loss? It was noted that the cost of
ovtrconservatism should drive the data-collection push (e.g., encourage industry to contribute to data
collection efforts because they ultimately pay for conservative risk assessments).
T •  ' „            '"'i!  '             §i   '•"',"           !•'•','•'••         ,                  _

5.4     SUMMARY OF EXPERT INPUT ON EVALUATING REPRESENTATIVENESS
'I*         '        ii        '                    ''  „        i'           ' ,         ,   '              	'
nil",':    ; , ,        '.;         •         /       • i • •> •	  ...••.   «,    • .    	   • :  ;;, -
Workshop discussions on representativeness revealed some common themes.  The group
generally agreed that representativeness is context-specific.  Methods must be developed to ensure
representativeness exists in cases where lack of representativeness would substantially impact a risk-
management decision. Methods, the sensitivity analysis, and the decision endpoint are closely linked.
One expert warned that once the problem is defined, the assessor must understand how to use statistical
tools properly to meet assessment goals.  Blind application of these tools can result in wrong answers
(e.g., examining the tail versus the entire curve).

One or more experts raised the following issues related to evaluating the quality and
"representativeness" of exposure factors data:

"      Representativeness might be better termed "adequacy" or "usefulness."

»      Before evaluating representativeness, the risk assessor, with input from stakeholders,
must define the assessment problem clearly.
<• '     '     .   "'"Bill  ' ..       	  '        ,  'v1 '. I    ;         	,        '      »      ,            :
•      Ho data are perfect; assessors must recognize this fact, clearly present it  in their
assessments, and adjust non-representative data as necessary using available tools. The
5-14
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

assessors must make plausible adj ustments if non-representativeness matters to the
endpoint.

•      To perform a probabilistic assessment well, adequate data are necessary, even for an
assessment with a well-defined objective. In large part, current exposure distribution data
fall short of the risk assessors' needs.  Barriers to collecting new data must be identified,
then removed. Cost limitations were pointed out, however. One expert, therefore,
recommended that justification and priorities be established.

•      Methods must be  sensitive to needs broader than the Superfund/RCRA programs (e.g.,
food quality and pesticide programs).

«      When evaluating the importance of representativeness and/or adjusting for non-
representativeness, the assessor needs to make decisions that are adequately protective of
public health while still considering costs and other loss functions. Ultimately, the
assessor should strive for accuracy.

Options for the assessor when the population of concern has been shown to have different habits
than the surrogate population were summarized as follows:  (1) determine that  the data are clearly not
representative and cannot be used; (2) use the surrogate data and clearly state the uncertainties; or (3)
adjust the data, using what information is available to enable a reasonable adjustment.
5-15
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

1.  r,l|,  ...... I'll,, ....... ...... it .......... I  ,- ....... .
....... !i    .I,,-,,!,,., ..... 'Jil !
.-,:..  • ....... :"Mlr • ...... H : ' • . t •_;, 'ii (-,  '.,-fl'

, :\\')i.n.>] iV.lii -'Hii'll'Q  (j;s
:;>*>':>[" I' I: '<(.'» I'*';?' ; ..... PI: -*ft'f  ;>i"Ti": ...... !" I I ' ",t?f ]',)'.' I •!,-;!""'

:l!"i",!  li'fsrj'ibi^fJft'' l*->!? :>V['(i/?  rf^'lt*!'! t'JHrti
il'f j,  "i,,,, ;;•;•; ijii'i"! ..... »", ..... "U  '.n'f  'V'f'' ..... i  vf't:i'j!'j  hu
f" ....... ;•;.  -1!,;  ::; -r ( ;i;f, ' •/ 1.-.;;
'"! JO'-:::
;',? !;>!'"»',  -J-jh    ,
;:'iai' .(."!;.      I
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

SECTION SIX

EMPIRICAL DISTRIBUTION FUNCTIONS AND RESAMPLING
VERSUS PARAMETRIC DISTRIBUTIONS
Assessors often must understand and judge the use of parametric methods (e.g., using such
theoretical distribution functions as the Lognormal, Gamma, or Weibull distribution) versus non-
parametric methods (using an EDF) for a given assessment. The final session of the workshop was
therefore dedicated to exploring the strengths and weaknesses of EDFs and issues related to judging the
quality of fit for theoretical distributions.  Discussions centered largely on the topics in Issue Paper 2 (see
Appendix A for a copy of the paper and Section 3 for the workshop presentation of the paper). This
section presents a summary of expert input on these topics.

Some of the experts thought the issue paper imposed certain constraints on discussions because it
assumed that: (1) no theoretical premise exists for assuming a parametric distribution, and (2) the data
are representative of the exposure factor in question (i.e., obtained as a simple random sample and in the
proper scale). These experts noted that many of the assertions in the issue paper do not exist in reality.
For example, it is unlikely to  find a perfectly random sample for exposure parameter data.

As a result, the discussions that followed covered the relative advantages and  disadvantages of
parametric and non-parametric distributions under a broader range of conditions.
6.1    SELECTING AN EDF OR PDF

Experts were asked to consider the following questions.

What are the primary considerations in choosing between the use of EDFs and theoretical
PDFs?  What are the advantages of one versus the other? Is the choice a matter of preference?
Are there situations in which one method is preferred over the other? Are there cases in which
neither method should be used?

The group agreed that selecting an EDF versus a PDF is often a matter of personal preference or
professional judgment. It is not a matter of systematically selecting either a PDF- or EDF-based
approach for every input. It was emphasized that selection of a distribution type is case- or situation-
specific.  In some cases, both approaches might be used in a single assessment. The decision, as one
expert pointed out, is driven largely by data-rich versus data-poor situations.  The decision is based also
on the risk assessment objective. Several experts noted that the EDF and PDF have different strengths in
different situations and encouraged the Agency not to recommend the use of one over the other or to
develop guidance that is too rigid. Some experts disputed the extent to which a consistent approach
should be encouraged. While it was recognized that a consistent approach may benefit decision making,
the overall consensus was that too many constraints would inhibit the implementation of new/innovative
approaches, from which we could learn.

Technical discussions started with the group distinguishing between "bootstrap" methods and
EDFs. One expert questioned if the methods were synonymous. EDF, as one expert explained, is a
specific type of step-wise distribution that can be used as a basis for bootstrap simulations.  EDF is one

6-1
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

way to describe a distribution using data; bootstrapping enables assessors to resample that distribution in
a special way (e.g., setting boundaries on the distribution of the mean or percentile) (Efron and
Tibshirani, 1993). Another expert distinguished between a parametric and non-parametric bootstrap,
stating that there are good reasons for using both methods. These reasons are well-covered in the
statistical literature. One expert noted that bootstrapping enables a better evaluation of the uncertainty of
the distribution.

Subsequent discussion focused on expert input on deciding which distribution to fit, if any, for a
given risk assessment problem. That is, if the assessor has a data set that must be represented, is it. better
to use the data set as is and not make  any assumptions or to fit the data set to a parametric distribution?
The following is a compilation of expert input.

»       Use of the EDF.  The use of an EDF may be preferable (1) when a large number of data
points exists, (2) when access is available to computers with high speed and storage
capabilities, (3) when no theoretical basis for  selecting a PDF exists, or (4) when a
"perfect" data set is available. With small data sets, it was noted that the EDF is unlikely
to represent an upper percentile adequately; EDFs are restricted to the range of observed
data. One expert stated that while choice of distribution largely depends on sample size,
in most cases he  would prefer the EDF.

When measurement or response error exists, one expert pointed out that an EDF should
not be used before looking at other options.

«       Use of the PDF.  One expert noted that it is easier to summarize a large data set with a
PDF as long as the fit is reasonable. Use of PDFs can provide estimates of "tails" of the
distribution beyond the range of observed data.  A parametric distribution is a convenient
way to concisely summarize a data set. That is, instead of reporting the individual data
values, one can report the distribution and estimated parameter values of the distribution.

While data may not be generated exactly according to a parametric distribution,
evaluating parametric distributions may provide insight to generalizable features of a
data set, such as moments, parameter values, or other statistics. Before deciding which
distribution to use, two experts pointed out the value of trying to fit a parametric
distribution to gain some insight about the data set (e.g., how particular parameters may
be related to other aspects of the data set). These experts felt there is great value in
examining larger data sets and thinking about what tools can be used to put data into
better perspective.  Another expert noted that the PDF is easier to defend at a public
meeting or in a legal  setting because it has some theoretical basis.

»      Assessing risk assessment outcome. The importance of understanding what the
implications of the distribution choice  are to the outcome of the risk assessment was
stressed. An example of fitting soil ingestion data to a number of parametric and non-
parametric distributions yielded very different results. Depending on which distribution
was used, cleanup goals were changed by approximately 2 to 3 times. Therefore, the
cbqice may have cost implications.
"' i          'i         ''    '                      ':
•      Assuming all data are empirical. One expert felt strongly that all distributions are
efiipirical. In data poor situations, why assume that the data are Lognormal? The data

6-2
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

could be bimodal in the tails. If a data set is assumed to be empirical, there is some
control as to how to study the tails. Another expert reiterated that using EDFs in data
poor situations (e.g., six data points) does not enable simulation above or below known
data values. One expert disagreed providing an  example that legitimizes the concern for
assuming that data fit a parametric distribution.  He noted that if there is no mechanistic
basis for fitting a parametric distribution, and a small set of data points  by chance are at
the lower end of the distribution, the 90th percentile estimate will be wrong.

Evaluating uncertainty. Techniques for estimating uncertainty in EDFs and PDFs were
discussed.  The workshop chair presented an example in which he fit a distribution for
variability to nine data points.  He then placed uncertainty bands around the distributions
(both Normal and Lognormal curves) using parametric bootstrap simulation. (See Figure
6-1).  For example, bands were produced by plotting the results of 2,000 runs of a
synthetic data set of nine points sampled randomly from the Lognormal distribution fitted
to the original data set. The wide uncertainty (probability) bands indicate the confidence
in the distribution. This is one approach for quantifying how much is known about what
is going on at the tails, based on random sampling error. When this exercise was
performed for the Normal distribution, less uncertainty was predicted in the upper tail;
however, a lower, tail with negative values was predicted, which is not appropriate for a
non-negative physical quantity such as concentrations. The chair noted that, if a stepwise
EDF had been used, high and low ends would be truncated and tail concentrations  would
not have been  predicted.  This illustrates that the estimate of uncertainty in the tails
depends on which assumption is made for the underlying distribution. Considering
uncertainty in this manner allows the assessor to evaluate alternative distributions and
gain insight on distinguishing between variability and uncertainty in a "2-dimensional
probabilistic framework." Several participants noted that this was a valuable example.

Figure 6-1:     Variability and Uncertainty in the Fit of Lognormal Distribution to a.
Data Set of n=9 (Frey, H.C. and D.E. Burmaster, 1998)
Data Set 2
Method of Matching Moments
Bootstrap Simulation
8=2,000, n=2,000
-I	•	r
0.2        0.4        0.6        0.8
PCB Concentration (ng/g, wet basis)
6-3
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

Extrapolating beyond the range of observable data. The purpose of the risk analysis
drjyes what assessors must know about the tails of the distribution. One expert
emphasized that the further assessors go into the tails, the less they know.  Another
stressed that once assessors get outside the range of the data, they know nothing.
Another expert disagreed with the point that assessors know nothing beyond the highest
data point. He suggested using analogous data sets that are more data rich to help in
predicting the tails of the distribution. The primary issue becomes how much the
assessors are willing to extrapolate.
• ,;;      •'   ,"!•  '•  :    "\ '    '• '   ••  • •      •    •   .,j' •;.•'. •/; .':  ' ..... '   .    ' .
Several experts agreed that uncertainty in the tails is not always problematic.  If the
assessor Wants to focus on a subgroup, for example, it is not necessary to look at the tail
of tjjie larger group. Stratification, used routinely by epidemiologist, was suggested. With
stratification, the assessor would look at the subgroup arid avoid having to perform an
exhaustive assessment of the tail, especially for more preliminary calculations used in a
tiered approach. In a tiered risk assessment system, if the assessor assumes the data are
Lognoririal, standard multiplicative equations can be run on a simple calculator. While
Monte Carlo-type analyses can provide valuable information in many cases, several
experts agreed that probabilistic analyses are not always appropriate or necessary. It was
suggested that, in some cases, deterministic scenario-based analyses, rather than Monte
Carlo simulation, would be a useful way to evaluate extreme  values for a model output.
".''*      • '  :  '':   .,'   ./  '      ::•     ' '"   '   '  t"  ' • " '   .  • : :  ; '  '
In a situation where a model is used to make predictions of some distribution, several
experts agreed that the absence of perfect information about the tails of the distribution
of each input does not mean that assessors will not have adequate information about the
ta|| of the m°del output.  Even if all we have is good information about the central
portions of the  input distributions, it may be possible to simulate an extreme value for the
output.
Use of data in the tails of the distribution. One expert cautioned assessors to be sensitive
to potentially important data in the tails.  He provided an example in which assessors
re.Jied on the "expert judgement" of facility operators in predicting contaminant releases
from a source,  they failed to adequately predict "blips" that were later shown to exist in
20 to 30 percent of the distribution. Another expert noted that he was skeptical about
adding tails (but was not skeptical about setting upper and lower bounds). It was agreed
that, in general, assessors need to carefully consider what they do know about a given
data set that could enable them to set a realistic upper bound (e.g., body weight). The
goal is to provide the risk manager with an "unbiased estimate of risk." One expert
reiterated that subjective judgments are inherent in the risk assessment process.  In the
ca$£ of truncating data, such judgments must be explained clearly and justified to the risk mMiager. In contrast to truncation, one expert reminded the group that the risk manager decides upon what percentile of the tail is of interest. Because situations arise in which the' risk manager may be looking for 90th to 99th percentile values, the assessor must know how to approach the problem and, ultimately, must clearly communicate the approach and the possible large uncertainties. Scenarios, The group discussed approaches for evaluating the high ends of distributions (e.g., the treatment blips mentioned previously or the pica child). Should the strategy for assessing overall risks include high end of unusual behavior? Several experts felt that • • • ' ' 6-4 image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } including extreme values in the overall distribution was not justified and suggested that the upper bounds in these cases be considered "scenarios." As with upper bounds, one expert noted that low end values also need special attention in some cases (e.g., air exchange in a tight house). • Generalized distributions versus mixtures. Expert opinion differed regarding the issue of generalized versus mixture distributions. One expert was troubled by the notion of a mixture distribution. He would rather use a more sophisticated generalized distribution. Another expert provided an example of radon, stating that it is likely a mixture of Normal distributions, not a Lognormal distribution. Therefore, treatment of mixtures might be a reasonable approach. Otherwise, assessors risk grossly underestimating risk in concentrated areas by thinking they know the parametric form of the underlying distribution. The same expert noted that the issue of mixtures highlights the importance of having - some theoretical basis for applying available techniques (e.g., possible Bayesian methods). Another expert stated that he could justify using distributions that are mixtures, because in reality many data sets are inherently mixtures. • Truncation of distributions. Mixed opinions were voiced on this issue. One expert noted that assessors can extend a distribution to a plausible upper bound (e.g., assessors can predict air exchange rates because they know at a certain point they will not go higher). Another expert noted that truncating the distribution by 2 or 3 standard deviations is not uncommon because, for example, the assessors simply do not want to generate 1,500- ppund people. One individual questioned, however, whether truncating a Lognormal distribution invalidates calling the distribution Lognormal. Another commented on instances in which truncating the distribution may be problematic. For example, some relevant data may be rejected. Also, the need to truncate suggests that the fit is very poor. The only reason to truncate, in his opinion, is if one is concerned about getting a zero or negative value, or perhaps an extremely high outlier value. One expert noted that truncation clearly has a role, especially when a strong scientific or engineering basis can be demonstrated. • When should neither an EDF nor PDF be used? Neither an EDF nor a PDF may be useful/appropriate when large extrapolations are needed or when the assessor is uncomfortable with extrapolation beyond the available data points. In these cases, scenario analyses may come into play. In their final discussions on EDF/PDF, the group widely encouraged visual or graphical representation of data. Additional thoughts on visually plotting the data are presented in the following discussions of goodness of fit. 6.2 GOODNESS-OF-FIT (GoF) On -what basis should it be decided whether a data set is adequately represented by a fitted parametric distribution? 6-5 image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } r The final workshop discussions related to the appropriateness of using available GoF test statistics in evaluating how well a data set is represented by a fitted distribution. Experts were asked to CQpsider what options are best suited and how one chooses among multiple tests that may provide different answers. The following highlights the major points of these discussions. 1 Interpreting poor fit. GoF in the middle of the distribution is not as important as that of the tails (upper and lower percentiles). Poor fit may be due to outliers at the other end of the distribution. If there are even only a few outliers, GoF tests may provide the wrong answer. Graphical representation of data is key to evaluating goodness or quality of fit. Unanimously, the experts agreed that using probability plots (e.g., EDF, QQ plots, PP plots) or other visual techniques in evaluating goodness of fit is an acceptable and recommended approach. In fact, the group felt that graphical methods should always be U§ed. Generally, it is easier to judge the quality of fit using probability plots that compare data to a straight line. There may be cases in which a fit is rejected by a particular GoF test but appears reasonable when using visual techniques. The group supported the idea that GoF tests should not be the only consideration in fitting a distribution to data. Decisions can be made based on visual inspection of the data. It was noted that graphical presentations help to show quirks in the data (e.g., mixture distributions). It was also recommended that the assessor seek the consensus of a few trained individuals when interpreting data plots (as is done in the medical community when visually inspecting X-rays or CAT scans). What is the significance of failing a weak test such as chi-square? Can we justify using dfftq (hat fail a GoF test? GoF tests may be sensitive to imperfections in the fit that are not important to the assessor or decision maker. The group therefore agreed that the fitted distribution can be used especially if the failure of the test is due to some part of the distribution that does not matter to the analysis (e.g., the lower end of the distribution). The reason the test failed, however, must be explained by the assessor. Failing a chi-square test is not problematic if the lower end of the distribution is the reason for the failure. One expert questioned whether the assessor could defend (in court) a failed statistical test. Another expert responded indicating that a graphical presentation might be used to defend use of the data, showing, for example, that the poor fit was a result of data set size, not chance. Considerations for risk assessors when GoF tests are used. "IF | :; i , — The evaluation of distributions is an estimation process (e.g., PDFs). Using a systematic testing approach based on the straight line null hypothesis may be problematic. TT-,.. R2 is a poor way to assess GoF. — The appropriate Joss function must be identified. 6-6 „.,„:- ;,. ii.i. M aiiiiltl Jlii. image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } — The significance level must be determined before the data are analyzed. Otherwise, it is meaningless. It is a risk management decision. The risk assessor and risk manager must speak early in the process. The risk manager must understand the significance level and its application. • Should GoF tests be used for parameter estimation (e.g., objective function is to minimize the one-tail Anderson-Darling)? A degree of freedom correction is needed before the analysis is run. The basis for the fit must be clearly defined—are the objective and loss functions appropriate? • "Maximum likelihood estimation (MLE)" is a well-established statistical tool and provides a relatively easy path for separating variability from uncertainty. • The adequacy of Crystal Ball®'s curve-fitting capabilities was questioned. One of the experts explained that it runs three tests, then ranks them. If the assessor takes this one step further by calculating percentiles and setting up plots, it is an adequate tool. • The Office of Water collects large data sets. Some of the office's efforts might provide some useful lessons into interpreting data in the context of this workshop. » What do we do if only summary statistics are available? Summary statistics are often all that are available for certain data sets. The group agreed that MLE can be used to estimate distribution parameters from summary data. In addition, one expert noted that probability plots are somewhat useful for evaluating percentile data. Probability plots enable assessors to evaluate the slope (standard deviation) and the intercept (mean). Confidence intervals cannot be examined and uncertainty cannot be separated from variability. In summary, the group identified possible weaknesses associated with using statistical GoF tests in the context described above. The experts agreed unanimously that graphical/visual techniques to evaluate how well data fit a given distribution (alone or in combination with GoF techniques) may be more useful than using GoF techniques alone. 6.3 SUMMARY OF EDF/PDF AND GoF DISCUSSIONS The experts agreed, in general, that the choice of an EDF versus a PDF is a matter of personal preference. The group recommended, therefore, that no rigid guidance be developed requiring one or the other in a particular situation. The decision on which distribution function to use is dependent on several factors, including the number of data points, the outcome of interest, and how interested the assessor is in the tails of the distribution. Varied opinions were voiced on the use of mixture distributions and the appropriateness of truncating distributions. The use of scenario analysis was suggested as an alternative to probabilistic analysis when a particular input cannot be assigned a probability distribution or when estimating the tails of an important input distribution may be difficult. Regarding GoF, the group fully agreed that visualization/graphic representation of both the data and the fitted distribution is the most appropriate and useful approach for ascertaining adequacy of fit. In 6-7 image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } general, the group^ agreed that conventional GoF tests have significant shortcomings and should not be the primary method for determining adequacy of fit. 6-8 image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } SECTION SEVEN OBSERVER COMMENTS This section presents observers' comments and questions during the workshop, as well as responses from the experts participating in the workshop. DAY ONE: Tuesday, April 21,1998 Comment 1 Helen Chernoff, TAMS Consultants Helen. Chernoff said that, with the release of the new policy, users are interested in guidance on how to apply-the information on data representativeness and other issues related to probabilistic risk assessment. She had believed that the workshop would focus more on application, rather than just on the background issues of probabilistic assessments. What methods could be used to adjust data and improve data representativeness (e.g., the difference between past and current data usage)? Response The workshop chair noted that adjustment discussions during the second day of the workshop start to explore available methods. One expert stated that, based on his impression, the workshop was designed to gather input from experts in the field of risk assessment and probabilistic techniques. He noted that EPA's policy on probabilistic analysis emerged only after the 1996 workshop on Monte Carlo analysis. Similarly, EPA will use the information from this workshop to help build future guidance on probabilistic techniques, but EPA will not release specific guidance immediately (there may be an approximate two-year lag). The chair noted that assessors may want to know when they can/should implement alternate approaches. He pointed out that the representativeness issue is not specific to probabilistic assessment. It applies to all assessments. Since EPA released its May 1997 policy on Monte Carlo analysis, representativeness has been emphasized more, especially in exposure factor and distribution evaluations. He noted, however, that data quality/representativeness is equally important when considering a point estimate. However, it may not be as important if a point estimate is based on central tendency instead of an upper percentile where there may be fewer data. Another agreed that the representativeness issue is more important for probabilistic risk assessment than deterministic risk assessment (especially a point, estimate based on central tendency). Comment 2 Emran Dawoud, Human Health Risk Assessor, Oak Ridge National Laboratory Mr. Dawoud commented that the representativeness question should reflect whether additional data must be collected. He noted that the investment (cost/benefit) should be considered. From a risk assessment point of view, one must know how more data will affect the type or cost of remedial activity. In his opinion, if representativeness does not change the type or cost of remedial activity, further data collection is unwarranted. 7-1 image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } Mr. Qawppd also commented that the risk model has three components: source, exposure, and dgse-response. Has the sensitivity of exposure component been measured relative to the sensitivity of the other two components? He noted the importance of the sensitivity of the source term, especially if fate and transport are involved. Mr. E)awoud briefly noted that, in practice, a Lognormal distribution is being fit with only a few samples. Uncertainty of the source term in these cases is not quantified or incorporated into risk predictions. Even if standard deviation is noted, the contribution to final risk prediction is not considered. Mr. Dawoud noted that the workshop discussions on the distribution around exposure parameters seem to be less important than variation around the source term. Likewise, he noted the Uncertainties associated with the dose-response assessment as well (e.g., applying uncertainty factors of 10,, 100, etc.). Response , J, "'' U , ' T ' ,' l'1"1 " 'i' . '' One participant noted that representativeness involves more than collecting more data. Evaluating representativeness is often about choosing from several data sets. He agreed that additional data are collected depending on how the collection efforts may affect the bottom line assessment answer. He noted that if input does not affect output, then its distribution need not be described. Relative tq Mr. Dawoud's second point, it was noted that source term evaluation is part of exposure assessment. While exposure factors (e.g, soil ingestion and exposure duration) affect the risk assessment, one expert emphasized that the most important driving "factor" is the source term. As for dose-response, the industry is just beginning to explore how to quantify variability and uncertainty. The workshop chair noted that methodologically, exposure and source terms are not markedly different. The source term has representativeness issues. There are ways to distinguish between variability and uncertainty in the variability estimate. Lastly, more than one expert agreed that the prediction of risk for noncancer and cancer endpoints (based on the reference dose [RfD] and cancer slope factor [CSF], respectively) is very uncertain. The methods discussed during this workshop cannot be directly applied to RfDs and CSFs, but they could be used on other toxicologic data. More research is needed in this area. Comment 3 Ed Garvey, TAMS Consultants Mr. Garvey questioned whether examining factors of 2 or 3 on the exposure side is worthwhile, given the level of uncertainly on the source or dose-response term, which can be orders of magnitude. Response It was an EPA policy choice to examine distributions looking first at exposure parameters, according to one EPA panelist. He also reiterated that the evaluation ofexposure includes the source term (i.e., exposure = concentration x uptake/averaging time). One person noted that it was time to "step up" on quantifying toxicity uncertainty. Exposure issues have been driven primarily by engineering approaches (e.g., the Gaussian plume model), toxicity has historically been driven by toxicologists and statisticians and are more data oriented. 7-2 image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } It was noted that, realistically, probabilistic risk assessments will be seen only when money is available to support the extra effort. Otherwise, 95% UCL concentrations and point estimates will continue to be used. Knowing that probabilistic techniques will enable better evaluations of variability and uncertainty, risk assessors must be explicitly encouraged to perform probabilistic assessments. We must accept that the existing approach to toxicity assessment, while lacking somewhat in scientific integrity, is the only option at present. Comment 4 Emran Dawoud, Human Health Risk Assessor, Oak Ridge National Laboratory Mr. Dawoud asked whether uncertainty analysis should be performed to evaluate fate and transport related estimates. Response One expert stressed that whenever direct measurements are not available, variability must be assessed. He commented that EPA's Probabilistic Risk Assessment Work Group is preparing two chapters for Risk Assessment Guidance for Superfund (RAGS): one on source term variability and another on time-dependent considerations of the source term. Comment 5 Zubair Saleem, Office of Solid Waste, U.S. EPA Mr. Saleem stated that he would like to reinforce certain workshop discussions. He commented that any guidance on probabilistic assessments should not be too rigid. Guidance should clearly state that methodology is evolving and may be revised. Also, guidance users should be encouraged to collect additional data. Response The workshop chair recognized Mr. Saleem's comment, but noted that the experts participating in the workshop can only provide input and advice on methods, and is not in a position to recommend specific guidelines to EPA. DAY TWO: Wednesday, April 22,1998 Comment 1 Lawrence Myers, Research Triangle Institute Mr. Myers offered a word of caution regarding GoF tests. He agrees that many options do not work well but he stated that in an adversarial situation (e.g., a court room) he would rather be defending data distributions based on a quantitative model instead of a graphical representation. Mr. Myers noted that the problem with goodness of fit is the tightness of the null hypothesis (i.e., it specifies that the true model is exactly a member of the particular class being examined). Mr. Myers cited Hodges and Layman (1950s) who generalized chi-square in a way that may be meaningful to the issues discussed in this workshop. Specifically, because exact conformity is not expected, a more 7-3 image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } appropriate null hypothesis would be that the true distribution is "sufficiently close11 to the family being examined. Response One expert reiterated that when a PDF is fitted, it is recognizably an approximation and therefore rnakes application of standard GoF statistics difficult. Another expressed concern that practitioners could 1*0 oft a "fishing expedition," especially in an adversarial situation, to find a QoF test that gives the right answer. He did not feel this is the message we want to be giving practitioners. A third expert noted a definite trend in the scientific community away from GoF tests and; towards visualization,. 7-4 image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } SECTION EIGHT REFERENCES Buck, R.J., K.A. Hammerstrom, and P.B. Ryan, 1995. Estimating Long-Term Exposures from Short- term Measurements. Journal of Exposure Analysis and Environmental Epidemiology, Vol. 5, No. 3, pp. 359-373. Efron, B. and R.J. Tibshirani, 1993. An Introduction to the Bootstrap. Chapman and Hall. New York. Frey, H.C. and D.E. Burmaster, "Methods for Characterizing Variability and Uncertainty: Comparison of Bootstrap Simulation and Likelihood-Based Approaches," Risk Analysis (Accepted 1998). RTI, 1998. Development of Statistical Distributions for Exposure Factors. Final Report. Prepared by Research Triangle Institute. U.S. EPA Contract 68D40091, Work Assignment 97-12. March 18, 1998. U.S. Environmental Protection Agency, 1996a. Office of Research and Development, National Center for Environmental Assessment. Exposure Factors Handbook, SAB Review Draft (EPA/600/P-95/002Ba). U.S. Environmental Protection Agency, 1996b. Summary Report for the Workshop on Monte Carlo Analysis. EPA/630/R096/010. September 1996. . " U.S. Environmental Protection Agency, 1997a. Guiding Principles for Monte Carlo Analysis. EPA/63 O/R-97/001. March 1997. U.S. Environmental Protection Agency, 1997b. Policy for Use of Probabilistic Analysis in Risk Assessment at the U.S. Environmental Protection Agency. May 15, 1997. Wallace, L.A., N. Duan, and R. Ziegenfus, 1994. Can Long-term Exposure Distributions Be Predicted from Short-term Measurements? Risk Analysis, Vol. 14, No. 1, pp. 75-85. 8-1 image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } APPENDIX A ISSUE PAPERS image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } r image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } Issue Paper on Evaluating Representativeness of Exposure Factors Data This paper is based on the Technical Memorandum dated March 4, 1998, submitted by Research Triangle Institute under U.S. EPA contract 68D40091. 1. INTRODUCTION The purpose of this document is to discuss the concept of representativeness as it relates to assessing human exposures to environmental contaminants and to factors that affect exposures and that may be used in a risk assessment. (The factors, referred to as exposure factors, consist of measures like tapwater intake rates, or the amount of time that people spend in a given microenvironment.) This is an extremely broad topic, but the intent of this document is to provide a useful starting point for discussing this extremely important concept. Section 2 furnishes some general definitions and notions of representativeness. Section 3 indicates a general framework for making inferences. Components of representativeness are presented in Section 4, along with some checklists of questions that can help in the evaluation of representativeness in the context of exposures and exposure factors. Section 5 presents some techniques that may be used to improve representativeness. Section 6 provides our summary and conclusions. 2. GENERAL DEFINITIONS/NOTIONS OF REPRESENTATIVENESS Representativeness is defined in American National Standard: Specifications and Guidelines for Quality Systems for Environmental Data and Environmental Technology Programs (ANSI/ASQC E4 - 1994) as follows: The measure of the degree to which data accurately and precisely represent a characteristic of a population, parameter variations at a sampling point, a process condition, or an environmental condition. Although Kendall and Buckland (A Dictionary of Statistical Terms, 1971) do not define representativeness, they do indicate that the term "representative sample" involves some confusion about whether this term refers to a sample "selected by some process which gives all samples an equal chance of appearing to represent the population" or to a sample that is "typical in respect of certain characteristics, however chosen." Kruskal and Mosteller (1979) point out that representativeness does not have an unambiguous definition; in a series of three papers, they present and discuss various notions of representativeness in the scientific, statistical, and other literature, with the intent of clarifying the technical meaning of the term. A-l image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } In Chapter 1 of the Exposure Factors Handbook (EFH), the considerations for including the particular source studies are enumerated and then these considerations are evaluated qualitatively at the end of each chapter (i.e., for each type of exposure factor data). One of the criteria is "representativeness of the population," although there are several other criteria that clearly relate to various aspects of representativeness. For example, these related criteria include the following: '{iiii ''ii,. EFH Study Selection Criterion focus on factor of interest . • , , • •• i; i •., data pertinent to U.S. current information adequacy of data collection period validity of approach representativeness of the population variability in the population •••?• minimal (or defined) bias in study design minimal (or defined) uncertainty in the data EFH Perspective studies with this specific focus are preferred studies of U.S. residents are preferred recent studies are preferred, especially if changes over time are expected generally the goal is to characterize long-term behavior direct measurements are preferred U.S. national studies are preferred studies with adequate characterizations of variability are desirable studies having designs with minimal bias are preferred (or with known direction of bias) large studies with high ratings on the above considerations are preferred 3. A GENERAL FRAMEWORK FOR MAKING INFERENCES Despite the lack of specificity of a definition of representativeness, it is clear in the present context that representativeness relates to the "comfort" with which one can draw inferences from some set(s) of extant data to the population of interest for which the assessment is to be conducted., and in particular, to certain characteristics of that population's exposure or exposure factor distribution, the following subsections provide some definitions of terms and attempt to break down the overall inference into some meaningful steps. 3.1 Inferences from a Sample to a Population In this paper, the word population to refers to a set of units which may be defined in terms of person arid/or space and/or time characteristics. The population can thus be defined in terms of its individuals' characteristics (defined by demographic and socioeconomic factors, ""; • ' A-2 image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } human behavior, and study design) (e.g., all persons aged 16 and over), the spatial characteristics (e.g., living in Chicago) and/or the temporal characteristics (e.g., during 1997). In conducting a risk assessment, the assessor needs to define the population of concern — that is, the set of units for which risks are to be assessed (e.g., lifetime risks of all U.S. residents). At a Superfund site, this population of concern is generally the population surrounding the site. In this document, the term population of concern refers to that population for which the assessor wishes to draw inferences. If it were practical, this is the population for which a census (a 100% . sample) would exist or for which the assessor would conduct a probability-based study of exposures. Figure 1 provides a diagram of the exposure assessor decision process during the selection of data for an exposure assessment. As depicted in figure 1, quite often it is not practical or feasible to obtain data on the population of concern and the assessor has to rely on the use of surrogate data. These data generally come from studies conducted by researchers for a variety of purposes. Therefore, the assessor's population of concern may differ from the surrogate population. Note that the population differences may be in any one (or more) of the characteristics described earlier. For example, the surrogate population may only cover a subset of the individuals in the assessor's population of concern (Chicago residents rather than U.S. residents). Similarly, the surrogate data may have been collected during a short period of time (e.g., days), while the assessor may be concern about chronic exposures (i.e., temporal characteristics). The studies used to derive these surrogate data are generally designed with a population in mind. Since it may not be practical to sample everyone in that population, probability-based sampling are often conducted. This sampling scheme allows valid statistical (i.e., non-model- based) inferences, assuming there were no implementation difficulties (e.g., no nonresponse and valid measurements). Ideally, the implementation difficulties would not be severe (and hence ignored), so that these sampled individuals can be considered representative of the population. If there are implementation difficulties, adjustments are typically made (e.g., for nonresponse) to compensate for the population differences. Such remedies for overcoming inferential gaps are fairly well documented in the literature in the context of probability-based survey sampling (e.g., see Oh and Scheuren (1983)). If probability sampling is not employed, the relationships of the selected individuals for which data are sought and of the respondents for which data are actually acquired to the population for which the study was designed to address are unclear. There are cases where probability-based sampling is used and the study design allows some model-based inferences. For instance, food consumption data are often obtained using surveys which ask respondents to recall food eaten over a period of few days. These data are usually collected throughout a one-year period to account for some seasonal variation in food consumption. Statistical inferences can then be made for the individuals surveyed within the time frame of study. For example, one can estimate the mean, the 90th percentile, etc. for the number A-3 image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } figure 1: Risk Assessment Data Collection Process Are the surrogate data representative and adequate? (Use checklists II, III, IV) No I -No- Can adjustments be made to extrapolate to site/chemical of interest? -Yes- Are the data representative anc adequate? (Use checklist I) Risk Assessment Data Needs Are there site/chemical specific data? -Yes- A-4 image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } of days during which individuals were surveyed. However, if at least some of the selected individuals are surveyed multiple periods of time during that year, then a model-based strategy might allow estimation of a distribution of long-term (e.g., annual) consumption patterns. If probability-based sampling is not used, model-based rather than statistical inferences are needed to extend the sample results to the population for which the study was designed. In contrast to the inferences described above, which emanate from population differences and the sampling designs used in the study, there are two additional inferential aspects that relate to representativeness: • The degree to which the study design is followed during its implementation • The degree to which a measured value represents the true value for the measured unit Both of these are components of measurement error. The first relates to an implementation error in which the unit selected for measurement is not precisely the one for which the measurement actually is made. For instance, the study's sampling design may call for people to record data for 24-hr periods starting at a given time of day, but there may be some departure from this ideal in the actual implementation. The second has to do with the inaccuracy in the measurement itself, such as recall difficulties for activities or imprecision in a personal air monitoring device. 4. COMPONENTS OF REPRESENTATIVENESS As described above, the evaluation of how representative a data set is begins with a clear definition of the population of concern (the population of interest for the given assessment), with attention to all three fundamental characteristics of the population — individual, spacial, and temporal characteristics. Potential inferential gaps between the data set and the population of concern -- that is, potential sources of unrepresentativeness -- can then be partitioned both along these population characteristics. Components of representativeness are illustrated in Tablel: the rows correspond to the inferential steps and the columns correspond to the population characteristics. The inferential steps are distinguished as being either internal or external to the source study: 4.1 Internal Components - Surrogate Data Versus the Study Population After determining that a study provides information on the exposures or exposure factors of interest, it is important that the exposure assessor evaluate the representativeness of the surrogate study (or studies). This entails gaining an understanding of both the individuals sampled for the study and the degree to which the study achieved valid inferences to that population. The assessor should consider the questions in Checklist I in the appendix to help establish the degree of representativeness inherent to this internal component. In the context of the Exposure Factors Handbook (EFH), the representativeness issues listed in this checklist are presumably the types of considerations that led to selection of the source studies that appear in A-5 image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } Table 1* Elements of Representativeness Component of Inference Population Characteristics Individual Characteristics Spacial Characteristics Temporal Characteristics EXTERNAL TO STUDY How well does the surrogate population represent the population of concern? • Exclusion or limited coverage of certain segments of population of concern * Exclusion or inadequate coverage of certain regions or types of areas (e.g., rural areas) that make up the population of concern • Lack of currency • Limited temporal coverage, including exclusion or inadequate coverage of seasons • Inappropriate duration for observations (e.g., short- term measurements where concern is on chronic exposures) INTERNAL TO STUDY How well do the individuals sampled represent the population of concern for the study? How well do the actual number of respondents represent the sampled population? How well does the measured value represent the true value for the measured unit? • Imposed constraints that exclude certain segments of study population • Frame inadequacy (e.g., due to lack of current frame information) • Non-probability sample of persons • Excessive nonresponse • Inadequate sample size • Behavior changes resulting from participation in study (Hawthorne effect) • Measurement errors associated with people's ability/desire to respond accurately to questionnaire items • Measurement error associated with within- specimen heterogeneity • Inability to acquire physical specimen with exact size or shape or volume desired • Inadequate coverage (e.g., limited to single urban area) • Non-probability sample of spatial units (e.g., convenience or judgmental siting of ambient monitors) • Inaccurate identification of sampled location • Limited temporal coverage (e.g., limited study duration) • Inappropriate duration for observations « Non-probability sample of observation times • Deviation in times selected vs. those measured or reported (e.g., due to schedule slippage, or incomplete response) • Measurement errors related to time (e.g., recall difficulties for foods consumed or times in microenvironments) A-6 image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } the EFH. As indicated previously, the focus for addressing representativeness in that context was national and long-term, which may or may not be consistent with the assessment of current interest. 4.2 External Components - Population of Concern Versus Surrogate Population In many cases, the assessor will be faced with a situation in which the population of concern and surrogate population do not coincide in one or more aspects. To address this external factor of representativeness, the. assessor needs to: • determine the relationship between the two populations • judge the importance of any discrepancies between the two populations • assess whether adjustments can be made to reconcile or reduce differences. To address these, the assessor needs to consider all characteristics of the populations. Relevant questions to consider are listed in Checklists II, III, and IV in the appendix for the individual, spacial, and temporal characteristics, respectively. Each checklist contains several questions related to each of the above bullets. For example, the first few items of each checklist relate to the first item above (relationship of the two populations). There are several possible ways in which the two populations may relate to each other; these cases are listed below and can be addressed for each population dimension: • Case 1: The population of concern and surrogate population are (essentially) the same • Case 2: The population of concern is a subset of the surrogate population Case 2a: The subset is a large and identifiable subset. Case 2b: The subset is a small and/or unidentifiable subset. • Case 3: The surrogate population is a subset of the population of concern. • Case 4: The population of concern and surrogate population are disjoint. Note that Case 2a implies that adequate data are available from the surrogate study to generate separate summary statistics (e.g., means, percentiles) for the population of concern. For example, if the population of concern was focused on children and the surrogate population was a census or large probability study of all U.S. residents, then children-specific summaries would be possible. In such a situation, Case 2a reverts to back to Case 1. Case 2b will be typical of situations in which large-scale (e.g., national or regional) data are available but assessments are needed for local areas (or for acute exposures). As an example, suppose raw data from the National Food Consumption Survey (NFCS) can be used to form meaningful demographic subgroups and to estimate average tapwater consumption for such subgroups (e.g., see Section 5.1). If a risk assessment involving exposure from copper smelters is to be conducted for the southwestern U.S., for instance, tapwater consumption would probably be considered to be different for that area than for the U.S. as a whole, but the NFCS data for that A-7 image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } area might be adequate. If so, this Would be considered Case 2a. But if the risk assessment concerned workers at copper smelters* then an even greater discrepancy between the population of concern and the surrogate data might be expected^ and the MFCS' data would likely be regarded as inadequate, and more speculative estimates would be needed. In contrast to Case 2, Case 3 will be typical of assessments that must use local and/or short-term data to extrapolate to regional or national scales and/or to long-term (chronic) exposures. Table 2 presents some hypothetical examples for each case. Note that, as illustrated here and as implied by the bulleted items in Checklist IV, the temporal characteristics has two series of issues: one that relates td the currency and the temporal coverage (study duration) of the source study relative to the population of concern time frame, and one that relates to the time unit of observation associated with me study. • vijih i • • " '•• •• :-.j: '• ' ' •• : • ,rT •"$;•'   '   •.•      ',:    '  .   ' '  ' ••
Since most published references to the NFC'S rely on the 1977-7£ survey, exposure factor
data based on  that survey might well be considered  as Case 4 with respect to temporal coverage,
as trends such as consumption of bottled water and  organic foods may not be well represented by
20 year-old data. A possible approach hi this situation would be to obtain data from several
NFCSs, to compare or test for a difference between them,  and to use them to extrapolate to the
present or future, the NFCS also illustrates the other temporal aspect — dealing with a time-
unit mismatch of the data and the population of concern — since the survey involves three
consecutive days for each person, while typically a  longer-term estimate would be desired3 e.g., a
person-year estimate (e.g., see Section 5,2).

While  determining the relationship of the two populations will generally be
Straightforward (first bullet), determining the importance of discrepancies and making
adjustments (the second and third bullets) may be highly subjective and require an understanding
of what factors contribute to heterogeneity in exposure factor values and speculation as to their
influence on the exposure factor distribution.  Cases 1 and 2a are the easiest,  of course.  In the
other cases, it will generally be easier to speculate about how the mean and variability (perhaps
expressed as a coefficient of variation (CV)) of the  two populations may differ than to speculate
on changes in a given percentile.  Considerations of unaffected portions of the population must
also be factored into the risk assessor's speculation. The difficulty in such speculation obviously
increases dramatically when two or more factors affect heterogeneity, especially if the factors are
anticipated to  have opposite or dependent effects on the exposure factor values. Regardless of
how such speculation is ultimately reflected hi the assessment (either through ignoring the
population differences or by adjusting estimated parameters of the  study population), recognition
of the increased uncertainty should be incorporated into sensitivity analyses. As a part of such an
analysis, it would be instructive to determine risks, when, for each relevant factor (e.g., age
category), several assessors independently speculate on the mean (e.g., a low., best guess, and
high) and on the CV.
A-8
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

^ 	 .
0 T3
d c C c
.O nj u ^o -g
•.• Sj C os "S71
a> £3 o O 3 'Q
a & g S §• s
CJ PL, 0 £ OH C3
.£} <L> O
£ g <£ o
co &D Ki -(_» cC C
u g 3 1 "3 8
U CO OH W O< 8
0
13 4^
0 OS « « ^
a ' „ « -§ -2
. . .2 — «g a> .0 g .«
-a
O _j 0) ^ w O
.g
f. O
0 T-J S
CH C! C)
O Cd (D o

'-' ^ g OB ^3
(U g o g 3 <u
« O1 g § §" |
o £ 8 a S, S

a
o
1
OH
O
PLH

S
pj _«
•2 J3
"3 S
p? O
a
S
2
^3
o
00
s
T3
§
CO
H)

Asthmatic U.S.
children

1
CO
p

i
•a
CO
•-1
OO
P
O
Population
concern:

en
"3
T3
ca
oo

+2
•s
CO

U.S. residents
+
1
12
»- rt
. "O
CO u

s

CO
"-1 -
00
p

ll
bfl S
p "3
fa 1
CO OH

CO
O
§*•"

"? 2
la

oo'
D

00

Near hazardous
waste sites

Northeast U.S

CO
o
Population
concern:

CO
•o
§

1
%

1
a

CO
p
R
g
CO 4^
p"§

00
p

Surrogate
population:

0
CO
o
.2 a
oo O

1
>,
I

(D
1

CO
t
oo W)
C> 0
Os g
i— I c/3

summer, 1998

oo
Os
Os

c3
o>
I
o
Population
concern:

Os
Os
i
rt ^
CO JQ
O g

one year, 1998
+
|a
1 1

oo
Os

u<
C3

1

Surrogate
population:
1
CO
O
to
"2 & E?
a a s
H O O

g

S £
J g
^^
13 e

eating occasions
(acute)
V.
eating occasioi

CO
Co
T3
g
.£2
(D
OH
-«(-,
O
Population
concern:
CO

is*
T3
O
1
.5
ts
T3
1
1
a
o
£2
V
OH

+ (j
i 1
O -1(
tO KJ KJ
<D ,—1 trj
OH S *O

CO
ca
T3
g
OH

€l
bO cd
O 'p
S &
00 OH

a
a
13 '•§
ag
H O
A-9
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

5. "ATTEMPTING to IMPROVE

5.1 Adjustments to Account for Differences in Population Characteristics or Coverage.

If there is some overlap in information available for the population of concern and the
surrogate population (e.g., age distributions), then adjustments to the sample data can be made
that attempt to reduce the bias that would result from directly applying the study results to the
population of concern. Such methods of adjustment can all be generally characterized as "direct
standardization'5 techniques, but the specific methodology to use depends on whether one has
access to the raw data or only to summary statistics, as is often the case when using data from the
Exposure Factors Handbook.  With access to the raw data, the applicable techniques also depend
Oil Whether one wants to standardize to a single known population ofeoncern distribution (e.g.,
age categories), to two or more marginal distributions known for the population ofeoncern, or
eyen to population.'ofconcern totals for continuous variables.

Summary ^Statistics Available. Suppose that the available data are summary statistics
such as the mean, standard deviation, and various percehtiles for an exposure factor of interest
(eg., daily consumption of tap water).  Furthermore,suppose that these statistics are available for
subgroups based on age, say age groups g = 1,2,..., G. Furthermore, suppose we know that the
age distribution of the population ofeoncern differs from that represented by the sample data.
We can then estimate linear characteristics of the population ofeoncern, such as the mean or the
proportion exceeding a fixed threshold, using a simple weighted average. For example, the mean
of the population ofeoncern can be estimated as
-                      ••••                      '
XATP  ~ \^pXg>

where £g represents summation over the population of concern groups indexed by g, Pg is the
proportion of the population ofeoncern that belongs to group g, arid xgis the sample mean for
group g.

Unfortunately, if one is interested in estimating a non-linear statistic for the population of
concern, such as the variance or a percentile, this technique is not algebraically correct.
However, lacking any other information from the sample, calculating this type of weighted
average to estimate a non-linear population ofeoncern characteristic is better than making no
adjustment at all for known population differences. In the case of the population variance, we
recommend  calculating the weighted average of the group standard deviations, rather than their
variances, and then squaring the estimated population ofeoncern standard deviation to get the
estimated population ofeoncern variance.

Raw Data Available. If one has access to the raw data, not just summary statistics,
options for standardization are more numerous and can be made more rigorously. The options
depend, in part, on whether or not the data already have statistical analysis weights, such as those
appropriate  for analysis of data from a probability-based sample survey.

A46
"ii;..
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

Suppose that one has access to the raw data from a census or from a sample in which all
units can be regarded as having been selected with equal probabilities (e.g., a simple random
sample). In this case, if one knows the number, N^ of population of concern members in group
g, then the statistical analysis weight to associate with the z'-th member of the g-th group is
N
W (i) =  -1,
where the sample contains ng members of group g.  Alternatively, if one knows only the
proportion of the population and sample that belong to each group, one can calculate the weights
as
W (i) = -S-,
where pg is the proportion of the sample in group g.  The latter weights differ from those above
only by a constant, the reciprocal of the sampling fraction, and will produce equivalent results for
means and proportions. However, the former weights must be used to estimate population totals.
In either case, the population of concern mean can be estimated as
x
'ATP
where xg(i) is the value of the characteristic of interest (e.g., daily tap water consumption) for the
z'-th sample member in group g.

In general, one may have access to weighted survey data, such as results from a
probability-based sample of the surrogate population. In this case, the survey analysis weight,
w(i), for the /-th sample member is the reciprocal of that person's probability of selection with
appropriate adjustments to reduce nonreponse bias and other potential sources of bias with
respect to the surrogate population.  Further adjustments for making inferences to the population
of concern are considered below.  These results can also be applied to the case of equally
weighted survey data, considered above, by considering the survey analysis weight, w(i), to be
unity (1.00) for each sample member.

If one knows the distribution of the population of concern with respect to a given
characteristic (e.g., the age/race/gender distribution), then one can use the statistical technique of
poststratification to adjust the survey data to provide estimates adjusted to that same population
distribution (see, e.g., Holt and Smith, 1979).1  In this case, the weight adjustment factor for each
member of poststratum g is calculated as
Sampling variances are computed differently for standardized and poststratified estimates, but these
details are suppressed in the present discussion (see, e.g., Shah et al, 1993).

A-ll
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

^heif fhe summation is oyer all sample members belonging tp poststratum g.  The poststratified
analysis weight for the z-th sample member belonging to poststratum g is then calculated as
tt'/O =
C^fng this weight, instead of the surrogate population weight, wft), standardizes the survey
estimates to, the population of concern.

If one knows multiple marginal distributions for the population of concern but not their
joint distribution (e.g., marginal age, race, and gender distributions), one can apply a statistical
Height adjustment procedure known as raking, or iterative proportional fitting, to standardize the
Survey weights (see, e.g., Oh and Scheuren, 1983). Raking is an iterative procedure for scaling
the survey weights to known marginal totals.
' '  '' "j    '         ™"   „      '•,,'   '           •:      '••,,'   ..',•'     '
If One knows population of concern subgpup totals for continuous variables, a
generalized raking procedure can be used to standardize the survey weights to known
distributions of categorical variables as well as known totals fpr continuous variables.  The
generalized raking procedures utilize non-linear, exponential modeling (see, e.g., Fplspm, J.991
*
Of course, none of these standardjzatipn procedures results in inferences tp the population
of concern that are as defensible as those from a well-designed sample survey selected from a
Sampling frame that completely and adequately covers the population of concern.

S.2 Adjustments to Account for Time-Unit J)ifferenees.

A common way in which the surrogate population and population of concern may differ
is in the time unit of (desired) observation. Probably the most common situation occurs when the
study data represent short-term measurements but where chronic exposures are of interest.  In
this case, some type of model is needed to make the time-unit inference ,(e-g,, from the
distribution of person-day or person-week exposures to the distribution of annual or lifetime
exposures). In general, it is convenient to break down the overall inference into two components:
from the time unit of measurement to the time duration of the study (data tp the surrogate
population), and from the time duration of the surrogate population to the tirne unit of the
population of concern. For specificity, let t denote the observation time (e.g., a day or a week);
b" T denote the duration of the study (i.e., t is the time duration associated with the surrogate
. 'I	II"	|	il[|	 '" '"I1!" .l"i[||	'!!!' '	lip	[ ;;l " '• »• "»  • »"  	*til|» "I1 '•; f  ,•? f1   7    ",  '      ' " , „     «.',,'"      ,,   ,   •
population); and let T denote the time unit of the  population of concern (e.g., a lifetime). In the
case of phronip exposure cpncerns, t<T<T.

Suppose that N denotes the number of persons in the surrogate population, and assume
there are (conceptually) 1C disjoint time intervals  of length t that surrogate population T (i.e.,
442
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

Kt=T). Thus a census of the surrogate population would involve NK short-term measurements
(of exposures or of exposure factors). This can be viewed as a two-way array with N rows
(persons) and K columns (time periods).  Clearly, the distribution of these NK measurements,
whose mean is the grand total over the NK cells divided by NK, encompasses both variability
among people and variability among time periods within people (and in practice, measurement
error also). The average across the columns for a given row (the marginal mean) is the average
exposure for the given person over a period of length T.  Since the mean of these T-period
"measurements" over the N rows leads to the same mean as before, it is clear that the mean of the
t-time measurements and the mean of the t-time measurements is the same.  However, unless
there is no within-person variability, the variability of the longer T-period measurements will be
smaller than the variability of the shorter t-period measurements.  If the distribution of the shorter
term measurements is right-skewed, as is common, then one would expect the longer term
distribution to exhibit less skewness.  Note that the degree to which the variability shrinks
depends on the relation between the within-person and between-person components of variance,
which is related to the temporal correlation.  For example, if there is little within-person
variability, then people with high (low) values will remain high (low) over time, implying that
the autocorrelation is high and that the shrinkage in variability in going from days to years (say)
will be minimal. If there is substantial within-person variation, then the autocorrelations will be
low and substantial shrinkage  in the within-person variance (on the order of a t/T decrease) will
occur.

To make this t-to-u portion of the inference, we therefore would ideally have a valid
probability-based sample of the NK person-periods, and data on the t-period exposures or
exposure factors would be available for each of these sampling units. As a part of this study
design, we would also want to ensure that at least some of .persons have measurements for more
than one time period, since models that allow the time extrapolation will need data that, in
essence, will support the estimation of within-person components of variability. There are
several examples of models of this sort, some of which are described below.

Wallace et al. (1994) describe a model, which we refer to as the Duan- Wallace (DW)
model, in which data over periods of length t, 2t, 3t, etc. (i.e., over any  averaging period of length
mt) are all conceptually regarded to be approximated by lognormal distributions, with parameters
that depend on a "lifetime" variance component and a short term variance component. While
such an assumption is theoretically inconsistent if exact lognormality is required, it may
nevertheless serve well as an approximation. The basic notion of the DW method is that, while
the mean of the exposures stays constant, the variability decreases as the number of periods
averaged together increases. Hence it is assumed that the total variability for a distribution that
averages over M periods (M=l,2,...) can be expressed in terms of a long-term component and a
short-term component.  Let yL and Ys denote, respectively, the log-scale variances for these two
components. Under the lognormal model, Wallace et al. show that the  log-scale variance for the
M-period distribution (i.e., the distribution that averages over M periods) is given by
VM =
,   r,
log[l
A-13
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

Note that an implication of the DW model is that the geometric means for the various
distributions will increase as M increases.  In fact, the geometric mean (gm) associated with the
average of M short-term measurements will be

gm(M)  =  Fexp[-^/2]

where f is the overall population mean of the exposures. As a consequence, if data are adequate
fpr estimating the' variance components (and the mean of the exposures), then an estimate4
distrfbjitipn for anv averaging time can be inferred. In particular, the DW method can be applied
if ||a|g."are 'available for estimating VM for (atieast) two values of M, since one is then able to
d.ejermirjie, values of the two variance components. For instance, if two .observations per person
^ iaYi|iiafeig?rip'rie''carf estimate population mean and the population log-scale variance (V,) for
Jiflgi?*m^urements''(3VJ=i)', and by averaging the two short-term measurements and then taking
logs, one can estimate the population log-scale variance, Y2.  (Sampling weights should be used
wf en applicable.). By substituting into the above VT equation for T=l and T=2, the following
fojjnujas fpr estimating the variance components can be determined:
The distribution for any averaging time can then be estimated by choosing the appropriate M
^^--^• — —^ j^eqsurement time is one day) and substituting estimates into the VM equation
above. Similarly, a "lifetime" distribution (also assumed to be lognormal) is then estimated by
lefjing M go to infinity (i.e., the influence of the short term component vanishes).  Wallace et
a/,:(i$94) caution that the data collection period should encompass all major long-term trends spch as seasonality. Clayton$t al (1998) describe a study of personal exposures to airborne contaminants that
employs a more sophisticated study design and model (that requires more data); the goal was to
estimate (Jjsjributipns of annual exposures from 3-4ay exposure measurements collected
thjffpghQut a 12-month period!  Two measurements per person (in different months) were
gygilable for some of the study participants. A multivariate lognormal distribution was assumed;
t|ie Ipgnormai parameters for each month's data were estimated, along with the correlations for
egcJi rjia.nthly lag (assumed to depend only on the length of the lag). Simulated data were
generated from tins multivariate distribution for a large number of "people;" each "person's"
exposures Were then averaged over the 12 months.  This approach assumes that the an average
over 12 observations, one per month, produces an adequate approximation to the annual
(Jistribution of exposures, the model results were compared to those obtained via a modification
6f the DW modef?
444
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

Buck et al. (1995, 1997) describe some general models (e.g., lognormally is not
assumed); these, too, require multiple observations per person, and if the within-person variance
is presumed to vary by person, then a fairly large number of observations per person may be
needed. These papers give some insight into how estimated distributional parameters based on
the short-term data relate to the long-term parameters.  Reports by Carriquiry et al. (1995, 1996),
Carriquiry (1996), and a paper by Nusser et al. (1996) deal with the some of the same issues in
the context of estimating distributions of "usual" food intake and nutrition from short-term
dietary data.

The second part of the inference — extrapolation from study time period (of duration T)
to the longer time T — is likely to be much less defensible than the first part, if T and T are very
different. This part of the inference is really an issue of temporal coverage. If the study involves
person-day measurements conducted over a two-month period in the summer, and annual or
lifetime inferences are desired, then little can be said regarding the relative variability or mean
levels of the short-term and T-term data, basically because of uncertainty regarding the
stationarity of the exposure factor over seasons and years.  The above-described approach of
Wallace et al, for instance, includes statements that recognize the need for a population
stationarity assumption that essentially requires that the processes underlying the exposure factor
data that occur outside the time period of the surrogate population be like those that occur within
the surrogate population. Applying some of the above methods on an age-cohort-specific basis,
and then combining the results over cohorts, offers one possible way of improving the inference
(e.g., see Hartwell et al, 1992).

6. SUMMARY AND CONCLUSIONS

Representativeness is concerned with the degree to which "good" inferences can made
from a set of exposure factor data to the population of concern. Thus evaluating
representativeness of exposure factor data involves achieving an understanding of the source
study, making an appraisal of the appropriateness of its internal inferences, assessing how and
how much the surrogate population and population of concern differ, and evaluating the
importance of the differences.  Clearly, this can be an extremely difficult and subjective task. It
is, however, very important, and sensitivity analyses should be included in the risk assessment
that reflect the uncertainties of the process.

In an attempt to ensure that all aspects of representativeness are considered by analysts,
we have partitioned the overall inferential process into components, some of which are
concerned with design and measurement features of the source study that affect the internal
inferences, and some of which are concerned with the differences between the surrogate
population and the population of concern, which affect the external portion of the inference. We
also partition the inferential process along the lines of the population characteristics —
individual, spacial, and temporal — in an attempt to assess where overlaps and gaps exist
between the data and the population of concern. In the individual and spatial characteristics,
representativeness involves consideration of bounds and coverage issues.  In the temporal
A-15
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

characteristic, these same issues (i.e., study duration and currency) are important, but the time
uMt associated with the measurements or observations is also important, since time unit
differences often occur between the data and the population of concern. Checklists are provided
to aid in assessing the various components of representativeness.

When some aspect of representativeness is lacking in the available data, assessors are
faced with the task of trying to make the data "more representative.'5 We describe several
techniques (and cite some others) for accomplishing these types of tasks; generally making such
adjustments for known differences will reduce bias. However, it should be emphasized that these
adjustment techniques cannot guarantee representativeness in the resultant statistics.  For
supporting future, large-scale (e.g., regional or national) risk assessments, one of the best avenues
fqr improving the exposure factors data would be to get assessors involved in the design process
— So that appropriate modifications to the survey designs of future source studies can be
considered. For example, the design might be altered to provide better coverage of certain
segments of the population that may be the focus of risk assessments (eig., more data on children
could be sought). The use of multiple observations per person also could lead to improvement in
those assessments concerned with chronic exposures.

7. BIBLIOGRAPHY

American Society for Quality Control (1994). American National Standard: Specifications and
Guidelines for Quality Systems for Environmental Data and Environmental Technology
Programs (ANSl/ASQC E4). Milwaukee,  WI.

Barton, M., A. Clayton, K. Johnson, R. Whitmore (1996). "G-5 Representativeness." Research
Triangle Institute Report (Project 91U-6342-1 16), prepared for U.S. EPA under Contract No.
68D40091.

Buck, R.J., K.A. Hammerstrom, and P.B. Ryan (1995). "Estimating Long-Term Exposures from
Short-Term Measuremetns." Journal of Exposure Analysis and Environmental Epidemiology,
'
;, '               M  ' „ „        ''      Mi'   '     '            '  '''!' •        '!'   '      ,
Burmaster, D.E. and A.M. Wilson (1996). "An introduction to Second-Order Random Variables
in Human Health Risk Assessments." Human and Ecological Risk Assessment, Vol. 2, No. 4, pp.
892-919.
I!  ,          :         ""   •    '     .   :      ' . .      .   .  •
Carriquiry, A.L (1 996). ''Assessing the Adequacy of Diets: A Brief Commentary" (Report
prepared under Cooperative Agreement No. 58-3198-2-006, Agricultural Research Service,
USD A, and Iowa State University).

Carriquiry, A.L., J.J. Goyeneche, and W.A. Fuller (1996). "Estimation of Bivariate Usual Intake
Distributions" (Report prepared under Cooperative Agreement No.  58-3198-2-006, Agricultural
Research Service, USD A, and Iowa State University).

A-16
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

Carriquiry, A.L., W.A. Fuller, J.J. Goyeneche, and H.H. Jensen (1995). "Estimated Correlations
Among Days for the Combined 1989-91 CSFII" (Dietary Assessment Research Series Report 4
under Cooperative Agreement No. 58-3198-2-006, Agricultural Research Service, USD A, and
Iowa State University).

Clayton, C.A., E.D. Pellizzari, C.E. Rodes, R.E. Mason, and L.L. Piper (1998). "Estimating
Distributions of Long-Term Particulate Matter and Manganese Exposures for Residents of
Toronto, Canada." Submitted to Atmospheric Environment.

Cohen, J.T., M.A. Lampson, and T.S. Bowers (1996). "The Use of Two-Stage Monte Carlo
Simulation Techniques to Characterize Variability and Uncertainty in Risk Analysis." Human
and Ecological Risk Assessment, Vol. 2, No. 4, pp. 939-971.

Corder, L.S., L. LaVange, M.A. Woodbury, and K.G. Manton (1990). "Longitudinal Weighting
and Analysis Issues for Nationally Representative Data Sets." Proceedings of the American
Statistical Association, Section on Survey Research, pp. 468-473.

Deville, J., Sarndal, C., and Sautory, O. (1993). "Generalized Raking Procedures in Survey
Sampling." Journal of the American Statistical Association, Vol. 88, No. 423, pp. 1013-1020).

Person, Scott (1996). "What Monte Carlo Methods Cannot Do." Human and Ecological Risk
Assessment, Vol. 2, No. 4, pp. 990-1007.

Folsom, R.E. (1991). "Exponential and Logistic Weight Adjustments for Sampling and
Nonresurrogate populationonse Error Reduction." Proceedings  of the Social Statistics Section of
the American Statistical Association, 191-2Q2.

Francis, Marcie and Paul Feder, Battelle Memorial Institute (1997). "Development of Long-Term
and Short-Term Inhalation Rate Distributions." Prepared for Research Triangle Institute.

Hartwell, T.D.,  C.A. Clayton, and R.W. Whitmore (1992). "Field Studies of Human Exposure to
Environmental Contaminants." Proceedings of the American Statistical Association, Section on
Statistics and the Environment, pp. 20-29.

Holt, D. and Smith,  T.M.F. (1979). "Post Stratification." Journal of the Royal Statistical Society,
Vol. 142, Part 1, pp. 33-46.

Kendall, M.G. and W.R. Buckland (1971). A Dictionary of Statistical Terms. Published for the
International Statistical Institute, Third Edition, New York: Hafner Publishing Company Inc p
129.

Kruskal, W. and F. Mosteller (1979). "Representative Sampling, I: Non-Scientific Literature."
International Statistical Review, Vol. 47, pp. 13-24.
A-17
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

kal, . W. and I?.' Mosteller (1 979). 'TlepresentativeSSampling,1 II:'>Sciejitific Literature,
Excluding Statistics." International Statistical Review, Vpl.,47^ pp. i 111 - 127.
kal, . W, aij,d, F, Hosteller (1 979). "RepreseAtatiYe'Sainplijag, III: The Current Statistical
Literature." International Statistical Review, Vol. ,47,, pp. 245-265.
, SJvU AiUQarrj^uiry,; K,W.,Do.4d;,aiid,W.A.:Euller (1996). '/A,:SeOTiparametric
to. Estimating. Usual; D.aily. Jjgit^e^istribuKoris.'li/o^wa/ of the
\ OJj% H^L, arxd Scheureji^FJ, (1983). ".Weightia^rAdjustmejit for- Unit NoMefpanse."«In:
^
> B.,: R.,FolsjQmJ L.; LaYange^S; Wheeless,^
an.dMathewQfi^
' .....               '  '  ' '        .....
aniji,, L«E.." JyIyers,.,aad:M. J. Messner; (198^).,-^Evaluating.;and Presentingf''Quality
;6,SSPranQe Sampling Data. In.Keith,: LjH-i (EdO.^rinciplesofi Environmental-Sampling. American
iek,,III,,EJ.; (1 996). lEstimati^
Assessment." Human, and Ecological j4we^we»^'-
i Predictedifrpm^Short-TermMeasurements?."/:/?^ Andlysis,'-Vol. Ui4, No. : 1 ,, pp.
.75-85.
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

CHECKLIST I. ASSESSING INTERNAL REPRESENTATIVENESS: POPULATION SAMPLED VS.
POPULATION OF CONCERN FOR THE SURROGATE STUDY

• What is the study population?
•       What are the individual characteristics (i.e., defined by demographic, socioeconomic
factors, human behavior and other study design factors)?
•       What are the spatial characteristics?
•       What are the temporal characteristics?
•   _    What are units of observation (e.g., person-days or person-weeks)?
•       What, if any, are the population subgroups for which inferences were especially
desired?

• Are valid statistical inferences  to the study population possible?
•       Was the whole population sampled (i.e., a census was conducted) used?
•       If not was the sample  design appropriate and adequate?
•       Was a probability sample used? If not, how reasonable does the method of
sample selection appear to be?
•       Was the response rate satisfactory?
•       Was the sample size adequate for estimating central tendency measures?
•       Was the sample size adequate for estimating other types of parameters (e.g.,
upper percentiles)?
•       For what population or subpopulation size was the sample size adequate for
estimating measures of central tendency?
•       For what population or subpopulation size was the sample size adequate for
estimating other types of parameters (e.g., upper percentiles)?
•       What biases are known or suspected as  a result of the design or
implementation or the study? What is the direction of the bias?

• Does the study appear to have and use a valid measurement protocol?
•       What is  the likelihood of Hawthorne effects? What impact might this have on
bias or variability?
•       What are other sources of measurement errors (e.g., recall difficulties)? What
impact might  they have on bias or variability?

• Does the study design allow (model-based) inferences to other time units?
•       What model is most appropriate?
•       What assumptions are inherent to the model?
A-19
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

CHECKLIST II. ASSESSING EXTERNAL REPRESENTATIVENESS: SURROGATE POPULATION
VS. EXPOSURE ASSESSOR'S POPULATION OF CONCERN - INDIVIDUAL CHARACTERISTICS
How does the population of concern relate to surrogate study population in terms, of the individuals'
characteristics?
«      Case 1: Are the individuals in the two populations essentially the same?
•      Case 2: Are the individuals in the population of concern a subset ©f those in the study
population? If so, is there adequate information available to allow for the analysis of
the population of concern? (Note: If so IJCase 2a], we can redefine the, surrogate data
to include only persons in the population: of concern and- then treat this, case as Case
l->
•      Case 3':. Are the individuals in: the surrogate study population- a subset of those in- the
population of concern?
*      Case 4: Are two populations disjoint — in terms of individual characteristics?'

k How important is the difference in, the two populations (population of concern: and surrogate
population) with regard to the individuals* characteristics? To what extent is the difference between.
the individuals of the two populations expected1 to-affect the population parameters?
*       Witlfc respects to central tendency of the two- populations?
»       With respect to the variability of the; too: populations?
«       Witll respect to the shape, aiid/or upper percentiles of tlie two populations?'

i- Is. there a reasonable way of adjusting or extrapolating from- the surrogate population to the:
population of concern — in terms of the individuals' characteristics?
«       What method(s): should be used?'
«       Is,there adequate informatlba avail'abte. to- Implement it?

image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

CHECKLIST III. ASSESSING EXTERNAL REPRESENTATIVENESS: SURROGATE
POPULATION VS. EXPOSURE ASSESSOR'S POPULATION OF CONCERN - SPATIAL
CHARACTERISTICS
How does the population of concern relate to surrogate population in the spatial characteristics?
•      Case 1: Do they cover the same geographic area?
•      Case 2: Is the geographic area of the population of concern a subset of the area of
surrogate population? If so, is there adequate information available to allow the
analysis of the population of concern? (Note: If so [Case 2a], we can redefine the
surrogate population to include only regions or types of geographic areas in the
population of concern and then treat this case as Case 1.)
•      Case 3: Is the geographic area covered by the surrogate population a subset of that
covered by the population of concern?
•      Case 4: Are two populations disjoint -- in the spatial characteristics?

How important is the difference in the two target populations with regard to the spatial
characteristics? To what extent is the difference in the spatial characteristics of the two populations
expected to affect the population parameters?
•      With respect to central tendency of the two populations?
•      With respect to the variability of the two populations?
•      With respect to the shape and/or upper percentiles of the two populations?

Is there a reasonable way of adjusting or extrapolating from the surrogate-population to the
population of concern — in terms of the spatial  characteristics?
•      What method(s) should be used?
•      Is there adequate information available to implement it?
A-21
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

CHECKLIST IV. ASSESSING EXTERNAL REPRESENTATIVENESS: SURROGATE
POPULATION VS. EXPOSURE ASSESSOR'S POPULATION OF CONCERN - TEMPORAL
CHARACTERISTICS                                                       ^	._
How does the population of concern relate to surrogate population in terms of currency and temporal
coverage (study duration)?
•      Case 1: Are the duration and currency of the surrogate data compatible with the
population of concern needs?
•      Case 2: Is the temporal coverage of the population of concern a subset of the
surrogate population? If so, is there adequate information available to allow the
analysis of the population of concern? (Note:  If so [Case 2a], we can redefine the
surrogate population to include only time periods (e.g., seasons) of interest to the
assessor and then treat this case as Case 1.)
•      Case 3: Is the temporal coverage of the surrogate population a subset of that covered
by the population of concern?
•      Case 4: Are the two populations disjoint — in terms of study duration and currency?

• How does the population of concern relate to surrogate population in terms of the time unit (either
the observed time unit or, if appropriate, a modeled time unit)?
•      Case 1: Are the time units compatible?
•      Case 2: Is the time unit for the population of concern shorter than that of the surrogate
population? If so, are data available for the shorter time unit associated with the
population of concern.  (If so .[Case 2a], this can be treated as Case 1.)
•      Case 3: Is the time unit for the population of concern longer than that of the surrogate
population?

• How important is the difference in the two populations (i.e., population of concern and surrogate
population) with regard to the temporal coverage and currency? To what extent is the difference in
the temporal coverage and currency of the two populations expected to affect the population
parameters?
•      With respect to central tendency of the two populations?
•      With respect to the variability of the two populations?
•      With respect to the shape and/or upper percentiles of the two populations?

• Is there a reasonable way of adjusting or extrapolating from the surrogate population to the
population of concern -- to account for differences in temporal coverage or currency?
•       What method(s) should be used?
•       Is there adequate information available to implement it?

• How important is the difference in the two  populations (i.e., population of concern and surrogate
population) with regard to the time unit of observation? To what extent is the difference in the
observation time unit of the two populations expected to affect the population parameters?
•       With respect to central tendency of the two populations?
•       With respect to the variability of the two populations?
•       With respect to the shape and/or upper percentiles of the two populations?

• Is there a reasonable way of adjusting or extrapolating from the surrogate population to the
population of concern -- to  account for differences in observation time units?
•       What method(s) should be used?
•       Is there adequate information available to implement it?
A-22
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

Issue Paper on Empirical Distribution Functions and

Non-parametric  Simulation

Introduction
One of the issues facing risk assessors relates to the best use of empirical distribution
functions (EDFs) to represent stochastic variability intrinsic to an exposure factor. Generally,
one of two situations occurs.  In the first situation, the risk assessor is reviewing an assessment in
which an EDF has been used. The risk assessor needs to make a judgement whether or not the
use of the EDF is appropriate for this particular analysis. In the second situation, the risk
assessor is conducting his/her own assessment and must decide whether a parametric
representation or non-parametric representation is best suited to the assessment.  The objective of
this issue paper is to help focus discussion on the key issues and choices facing the assessor
under these circumstances.

We make the initial assumption that the data are sufficiently representative of the
exposure factor in question. Here, representative is taken to mean that the data were obtained as
a simple random sample of the relevant characteristic of the correct population, that the data were
measured in the proper scale (time and space), and that the data are of acceptable quality
(accuracy and precision).

We also make the assumption that the analysis involves an exposure/risk model which
includes additional exposure factors, some of which also exhibit natural variation. Ultimately,
we are interested in estimating some key aspects of the variation in predicted exposure/risk.  As a
minimum, we are interested in statistical measures of central tendency (e.g., median), the mean,
and some measure of plausible upper bound or high-end exposure (e.g., 95th, 97.5th, or 99th
percentiles of exposure). Thus, how variable factors algebraically and statistically interact is
important.

Further, we assume that Monte Carlo methods will be used investigate the variation in
exposure/risk. Obviously, other methods can be used, but it is clear from experience that
simulation-based techniques will be used in the vast majority of applications.

Conventional wisdom advises that when there is an underlying theory supporting the use
of a particular theoretical distribution function (TDF), then the data should be used to fit the
distribution and that distribution should be used in the analysis. For example, it has been argued
that repeated dilution and mixing of an environmental pollutant should eventually result in a
lognormal distribution of concentrations.  While this is an agreeable concept in principle, it is
rare situation when a theory-based TDFs are available for particular exposure factors.
Furthermore, theory-based TDFs are often only valid in the asymptotic sense.  Convergence is
may be very slow, and, in the early stages, the data may be very poorly modeled by the
A-23
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

asymptotic form of the TDP.  For this issue paper, we assume that no theory^based TDFs are
available.

The issue paper is written in two parts, Part 1 addresses the strengths and weakness of
empirical distribution functions;  Part II addresses issues related to judging quality of fit for
theoretical distributions.
Pgrt I,  Empirical Distribution Functions

Definitions. Given representative data, X = {xif x2, ••; xtt }, the risk assessor has two basic
techniques for representing an exposure factor in a Monte Carlo analysis:
, :.  •" i , ,   nil"I1   ' '   if    ,.• ,„    "	 « ,',',:•  ,." . ''"' ,,,;  ''» '  . ' 	 ,  • ;V',i,''•   ,' ',"

parametric methods which attempt to characterize the exposure factor using a TDF. For
example, a lognormal, gamma, or Weibull distribution is used to represent the exposure factor,
and the data are used to estimate values for its intrinsic parameters.
• lj            "..si.  '   "' „    	      •     " in  ,  •    •     ' '  •   ' '»"!
non-parametric methods which use the sample data to define an empirical distribution function
(EDF) or modified version of the EDF.

KDF. Sorted from smallest to largest, x, <. x2 z •" xn, the EDF is the cumulative distribution
function defined by
number ofx,^x
or
n
n k=i
Figure 1.  Example of EOF
where H(u) is the unit step function which jumps from 0 to 1 when u ^ 0.  The values of the EDF
are the discrete set of cumulative
probabilities (0, i/n, 2/n, •", nln). Figure 1
illustrates a basic EDF for 50 samples
drawn from lognormal distribution with a
geometric mean of 100 and a geometric
standard deviation of 3, i.e., X ~
tN(ldO,3).

In a Monte Carlo simulation, an EDF is
generated by randomly sampling the raw
data with replacement (simple
bootstrapping) so that each observation in
the data set, xk, has an equal probability of
selection, i.e., prob(x^) = \ln.
100  200  300  400  500  600  700  800
Random Variate
A-24
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

Properties of the EDF.  The following summarizes some of the basic properties of the EDF:

1.  Values between any two consecutive samples, xk and xk+l cannot be simulated, nor can
values smaller than the sample minimum, xl5 or larger than the sample maximum, xn, be
generated, i.e., x > x,  and x <xn

2.  The mean of the EDF is equal to the sample mean.  The variance of the EDF mean is
always smaller than the variance of the sample mean; it  is equal to (»- l)/n times the
variance ofthe sample mean.

3.  The variance of the EDF is equal to (n-l )/n times the sample variance.

4.  Expected values ofthe EDF percentiles are equal to the  sample percentiles.

5.  If the underlying distribution is skewed to the right (as are many environmental
quantities), the EDF will tend to  under-estimate the true mean and variance.

Figures 2 and 3 below illustrate typical Monte Carlo behavior ofthe EDF in reproducing the
sample mean, variance, and 95th percentile ofthe underlying sample.  Here X ~ LN(100,3) with
a sample size ofN= 100 and the relative error is defined as 100 x [simulated-samplej/sample.
The oscillatory nature ofthe simulated 95th percentile reflects the normalized magnitude ofthe
difference between adjacent order statistics in the sample, jc(95), and jc(96) and shows the Monte
Carlo estimate flip-flopping between these two ranks
Figure 2.  Convergence of the Mean and Varianci
1000     10000    100000    1000000
Numer of Monte Carlo Simulations
Figure 3. Convergence ofthe 95th Percentile
1000     10000    100000   1000000
Number of Monte Carlo Simulations
Linearly Interpolated EDF (Linearized EDF). For continuous random variables, it may be
troubling to define the EDF as a step function and so extrapolation is often used to estimate the
probabilities of values in between sample values. Generally, for values between observations,
linear interpolation is favored, although higher order interpolation is sometimes used. Figure 4
compares a linearly interpolated EDF with the basic EDF. The linearly interpolated EDF will

A-25
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

tend to underestimate the sample mean and variance. It will converge to the appropriate sample
percentile, but take longer to do so when compared to the simple EDF. These differences tend to
diminish as the sample size increases. Table 1 illustrates differences between the EDF,
linearized EDF and best fit TDF for residential room air exchange rates. The EDF statistics are
,''  I1  "•"' ,	  '  ' '«	IF  '" ,'  ' ' , '' '   ',""''    '   '   '';i1     •     n '  i i '"" • *-^  ,             ,        ,
based on a Monte Carlo simulation with 25,000 replications. Clearly the simple EDF is best at
reproducing sample moments and sample percentiles.
• J ! , ,      	'.I.  :...',     "	 : :   •   ,: . •  -,,-    .   .,:l .   i ,  ;'"l si , I :."	' !

Statistic

mean
variance
skewness
kurtosis
5%
10%
50%
90%
95%
ACH
Sample
N = 90
0.6822
0.2387
1.4638
6.6290
0.1334
0.1839
0.6020
1.2423
1.3556

EDF
0.6821
0.2358
1.4890
6.7845
0.1320
0.1840
0.6160
1.2390
1.3820

Linearized
EDF
0.6747
0.2089
1.2426
5.6966
0.1307
0.1840
0.6032
1.2398
1.3600
Best Fit
WeibDlt
PDF
0.6782
0.2479
1.2329
4.9668
0.0881
0.1452
0.5691
1.3592
1.6450
Figure 4.  Comparison of Basic EDF and
Linearly Interpolated EDF
0.950
0.70Q .
200      250       300
Random Variate
350
400
Table 1 Comparison of key summary statistics
Extended EDF, Neither the simple EDF nor the interpolated EDF can produce values beyond
the sample minimum or maximum. This may be an unreasonable restriction in many cases. For
example, the probability that a previously observed largest value in a sample based on n
observations will be exceeded in a sample of N future observations may be estimated using the
relationship prob = 1 - nl(N+n).  If the next sample size is the same as the original sample size,
there is a 50% likelihood mat the new sample will have a largest value greater than the original
sample's largest value. Restricting the EDF to the smallest and largest sample values will
produce distributional tails that are too short. In order to get around this problem, one may
extend the EDF by adding plausible lower and upper bound values to the data. The actual values
ate usually based on theoretical considerations or on expert judgement. For right skewed data,
adding a new minimum and maximum would tend to increase the mean and variance of the EDF.
This same sort or rational is used when continuous, unbounded TDFs are truncated at the low
and high end to avoid generating unrealistic values during Monte Carlo simulation (e.g., 15 kg
adult males, females over 2.5m tall, etc.)
'. '!• ! i	'•
'"ill '':»
Mixed Empirical-Exponential Distribution. An alternative approach to extending the upper
tail of an empirical distribution beyond the sample data has been suggested by Bratley et al.  In
their method, an exponential tail is fit to the last five or ten percent of the data.  This method is
A-26
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

based on extreme value theory and the observation that extreme values for many continuous,
unbounded distributions follow an exponential distribution.
Starting Points
The following table summarizes the results of an informal survey of experts who were asked to
contribute their observations and thoughts on the strengths and weaknesses of EDFs by
addressing a list of questions and issues.  Based on this survey:

1.  The World seems to be divided into TDF'ers and EDF'ers.

2.  There are no clear-cut, unambiguous statistical reasons for choosing EDFs over TDFs or
vice versa.

3.  Many of the criticisms leveled at EDFs also apply to TDFs (e.g., the data must be simple
random samples)..

4.  One aspect of which may have important implications for our discussion is the nature of
the decision and how sensitive an outcome is to the choice of an EDF.

5.  Generally, contributors did not express much support for either the linearized EDF or the
extended EDF.  Why they seem to be comfortable with TDFs, which essentially
interpolate between data points as well as extrapolated beyond the data, is unclear.
A-27
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

,

,1:

1
il!

o
/A
K
Jtfl

to
t—
known about the quantity for which the dislributibi
.S2
"to
•5
c
o
"TT.
Yes, but perhaps an incomplete 'representat
needed.
II
ro ra

§ E
ijj o
££
cu o
CD M
15 °
Q. >,
o ™
8 -g
CD o
S -c
> JS
e 5
O.JD
co co
U. T)
O 0>
LU j£
53
^°

i ^.5
2 £• £ o .-
Dn parametric assumptions) is true arid is awelf-kr
letric procedures make some strdng^ssumptioris.
thd class of pbssible probability distributions to a
finite number of real numbers, or parameters. In
ledians of two sets of data) the data are mddeled 1
s" thatthd. members of each pair "are. the same ^ t
, using a'ri EDF is sbrrieithirig entireiy'different'than
'distributions. Usually, ybuuse'ari EDF as 'a'tooi t
73 SS:E ro fc ro iP. .2
i If Iff Ills
I 11:? Iff! Si
m C = CD 3 0 S5 U- tO -m
C ~ CO .c .„ -.— ,n »5
One has to assume a representative randor
As another example, advantage 2 (EDFs'dc
advantage. Less Well known is that almost
Technically, a parametric situation is onewl
collection that can be Described in a natural
common non-parametric situations (such as
pairs of distributions, but there is stilla restr
'distribution except for a change of location.
set of assumptions you make about the clas
make an estimate: that is; as a computation
_o
^3
CD
E
ro
&§-
ro £
ol
"O "O

CD (Q
CD ^

Q S
C
« ">
o c
;O
u- |i- to
Q E -35
UJ S "a
<e o
. CO rZ
CM ro E
C? "° &• 0
>•£ ra ^_ CD o
^S.o>--5 .-> ,^
2 that the EDF cd'nvfe'rges in prpbability to the undf
ntso^data.' bWd'irn^bVia'hfissuelnrls^k^aSsessmb
usually means' we'afe rfbw'here''riearsa" liniitmg pas
ig Maximum Likelihbo'd/withput direful evaluation
iverge'yERYsfdwl/to\he;'underiying;d1s^^^^^
i). therefdre this convergence" pheftorrienb'n is1 'hoi
' Accuracy of any 'iritervafls 'd'riye^pyla sfen'dard
iracy, with 20 iritervals.'you wo'ulcf'riesd mbVe'tharf
cal purposes, Unless'yoU'reHhe Qi^isUs' Bure|iu.
% -§ .!2 =5 ^g.ic ^2 o t3
_c o • .c » co "• jS ^ j — *
j^""* tjs ,;J5 ™ . 3 ,SJ '-co o - co
:s2;;<2^ro w » .£ .n:^
jti.'jS ^ ^ p -£ CD ,ro 42
"tO 5> 3 ^ LL.t? S> C <D
Although for 'most well-behaved 'distribution
Distribution, cbnve'rgence 'often requires urii
'the near universalsituatibn of having toofe
!ithat'we shbuld beware ALL 'asymptotic /met
their applicability to bur small 'data:set. 'EDI
'(especially if-ybU're trying to characterize e:
cornforting or useful.
: EDFs are almost Usetess,' except in veryla'l
-deviation of sqrt(ri) in that interval. For eve
underlying observations.
Thisls useless, since "large" is Uriattainabi
S
* Q) ^i

d) *o
^
§1
"to -ra

uj ^
to °
CD ^*~
ts
±±
{0 ^
w ^i
CD 1*2
S>'iS2
ll

co.^

*s
ra
•7-1 >
quite wfde in some da'ses. ! For example, 'a'smali i
frbm a pbsifively^skewed pdpuiatibh.
^tB-jgj
ro >§
||
Yes,' but the confidence lirriits orf those esti
set that 1s negatively skewed could be a rai
CD .
C Q) -Q_

C 0 '= T3
~o ,2 . y
1 I ^ c
i— "Jr: LJJ ,2
•g ~ •^'ro
•^ o>i= O
o c ro r;
CD '5, TO c:

^ ^ .i §
^D C "^ "-P*
•5 ,= -o ro
0 m c E
2 S co E
°- *= co £ c
yj <4~ (^ .(— C
Q CD C '-^ i

^t- to co >_-D

CD !;
ing data 'to" a! fitted 'paramefficaWibtiti6nbr''mi5Sui
Q.
1
"c
CD
'^
Maybe. Not sure hbwihis' is different than
distributions.
2
ro
o

_c
CD
»3
It ^
^3 .S
c 5
ro j
•^ CD

ro -2
o -,„
U- cS
Q :•=
^.1
< i:
. c
in Q.

CD

o) J2
c c:
'5. CO
-^ ^™ rn
2-2 o1
D> CO CO
CD E2
^£-5
> .C CD
'^ O ^
2 to CD
c to .c
CO O '^
CD — i:
o -g .•§
73 ro ^
0 cSl
LU •£ -ro
"C 3 '0
**• - 8

CD ^ ro
4-28
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

co
co
Ic
.c
5
£
CO
•s

z=
1
CO
CO
CO
c.
o

3
.a
'i_
!M
-M
rametric c
CO o
Q. tn
o ro
*- co
11
Z3 Q
E 2

o ^5
11
in
S3
how? They can be
ige for EDFs and nol
o- **
TO CD
.IZ >
0 C
U. CO

i^
CD
CD
CD

CO
CO
CO
CD
C

8
g "S
§1
o 3
. CO
N- O
CD"
£
l~. ,

•fl)
•5
CD
CD
(D
~
±±
O
"c
o
Q.
E
•^
o
f=
CD
.C
ifs
^ CD
*^ "co
•22
73 C
CD —
11
_O £
(Q O)
>.?
'co O
co -n
CD •£,
CD C
CO «
"CD 2
ien confidence interv
average claimed of tl
o
^ CD
O •*-*
•E g>
S5

CD
1
5*
CO
£
2
CD
4=
. .
o
i
_
o
>>
CO
1J-
Q
LLI
'antage ol
iclusions.
-g o
CO O
"*"* "G
W •r'
S >*

.c ^
1- E
•d J5
^ 0
E S
— (D
CD E
J- CD
CO i_
»n CD
ude if measurement!
. I.e., the choice of p
5.2

o —
il

CD
f"

O
CO
CO
£
a.
c
8
CD

^ ^o
§!
\— o
-§ E
^1
8 §
!s2 ^

^
*-> CO
CO CD
CD o
CD |

H 8
.£2 0
CD CD
co x:
^-*

E f
CO M—
Is
CD c
S§2
0 0)
= CO
— co
CD C
J= S3
.CO ^
CO '£
E .<2
1- -D
SCO
c
c o
E = =
o £ o
~° CD c
§^8
0 T3 ^
CO
CD CD 'C
"co 1o co
g"° o
CO CD —

CD *- CD
& C CD
C
CO C Jf
U_ CD E
Q fc CD
UJ 3£
. 8 c
OO O O

Q)
CO
.CO

£
"o
essence
fli
^
03
£
.CO
CO
^
J3
"CD
CO
"co
"o.
CO
CO
1
CD
^
^
O
CO
CO
^
<D
Q.
E
CD

^1
—
CD
E
"co
"o
CD
Q.
CO
CD
c ^5
s I
CO ID
Ic N
H 'co
A-29
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

'!• "K
'  ii

Whafryou need is" random -representative-data and tdfeel comfortable that your data incIudeWe lower!airid upiper
; bounds of the quantity. The number of data points in itself is not particularly important
CO
i
1
i g ul
la
0> 3
f How many <Jata? Two. This somewhat flippant answer simply highlights the-lmportant facltriat ytiii 'need 1b ask
<V\e question in the context of (a) what decision is being made and (b) what its risk' fundtkm is (hbw bad is it if the
deciston is'incorrect?). Ifthe risk function is low (it-doesn't matter much if we are'wrbng) andlhe decision is
: really-obvious, then sometimes all you need is a reality check. Hencethe need forone datum. People makes
^mistakes and-Murphy's Law^pplies, sb experience^ictates'aseebW^atum.' l-kno%you;'gnys»at EPA-and'in the
: states are cbmpetehf and sensible an'd often very good aithis stuff; :buj1there are still many ^people and many
• agencies outtherethat are just tod tincorhfoVtable with eorriifion sensevlikd this; sojittpays1tolfepeat'it.t (The
•.comment cuts both ways: sometimes 1 am askeW by clienfsto gather mote'data'ld shdw'th'at they;dbn't hav'e a
t prdblemj-when all their 'data point- tb serious^cbfrtamination.' Most of 'them-Back'doWn fight dway' when confrbnted
^withthe'cbhimon-serise approach-"you obvio'lisly have a problem, so'let's^alk'insteadabbut hbW'td remedy it,

tsince" honest statistics wbri't'make it go away.")

i 1 would not appfoacK thetopicthis'way. ! 1 woiild ask, "instead,1 h'bw:idb 1 characterize an-amdilht of 'data; and
« giverr these surrimary charaGteristies,:what m'ethbds are appropriate.

/At^rrtrflfnumh'C^pom^-per^
s:iriterp'6latibri PfWPst ^density ^^ curves. For ^^birrio'dal'efc.'ybubTe the' 'ntrmber of :iflten?als.

J6ee -'thafrdepetids. M thlnkte
no'p'lace tf1orbMeightbrfthe.99Wperc'enttlle,4heri^OO-!^^^ ^ .
%you-Wart1?folknoW.-*y-ou-areptim^
?gobd. Thisis-similar^tothe'-'hoWmany-iteratibnglst^rib&gh?1^^
:iiterati6'ns,«theni Mhink;ft Is^pretty hard" tb justify NOT us1no;;anSE§F

ii|fiyou*are=gdfng to^place;af lot of-weight bnthe 99W pereentne/thenHW olatSlibint^areielf^
lito^know.

lEXCEPTt'ON- • Nbt^with much accuracy. 'The'theory is ^simple^rid ^ie- exahn^fe%s/ilf1llustrile%ie:issue. ;By
ideation of percertle,%ereMs' 01 9^
idi^tributibh does' ndtoGcurintf random sampFe. !In a sa^
^distribution values' betweerflhe 99th and H GfiWpereentnes-therifore-dor* n6t-occar'W)ith:Prbbal3ility^O:99)AWD,
!-which5is extremely ;close^o 'Me, or4lmost 40%. 'therefbre ; there are|lrnb^
^ouShaVembt^ensseen^nylhing as-high^the 99th -percerrtfleryet. ToW^irlyiUre^f seeit%a^lue W^rfjiigri,
^yoU'need^to.sdlVe [0:99)*N <=%ssurance-viruev(such asi5%norN. TfiaMo^ltf^

1
1
1
1
'c
ji
1

image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

CD

A true EOF uses step functions-this is resampling of the data in which each data point has a probability 1/n.
use of linear interpolation will typically lead to lower estimates of the standard deviation, since you are not
guaranteed to sample the min and max data points.
CD
1,
.Q CO
fel
111 o
<D i
.=•**=
"*"* O.
CD 0)
N -*-•

CO <D
CD CO
.£ =
— o
33 co
Zi CD
O --5
CO ci
CJ
CD CD
•<- Q.
CD
.. -C •
Now you're going down a slippery slope. As soon as you linearize your EOF you are entering into the land o
semi-parametric techniques, smoothing, modeling, and assumptions. You're not using the EOF any more. T
EOF is accurately and correctly described by its cumulative distribution function, which will be a step functior

* IS
CD M- m "O
If your aren't using a continuous distribution, why not just go with the data? The diversity of distributions is v>
rich. For example, see Evans, Hasting, and Peacock, Statistical Distributions, 2nd Ed., Wiley (1993) for 39 o
them. Using some kind of test for fit of the continuous distribution to your data, e.g., quantiles, you usually c;
obtain a reasonable fit. See JWTukey, Exploratory Data Analysis Addison-Wesley (1977). If not, e.g., bimo
you will have to decompose or transform your data, and you already start to make important assumptions.

£
.«
Smoothing EDFs within the bulk of the probability curve causes no serious errors. Extrapolation beyond the I
of data violates the very concept of EOF, and is intrinsically dependent on the parameterization used.

CO
73 en
'rt CD
The simple solution is to use the midpoint rule (apply prob. at the interval midpoint). Alternatively, use trapezi
rule (st. line interpolation). For a continuous curve, a straight line interpolation averages properly and improv
discretization bias. I, however, would suggest using resampling as a better approach than smoothing.

I usually use percentiles, but you have enough data to use an EOF, then it shouldn't matter much.

CD
en
o
In this case, the difference between step functions and linear interpolations becomes small. Why bin? You I
information that way. If you have large segments of the CDF that are approximately piecewise uniform, then
binning the data won't result in much loss of information.
_§.£!
"5 "§ <& .S
J2 g_eo 3
en <n <B -S

Sap
*- o > c
co *« - °
m ^~ CO "^
•*-* CO '> -i ^

73 -8 c- g-
1 'co -I ~

§5 to .2 en
f <B .i -8
•~^ •£ a> >
^ C CD o
T- .a £ 1
c
— CO O
CD

here's a lot of literature on binning data, mostly in terms of how the perception of the histogram can change.
would suggest, in the spirit of the response to question 1 , that you consider the effect the binning process ha
the outcome of your work, since your question really is one of computational practice, not conceptual approai
Bin the data to speed your process (simulation, bootstrapping, whatever) but in a way in which you can
demonstrate your answers are not materially different than what you would get with a more accurate procedu
How do you know what a material difference is? Look at your decision space and your risk function.

o-
CO
"co
73
CD
.C

CD
_N
m
No! This approach causes more mischief in epidemiology than in exposure analysis, but anytime you summ;
the data, you lose information. If the data set is large, feel grateful.

CO
The intervals or bins used are mathematical estimators of the underlying density or distribution curve. This is

numerical integration or interpolation issue. Typically 10-20 intervals gives good performance on a unimodal
density function. Particularly if linear interpolation is used.
No.

A-31
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

sli
at
he data talk to you
Le
no.
CD

CD
"CO
|
you wan
3
O
>.
OJ
c
Hi oj

ll
^,'5

f§ 0)

%*
CO _
m §
j=-o
is o
r- O
1.2
5 a)
(0-°
S2
•c 0
ro S
--. ns
CD ^
jr -—
a tail problem
udgment on wi
1 punt. This
you base yo
E
_.
S
offers in co
as described above. 1 don't see what
he analyst up for excessive criticism.
od
o o p
lly
is)
g care
pproa
sents
ch (after
plicable
tio
an
th
, s
don't like this meth
ould seem to open

like
ist
tati
he mi
ution
ical r
st
is,
me
•a
CD
•n •<-  a> -a
'a. "  -0
to
c

.
cJ
Sm §5 8.SS
£  S
A-32
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

5
i—
0)
1
TO

jl
CD
"S
$E o spresentative rand £_ CO S j assume the da o^ ^* H~* Probably nothing needs to be done CD o o. CD £ ~ TO TO" 5. to 8 jCt M— ^: T~ 1_ TO c M— m -1 CD _C t~ *-> TJ TO .Q CO ~0 .CO t CD *2 .— *4— TJ C TO C "ro J2 s XI CD .c f— '^ CD *• '§ TS CD TO Q £ C ^ ? CO o s and other statisti f"* Q 1 S TJ ates of standard E '^5 (f) look noisy or jumpy due to the gaps interpolations can lead to different e s= o" 3! ~3 o .c TO g of .c "c o o .CO 'CD 3 CO > £ C CD £ CO 1 ^ "3 o ro CD J ^ TO Ll- Q LU CD tn S Q. CD to C *2 1 1 'o 1 .c IE 3 .5 Q. to TJ O- CO CO s o Q. CD C "o. ro CO CD CD •*- m 3 §, CD fitting suggestion. •— ro !™ "c CD CD •rr co CD You have partially answered this qu t|2 ro o E o **"~ ^ o > ro "il CD o ie purely parametr _i_ L_ <D O) C 1 1 1 1 t_ *~i interpolating and fitting curves to yo S CD •*•• to can't trust using ju -^ Q co CD CO T> >^ ^^ .J of the EOF advantages you so carel _fO H d. CD to tfe ro CO ro TO to Tl rameterizing your i TO Q. g "5 .g % T) TO CD CD CD C CD CD As 1 understand bootstrap, you mus step takes care of interpolation. •5 13 s CO "5 i ^ _c £ tstraps. 1 wouldn't o o 1 take percentiles ^ o of You could bootstrap from percentile difference. A-33 image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } Part II. Issues Related to Fitting Theoretical Distributions Suppose the following set of circumstances: (1) that we have a random sample of an exposure parameter which exhibits natural variation '' ' '('I ' ' ' , " i ...... ., ' ' - ," i! '>! ' : :'"! ' .•"' i ':i " '' ''. ;: '''•' • ' "• ' . , ' "; . . (2) that the collected data are representative of the exposure parameter of interest (i.e., the data measure Jie right population, in me right time and spatial scales etc.) :• '. *' • ' ' ;;ij • ' ..... 1;. • ' • '. ' - . ' • ,,;, (3) that estimates of measurement error are available. '. • •• "!1| , '/: • , ' ... ' .. ,,1 ...... , ' . i . . (4) that there Is no available physical model to describe the distribution of the data (i.e., there is no theoretical basis to say that the data are lognormal, gamma, Weibull, etc). (5) that we wish to characterize and account for the variation in the parameter in an analysis of environmental exposures. (6) we run the data through our favorite distribution-fitting software and get goodness of fit statistics (e.g., chi-square, Kolmogorov-Smirnov, Cramer-von Mises, Anderson-Darling, Watson, etc.) and their statistical significance. (7) rankings based on the goodness of fit results are mixed, depending on the statistic and p- values. (8) graphical examination of the quality of fit (QQ plots, PP plots, histogram overlays, residual plots, etc) presents a mixed picture, reinforcing the differences observed in the goodness of fit statistics. Questions 1), A statistician might say that one should pick the simplest distribution not rejected by the data. But what does that mean when rejection is dependent on the statistic chosen and an arbitrary level of statistical significance? 2). On what basis should it be decided whether or not a data set is adequately represented by a fitted analytic distribution? 3). Specifically, what role should the p-value of the goodness of fit statistic play in that judgment? 4), What role should graphical examination of fit play? A-34 image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } Respondent #1 All distributions are, in fact empirical. Parametric distributions are merely theoretical constructs. There is no reason to believe that any given distribution is, in fact, log-normal (or any other specific parametric type). That we agree to call a distribution log-normal is (or at least should be) merely a shorthand by which we mean that it looks sufficiently like a theoretical log-normal distribution to save ourselves the extra work involved in specifying the empirical distribution. Other than analyses where we are dealing strictly with hypothetical constructs (e.g, what if we say that such-and-such distribution is lognormal and such and such distribution is normal....), I can see no theoretical justification for a parametric distribution other than the convenience gained. When the empirical data are sparse in the tails, we, of course, run into trouble in needing to specify an arbitrary maximum and minimum to the empirical distribution. While this may introduce considerable uncertainty, it is not necessarily a more uncertain practice than allowing the parametric construct to dictate the shape of the tails, or for that matter arbitrarily truncating the upper tail of a parametric distribution. This becomes less of .a problem if the analysts goal in constructing an input distribution is to describe the existing data with as little extrapolation as necessary rather than to predict the "theoretical" underlying distribution. This distinction gets us close to the frequentist/subjectivist schism where many, if not all MC roads eventually seem to lead. Respondent #2 ...if you use p-bounds you don't have to choose a single distribution. You can use the entire equivalence class of distributions (be it a large or small class). I mean, if you can't discriminate between them on the basis of goodness of fit, maybe you do the problem a disservice to try. And operationalizing the criterion for "simplest" distribution is no picnic either. Respondent #3 Why not try the KISS method: Keep It Simple & Sound. The Ranked Order Data assuming uniform probability intervals is a method that makes no assumptions as to the nature of the distribution. I also tends to the true distribution function as the number of data points increases. If you have replicate measurements (on each random sample) then the mean of these should be used. The method yields simple rapid random number generators and one can obtain and desired statistical parameter of the distribution. However, use of the distribution function in any estimate is advised. Given the high level of approximation and/or bias in most risk assessment data and models, any approximation to the true PDF should be adequate. There is one occasion when the theoretical PDF may be better than the empirical PDF. That is when it comes from the solution of equations based on fundamental laws constraining the solution to a specified form. Even in this case agreement with data is required. This in not usually the case in risk assessment PDFs. A-35 image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } : ' • jj! , , . •,',', , ! . i '' !'' - ' • ••• ; Respondent #4 Since I am blessed not to be a statistician, I have no problem disputing their "statement" about the "simplest" distribution^ I don't know what they mean either. What really matters physically is picking a distribution that has the fewest variables and that is easy to apply, given the kind of analysis you want to do. You want one that does not make assumptions in its construction that cifttradict processes operating; in your data. If your are generating equally bad fits with a variety of the usual distributions anyway, by all means chose the one that is easiest to use. For time sliced exposure data, the "right" distribution almost always means a lognormal distribution. A physical basis for the lognormal does exist for exposure data, and empirically, most exposure data fit Ipgnormals. [Your assumption "A" does not hold for typical exposure processes.] Wayne Ott, who probably does not even remember it, taught me this one afternoon in the back of a meeting room. See "A Probabilistic methodology for analyzing water quality effects of urban runoff on rivers and streams," Office of Water, February 15, 1984. Just tell people that you have used a lognormal distribution for convenience, although it does not fit particularly well, then provide some summary statistics that describe the poorness of fit. Problems begin when you get a poor fit to a lognormal distribution but a good fit with a different distribution. Say you get a better fit to the Cauchy distribution, because the tails of your pdf have more density. Now things get more fun. Statisticians would say that you should use the Cauchy distribution, because it is a better fit. I say that you should still use the lognormal, because you can interpret manipulations of the data more easily, and just note that the lognormal fit is, poor. Problems will arise, however, if you want to reach conclusions that rely on the tails of the distribution, and you use the lognormal pdf formulation, instead of your actual data. I somewhat anticipated your dilemma in my previous E-mail to you. If you don't need to use a continuous distribution, just go with the data!" For time dependent exposure data, the situation gets much more complex. I prefer to work with Weibull distributions, but I see lots of studies that use Box-Jenkins models. And you also asked: On what basis do I decide whether my data are adequately represented by a fitted analytic distribution? Specifically, what role should the p-value of the goodness of fit statistic play in my choice? What role should graphical examination of fit play? To me, the data are adequately represented, when the analytical distribution adequately fills the role you intend it to have. In other words, if you substitute a lognormal distribution for your data, as a surrogate, then carry out some operations and obtain a result, the lognormal is adequate, unless it leads to a different conclusion than the actual data would support. The same statement is true of any continuous distribution. Similarly, as a Bayesian, I think that the proper role of a p-value is the role you believe it should play. I don't think that p-values have much meaning in these kinds of analyses, but if you think tlipy should, you should state the desired value before beginning to analyze the data, and not proceed until you obtain this degree of fittedness or better. If small differences in p-value make . ' ' ' : ' , "'" ' A-36 image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } much difference in your analysis, your conclusions are probably too evanescent to have much usefulness. The quantiles approach that I previously commended to you, is a graphical method. [See J.W. Tukey, Exploratory Data Analysis. Addison-Wesley (1977)]. In it, you would display the distribution of your data, mapped against the prediction from the continuous distribution you have chosen, with both displayed as order statistics. If your data fit your distribution well, the points (data quantiles versus distribution quantiles, will fall along a straight (x=y) line. Systematic differences in location, spread, and/or shape will show up fairly dramatically. Such visual inspection is much more informative than perusing summary statistics. No "statistical fitting" is involved. [Also see J.M. Chambers et al, Graphical Methods for Data Analysis. Cole Publishing (1983)]. Respondent #5 I have several thoughts on the goodness of fit question. First, visual examination of the data is likely to yield more insight into the REASONS for the mixed behavior of the various statistics; i.e., in what regions of the variable of interest does a particular theoretical distribution not fit well, and in what direction is the error? Then choosing a particular parametric distribution can be influenced by the purpose of the analysis. For example, if you are interested in tail probabilities, then fitting well in the tails will be more important than fitting well in the central region of the distribution, and vice versa. A good understanding of the theoretical properties of the various distributions is also handy. For example, the heavy tails of the lognormal mean that the moments can be very strongly influenced by relatively low-probability tails. If that seems appropriate fine; if not the analyst should be aware of that, etc. I don't think there is a simple answer; it all depends on what you are trying to do and why! Respondent #6 In broad overview, I have these suggestions - all of which are subject to modification, depending on the situation. 1. Professional judgment is **unavoidable** and is **always** a major part of every statistical analysis and/or risk assessment. Even a (dumb) decision to rely **exclusively** on one particular GOF statistic is an act of professional judgment. There is no way to make any decision based exclusively on "objective information" because the decision on what is considered objective contains unavoidable subjective components. There is no way out of any problem except to use and to celebrate professional judgment. As a profession, we risk assessors need to get over this hang up and move ahead. 2. It is **always** necessary and appropriate to fit several different parametric distributions to a data set. We make choices on the adequacy of a fit by comparison to alternatives. Sometimes we decide that one 2-parameter distribution fits well enough (and better than the reasonable A-37 image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } alternatives) so trial we will use this distribution. Sometimes we decide that it is necessary to use a'Ifior;d cdmplicated parametric distribution (e.g., a 5-parameter "mixture" distribution) to fit the data well (and better than the reasonable alternatives). And sometimes, we decide that no parametric distribution can do the job adequately well, hence the need for bootstrapping and other methods. 3, The human eye is far, far better at **judging** the overall match (or lack thereof) between a fitted, distribution and the data under analysis than any statistical test ever devised. GOF tests are "blind" to the data! We need to visualize, visualize, and visualize the data - as compared to the alternative fitted distributions - to **see** how the various fits compare to the data. Mosteller, f ukey, and Cleveland, three of the most distinguished statisticians of the last 50 years, have all stressed the **essential** nature of visualization and human judgment relying thereon (in lieu of GOF tests). BTW, these graphs and visualizations *must* be published for all to see and understand. r _ i : t 4, In situations where no single parametric distribution provides an **adequate** fit to the data, there are several possible approaches to keep moving ahead. Here are my favorites. A (standard approach) Fit a "mixture" distribution to the data. B. Use the two or three or four parametric distributions that offer the most appealing fit in a sensitivity analysis to see if the differences among the candidate distributions really make a difference in the decision at hand. Get the computer to simulate the results of choosing among the different candidate distributions. This leads to keen insights as to the "value of information1^ • C. (see references below, and references cited therein) By extension of the previous idea, analysts can fit and use "second-order" distributions that contain both **Variability** and **Uncertainty**. These second-order distributions have many appealing properties, especially the property that they allow the analyst to propagate Variability and Uncertainty ^separately** so the risk assessor^ the risk manager, and the public can all see how the Var and Unc combine throughout the computation / simulation into the final answer. Respondent #7 [RE comments #-1, #3, respondent #6].... the motivation behind having standardized methods: Professional judgment does not always produce the same result. Your professional judgment does not necessarily coincide with someone else's professional judgment. Surely, you've noticed tills. Tjie problem isn't that no one is celebrating their professional judgement - the problem is that we have more than one party ^ ;iP'!< r^T'"-! '; '"'.*• ,'i'T:' •'*''• '" '':':i!-"" ' ;' ;. ' ':':'' .'--I' ffie bigger and triore unique the problem, the less standardization matters. But if you are trying to compare, say, the risk from thousands of superfund sites, you can't very well reinvent risk A-38 image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } analysis for every one and expect to get comparable results - whatever you do for one you must do for all. Have you tried to produce a GOF statistic that matches your visual preference? I have. For instance, I think fitting predicted percentiles produces better looking fits than fitting observed values (e.g., maximum likelihood) - because this naturally gives deviations at extreme values less weight - where 'extreme value' is model dependent. A-39 image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } r • image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } APPENDIX B LIST OF EXPERTS AND OBSERVERS image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } r image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } ?xEPA United States Environmental Protection Agency Risk Assessment Forum Workshop on Selecting Input Distributions for Probabilistic Assessment U.S. Environmental Protection Agency New York, NY April 21-22, 1998 List of Experts Sheila Abraham Environmental Specialist Risk Assessment/Management Northeast District Office Ohio Environmental Protection Agency 2110 East Aurora Road Twinsburg, OH 44087 330-963-1290 Fax: 330-487-0769 E-mail:sabraham@epa.state.oh.us Hans Allender U.S. Environmental Protection Agency 401 M Street, SW (7509) Washington, DC 20460 703-305-7883 E-mail: allender.hans@ epamail.epa.gov Timothy Barry Office of Science Policy, Planning, and Evaluation U.S. Environmental Protection Agency 401 M Street, SW(2174) Washington, DC 20460 202-260-2038 E-mail: barry.timothy@ epamail.epa.gov Robert Blaisdell Associate Toxicologist California Office of Environmental Health Hazard Assessment 2151 Berkeley Way Annex 11 - 2nd Floor Berkeley, CA 94704 510-540-3487 Fax: 510-540-2923 E-mail: bblaisde@ berkeley.cahwnet.gov David Burmaster President Alceon Corporation P.O. Box 382669 Cambridge, MA 02238-2669 617-864-4300 Fax:617-864-9954 E-mail: deb@alceon.com Christopher Frey Assistant Professor Department of Civil Engineering North Carolina State University P.O. Box 7908 Raleigh, NC 27695-7908 919-515-1155 Fax: 919-515-7908 E-mail: frey@eos.ncsu.edu Susan Griffin Environmental Scientist Superfund Remedial Branch Hazardous Waste Management Division U.S. Environmental Protection Agency 999 18th Street (8EPR-PS) Suite 500 Denver, CO 80202-2466 303-312-6651 Fax: 303-312-6065 E-mail: griffin.susan@ epamail.epa.gov Bruce Hope Environmental Toxicologist Oregon Department of Environmental Quality 811 Southwest 6th Avenue Portland, OR 97204 503-229-6251 Fax: 503-229-6977 E-mail: hope.bruce@deq.state.or.us William Huber President Quantitative Decisions 539 Valley View Road Merion, PA 19066 610-771-0606 Fax:610-771-0607 E-mail: whuber@quantdec.com Printed on Recycled Paper B-1 image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } Robert Lee Risk Analyst Colder Associates, Inc. 4104 148th Avenue, NW Redmond, WA 98052 206-367-2673 Fax:206-616-4875 E-mail: rciee@u.washington.edu •' : lliii'ii .;•;!' ;'('. ";!: i "li ' '.' <\ ., David Miller Chemist Officetof Pesticide Programs Health Effects Division U.S. Environmental Protection Agency 401 M Street, SW (7509) Washington, DC 20460 703-305-5352 Fax: 703-305-5147 E-mail: miller.david© eparnail,epa.gbv ! «.' ;, " ,,'ii; . . . • Samuel Morns Environmental Scientist' Deputy Division Head Braqkhaverj National Laboratory Building 815 815 Rutherford Avenue Upton, NY 11973 516-344-2018 Fax: 516-344-7905 E-mail: morris3@bnl.gbv Jacqueline Moya Environmental Engineer National Center for Environmental Assessment Office of Research and Development U.S. Environmental Proteptjqn Agency 401 M Street. SW(8623D) Washington, DC 20460 202-664-3245 Fax: 202-565-0052 E-mail: moya.jacqueline@ epamail.epa.gov Christopher Portier Chief Laboratory of Computational Biology and Risk Analysis National Institute of Environmental Health Sciences P.O. Box 12233 (MD-A306) Research Triangle Park, NC 27709 919-541-4999 Fax: 919-541-1479 " ' ' " ' E-mail: portier@niehs.nih.gov P. Barry Ryan Professor Exposure Assessment and Environmental Chemistry Rollins School of Public Health Emory University 1518 Clifton Road, NE Atlanta, GA 30322 404-727-3826 Fax: 404-727-8744 E-mail: bryan@sph.emory.edu Brian Sassaman Bioenvironmental Engineer US. Air Force DET1.HSC/OEMH 2402 E Drive Brooks Air Force Base, TX 78235- 5114 210-536-6122 Fax:210-536-1130 E-mail: bria'n.sassaman® guardian.brooks.af.mil Ted Simon' Toxicologist Federal Facilities Branch Waste Management Division U.S. Environmental Protection Agency Atlanta Federal Center 61 Forsyth Street, SW Atlanta, GA 30303-3415 404-562-8642 Fax: 404-562-8566 E-mail: simon.ted@epamail.epa.gov Mitchell J. Small Professor Departments of Civil & Environmental Engineering and Engineering & Public Policy Carnegie Mellon University Porter Hall 119, Frew Street Pittsburgh, PA 15213-3890 412-268-8782 Fax:412-268-7813 E-mail: ms35@andrew.cmu.edu Edward Stanek Professor of Biostatistics Department of Biostatistics and Epidemiology University of Massachusetts 404 Arnold Hall Amherst, MA 01003-0430 413-545-4603 Fax:413-545-1645 E-mail: stanek@schoolph.umass.edu Alan Stern Acting Chief Bureau of Risk Analysis Division of Science and Research New Jersey Department of Environmental Protection 401 East State Street P.O. Box 409 Trenton, NJ 08625 609-633-2374 Fax: 609-292-7340 E-mail: astern@dep.state.nj.us Paul White Environmental Engineer National Center for Environmental Assessment Office of Research and Development U.S. Environmental Protection Agency 401 M Street, SW (8623D) Washington, DC 20460 202-564-3289 Fax: 202-565-0078 E-mail: white.paul@epamail.epa.gov (over) B-2 image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } SEPA United States Environmental Protection Agency Risk Assessment Forum Workshop on Selecting Input Distributions for Probabilistic Assessment U.S. Environmental Protection Agency New York, NY April 21-22, 1998 Final List of Observers Samantha Bates Graduate Student/ Research Assistant Department of Statistics University of Washington Box 354322 Seattle, WA 98195 206-543-8484 Fax: 206-685-7419 E-mail: sam@stat.washington.edu Steve Chang Environmental Engineer Office of Emergency and Remedial Response U.S. Environmental Protection Agency 401 M Street, SW (5204G) Washington, DC 20460 703-603-9017 Fax: 703-603-9103 E-mail: chang.steve@ epamail.epa.gov Helen Chernoff Senior Scientist JAMS Consultants, Inc. 655 Third Avenue New York, NY 10017 212-867-1777 Fax: 212-697-6354 E-mail: hchernoff@ tamsconsultants.com Printed on Recycled Paper Christine Daily Health Physicist Radiation Protection & Health Effects Branch Division of Regulatory Applications U.S. Nuclear Regulatory Commission (T-9C24) Washington, DC 20555 301-415-6026 Fax: 301-415-5385 E-mail: cxd@nrc.gov Em ran Dawoud Human Health Risk Assessor Toxicology and Risk Analysis Section Life Science Division Oak Ridge National Laboratory 1060 Commerce Park Drive (MS- 6480) Oak Ridge, TN 37830 423-241-4739 Fax: 423-574-0004 E-mail: dawoudea@ornl.gov Audrey Galizia Environmental Scientist Program Support Branch Emergency and Remedial Response Division U.S, Environmental Protection Agency 290 Broadway New York, NY "10007 212-637-4352 Fax: 212-637-4360 E-mail: galizia.audrey@ epamail.epa.gov (over) B-3 Ed Garvey TAMS Consultants 300 Broadacres Drive Bloomfield, NJ 07003 973-338-6680 Fax: 973-338-1052 E-mail: egarvey@ tamsconsultants.com Gerry Harris UMDNJ-RWJMS UMDNJ-EOSHI Rutgers University 170 Frelinghuysen Road - Room 234 Piscataway, NJ 08855-1179 732-235-5069 E-mail: gharris@gpph.rutgers.edu David Hohreiter Senior Scientist BBL 6723 Towpath Road P.O. Box 66 Syracuse, NY 13214 315-446-9120 Fax:315-446-7485 E-mail: dh%bbl@mcimail.com image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } Nancy Jafolla Environmental Scientist Hazardous Waste Management Division U.S. Environmental Protection Agency 841 Chestnut Building (3H541) Philadelphia, PA 19107 215-556-3324 E-mail; jafol!a.nancy@ epamailepa.gov Alan Kao Senior Science Advisor ENVIRON Corporation 4350 North Fairfax Drive - Suite 300 Arlington, VA 22203 703-516-2308 Fax: 703-516-2393 Steve Knptt Executive Director, Risi Assessment Forum Office of Research and Development National Center for Environmental Assessment U.S. Environmental Protection Agency 401 M Street, SW (8601-0) Washington, DC 20460 202-564-3359 Fax: 202-565-0062 E-mail: knottiteve@epamail.epa.gov . * ,.ii, , • ii ; '' " ' ' Stephen Kroner Environmental Scientist Office of Solid Waste U.S. Environmental Protection Agency 401 !i Street; SW (5307W) Washington, DC 20460 703-308-0468 E-mail: kroner.stephen@ epamaH.epa.gov Anne LeHuray Regional Risk Assessment Lead Foster-Wheeler Er)vir<3n|Tienta| Corporation 8100 Professional Place - Suite 308 Lanham, MD 20785 301-429-2116 Fax:301-429-2111 E-maJ; alehuray@fwenc.com Toby Levin Attorney Advertising Practices Federal Trade Commission 601 Pennsylvania Avenue, NW Suite 4110 Washington, DC 20852 202-326-3156 Fax:202-326-3259 Lawrence Myers Statistician Research Triangle Institute P.O. Box12194 Research Triangle Park, NC 27709 919-541-6932 Fax: 919-541-5966 E-mail: Iem@rti.org Marian Olsen . '., ', ' ".' ' '""", Environmental Scientist Technical Support Section Program Support Branch Emergency and Remedial Response Division U.S. Environmental Protection Agency 290 Broadway New York, NY 10007 212-637-4313 Fax:212-637-4360 E-mail: olsen.marion@epamail.epa.gov Lenwood Owens President Boiler Servicing 1 Laguardia Road Chester, NY 10918 Zubair Saleem Office of Solid Waste U-S. Environmental Protection Agency 4Q1 M Street, SW (5307W) Washington, DC 20460 703-308-0467 Fax: 703-308-0511 E-mail: saleem.zubair© epamail.epa.gov SwatiTappin Research Scientist New Jersey Department of Environmgntaj Protection 401 East State Street P.O. Box 413 Trenton, NJ 08625 609-633-1348 Joan Tell Senior Environmental Scientist Exxon Biomedical Mettlers Road (CN 2350) East Millston, NJ 08875 732-873-6304 Fax:732-873-6009 E-mail: joan.tell@exxon.sprint.com Bill Wood Executive Director, Risk Assessment Forum Office of Research and Development National Center for Environmental Assessment U.S. Environmental Protection Agency 401 M Street, SW (8601-D) Washington, DC 20460 202-564-3358 Fax:202-565-0062 E-mail: wood.bill@epamail.epa.gov B-4 image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } APPENDIX C AGENDA image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } r image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } &EPA United States Environmental Protection Agency Risk Assessment Forum Workshop on Selecting Input Distributions for Probabilistic Assessment U.S. Environmental Protection Agency New York, NY April 21-22, 1998 Agenda Workshop Chair: Christopher Frey North Carolina State University TUESDAY, APRIL 21, 1998 8:OOAM Registration/Check-ln 9:OOAM 9:10AM 9:30AM 9:45AM 10:OOAM 10:15AM 10:30AM 10:45AM 11:OOAM 12:OOPM Welcome Remarks Representative from Region 2, U.S. Environmental Protection Agency (U.S. EPA), New York, NY Overview and Background Steve Knott, U.S. EPA, Office of Research and Development (ORD), Risk Assessment Forum, Washington, DC Workshop Structure and Objectives Christopher Frey, Workshop Chair Introduction of Invited Experts Presentation: Issue Paper #1 - Evaluating Representativeness of Exposure Factors Data Jacqueline Moya, U.S. EPA, National Center for Environmental Assessment (NCEA), Washington, DC Presentation: Issue Paper #2 - Empirical Distribution Functions and Non- Parametric Simulation Tim Barry, U.S. EPA, NCEA, Washington, DC BREAK Charge to the Panel Christopher Frey, Workshop Chair Discussion on Issue #1: Representativeness LUNCH (over) C-l image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } TUESDAY, APRIL 21, 1998 (continued) 1;30PM Discussion on Issue #1 Continues 3:OOPM BREAK 3:15PM Discussion,on Issue #1 Continues _ /Christopher Frey, Workshop Chair 4:15Piyl Obseryer Comments 4-.45PM Review of Charge for Day Two Christopher Frey, Workshop Chair -. Writing Assignments 5:OOPM ADJOURN WEDNESDAY, APRIL 22, 1998 8:30AM Planning and Logistics Christopher Frey, Workshop Chair 8:40AM Summary of Discussion on Issue #1 10:OOAM BREAK 10:15AM Discussion on Issue #2: Empirical Distribution Functions and Resembling Versus Parametric Distributions 12;OOPM LUNCH 1:30PM Discussion on Issue #2 Continues '.<!, ' ill ' ' '" „ ' ' ' ' • 3:OOPM BREAK 3:f5PM Summary of Discussion on Issue #2 Christopher Frey, Workshop Chair - Writing Assignments/Session 4-.15PM Observer Comments 4:45PM Closing Remarks 5:OQPM ADJOURN C-2 image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } APPENDIX D WORKSHOP CHARGE image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } Jl, image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } Workshop on Selecting Input Distributions for Probabilistic Assessment U.S. Environmental Protection Agency New York, NY April 21-22, 1998 Charge to Experts/Discussion Issues This workshop is being held to discuss issues associated with the selection of probability distributions to represent exposure factors in a probabilistic risk assessment. The workshop discussions will focus on generic technical issues applicable to any exposure data. It is not the intent of this workshop to formulate decisions specific to any particular exposure factors. Rather, the goal of the workshop is to capture a discussion of generic issues that will be informative to Agency assessors working with a variety of exposure data. On May 15, 1997, the U.S. Environmental Protection Agency (EPA) Deputy Administrator signed the Agency' s "Policy for Use of Probabilistic Analysis in Risk Assessment." This policy establishes the Agency's position that "such probabilistic analysis techniques as Monte Carlo Analysis, given adequate supporting data and credible assumptions, can be viable statistical tools for analyzing variability and uncertainty in risk assessments." The policy also identifies several implementation activities designed to assist Agency assessors with their review and preparation of probabilistic assessments. These activities include a commitment by the EPA Risk Assessment Forum (RAF) to organize workshops or colloquia to facilitate the development of distributions for exposure factors. In the summer of 1997, a technical panel, convened under the auspices of the RAF, began work on a framework for selecting input distributions for use in Monte Carlo analyses. The framework emphasized parametric methods and was organized around three fundamental activities: selecting candidate theoretical distributions, estimating the parameters of the candidate distributions, and evaluating the quality of the fit of the candidate distributions. In September of 1997, input on the framework was sought from a 12 member panel of experts from outside of the EPA. The recommendations of this panel include: expanding the framework's discussion of exploratory data analysis and graphical methods for assessing the quality of fit, • discussing distinctions between variability and uncertainty and their implications, • discussing empirical distributions and bootstrapping, discussing correlation and its implications, making the framework available to the risk assessment community as soon as possible. D-l image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } Subsequent to receiving tibis input, some changes were made to the framework and it was applied to selecting distributions for three exposure factors: water intake per body weight, inhalation rate, and residence time. The results of this work are presented in the attached report entitled "Development of Statistical Distributions for Exposure Factors." Applying the framework to the three exposure factors highlighted several issues. These issues resolved into two broad categories: issues associated with the representativeness of the d|ta, and issues associated with using the empirical distribution function (or resampling techniques) versus using a theoretical parametric distribution function. Summaries for these issues are presented in the attached issue papers. These issues will be the focal point for "i" 'i , N,:.ii r , , ,„ „ •*;„,, -\ „ , „, ,, , , discussions during this workshop. The following questions are intended to help structure and guide these discussions. In addressing these questions, workshop participants are asked to consider: what do we know today that can be applied to answering the question or providing additional guidance on me topic; what short term studies (e.g., numerical experiments) could be conducted to answer the question or provide additional guidance; and what longer term research may be needed to answer the question or provide additional guidance. 1 ,/ „„ ,» • :i; is . :•.,„ B' iiMi "... . i . ' -. 'i,, » " i ,",',•' ; • , , ,, ' : . r ,„ ,, • . • Representativeness (Issues Paper #1) h ,,, "' 111!, <h 1) The Issue Paper rf • ,i "; " JUNE: r Checklists I through IV in the issue paper present a framework for characterizing and evaluating the representativeness of exposure data. This framework is organized into three broad sets of questions: questions related to differences in populations, questions related to differences in spatial coverage and scale, .and questions related to differences in temporal scale. Do these issues cover the most important considerations for representativeness? Are the lists of questions associated with each issue complete? If not, what questions should be added? In a tiered approach to risk assessment (e.g., a progression from simpler screening level assessments to more complex assessments), how might the framework be tailored to each tier? ;,jj|; ,j. ' . , ' ' , ' n , " ; MJjljjjij , „ * ' -^ ,„ <? ^ ,„, , For example, is there a subset of questions that adequately addresses our concerns about representativeness for a screening level risk assessment? 2) Sensitivity The framework asks how important are (or how sensitive is the analysis to) population, spatial, atid temporal differences between the sample (for which you have the data) and the population of interest. For example, to what extent do these differences affect our estimates of the mean and variance of the population and what is the magnitude and direction of these effects? ' .i1";;,! . • , ' , : " ••. '• '• " • ' ' - "' ','' ' '' •• i '' , '" ' ' .•• ' ^Vhat guidance can be provided to help answer these questions? What sources of information exist to help with these questions? Having answered these questions what are the implications for the use of the ..'data (e.g., use of the data may be restricted to screening level assessments in D-2 .•"•:: +i&l image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } certain circumstances)? What differences could be considered critical (i.e., what differences could lead to the conclusion that the assessment can't be done without the collection of additional information)? 3) Adjustments The framework asks, is there a reasonable way of adjusting or extrapolating from the sample (for which you have data) to the population of interest in terms of the population, spatial, and temporal characteristics? If so, what methods should be used? Is there adequate information available to implement these methods? What guidance can be provided to help answer these questions? Can exemplary methods for making adjustments be proposed? What sources of information exist to help with these questions? What research could address some of these issues? Section 5 of the issue paper on representativeness describes methods for adjustments to account for differences in population and temporal scales. What other methods exist? What methods are available for spatial scales? Are there short-term studies that can be done to develop these methods further? Are there data available to develop these methods further? Are there numerical experiments (e.g., simulations) that can be done to explore these methods further? Empirical Distribution Functions and Resampling Versus Parametric Distributions (Issues Paper #2^ 1) Selecting the EDF or PDF What are the primary considerations for assessors in choosing between the use of theoretical parametric distribution functions (PDFs) and empirical distribution functions (EDFs) to represent an exposure factor? Do the advantages of one method significantly outweigh the advantages of the other? Is the choice inherently one of preference? Are there situations in which one method is clearly preferred over the other? Are there circumstances in which either method of representation should not be used? 2) Goodness of Fit On what basis should it be decided whether or not a data set is adequately represented by a fitted analytic distribution? What role should the goodness-of-fit test statistic play (e.g., chi-square, Kolmogorov-Smirnov, Anderson-Darling, Cramer-von Mises, etc.)? How should the level of significance, i.e., p-value, of the goodness of fit statistic be chosen? What are the implications or consequences for exposure assessors when acceptance/rejection is dependent on the goodness of fit statistic chosen and an arbitrary level of statistical significance? What role should graphical examination of the quality of fit play in the decision as to whether a fit is acceptable or not? D-3 image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } When the only data readily available are summary statistics (e.g., selected percentiles, mean, and variance), are fits to analytic distributions based on those summary statistics acceptable? Should any limitations or restrictions be placed in these situations? \\&n the better knpwn theoretical distributions (e.g., lognormal, gamma, Weibull, log-logistic, etc.) cannot provide an acceptable fit to a particular set of data, is there value in testing the fit of the more flexible generalized distributions (e.g., the generalized gamma and generalized F distributions) even though they are considerably more complicated and difficult to work with? : „ ' • .ii"1 • ' : ,'i • " • : " 3) Uncertainty Are there preferred methods for assessing uncertainty in the fitted parameters (e.g., methods based on maximum likelihood and asymptotic normality, bootstrapping, etc.)? D-4 image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } APPENDIX E BREAKOUT SESSION NOTES image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } '"i'j'i '''I •;ii image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } APPENDIX E SMALL GROUP DISCUSSIONS/BRAINWRITING SESSIONS During the workshop, the experts worked at times in smaller groups to discuss specific technical questions. Some of these sessions involved open discussions. Other sessions involved "brainwriting," during which individuals captured then- thoughts on paper, in sequence, and then discussed similar and/or opposing views within each group. The outcomes of these sessions were captured by group rapporteurs and individual group members and are summarized below. This summary represents a transcription of handwritten notes and are, as such, considered rough working notes. Information from these smaller group discussions was presented and deliberated in the plenary session, and partially forms the basis of the points presented in the main text of this report. What information is required to fully specify a problem definition ? Population at risk Sample under study (include biases) Spatial extent of exposure—micro, meso, macro scale Exposure-dose relationship Dose-response-risk relationship Temporal extent (hours, days, months, years) Temporal variability about trend What is the "acceptable error"? — yes/no — categorization — continuous — quantitative • Variability/uncertainty partitioning — not needed — desirable — mandatory . • User of output scientific community — regulatory community general public One expert noted that the "previous problem definition" forces the blurring of the boundaries between modeling and problem description—for example, many may not consider the dose-exposure-risk relationship to be part of the problem definition. Another expert asked, "How much information do we have to translate from measured value to population of concern?" He described the population of concern, surrogate population, individuals sampled from the surrogate population, and how well measured value represents true value. Another agreed, emphasizing the importance of temporal, spatial, and temporal-spatial representativeness (e.g., Idaho potatoes versus Maine potatoes). E-l image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } Other issues in problem definition include: * In the context of environmental remediation, a problem is defined in terms of what level of residual risk can be left on the site. The degree of representativeness needed is dependent on the land use scenario. this might dictate limits on future land use and the need for evaluation. A problem needs to be specified in space (location), time (over what duration), and whom (person or unit). Some of these definitions may be concrete (e.g^, in terms of spatial locations around a site) while some may be more vague, such as persons who live on a brownfield site (which may change over time with mobility, new land use, etc.). The problem addresses a future context, and must therefore be linked to observable data by a model/set of assumptions. The problem definition should include these models (no population change over time) or assumptions (exposure calculated over 50- year duration/time frame). One must define the health outcome being targeted (e.g., acute vs. cancer vs. developmental). Define how you will link the exposure measure to a model for hazard and/or risk (margin of exposure has different data needs from an estimate of population risk). Also, one should consider the type of observation being evaluated (blood measurements vs. dietary vs. ecological). Tli is is more likely to have an impact on the representativeness of the data sample than anything else. " ' ' "i,1 '• - , ' ' • ,' • • i, "' ,i :,. • Define the target risk level; this will dictate what kind of data will be necessary. Xnother panelist agreed these are important points but questioned, however, whether these factors were part of problem definition. Specify the scope and purpose of the assessment (e.g., regulatory decision, set cleanup standards, etc.) Determining how much error we are willing to live with will determine how representative the data are. Specify the population of concern (who they are, where they live, what kinds of activities they are involved with). Problem definition is the most critical part of the process, and all stakeholders should be involved as much as possible. If the stakeholders come to a common understanding of the objectives of the process, the situation becomes focused. Although EPA has provided much guidance for problem definition (DQOs, DQAs, etc.), what data are necessary (and to what extent it must be representative) is a function of each individual problem. Certain basic questions are common to all problem definitions (who, what, when, E-2 image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } how); the degree to which each basic question is important is a function of the actual problem/situation. Decision performance requirements: What is acceptable at a specific site for a specific problem (i.e., what is the degree of decision error)? An answer to this question should be decided up front as much as possible to alleviate "bias" concerns. Attributes of the exposed population are key issues: — Who are they? — What are their activities/behaviors? —• Where are they? — When do they engage in activities and for how long? — Why are certain activities performed? The potential imprecision of "national" populations seems significant. Scale is important; maybe regional is as large as it gets. If representativeness is a property of the population, then we should focus on methods for collecting more specific data. Variability within a super-population (e.g., a national study) provides useful, quantifiable bounds to potential bias and gives an upper bound on the variability that could be found in a subpopulation. This suggests that there are quantitative ways to guide the use "reduce sparingly." The assessor needs to ask the following questions: Is a risk assessment necessary? What is the level of detail needed for the decision at hand? What is the scope of the problem? For example, -T- Who is at risk? — Who has standing [e.g., stakeholders]? — Who has special concerns? — What is of concern? — When are people exposed? (timeframe [frequency and duration], chronic vs. acute, level of time steps needed) — Where are people exposed—spatial considerations; scope of the problem (national, regional, site?) — How are people exposed? The time step used in the model must be specified. The assessor must distinguish between distribution needed for a one-day time step as compared to a one-year time step. Some models may run at different time steps (e.g., drinking water at a one-week time step to include seasonal variation; body weight at a one-year time step to include growth of a child.) Consideration of a tiered approach is important in problem formulation. How are data to be used? If data are to be used in a screening manner, then conservativeness is even more important than representativeness. If more than a screening assessment is proposed, the assessor should E-3 image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } Jffl , , i , ' ' ' l,n; lr n 'il " I' consider what is the value added from more complex analyses (site-specific data collection, modeling, etc.). I. I „ "'„, ,i i 'I». ,. • '•'':, • ' ' '' As probabilistic methods continue to be developed, it will become increasingly important to specify constraints in distribution. Boundaries exist. For example, no person can eat multiple food groups at the 95th percentile. Two panelists noted that tiered approaches would not change the problem definition. Generally, the problem is: Under an agreed set of exposure conditions, will the population of concern experience unacceptable risks? This question would not change with a more or less sophisticated (tiered) assessment. '''Hill , • ' ' • , I, '11 I, ' 'I" ,' , ' , ' ,. „, •« , • " , , i " "V^hen evaluating unknown future population characteristics, we are dealing with essentially unknown conditions. It is not feasible, therefore, to have as a criterion that additional information will not significantly change the outcome of the analysis. Instead, the problem needs to be defined in terms of a precise definition of population (in time and space) which is to be protected. To the extent that this is uncertain, it needs to be defined in a generalized, generic manner. Considerations of the "external" representativeness of the data to the population of concern is absolutely critical for "on the ground" risk assessments. The "internal" validity of the data is often a statistical question. It seems more important to ensure that the outcome of the assessment will not change based on the consideration of "external" representativeness of the data set to the population of concern. What constitutes (lack of) representativeness? General The issue of data representativeness begs the question "representative of what?" In many (most?) cases, we are Working backwards, using data in hand for purposes that may or may not be directly related to the rfason the data were collected in the first place. Ideally, we would have a well-posed assessment problem with well-defined assessment endpoihts. From that starting point, we would collect the relevant data necessary for good statistical characterization of the key exposure factors. More generally, we are faced with the question, "Can 1 use these data in my analysis?" To make that judgment fairly, we would have to go through a series of questions related to the data itself and to the use we intend to make of the data. We usually ignore many of these questions, either explicitly or implicitly. The following is an attempt at listing the issues that ought to affect our judgment of data relevance. Sources of Variability and tJncertainty Related to the Assessment of Data Representativeness EPA policy sets the standard that risk assessors should seek to characterize central tendency and plausible upper bounds on both individual risk and population risk for the overall target population as Well as for sensitive subpopulations. To this extent, data representativeness cannot be separated from the assessment endpoint(s). the following outlines some of the key elements affecting data febfeSehtativeness. The elements are not mutually exclusive. I. *• ' . ill. ' . MI!, < I ' . • 'I .. ' „''!!" "I' ' , , ' ,, ', 'I 1 i, " . „ , , ,, E-4 image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } Exposed Population general target population particular ethnic group known sensitive subgroup (children, elderly, asthmatics, etc.) occupational group (applicators, etc.) age group (infant, child, teen, adult, whole life) sex activity group (sport fishermen, subsistence fishermen, etc.) Geographic Scale, Location trends (stationary, non-stationary behaviors) past, present, future exposures lifetime exposures less-than-lifetime exposures (hourly, daily, weekly, annually, etc.) temporal characteristics of source(s), continuous, intermittent, periodic, concentrated (spike), random Exposure Route inhalation ingestion (direct, indirect) dermal (direct) contact (by activity, e.g., swimming) multiple pathways Exposure/Risk Assessment Endpoint cancer risk non-cancer risk (margin of exposure, hazard index) potential dose, applied dose, internal dose, biologically effective dose risk statistic mean, uncertainty percentile of mean percentile of a distribution (e.g., 95th percentile risk) uncertainty percentile of variability percentile (upper credibility limit on 95th percentile risk) plausible worst case, uncertainty percentile of plausible worst case Data Quality Issues direct measurement, indirect measurement (surrogates) modeling uncertainties measurement error (accuracy, precision, bias) sampling error (sample size, non-randomness, independence) monitoring issues (short-term, long-term, stationary, mobile) • Almost all data used in risk assessment is not representative in one or more ways. What is important is the effect the lack of representativeness has on the risk assessment in question. If the water pathway, for example, is of minor concern, it will not matter if the water-consumption rate distribution is not representative. A lack of representativeness could mean the risk assessment results fail to be protective of public health or grossly overestimate risks. E-5 image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } The Issue Paper is helpful in describing the ways in which distributions can be nonrepresentative. It can guide the selection of the input distributions. Representativeness needs to be considered in the context of the decision performance requirements. Factors that could have a major impact in terms of one problem/site need not have the same impact across all problems/sites. Decision performance requirements should therefore be considered with problem-site-specific goals and objectives factored into the process. The definition of representativeness depends on how much error we are willing to live with. What is "good enough" will be case specific. Going through some case studies using assessments done for different purposes can shed some light on defining representativeness. "With regard to exposure factors, we [EPA] need to do a better job at specifying or providing better guidance on how to use the data that are available." For example, the soil ingestion data for children are limited, but may be good enough to provide an estimate of a mean. The data are not good enough to support a distribution or a good estimate of a high-end value. Representativeness measures the degree to which a sample of values for a given endpoint accurately and precisely (adequately) describes the value(s) of that endpoint likely to be seen in a target population. A number of issues relate to the lack of representativeness which one can use to decide upon use of a sample in a given case: The context of the observation is important. In addition to those mentioned in the Issues Paper (demographic, technical, social), other concerns include what is being measured: environmental sample (water, air, soil) versus human recall (diet) versus tissue samples in humans (e.g., blood). In most cases, provided good demographic and social information is available on key issues associated with the exposure, adjustment can be made to make a sample representative for a new population, technical issues sometimes must be "guessed" from one sample to another (key issues like different or poor analytic techniques, altered consumption rates, etc.). A sample should not be used if it is flawed due to one of the following factors: 1) inappropriate methods (sample design and technical methods) 2) lacjc of descriptors (demographic, technical, social) to make adjustments 3) inadequate size for target measure The above applies to the internal analysis of a sample. Human recall includes behavioral activities (e.g., time spent outdoors or indoors, number of days away from site). Identifying differences (as defined by the final objective) between characteristics of the subject population and the surrogate population will generally be subjective because there is usually no data for the subject population. Differences might be due to socioeconomic differences, race, or climate. Lack of representativeness should not be "too rigid" partly due to uncertainties and partly because the subject population usually includes a future population that is even less well defined than the current population. E-6 image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } The surrogate population may overlap (as in age/sex distribution) with the target population. A context is needed to determine what constitutes "lack of representativeness." For example, if soil ingestion is not related to gender, then while the surrogate population may be all female, it may not imply that the estimates from the surrogate population cannot be used for a target population (including males and females). Bottom line: the factor being represented (such as gender) needs to be related to the outcome (soil ingestion) before the non-representativeness is important. Lack of representativeness "depends" in this sense on the association. Another panelist expanded on the above, noting that the outcome determines the representativeness of the surrogate data set. If in the eyes of the "beholder" the data are "equivalent" they represent the actual population well. Defining.representativeness is like defining art. One cannot describe it well; it is easily recognized but recognition is observer- dependent. We should strive to remove subjectivity as best as possible without making inflexible choices. Representativeness suggests that our exposure/risk model results are a reasonable approximation of reality. At minimum, they pass a straight-face test. Representativeness could therefore be assessed via model calibrations and validation. Representativeness often cannot be addressed unless an expert-judgment-based approach is used. It requires brainstorming based upon some knowledge of how the target population may differ from the surrogate one. In the long run, collection of more data is needed to reduce the non- representativeness of those distributions upon which decisions are based. Define the characteristics to be examined, define the population to be evaluated, select a statistically significant sample that reflects defined characteristics of the population (another expert noted that statistical significance has little relevance to the problem of ' representativeness—the issue is the degree of uncertainty or bias). Ensure randomness of a sample to capture the entire range of population characteristics. (Another noted that the problem is that we usually don't have such a sample but have to make a decision or take action now. If we can quantitatively evaluate representativeness, then we can at least make objective determinations of whether this lack of representativeness will materially affect the decisions.) The degree of bias that exists between a data set or sample and the problem at hand—is the sample even relevant to the problem? Types: Scenario: Is a "future residential" scenario appropriate to the problem at hand? Model: Is a multiplicative, independent-variable model appropriate? Variables: Is a particular study appropriate to the problem? Is it biased? Uncertain? Two experts agreed that statistical significance has little relevance to the problem of representativeness. A well-designed controlled randomized study yielding two results can be "representative" of the mean and dispersion, albeit highly imprecise. E-7 image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } Representativeness exists when the data sample is drawn at random from the population (including temporal and spatial characteristics) of concern, or is a census in the absence of ifieaSurerngit error. This condition is potentially lacking when using surrogate data that are for a tabulation that differs in any way from the population of concern. Important differences include: ^ v^ , ' ' iini ' , " ' ill , . - - • „ " '";,,,i » i ", ' . I i ;;: •. j«l . ! '. „',;• ,. . i '"-"!;' ", ^ '• , >.'.• : , .:,' '',!.•' ! I (>!,'. .: -. ' • :,••• ' . . ,-r-r chafacteristics of individuals (e.g., age, sex, etc.) ^ geographic locations ,—, '"" averaging time' ^ dynamics of population characteristics over the time frame needed in the study -^- large measurement errors Npn-representativeness poses a problem if we have biases in any statistical interest (i.e., lack of representativeness can lead to biases in the mean, standard deviation, 95th percentile, etc). Bias, or lack of accuracy, is typically more important than lack of precision. For example, we can expect some imprecision in our estimate of the 95th percentile of a population characteristic (e.g., intake rate) due to lack of relevant "census" data, but we hope that on average our assessment methods do not produce a bias or systematic error. Conversely, if we have a large amount of uncertainty in our estimates for a sample distribution, then it is harder to claim non-representativeness than when a particular distribution for a surrogate is estimated. In the following example, the distribution for the surrogate population is non-representative of the target population since it has too wide a variance. However, the uncertainty in the surrogate encompasses outcomes which could include the target population. Thus, in this case it may be difficult to conclude, based upon the wide range of uncertainty, that the surrogate is non- representative. i ••! Distribution for Target Population Nominal Distribution for Surrogate Population Range of Uncertainty on Surrogate Population Distributio Due to Measurement Error, Small Sample Size, etc. E-8 image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } Representativeness in a given exposure variable is determined by how well a given data set reflects the characteristics of the population of concern. Known characteristics of the data that distinguish the data set from the population of concern may indicate a need for adjustment. Areas of ignorance regarding the data set and the population of concern should be considered uncertainties. Representativeness or lack thereof should be determined in a brainstorming session among stakeholders, lexicologists, statisticians, engineers, and others may all have information that bears on the representativeness of the data. Known or suspected difference between the data set and the population of concern diminish representativeness. The question as to what constitutes representativeness is contingent on the problem definition—that is, who is to be represented, at what point in time, etc. If the goal is to represent a well-characterized population in the present, representativeness for a given parameter (e.g., drinking water consumption) should be evaluated based on the match of the surrogate data to the data for the population of concern relative to key correlates of the parameter (e.g., for drinking water volume, age, average ambient temperature, etc.). If, on the other hand, the population of concern is not well characterized in the present, or if the intent of the risk assessment is to address risk into the indefinite future, representativeness does not appear to have a clear meaning. The goal in such cases should be to define reasonable screening characteristics of a population at an indefinite point in time (e.g., maximum value, minimum value, estimated 10th percentile, estimated 90th percentile) and select such values from a semi-quantitative analysis of the available surrogate data. A representative surrogate sample is one that adds information to the assessment beyond the current state of knowledge. However, both the degree to which it adds information and the remaining uncertainty in the risk characterization must be identified. Suggestion: Replace the word representative with "useful and informative." A data set is representative of a characteristic of the population if it can be shown that differences between the data set and the population of concern will not change the outcome of the assessment. In practice, a data set should be considered in terms of its similarity and difference to the population of concern and expectations as to how the differences might change the outcome. Of course, these expectations may lead to adjustments in the data set which would make it potentially more representative of the population. In part, what degree of comfort the risk assessor/reviewer needs to have for the population under consideration determines how representative data have to be. Also of concern is where in the population of concern observations will take place. Are we comparing data mean or tails (outliers)? What degree of uncertainty and variability between the population of concern and the surrogate data is the assessor willing to live with? We may be using the term "representativeness" too broadly. Many of the issues seem to address the "validity" of the study being evaluated. However, keeping with the broad definition, the following apply to internal representativeness: E-9 image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } — Measurement reliability. Measurement reliability refers whether the study correctly measures what it set out to measure and provides some basis for evaluating the error in 1,111 •' ' * • „• < '.. -. :, • "' ' . V i , ....Vi. , ..•. . ....... ... „ measurement. „ ' ' ' ' ' '„ ' . ' ' '.I" ' "" ' ' ,' ' , '" '" I'll"' • ' ' • -p- Bias in sampling. Bias in sampling presupposes that there is a "population" that was sampled and not just a haphazard collection of observations and measurements. Statistical sampling error. \\ in ,„. I,, >i,i • ''i ' , .'". .... ! .;; , ' • . • . . The following issues apply to external representativeness: — Di9 the study measure what we need to know (e.g., short-term vs. long-term studies). If there is a statistical procedure for translating measurements into an estimate of the 'I"!' "'jl"'"!T!"J|][" ' i !,, III/!"'.' |h| nif ,|i| l|l||lli'!'l*| |i , ,, ,,[, f, ,m|. , ',: i , n''„,!," i , iiih' •' ,,„ • • ,. needed values, the validity and errors involved must be considered. —'. "Representativeness" implies that the sample data is appropriate to another population in an assessment. :u i"! considerations should'be included in, added to, or excluded from the checklists? Expand to include other populations of concern (e.g., ecological, produce). The issue paper and checklist seem to presuppose that the population of concern is the human population. Include more discussion on criteria for determining if-question is adequately and appropriately answered. Clarify definitions (e.g., internal versus external) Include "worked" examples: ,: T-T- Superfund-type risk assessment — Source-exposure-dose-effect-risk example '" :— Include effect of bias, misclassification, and other problems Ask if factors are known or suspected of being associated with the outcome measured? Was the distribution of factors known or suspected to be associated with the outcome spanned by the sample data? Focus on outcome of risk assessments (if populations are different, does it make any real difference in the outcome of the assessment?). i i ".....I, How will the exposures be used in risk assessment? For example, is the sample representative enough to bound the risk? In judging the quality of a sample, especially with questionnaire-based data, determine whether a consistency check was put in the forms and the degree to which individual samples are consistent. Risk assessors must be able to review the survey instrument. £-,10 i , if :.,. i ill*,. ,!, I III image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } Internal and external lists may each need some reorganization (for example, measurement issues vs. statistical bias and sampling issues for "internal;" extrapolation to a different population vs. reanalysis/reinterpretation of measurement data for "external"). Is a good set of subject descriptors (covariates such as age, ethnicity, income, education, or other factors that can affect behavior or response) available for both the population sampled and population of concern to allow for correlations and adjustments based on these? How valuable would some new or additional data collection be for the population of concern to confirm the degree of representativeness of the surrogate population and better identify and estimate the adjustment procedure? What is the endpoint of concern and what decision will be based on the information that is gathered? Since risk assessment involves a tiered approach, checklist should focus around the following type of question: Do I have enough information about population (type, space, time) that allow answering the questions at this tier and is my information complete enough that I can make a management decision? Do I need to go through all of the checklists before I can stop? (Questioning application/implementation) The checklists should address how much is known about the population of concern relative to the adaptation of the surrogate data. If the population of concern is inadequately characterized, then the ability to consider the representativeness of the surrogate data is limited, and meaningless adjustment will result. One consideration that is missing from the checklists is the fact that risk assessments are done for a variety of purposes. A screening level assessment may not need the level of detail that the checklists include. The checklists should be kept as simple and short as possible, trying to avoid redundancy. The checklist should be flexible enough to cover a variety of different problems and should be only a guide on how to approach the problem. The more considerations included the better. Guidance is needed on how to address overlap of the checklists. For example, when overlap exists (e.g., in some spatial and temporal characteristics), which questions in the checklist are critical? The guidance could use real life case studies to help focus the risk assessor on the issues that are critical to representativeness. Move from a linear checklist format to a flowchart/framework centered around the "critical" elements of representativeness. Fold in nature of tiered analysis. The requirements of a screening level assessment must be different from those of a full-blown risk assessment. Identify threshold (make or break) issues to the extent possible (i.e., minimum requirements). When biases due to lack of representativeness are suspected, how can we judge which direction those biases take (high or low?). E-ll image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } Include a "box" describing cases when "nonrepresentative" and "inadequate" will need to be used in a risk assessment (which is common)....Figure 1? ; , •' 'i|i •'" , ;. •'' ',"'"; . ' ',; > '" - ": • • I- ; '• '• ' ' '... ' " Define ambiguous terms, such as "reasonable" and "important." •* .i " J,«.. •• ; " :.. ', : ' i ' • "' : ' -« ; • • ,'" •" :"!": '•;.-!' ;";, '.' ,'.."•. ::, • . 1' Make checklist more than binary (yes, no)—allow for qualitative evaluation of data. Key questions: Can data be used at all? If so, do we have a great deal of confidence in it or not? Is data biased high or low? Can data be used in a quantitative, semi-quantitative, or only a qualitative manner? Standards according to which checklist items are evaluated should be consistent with stated objective (e.g., a screening assessment will require less stringent evaluation of data set than a site assessment where community concerns or economic costs are critical issues). Allow for professional judgement and expert elicitation. What are the representativeness decision criteria? Data only have to be good enough for the problem at hand; there are no perfect data. List some considerations pertaining to the acceptance?rejection criteria. •t The 95th percentile of each input distribution is not needed to forecast.risk at the 95th percentile with high accuracy and low uncertainty. * What is the study population doing? (i.e., were the sample population and study population engaged in similar activities?) Consider how their behavior affects ability to represent. « Combine Checklists II, III, and IV into one. I*1 *.s '', ', ' ?! 'kY- ' • : '-.••'. '"' '."'', • . i ''.•',"' ' '' '";'" ,:":' ' A'V " " '•' i '"' V'""' ' • 4 • Distinguish between marginal distributions vs. joint distributions vs. functional relationships. * Distinguish variability from uncertainty. Add a crisp definition of each (e.g., Burmaster's premeeting comments). " Add explicit encouragement and positive incentives to collect and analyze new data. « Add an explicit statement that the agency encourages the development and use of new methods and that nothing in this guidance should be interpreted as blocking the use of alternative or new methods. « Add an explicit statement that it is always appropriate to combine information from several studies to develop a distribution for an exposure factor. (This also applies to toxicology and the development of distributions for reference doses and cancer slope factors.) flow can one perform a sensitivity analysis to evaluate the implications of non-representativeness? How (to we assess the importance of non-representativeness? E-12 image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } The assessor should ask, "under a range of plausible adjustments from the surrogate population to the population of concern, does (or can) the risk management decision change?" That is, do these particular assumptions and their uncertainty matter? (among all others) Representativeness is often not that important, because risk management decisions are usually not designed to protect just the current population at a particular location, but a range of possible target populations (e.g., future site or product users) under different possible scenarios. Theoretically, we can come up with a "perfect" risk assessment in terms of representativeness, but if the factor(s) being evaluated is not important, then the utility of this perfectly representative data is limited. The important question to ask is: If one is wrong, what are the consequences, and what difference do the decision errors make in the estimate of the parameter being evaluated? The question of data representativeness can be asked absent the context/model/parameter or it can be asked in the context of a decision or analysis (are the data adequate?). The key is placing bounds on the use of the data. Assessments should be put in context and the level at which surrogate data may be representative. It should be defined in the context of the purpose of the original study. Two other factors are critical: sensitivity and cost/resource allocation. The question, therefore, is situation-specific. A sensitivity analysis can be conducted in the context of the following tiered approach The importance of a parameter (as evidenced by a sensitivity analysis) is determined first, making the representativeness or non-representativeness of the non-sensitive parameters unimportant. Representativeness is not a standard statistical term. Statistical terms that may be preferable include bias and consistency. When evaluating the importance of non-representativeness, one needs to evaluate the uncertainty on the data set and on the individual. At the first level the assessor may choose a value biased high (could be a point value or a distribution that is shifted up). At the second level, can use an average, but must still be sensitive to whether acute or chronic effects are being evaluated. When looking at the individual sample it is more important to have a representative sample because the relevant data are in the tails (more important for acute toxicity). When using a mixture, representativeness is less of a problem. Adjustments Take more human tissue samples to back calculate—-this makes local population happier. Determine the need for cleanup based on tissue sample findings. Re-do large samples (e.g., food consumptions, tapwater consumption). Look at demographics, etc. and determine the most sensitive factor(s). E-13 image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } '.ili! SJIf 1 Given: Model, Parameters I YES Sensitivity Analysis Enough Data to Characterize Paramete Variability? NO YES Representative of Population? NO YES Risk Analysis? Enough Data to Bounc Parameter Estimate? 1 YES 1 :'. i T Bounding Estimate t Enough Data for Sensitivity Analysis' NO »- Collect More Data If Possible i NO i i i l Adjustment •ft Use a general model. Discuss with stakeholders the degree of inclusion in general. Adjust the model with survey data if it is not applicable to stakeholder. Use a special model for Subpopulations if necessary. "Change of support" analysis; time-series analysis — non-CERCLA, important to the Food Quality Protection Act/ Conduct three-day surveys with year-long adjustments. E-14 image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } Hypothesis methods will work, but need to be tested. The group recommended holding a workshop for experts in related fields to share existing theory and methods on adjustment (across fields). General guidelines for adjustments will be acceptable, but often site-specific needs dictate what adjustments must be made. Example adjustment: Fish consumption: If you collect data 3 days per week, you may miss those who might eat less—a case of inter- versus intra-individual variability. Adjustment is often difficult because of site specifics and evaluator bias or professional judgement. Sometimes it is not possible to adjust. Using an alternate surrogate data set makes it possible to set some plausible bounds to perform a screening risk assessment. Stratify data to see if any correlation exists. Start with brainstorming. Regression relationship versus threshold. Covariance; good statistical power to sample population. Correlation is equivalent to regression analysis as long as you keep the residual (Bayesian presentation). Instead of looking at the population, look at the individual (e.g., breathing rates or body weight for individuals from ages 0 to 30) to establish correlations. What if the population was misrepresented? For example, population of concern is sport fishermen but the national data represent other types of fishermen. Set up a hierarchy: — do nothing (may fall out when bounded) — conservative/plausible upper bound — use simple model to adjust the data (may be worth the effort if credibility issues are dealt with) . — resample/collect more data Before considering a bounding approach (model development), consider if refining is necessary or cost/beneficial. Are there situations in which "g-estimates" are worthwhile? E-15 image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } •ft What is gained by making adjustments? Short-term studies overestimate variability because they do not account for interindividual 1 • 'i , ' • ' ' * 'H!'i|j: • ' 'i'" »J •.'. i " ' ' ' i " ., : , «i, ' ,i • „ ,, i' variability (upper tail is overstated). J' ilJ " ' : ' ' , / , ' ni ,;' ' "' ,|:: "',!! ' ; ',' Can we estimate the direction of biases when populations are mismatched? • r rrJ| , : '":!" r.S ;,-: ; '- " ; '! ; ",; '•(• ; - ', ; „ .• .. . i |f the fyias is conservative, then we are being protective. But what if the bias is nonconservative (e.g., drinking water in the Mojave Desert or by construction workers)? Appropriate models Simplistic: HOJV speculative? identify potential damage due to credibility issues. Complex: Identify the bias: high (conservative); or low (different scenario used than plausible bounding analysis)? . ,;,!"« '• .. :" '•, •= , .'.'.•''' ' 1 , '• • •(' I •'.''. : . ii::,. '' ;: ' . ' ; • '' :, Unless one has a sense of the likelihood of the scenario, what does one do? i>i i .1:i'us ' '' "; ''i'1 '". i ! ' :i''•'"' "' 'i ' ''„'': ' „ — Risk management can address it. — Present qualitative statementsiabout uncertainty. — Value of information approaches (e.g., does weather change drinking water data?). Short-term Research: Evaluate : short-term data set: make assumptions, devise models on population variability (Ryan paper) (Wallace and Buclc). Look at behavior patterns, information biases. Flesh out Chris Portier's suggestion oil extrapolating 3-day data to 6 months, years. This would give the assessor some confidence in for interindividual variability. ' Long-term Research: Collect more data! Possible ORD funding? Look at breathing rates, soil ingestion, infrequently bonsumed, items, frequently consumed items. image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } APPENDIX F PREMEETING COMMENTS image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } '..: iii '':-:! image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } Workshop on Selecting Input Distributions for Probabilistic Assessments Premeeting Comments New York, New York April 21-22, 1998 Compiled by: Eastern Research Group, Inc. 110 Hartwell Avenue Lexington, MA 02173 F-1 image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } Table of Contents Reviewer Comments ' Sjieila Abraham Robert Biaisdelt David Burmaster Bruce Hope.... William Huber ,. Robert Lee ..." Samuel Morris . P. Barry Ryan .". Mitchell Small ... idward Stanek . Alan Stern ...".". F-3 F-10 F-17 F-22 F-25 F-33 F-38 F-43 F-50 F-56 F-63 F-2 image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } Sheila Abraham EPA Probability Workshop COMMENTS ON THE ISSUE PAPERS / DISCUSSION ISSUES FOR THE EPA WORKSHOP ON SELECTING INPUT DISTRIBUTIONS FOR PROBABILISTIC ASSESSMENT Probabilistic analysis techniques are, as stated in EPA's May 1997 "Guiding Principles for Monte Carlo Analysis", viable tools in the risk assessment process provided they are supported by adequate data and credible assumptions. In this context, the risk assessor (or risk assessment reviewer) needs to be sensitive to the real-life implications on the receptors of site-specific decisions based on the analysis of variability and uncertainty. The focus should be on the site, in a holistic manner, and all components of the risk assessment should be recognized as tools and techniques used to arrive at appropriate site-specific decisions. Preliminary (generalized) comments from a risk assessment perspective on the issue papers are provided below, as requested. Evaluating Representativeness of Exposure Factors Data (Issue Paper #1) 1) The Issue Paper (Framework/ Checklists): Overall, the issue paper provides a structured framework for a systematic approach for characterizing and evaluating the representativeness of exposure data. However, one of the clarifications that could be provided (in the narrative, checklists and figure) relates to the explicit delineation of the objectives of the exercise of evaluating data representativeness. The purpose of the original study should also be evaluated in the context of the population of concern. In other words, factoring the Data Quality Objectives (DQOs) and the Data Quality Assessment (DQA) premises into the process could help define decision performance requirements. It could also help to evaluate sampling design performance over a wide range of possible outcomes, and address the necessity for multi-staged assessment of representativeness. As stated in the DQA F-3 image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } '•; • .'• • " • '•'$  ' :'  •   ';•' :    "    ' ••• •; •'       J; ; " ,;';;' "i, •   :  •   i;   Sheila Abraham
p^t „.'•..'..  i  ^.vi:'   	-.   ,  .  ..  .   '   -    '     •  \.   ',.,<. . ii -.  EPA" Probability Workshop
Guidance (1997), data  quality (including representativeness) is meaningful only when it
relates to the intended  use of the data.

On  the query related to the tiered approach to ("forward") risk assessment; site-specific
screening risk assessments typically tend to be deterministic and have been conducted
using conservative default assumptions; the screening level tables provided by certain
U.S. EPA regions have to this point also been deterministic. Therefore the utility of the
checklists at this type of screening  level might be extremely limited. As one progresses
through increasing levels of analytical sophistication, the screening numbers generated
from probabilistic assessment may require a subset of the checklists to be developed;
the specificity of the checklists should be a function of the critical exposure parameters
identified through a sensitivity analysis. Such analyses might also help refine the
protocol (criteria and hierarchy) for assessing data set representativeness in the event
of overlap of the individual, population and temporal characteristics (example, inhalation
activity in elementary school students in the Columbus area exposed to contaminants at
a school baiifield).
mi    •	»:	11:1    ,       .      ','••,   '"!' • '    ,       • i,. •    »,,".,i" '           ,    •  ,
fj Sensitivity:
The utility of a sensitivity analysis cannot be overemphasized.  Currently, there appears
to be a 'tendency to use readily available software to generate these analyses; guidance
on this in the context of project/ site-specific risk assessments should be provided.
'fe'iil  ! „ , i ,    "I, ' ''  'SHI|        , In  II" i1   	',- ' ,,.'". , ,r „ ', ,.»,      »  ,|,i|i|||,,j •	   i ,    III ' ,  „ I1    i . „    ii    •',:• •"•
Providing examples as done in the Region VIII guidance on Monte Carlo simulations
facilitates the process.

On the issue of representativeness in making inferences from a sample to a population
grid the ambiguity of the term "representative sample", process-driven selection might
be  appropriate for homogenous populations, but for the risk assessor, sampling that
captures the characteristics of the  population might be more relevant in the context of
F-4
'Si I,;:
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

Sheila Abraham
EPA Probability Workshop
the use of the data. This issue appears to have been captured in the discussion on
attempting to improve representativeness.
Empirical Distribution Functions (EDFs) versus Parametric Distributions (PDFs)
(Issue Paper #2)
1) Selection of the Empirical Distribution Functions (EOF) or Parametric Distribution
Function (PDF):
The focus of the issue paper is the Empirical Distribution Function (EOF), and a number
of assumptions have been made to focus the discussion on EDFs. However, for a
clearer understanding of the issues and to facilitate the appropriate choice of analytical
constraining situations would be beneficial. The rationale for this is that the decision on
whether to apply the EOF or the PDF should not be a question of choice or even mutual
exclusivity, but a sequential process that is flexible enough to evaluate the merits and
demerits of both approaches in the context of the data.

In general, from a site/ project perspective, there may be definite advantages to PDFs
when the data are limited, provided the fit of the theoretical distribution to the data is
good, and there is a theoretical or mechanistic basis supporting the chosen parametric
distribution.  The advantages to the PDF approach are more fully discussed in several
references (Law and Kelton 1991). These advantages need to be evaluated in a
project-specific context; they could include the compact representation of observations/
data, and the capacity to extrapolate beyond the range of observed data, as well as the
"smoothing out" of data.  (In contrast, the disadvantages imposed by the possible
distortion of information in the fitting process should not be overlooked.  Further,  the
(traditional use of) EDFs that limit extrapolation beyond the extreme data points,
perhaps underestimating the probability of an extreme event, may need to be
considered.  This is could be a handicap in certain situations, where the risk
F-5
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

,  i	,  •         '  *y        .•:  ,  :   :          '     '•'   '    '    '.. .''• •'- •  Sheila Abraham
EPA Probability Workshop
assessment demands an interest in outlier values. In such situations, a fuller
discussion of alternate approaches such as a mixed-distribution (Bratelvef a/.. 1987)
may be warranted.) Finally, the PDFs, given their already established theoretical basis,
may lend themselves to more defensible and credible decision-making, particularly at
contentious sites_,
'"| ,NI'l!i           ! '     ,        , !'• '       "      "•''„''     i    ,  '!    ' '"'	'
This predisposition to  PDFs certainly does not preclude the evaluation of the EOF in the
process. The advantage accruing from having the data "speak" to the risk assessor/
:,:;      .     .  'Hi   "  V; ••  ;  : 'U " ,    •      "     ',..•,.  :	•	 - '•	!-   J.     .  : :•  •" .' •    ; > , '• '
reviewer should not be minimized. Depending on the project/ site involved, the benefits
Of the complete representation of data, the direct information provided on the shape of
the underlying distribution, and even on peculiarities such as outlier values should be
discussed, as well as  relevant drawbacks (sensitivity to random occurrences, potential
underestimation of the probability of extreme events, perhaps cumbersome nature if the
data points are individually represented). In this context,  some of the comments in  the
"Issue/ Comments" Table ("issues" presumably derived from D'Agostino and Stephens,
1986) can serve as the basis for additional discussion.

2) Goodness of Fit:
The decision whether the data are adequately represented by a fitted theoretical
disjribu|iqn is an aggregative process, and goodness-of-fit is part of the sequential
exefeise.  Preliminary assessments of the general families of distributions that appear
:	".	'	•	  ,	; YM" i        ': 'V:.   ;    -,•;. ..  >,  •  -  ':	  • •• •	     -  •';.v:i'    ... '. .'  . •.',
tp best match tie data (based on prior knowledge and exploratory data analysis) are
often conducted initially; the mechanistic process for choice of a distributional family,
the discrete/continuous and bounded/ unbounded nature of the variable are evaluated.
Surornary statistics, including measures of shape are evaluated and the parameters of
the (candidate) family are estimated. The goodness-of-fit statistics should factor into
the whole process, as should graphical comparisons of the fitted and empirical
distributions. Goodness-of-fit tests can be an excellent confirmatory tool for verifying
iF-6
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

Sheila Abraham
EPA Probability Workshop
the chosen distribution, when used in conjunction with statistical measures and
probability plots.

However, caution should be exercised in situations where these tests could conceivably
lead  an analyst to support a distribution that a visual inspection of the data does not
support. Also, it should be emphasized that (for example for certain physiological
parameters), even if the distribution fits, maintaining the integrity of the (biological) data
should override goodness-of-fit considerations.  Ultimately, the persuasive power of
graphical methods for assessing fit should not be underestimated.
On the question how the level of significance of the goodness-of-fit statistic should be
chosen, this is often a function of the data quality assessment (DQA) for that particular
site or situation; an idea of the consequences in terms of real-life examples can be
gathered from EPA's Guidance for Data Quality Assessment (1997). On the whole, I
tend to agree with the respondent (#4) who states that the desired level of significance
should be determined prior to analyzing the data. Again, as the respondent states, if
minor differences in the p-value impinge substantially on the analysis, the "conclusions
are probably too evanescent to have much usefulness".

Summary statistics are useful, particularly in the initial characterization of the data (as
previously mentioned). Given the constraints imposed  by the project/ site logistics, all
too often these are the only data available, and they have been used as the basis for
analytical distribution fits (Ohio EPA, 1996).  Caution should be exercised in implying a
level of accuracy based on limited knowledge. Sensitivity analyses might help clarify
the limitations that need to be placed in such situations particularly when dealing with
an exposure parameter of considerable impact; further, the utility of such an exercise
for a parameter with minor impact (as revealed by the sensitivity analysis) could be
questionable.
F-7
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

Sheila Abraham
EPA Probability Workshop
:,„         '   ir '''in'      '     H ' i,"            '      ,    ,     ''    •'<
On the question of the value of testing the fit of the more generalized distributions    -
(presumably in lieu of the EOF), this could be an useful exercise, but the project
logistics may factor into this, as also the DQA premises. Project resources available
and the defensibility of the decision-making process need to be factored into the
situation.  The issue of fitting an artificial distribution to a data set, and ultimately
arriving at a distribution removed from reality also needs to be evaluated in the project-
specific context.
3) Uncertainty:
f      . .....    ,,11      ,.  • , i  ,,_ ,   .      •      ,  , .   • ; ..... ,  ' ..• •:' ,    • '  / "'
The discussion !p "Development of Statistical Distributions for Exposure Factors"
(Research triangle Institute) paper is interesting in terms of the approaches suggested
for evaluating parameter uncertainty; Hattis and Burnmaster's comment cited in the
paper that only a trivial proportion of the overall uncertainty may be revealed is
important. Certain methods (example, bootstrapping) appear to have intriguing
potential for accounting for "hot spots".
111 '." ' •.'           li'iL!       '    '" , «  , ,   .  •••  , ,     ,.    '„,•••'.,.   ,, , '   ,         ii
Finally, the risk assessor/ reviewer needs to be aware that the analysis of variability and
uncertainty is a simulation, based on hypothetical receptors. However, as stated
initially, this sometimes academic exercise can have multi-million dollar implications,
and intimately affect real-life human and ecological receptors; the risk assessor/
reviewer should always be cognizant of this consequence.

References:
Bratelyi'P., B.L Fox, L.E. Schrage (1987) "A Guide to Simulation".  Springer-Verlag,
New York.

D'Agostino, R.B. and M.B. Stevens (1986) "Goodness of Fit Techniques". Marcel
Deker.
F-8
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

Sheila Abraham
EPA Probability Workshop
Law, A.M. and Kelton, W.D. (1991) "Simulation Modeling and Analysis" (Chapters,
325-419). McGraw-Hill, New York.

Ohio EPA (1996) "Support Document for the Development of Generic Numerical
Standards and Risk Assessment Procedures". The Voluntary Action Program, Division
of Emergency and Remedial Response, Ohio EPA.

U.S. EPA (1994) "Guidance for the Data Quality Objectives Process" (EPA/QA/G4).
EPA/600/R-96-055

U.S. EPA (1997) "Guidance for Data Quality Assessments - Practical Methods for Data
Analysis" (EPA QA/G-9, QA-97 Version) EPA/600/R-96/084 (January 1998)
F-9
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

-IS!
ii'V ! •/' ' t	,„), "
.:  •, •.	 I I*
Robert J. Blaisdell, Ph.D.

Comments on Issue Paper on Evaluating Representativeness of Exposure
Factors Data
1 ,           .  '''III       i '                           !       „          '

The Issue Paper on Evaluating Representativeness of Exposure Factors Data is a well
written, clear discussion of the theoretical issues of representativeness.  I was
particularly interested in the discussion of time unit differences,  the (Sffice of
Environmental Health Hazard  Assessment (OEHHA) is grappling with this issue with
Several of the distributions which we want to use for determining chronic exposure.
•• v  „.	  '   •;, Hit!  , ,1,    "  ' •. 	•', • ,''.,,'•..,•'   •  , •'....' ":„::, .'• ,';•'.    ,„• 	  ,,'i- ,          '  • •
'  . • , '•   i,  I ijllifl •  ' .  i'!,'1"-  ' I	'•" ;," ,  '"     ' i  :,'    ., V 	 .''.'.I'1'  " , /'•<(. " i  ". '"I." '  .' • ' '  •   '  ,   '.'  , '•

The issue of representativeness of a sample is often complicated by lack of knowledge
•.':(i:  ' . i""   . :' •   ".'	11^ ; '-I-	  "  .\' " •;,",:;„!  .     '•  ••     "   " .;. :	i;	"^K  •;'"! '  .. i ,. '"'• , j> . ; . ••     *  ,
about the demographics of the population under consideration.  An accurate
•;1 ;..••'">  j I.   '   .--iiS ' '   „ :.,' " ,	i •'..';,• :    , ' '	• , •. '  ";    • •.•  ", •( ;:,:- ,!  ';;	   '', .'.. '  .'.  '.' •     ,:
l^term jnation of the  population under' consideration may not be part of the risk
'i;r:'l'i  , ' • '    " i||!<  |lf jji'iil 'ililpn Til  ;, ,;, ...  Jin i,  '  ''"I;,!., i  ' s::: !	: • , .1'!  i|h»  !!    fin,  ',» 'i\!i 1 ,:' 'i,i"i', "", .•»ii!"l''i: "'  ; '.ii '•'•»	 : ,,illi": , • ' if  . ''•" ' , ••    ,!',.,„
assessment requirements of regulatory programs.  If the population of concern has not
been characterized, the determination of the representativeness of the data being used
o:i•.'..'  ;';;•; 'i*:/ r;?4l ', •    ;,   >., ,  >., •   '  ',  !  •„•    -  ':•.•••.  ;;:;;, •„,';•   '   ;    "      i '•  " "
in the assessment is not possible.
,• It
I..,!
the issue of representativeness of the sample to the population is an important
question. For example, populations which are exposed to Super Fund toxicants or
airborne pollution from stationary sources may be from lower socioeconomic groups.
Unfortunately, most of the  information which is available on mobility is from the general
population. It may be that low income home owners have a much longer residency time
than people of median or higher income. It may also be that low income non-home
owners in certain age groups have a higher mobility than the general population.  We
therefore suspected that the available distributions were not representative.  In addition,
the U.S. Census data, the basis for the available residency distributions are not
longitudinal.  Another problem with the residency data when evaluating stationary
l^urces is the issue of where the person moves to. A person moving may not
tiecessarily move out of the isopleth of the facility, the likelihood of moving out of the
tsopleth of a stationary facility also may be related to socioeconomic status.
'  :   "': "      ™: '          :         ' '  F-10"
r ' ,.•>!•
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

Robert J. Blaisdell, Ph.D.
In order to address this problem, OEHHA proposed not using a distribution for
residence time in our Public Review Draft Exposure Assessment and Stochastic
Analysis Technical Support Document (1996).. Instead we proposed doing a separate
stochastic analysis scenario for 9, 30 and 70 years. We did not think that the 9, 30 or
70 years time points evaluated were necessarily representative of actual residence
times, but that these were useful, reasonably spaced intervals for residents to compare
with their own known residency time.

Using three scenarios complicates the analysis, but we felt that the approach had some
advantages over using a distribution. The California *Hot Spots* program is a public
right to know act which assesses risks of airborne pollutants from stationary sources.
Public notification is required above a certain level of risk. An individual resident who
has received notice is aware of the amount of the time that  he or she has lived, or in
many cases plans to live, in  vicinity of the facility. Therefore the individual could  more
accurately assess his or  her individual cancer risk. The relationship between the
residency time assumption and the resulting risk are clear,  not buried in  the overall
range of the uncertainty or variability of the risk estimate.

This approach might possibly be used in other cases where representative data in not
available or where the representativeness is  questionable.  For example if the drinking
water pathway is of concern and representative information is not available for the
population of a Mojave Desert town, the range or point estimate of cancer risk from
drinking 1,2,4 and 8 liters of contaminated tap water per day could be presented.

In some cases, each situation that a regulatory risk assessment program will be
evaluating will be almost unique, and therefore anything other than site-specific data will
not be representative. OEHHA characterized a fish consumption distribution for anglers
consuming non-commercial fish using the Santa  Monica Bay Seafood Consumption
Study Final Report (6/94) raw data. We compared the Santa Monica Bay distribution to
F-11
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

" 1
•••'•I
Robert j; §iaisd§ll, Ph.D.
tjfje fish consumption distribution for the <5reat Lakes (Murray and Burmastef, i§94).
i,M|'|ii',j,   ,  '''''  '' ;.   'I: ' i|| i,, |i||| ' |!"|. , .,	I 'I i ' i.  '''1|!1 ..'' 1 '  	' '' |.' ' ! » if'I!'  1 ' i' "ill i ii. l''!li "  ' ' "i'!1 ' '  "   I "  ; 111 "    '".'!' | ,.   ' •   '  , i '' .  ", i1
Y| e, fburid that the differences in the fwo distrjbiitidns |0u|d joe attributed fo
f^e'|lQ<|piogicaraiflerertces in the two studies,  thbs trie assumption that a salt water
fish consumption distribution Was comparable fo a fisfi  consumption distributibn for
large fresh water body was not implausible.  However, the data gathered from large
bodies of water are probably not representative of small lakes and ponds With limited
jafoductivity and where other fishing options may exist.  For such bodies of water a
life-specific' angler survey is probably the 6hjy way of obtaining representative data.
For cost rea'sojif, this option is not likely to be pursued Except In a risk assessment with
^Iry high financial stakes. wVchose to  recommend using the Santa K/ionlca Say fish
donsUmption. If could be multiplied by a fraction to be determined by expert Judgment
to Adjust for site-specific conditions such as productivity etc.  the Santa Monica Bay
fish distribution may not be representative Sn  other ways in a given situation but may still
be the most practical option. It is clearly ndt temporally representative for chronic
cancer risk assessment.

Cost is often a factor that limits representativeness.

On page 8, paragraph 5 of the Issues paper there is a discussion of determining the
relationship between two  populations and making adjustments in distributions based on
Speculative estimates of the differences  in means and the coefficients of variation.
Perhaps in many instances, another option would be to stale that the Information from a
sTJffSgjate  population is being used and that the actual population is known to be
Sifferent, or may be  different by an unknown  amount!  f here are many questions in risk
Is'seS^fnefll for which expert opinion Is no better than uninformed opinion in attempting
''ll:;;l' • .'i * •" .".'h ; .' :' .''"..' '" JiMiii „ i1 "i	 .:: • * s!,*!'  " 	'•  •  i;,.. ».'	i.   ' ,;. •»  	, • ,.!»'   „:, ,,,',ii''- ,'',i , ..i.:'.1:;,!!!1"1 | • ,- •,"',	i ...  	*"i • r ; '"n. „  ' "	 .1. j. '•<'	 '
a	;   i,,'1 ! ''  " ,L	  '.« »  .. ill'1,Sill,, • *'.lir: ' '   	 " " " * ''"" .. !|l||: '  ' . 'V    *" i' 	" ' '	V «	- •'' f '	 „:!:"'!', i"1 «' i * ,' 	 , '   ,' ,1 ' ,", ..."	, , '!	 " ,  ' " '!!!' ' V! 	 :
tp cjUantify the unknown.  An example of this is the shape bf the dbse^retponse curve
%	,;  '"•' «. '" ,!, • ii S'lliil iij!;,,;"1** ,  'Wll. f; '	,ii „, ,,,'j.i1" ' ,','!' ,  ' , "nji " " ,: f II'1 " „„,, , ' 'V i ',i!i ' 	I1 , r.1 ."j; !ir',, '"'ii,1;', , M"!if'	JSP"''.j	' 	Lj. HI";;;.' I 	'„,   „, '.flj i .,   ! •:„ 'T,:,1'' ,'„  ,,', ,"k, '
for cancer for most chemicals at low concentrations.,  A frank admSsslon of ignorance
may be more credible than ah attempted qUantificaioti of Igfibrahce In many cases.
:!• '	 •  ;  ',: ; "'  "'"i:	IS  .;..		• f:;  ":.'' .'  '. y •.  . "M, '• '•' " ;,!: :'•.•.!	•	,:  f	',^r-'\-'"- ,; ,,'- „,!'"•' :  ' *';   ";  ',':  • '
i;(!"»; In 4"'"  I
1*1,
fe J
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

Robert J. Blaisdell, Ph.D.
The methods discussed for estimating intraindividual variability from data collected over
varying short periods of time relative to the longer time period of interest are interesting
and would appear to be useful for the NFCS data.  OEHHA is giving some
consideration to using the techniques described by Nusser et al. 1996 to adjust the
distributions for food consumption that we have developed for food consumption using
the Continuing  Survey for Food Intake for Individuals 1989-91 raw data. I would be
curious to know if these methods have been validated on any actual longitudinal data.
The assumption of the lognormal model needed by the method of Wallace et al. (1994)
may in some cases be limiting. We have discovered when we evaluated broad
categories of produce consumption using the CSFII 89-91 data that some of the
distributions for certain age groups were closer to a normal model than a lognormal
model.

The Representativeness Issue paper discusses the importance of using current data.
The continued use of the 1977-78 NFCS  study is cited as an example. The raw data
from the 1989-91 CSFII has been available for some time as an alternative to the
1977-78 NFCS survey. Raw data from the 1992-93 CSFII survey is now available.
OEHHA has used that data to develop produce, meat and dairy products consumption
distributions for the California population. It is admittedly not a trivial exercise to extract
the relevant data from the huge raw CSFII data sets but this alternative has existed for
several years.  The 1989-91 CSFII data is clearly different in some cases from the
1977-78 NFCS. Beef consumption appears to have declined.  As a matter of policy,
there should be a stated preference for using the available data over.attempting to use
expert judgment to guess at the appropriate means, coefficients of variation and
parametric model. In some of the Monte  Carlo risk assessment literature, the
preference appears to be for expert judgment rather than data.
F-13
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

•'i, I!
'' "ll'l
Robert J Blaisdell, Ph.D.
Jheuse of related" data may in some cases be useful in giving some insight into the
representativeness of data collected over the short term for chronic scenarios. OEHHA
has used the data on total energy expenditure as measured by the doubly labeled water
method to look at the representativeness of our breathing rate distribution, based in part
on a one day 24 hour activity pattern survey. The information on total energy
expenditure gave an indication that intraindividual variability was a huge fraction of the
total variability (intraindividual plus interindividual variability).

The intraindividual variability for a broad category of produce such as leafy vegetables
i                              in
may not be very great relative to the interindividual variability. The intraindividual
Variability for a single item less frequently consumed item such as strawberries is
probably much greater than for broad categories.  Thus, short term survey data which
looks at broader categories of produce are probably more applicable to chronic risk
assessment than single item distributions.

Research Needs
i    	           i   '• '• v1'. •" , v  ; v '!'''" •S'vi';.1';:: •s.1'.'  ' •'. , <": '.: ":;.  	^^'^ /" I
The information which is needed to develop more accurate distributions for many if not
most variates needed for chronic stochastic human health risk assessment are simply
libt available. In particular there is a lack of longitudinal data for breathing rates, soil
ingestion, water consumption rates,  produce ingestion, non-commercial fish
consumption, dairy product consumption and meat ingestion. Some distributions in
common use, such as water consumption, are based on out of date studies. More
research is needed  on bioconcentration and biotransfer factors.  Longitudinal'data on
activity patterns and mobility patterns would also be very useful. There needs to be
much more research on dermal absorption factors and factors which influence dermal
yi '';/,;	r.i'1 :	ir , :\tm <-. •,.,• • i	 _;;:/,.;:,•'.if'tiii ,i!;,	 .;•» .•	\,   ,,,,  ., „.,,  	..,..,  ^ ,,,, _  	  ,,  ,., ,.  ./; 	, ,   ,  -, „,, i.,,,^,,,-
Absorption! More research needs to be done oh children and the ways that they differ
F-14
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

Robert J. Blaisdeli, Ph.D.
Summary

The overall lack of data, particularly longitudinal data, for risk assessment variates is
probably the most important single factor limiting representativeness.  If the purpose of
the risk assessment is to inform the exposed public, it may be possible and even
preferable to use  point estimates for multiple scenarios in the absence of some
representative data. The statistical methods for adopting short term data for use in
chronic risk assessment presented the Issue paper appear to be reasonable
approaches in instances where the required data is available. More longitudinal studies
would be valuable for validation of these methods as well as improving the temporal
representativeness of distributions used in risk assessment.  Most of the data used in
stochastic risk assessment will probably be nonrepresentative in one or more of the
ways discussed in the Issues paper for a long time into the future.

References

Murray DM., and Burmaster DE. (1994).  Estimated distribution for average daily
consumption of total and self-caught fish for adults in Michigan angler households.
Risk Analysis 14, 513-519.

Nusser, S.M., Carriquiry, A. L, Dodd, D.W., and Fuller, W. A. A semiparametric
transformation approach to estimating usual daily intake distributions. J. Am. Statistical
Association 91: 1440-1449, 96.

Southern California Coastal Water Research Project and MBC Applied Environmental
Sciences (SCCWRP and MBC).  (1994). Santa Monica Bay Seafood Consumption
Study.  Final Report. June.
F-15
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

-!„ .> • , ;,,  ^ ,.; -: i>j;i  • ', Jf    ;• ,•    : •    .; ,,. . ... | v i    , ; •;.; vi  ,  .:;; i ,  Robert J. Blajsdell, Ph.D.

USDA(U.S. Department of Agriculture) 1989-91. Nationwide Food Consumption

Survey. Continuing Survey of Food Intakes of individuals (tSata Tapes) Hyattsville, Md:
>. ':,„ | .. . ;.j ........ . ...... • '1 ' .r; i!  I1  ',"' '• i ...... ,]",•• ..... t"  j ''• !    'j''1'1' i!-i ....... '! ..... '  '! ..... i1- '•'•',  « ...... I',"'  ' " ..... "'  '''  '    ''   |i;!'    " I   '"'
Nulrilion ybnitormg Division, Human Nutrition Information Service.
'.. i 'fll-
,.»,i 1i;i'-'
HIM
iiiiiji 1,»
''III
F-16
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

David Burmaster
13 April 1998

Memorandum
To:
Via:

From:
Participants, US EPA's Workshop on Selecting Input Distributions
for Probabilistic Analyses

Beth A. O'Connor, ERG

David E. Burmaster

Thank you for inviting me to participate in this Workshop in New York City.

Here are my initial thoughts and comments, along with suggestions for additional topics
for discussion. Since I have just returned from 3 weeks of travel overseas, I will keep
these brief.
1.     Models and Data

In  1979, George Box wrote, "All models are wrong, but some are useful."

May I propose a new corollary for discussion? "All data are wrong, but some are
useful."

Alceon ® Corporation • PO Box 382669 • Harvard Square Station • Cambridge, MA 02238-2669 • Tel: 617-864-4300
F-17
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

David Burmaster
2.    Definitions for Variability and Uncertainty
The Issue Papers lack crisp definitions for variability and uncertainty as well as a
discussion about why variability and uncertainty are important considerations in risk
.. i. " ;; i' 'I-1,.	i'1:- ;,: '• •  ' };"jfif••)' •::"• \, •*. ': ",,^ , ir ;'  y: ! ?(  ,:
assessment and risk management. (See, for example, NCRP,1996.) In particular, I
recommend definitions along these lines for these two key terms:
iit I ;fl   ',;,;
<"'*!;! 91
Variability represents true heterogeneity in the biochemistry or physiology (e.g.,
body weight) or behavior (e.g., time spent showering) in a population which
• •, '  ', " '  ; i'htii    ,	• !	f ,!,  -' ',  ,"i: i' 'if1'  ,' •'    , . i" •   i f ,",""! "ii,i.,1" 'i  i, •«„'' ,•   "''ji' i 'i, . !,, '•  , v1   '     "    ' ,
cannot be reduced through further measurement or study (although such
heterogeneity may be disaggregated into different components associated with
different subgroups in the population). For example, different children in a
population ingest different amounts of tap water each day. Thus variability is a
fundamental property of the exposed population and or the exposure scenario(s)
in the assessment. Variability in a population is best analyzed and modeled in
terms of a full probability distribution,  usually a first-order parametric distribution
with constant parameters.

Uncertainty represents ignorance - or lack of perfect knowledge -- about a
phenomenon for a population as a whole  or for an individual in a population
which may sometimes be reduced through further measurement or study. For
example, although we may not know much about the issue now, we may learn
more about certain people's ingestion of whole fish through suitable
measurements or questionnaires. In contrast, through measurements today, we
cannot now eliminate our'uncerlaihty about the number of children who will play
in a new park scheduled for construction in 2001. Thus, uncertainty is a property
of the analyst performing the  risk assessment. Uncertainty about the variability in
a population can be well analyzed and modeled in terms of a full probability
Alceoq © Corporation • PO Box 382669 • Harvard Square Station • Cambridge, MA 02238-2669 • Tel: 617-864-4300
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

David Burmaster

distribution, usually a second-order parametric distribution with nonconstant

(distributional) parameters.

Second-order random variables (Burmaster & Wilson, 1996; references therein) provide

a powerful method to quantify and propagate V and U separately.

3.    Positive Incentives to Collect New Data and Develop New Methods

I urge the Agency print this Notice inside the front cover and inside the rear cover of

each Issue Paper / Handbook / Guidance Manual, etc. related to probabilistic analyses

~ and on the first Web page housing the electronic version of the Issue Paper /

Handbook/Guidance Manual:
This Issue Paper / Handbook / Guidance Manual contains guidelines and
suggestions for use in probabilistic exposure assessments.

Given the breadth and depth of probabilistic methods and statistics, and given
the rapid development of new probabilistic methods, the Agency cannot list all
the possible techniques that a risk assessor may use for a particular
assessment.

The US EPA emphatically encourages the development and application of new
methods in exposure assessments and the collection of new data for exposure
assessments, and nothing in this Issue Paper / Handbook/ Guidance Manual
can or should be construed as limiting the development or application of new
methods and/or the collection of new data whose power and sophistication may
rival, improve, or exceed the guidelines contained in this Issue Paper /
Handbook/ Guidance Manual.
Alceon ® Corporation • PO Box 382669 • Harvard Square Station • Cambridge, MA 02238-2669 • Tel: 617-864-4300

F-19
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

David Burmaster
4,"     Truncating the Tails of LogNormal Distributions

'l:',:,,  •    '      ' ,». ill  	 .   •  „ '  'i,!  .!	     ,  :      '.   •'     : i,:''1'  i ..:.'         -    !'<
1  i',  	'   lidi    ;"  ,  "    -    ,     •     ,>:    :' • ; ; = ,,'   ;	i:< •!, ',, ; -  ,•, „	: .:••'
,i||        ,,' .i"'     r       '     .!     •''•   :,   '' , ,

While LogNormal distributions provide excellent fits to the data for many exposure

variables, e.g., body weight, skin area, drinking water ingestion rate (total and tap),
ft'	/r
Showering time, arid others, it is important to truncate the tails of these distributions. For

i ..... ''•' .......... ij:'1, ......  ""  . ::]»! :•  ..  >,f ..... .'.;   i -..'• 2   ..... ' .........  ........ ';• •• • ••• •' ! ....... ••"  -! .................. '5    2 .....................

example, no individual has 1 cm of skin area, no individual has 10  cm  of skin area,

and no individual can shower 25 hr/d.

<i . •. ••  • 'i   •••'   ";;;;i:i, *•   •.,,   .<:   '  •-    •  .',". i  •• •  "   •   • •   ;  - •    .'•>••'•

: .•   " : ' "   * "A  >  " '   , , :  "; '   :";!  il1'',"' •. '''  '  ' •.. ' "';'"' •• '   :i.,i  . "  -' ; :, i'1 •:

5.     Ivlixing Apples and Oranges
.                ,                          .

It is wholly inconsistent for the Agency to proceed with policies that legitimize the use of

Ij"l '•' ....... :'":;' "'!1:'., :  -•:.' :••$;i, ';;• , -'i" •'!, .,' , .';,v.r. ,i" ' :; ","! •"• ..• > ", ......... : , •• i -r . ...... i :!•', ...... ,;;• ..... ; . , • "' , prbbaBilistic techniques for exposure factors while preventing the use of probabilistic Si,1 ,;. '_ «" if" ,•! i .' '(IE i IT _ "> , _!'„',!. ,;" > '• :". , , '•••'>;• ' • , ;:i"ii .'1 1 ' j'.. . ; :] :. " : techniques in dose-response assessment. By doing so, the Agency double counts the effects of variability and uncertainty, all on a Iog10 scale - i.e., by several orders of magnitude. 6. *; Report by RTI . " ' , • , i'li'i > It,,, I disagree strongly with many of the approaches and conclusions found in RTI's Final Report dated 18 March 1998. ,i!. 'i , i ,'i ' ' , ,,t! II,! I ' , ,!' ' „ ii, i ',,, , "i, 'r, , ' , , ' ,, ' i, , eferences. Box, G.E.P., 1979, Robustness is the Strategy of Scientific Model Building, in Robustness in Statistics, R.L Launerand G.N. Wilkinson, eels., Academic Press, New York, NY Alceon ® Corporation • PO Box 382669 • Harvard Square Station • Cambridge, MA 02238-2669 • Tel: 617-864-4300 • • •-" ":• -"" F-20 '" " '""""" ' image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } David Burmaster Burmaster& Wilson, 1996 Burmaster, D.E. and A.M. Wilson, 1996, An Introduction to Second-Order Random Variables in Human Health Risk Assessment, Human and Ecological Risk Assessment, Volume 2, Number 4, pp 892 - 919 Alceon ® Corporation • PO Box 382669 • Harvard Square Station • Cambridge, MA 02238-2669 • Tel: 617-864-4300 F-21 image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } ' , •-, '•••• - it :;;7 1 '- • s:.:\ I/''1"1 ;;, • , in.'/in. , :,, • .• a^i'W Bruce Hope REPRESENTATIVENESS (Issue Paper #1) 1) The Issue Paper We would use probabilistic methods specifically for the purpose of assessing risks from the uncontrolled release of hazardous substances at a specific location (site). Our overall goal will be to feel confident that the entire risk assessment (and not just a few of its components) is representative of site-specific conditions. Our objective is better risk manasement decisions. This requires us to keep a few other considerations in mind. The issue of representativeness in terms of a fit between available exposure factors data and resulting distributions is dealt with in the issue paper. However, a risk assessment cannot be performed with exposure factor distributions alone - some type of exposure model is required. We should therefore also be concerned with the representativeness of the exposure model within which the individual exposure factors ,,'ilth ' '" i " ',,, •'! nil , ,; • •, ' I •' i1 ', i " '',',' : ',• '•. • , , „!, ' , , ," •' i' | are used. Correlation between exposure factors could significantly affect the ll ij,' ••• ' .' 'l!:"'" • II'" jLiic! ' „ •" '"'.i,,''1 " ,' I"1 'I"'"'' ••' T!" '!!., i,:" , " i,' • "'.''''.'„ .' ' i li'ifl'!1''!1'" :'' ''• i«. " ',„ i •!•'"! " i !• ' '' representativeness of the resulting risk assessment. It appears possible to have too rtiuch or little correlation between factors. In some cases, the correlation is not necessarily with body weight and/or age but with an underlying activity pattern (human behavior) that may not be fully known. This nature and extent of correlation should be a factor in evaluating representativeness. The issye of data and statistical inferences at thev'exltre,me upper bounds (e.g., j'1 ! j1; ' '...'. - '/fl , ': ••( ' " in1. ',' . ": - '!'", ': •••. •.,!: ' i-• ;:" '• li'1'"!'1;:!"1' '••'. , •:'• , : , •'','•: 99.9tlr percentile) of a distribution has been raised in the literature, on the Web, and in ?:."""', , ".•'. 'M ..'••l'..:,, v, ' • : • ' '• ..",.' i..:, ^, •.:' -1-1' C- , , ...'.|l!i^-: ..•. ,; ',. »• >•:„ • .' ' ,<• other U- Sj EPA forums. As a matter of policy, we regulate at the 90th percentile, feel ji?" '. i ', J, " ""; ^rJllllj , ,11 ||ia| decisions Ipsed on extreme upper bound estimates are potentially unreasonable, f • " •', , , ''•'.!' •11 ;. n and thus have truncated the upper bound (not allowed its extension to +°°) of many of ill - v i ••• ; in | ff \ I J ?,-'•; •' .; ; "iii ' II I : • '", .':••: • ' , , F-22 image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } Bruce Hope the exposure factor distributions. How any such truncation of a distribution affects its representativeness should also be discussed. The suggestion that probabilistic methods could be used in any form of "screening-level" risk assessment is of concern. We view screening has a quick but highly conservative comparison of environmental media concentrations with published toxicity data that occurs early in a remedial investigation (Rl) for the sole purposes of narrowing the focus of the baseline risk assessment. Under our current guidance, we are preserving probabilistic methods for use only in a baseline assessment. 2) Sensitivity When various exposure factors are combined within a given exposure model, it is typically the case that a few of them have a disproportionate influence on the outcome. For example, soil ingestion rate, soil adherence factor, and exposure duration are often primary drivers, as well as major sources of uncertainty. We should broaden the discussion to consider whether all exposure factors are of equal importance, in terms of their influence on the outcome of the risk assessment, so as to better focus our distribution development efforts. 3) Adjustments Concern has been expressed that any "default" exposure factor distributions proposed by U. S. EPA will, perhaps unintentionally, will evolve into inflexible or "standard" requirements. To counter this, as well as allow for inclusion of regional and local influences, U. S. EPA should propose, in addition to any de facto "default" distributions, an exemplary method(s) for establishing exposure factor distributions. This exemplary method should be as straightforward, transparent, and explainable (primarily to risk managers) as possible. It should also describe quality assurance (QA) and quality control (QC) procedures to allow for the expedient and thorough review of probabilistic risk assessments submitted to regulatory agencies by outside contractors. F-23 image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } Bruce Hope EMPIRICAL DISTRIBUTION FUNCTIONS (Issue Paper #2) {I did not have time to fully review paper #2, so only have input on this one item at this Siii ', : ": :!l . '. :, , v ; " • : ' ll! , •' 'i •' . , i .'. time} (Goodness of pit We should also ask, if the overall risk assessment is sensitive to both the exposure model and only a few of many exposure factors, just how "good*1 does every other distribution have to be in order to support credible risk management decisions? For example, if a relatively esoteric and hard to conceptualize distribution best fits available data, but a much more common and more easily understood distribution fits almost as well (say within 20%), would there not be some advantage in use of the letter? In addition, if toxicity data remain as point estimates with uncertainty Ippfolfching an prder-ofmagnitude, it would appear that there should be some leeway 1(1 how we choose or definecertain exposure factors. Hi-' I''?: image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } William A. Huber Representativeness (Issue Paper #1) 1) The Issue Paper 1.1 The checklists Section 3 of the Issue Paper regards the inferential process as consisting of several stages of inference and measurement: Population of interest -> Population(s) actually studied -> Set of individuals measured (the "sample") -> The measurements. The three stages are denoted "external" inference, "internal" inference, and measurement, respectively. This appears to be a useful framework. However, the four checklists address the first two stages only. Checklist i concerns the "internal" inference; Checklists II through IV concern the "external" inference. No checklist specifically addresses measurement. This approach is unbalanced. The obvious parallelism among Checklists II through IV emphasizes the lack of balance. We should consider whether a better organization of checklists might be achieved. One possible organization could be: Checklist A: Assessing measurement representativeness Checklist B: Assessing internal representativeness Checklist C: Assessing external representativeness Checklist D: "Reality checks," or overview. Checklist B and checklist I would nearly coincide. Checklist C would incorporate the (common) questions of checklists II through IV. Checklists^ and D are new. Checklist A would incorporate certain questions sprinkled throughout Checklists I-IV, such as: •. Does the study appear to have and use a valid measurement protocol? •. To what degree was the study design followed during its implementation? F-25 image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } P'V . „, IF !' 'i ' ,„' t . / ' • -il ' -.,• " ,; '.' . , ' " ••• .- , •• " {':*. ' "••' : William'A. Huber' ». What are the precision and accuracy of the measurements used in the study? •". Did the study actually measure what it claimed to? f he questions In Checklist D would focus on the fundamental questions: •„ Has the data set captured the variability within the population of interest? •. Is it sufficient in size and quality to support the estimate, decisions, or actions recommended in this risk assessment? •, Can we quantify potential departures of our estimates from their correct (but i III i . ,..••!••'••!•• : " " . .;';••' '••: •••• '"" •'' l" .'i;' ' '. '" " . •' •'•.''•';.' unknown) values? Why and how? |i : - v !,' '. ' i - '• •: .. • ' ; Each of the bulleted items above has some detailed questions associated with it. 1.2 Tiered risk assessments There is no subset of questions that can be selected since It cannot be foreseen which question is critical to evaluating a particular study. However, there is a basis for limiting the effort needed to establish representativeness. First, materially unimportant variables—as established, for example, by a sensitivity analysis—need not be fully addressed. Second, many of the checklist questions are relevant when variability and extreme percentiles must be characterized; they become less consequential when only a central tendency need be assessed. Finally, for a screening risk assessment, only qualitative degrees of representativeness are needed. For example, if it is known only that study results will conservatively overestimate exposures, then that study could be useful for a screening level risk assessment, but probably not for subsequent tiers. 2) Sensitivity fthere are two fcincls of sensitivity in a probabilistic calculation. They are related to the i*,^ ' "","' " ",''I,,!''"1 IJfi'iJ!]1!1!':; U '" .hill 'I!!1!1'11 :'!" ' !"']|'| " V. ,'! ,*» ,„'•,, ",,,f *,,,;,„• "p" ^ru , , ;„ , ,Pi: v ,„," ' „, i, ,:|, itf,,,,' ' » ', „ •; ^,:,|il|1|" ' j1, !,„ ", „,:' , '• ": ' (distinction JDetween yariabiiity arid uncertainty. We may, with some loss of generality, -." '- ' :•'•. '•"•;> * ,,; . • '•:••• F-26 image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } William A. Huber suppose that the calculation is a determined procedure F that processes a collection S = {p1, p2,..., pN} of "inputs," each of which is a (possibly degenerate) probability distribution, and outputs a single probability distribution F(S). If there is a material change in inferences based on F(S) when one of the input distributions, say pi, is collapsed to a point, then the calculation is sensitive to the variability in pi. Otherwise, the distribution pi can, with some safety, be replaced by a single number (a degenerate distribution). Uncertainty in the input pi can often be described as a collection of possible distributions {pi1} that are "close" to pi in some sense. A typical example is when pi is parametric and {pi1} is described by a set of alternate values of the parameters. There may even be a probability distribution on {pi'} (a Bayesian "prior"). If, by replacing pi by an arbitrary element of {pi'}, the inferences based on F(S) change in a material way, then the calculation is sensitive to the uncertainty in pi. The data must be sufficient to establish either that a variable is not a sensitive input or, if it is, the data must be sufficient to characterize the variability or the uncertainty or both, depending on which contribute to the sensitivity. This provides one basis for deciding when data are adequate. However, it could be argued that any data acceptable for use in a screening risk assessment are necessarily acceptable in subsequent tiers—at a cost. To be specific, for data to be acceptable at all they must provide some valid information about the population of interest and some quantifiable level of uncertainty must be established (no matter how great that level is). This is true for any risk assessment at any tier, not just for probabilistic risk assessments. For screening use, inputs would have to be set at extreme (but realistic) levels consistent with the data and their uncertainty, in such a way as to ensure a "conservative" estimate of risk—that is, one biased high. Once this is accomplished, it would seem there is no obstacle to using the F-27 image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } . same data in the same way in subsequent tiers, with the price for doing so being estimates that are still biased high. 3) Adjustments .:".' '$:    •	-;' ,'V'S1	!' pj'l  " •'"•', - ""!'!';  -  ^ ,   	•   : '"    *..'.'•' .'   " '' '" ', ,"< • if!;;"!1"  ', ' • •  "  , V ."   ,.  '"   !'         I1
i'1"'    ;;i,!, i  ,;i!;l,,i- -i  ,	,;Jki  : .:.f.',.':.,	'*].••$* ,'>''v .,• /. ,•. •. !• •;,-,:;:,, t;";,;»,; - v '$v :*•>!.	'••.:	,i>	 ;< •'/'<..!'         ''
Geostatisticaj methods; arei available forcertain adjustments of spatial scales.  Good
':'  '' r^^regcesi<a''re''lfcressie? "Kf. 'lv'Statist!csjfor Spatial' Data;* JourneJ, A', and" C. "Huybregts,
"Mining Geostatistics." In particular, methods such as "conservation of iognormality"
have been developed to adjust for differences in spatial measurement scale (this has
been termed the "change of support" problem). This is the spatial analog of the DW
model.

Adjustments should be applied with  extreme caution because results can be very
sensitive to them. Similarly, surrogate data should be used very cautiously. A good
point of departure for considering('adjustments is'thVfoiiowihg1 definition, constructed to
capture the use of "representative" in EPA guidance ("Guiding  Principles for Monte
Cario Analysis, lPA/6^0/R-97/001):
", " '    ii!!!1!!1' "!"" '""","„ " "	 ' "I	:«  :• i!"i    '•'.in- ••••'.  "•>,;:	! ","   •   '       " i ,;  '.,':,„;-•   • ••• „:„•-,     -   ,;       •,/.  .•.'..    i   I
••  „  •,«[•     . '  •     "'!jil|  • '"  ,;   ,",;  "A,   -.     ' 'i'i' :   '  	;  „' ,.';, :„   ' :: ••  •• : ;•'   ,  - .;; ,  ' :,l   •        Jj |
Data are ^representative" when they admit 6b]ective and quantifiable
Statements concerning the accuracy of the relevant inferences made from
:    T,,:  ,    them.

From this point of view, adjustments can be considered (and defended) when made in a
Way that allows the potential bias or imprecision thereby introduced to be quan'lifie'd m
,,,-  ,  IK'i'ill: .'iTillsa:,':,,' n!l» 1 .'CIIIIBr !•'	Jr ' H,: "	'IV '»" , ' •" „! . '• •!,	•'?:•(!(-. ,   '",','„',,, '''•	I  f. !'iar..'!l J.J. 	  i ".(•.'.;'
':.:} ,„'   Ipg, ,ri,s|s issessment.
;' ..'.'    ii|l|l'!i "li1!' s ";'	"'":  '„ ''-'" i""1 , iSI« '"'	 . ,  '.,'[;,	 	 ' '.:',•'» • ,„  ''„  -i,-':', '.  ,-"'•  , '   .in	'i ;, . •; -:	 '«/";
,  	!	  '."	:   ;•  « ",:" ..    ';: •• • = ' •:<  ;!  3	"i- •  .< :<    ••  • ••
'•••,••     '   •'   '".ii  ,   ;; - :   '-.; ,  • • t|.;' • • i •  •.    ,;
;,  , .    ,' t.'  	 ;.,»;    •      .;.,•,  I,"	"nifV,  •'•. .   . ' ,'
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

William A. Huber
EDFs (Issue Paper #2)

1)    Selecting an EOF or PDF

The primary consideration is the effect the choice will have on the risk assessment
results.  Each choice has relative advantages and disadvantages.  They come down to
this: using the EOF honors the data but subjects the calculation to the risk that the EOF
poorly represents population variability and percentiles, a risk that can sometimes be
decreased by using a well-chosen PDF.  Using a PDF requires some theory and
professional judgment and subjects the calculation to the risk that either (or both) could
be wrong or inapplicable.

The choice is not inherently one of preference. With small data sets especially, an EDF
is unlikely to represent an upper percentile adequately and so is manifestly a bad
choice.  (That's not to say that any particular PDF fit to the data is necessarily better!)
When measurement error is large, the EDF will not appropriately separate variability
and uncertainty.  On the  other hand, when the data set is large and not fit well by any
theoretical distribution function, using the EDF is an excellent approach.

So we come back to the  basic point: what effect will choice of distribution function(s)
have on the risk assessment results? This is determined in part by sensitivity analysis.
For this, the exponential  tail fitting approach is particularly intriguing, because it seems
to provide a robust opportunity to explore how relatively more or less extrapolation
beyond the sample maximum (or minimum) will influence the results.
F-29
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

William A. Huber
2)
Goodness of Fit
The best basis for concluding that a fitted distribution adequately represents a data set
i| when (1) there is a theoretical reason to presuppose the data will be represented by
such a distribution and (2) the fit is consistent with that presupposition! In this situation,
P-values are meaningful and useful  provided that one appropriate goodness-of-fit
(GOF) test is chosen before obtaining and testing the data.

Graphical examination of the distribution is crucial.  AH  empirical distributions will depart
from the theoretical fit, so the nature and amount of departure must be assessed. It is
highly unlikely that any standard GOF test will produce P-values that reflect the
1             "I'll i                 	.i  '  . J	i	1   .      ,	,'  . ,:<:•:,•.! >;" :..•••" ,,
sensitivity of the risk assessment results to these departures.  In particular, goodness of
fit in the upper (sometimes lower) percentiles is usually far more important than
goodness of fit elsewhere.
!'«,;  ; '	'  . '  -.	  _,» ; , •  .;,.._  ,'	.:   . •  "...,.,   ';•   •     , ':   ':. '*.'" '• .•,•••',', L .••'    • '  ''.•*,
In many cases, where many input variables are involved in a risk calculation, using
fitted  distributions that reproduce the means  and variances of the data is likely to
produce adequate results.  So, more than  any P-value or selection of GOF test, these
tpree criteria wl be practically useful for risk assessments:

IV  Correctly represent the centers (means and medians) of the input distributions.
£  Correctly represent the variances of the input distributions.
$, Fit the Important tails of the data as well as possible. (The "important tails" are the tails most influencing the upper percentile risk estimates. The definition of the tail—e.g., data beyond what percentile—will depend on which E I , - ', '- ill. Hi ''i. ... :'",!, V '. ,. , J ,, , • , ' ... .. , dppef 0e'rce"rltiies are Being characterized in the risk assessment.) Note that EDFs will satisfy the third criterion only when data sets are large enough to estimate extreme percentiles with confidence. .«!•]" K '51 ' " F-30 image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } William A. Huber When only summary statistics are available, there is an inherent problem in fitting any distribution: it is impossible to estimate uncertainty. Using additional information about possible limits to the data (that is, what the most extreme values could be), one should over-estimate the amount of uncertainty in the fit and use that in a sensitivity analysis. Uncertainty in the variance of the data is particularly important for probabilistic risk assessments. When the better known distributions do not fit the data, there is exceptionally little advantage to resorting to someone's system of distributions, such as the generalized F. First, there is usually no theoretical basis for adopting any of these distributions. Second, there is little assurance that the best fitting distribution in a family will adequately represent what is of importance, namely the variance and tails. Third, reproducing the calculations can be difficult if the family of distributions is not in general use or is ad-hoc, like the five-parameter generalized F distribution is. Fourth, many of these families of distributions include obscure members whose estimation theory might not be well understood or even known. It would be better for the risk assessor to work with familiar constructs whose properties (especially with regard to influencing the risk assessment outcome) are well known. 3) Uncertainty Every standard method of assessing uncertainty has limitations. Maximum likelihood methods often are based on asymptotic normality, which sometimes is not achieved even for impractically large data sets. There are applications where the bootstrap does not work—it is not theoretically justified. Certain methods, such as pretending the likelihood function is a probability distribution, simply have no justification (based on the theory of estimation). In general, uncertainty should be assessed as aggressively as possible. As many possible contributors to uncertainty should be considered and as many of these as F-31 image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } VV'!li§m A, HMlper possible should be incorporated in the risk assessment, because their effects accumulate. An excellent method for assessing uncertainty is to randomly divide datasets into parts, perform calculations (such as fitting distributions, estimating statistics, and computing t. . ,1 * • . ,. . . =' ., I, • ',. . ," , ,;.. , " '•( , ,.. .! , • •, . • . .. ' . , ' "ll risk) based on each part, and evaluate the differences that arise. Certain forms, of the bootstrap and its relatives, such as the jackknife, automate parts of this procedure. 71,, i •; *• |h pi,,,1 v •'' ' \ ., fn .• , [ , '• • • , ,;„""!,, >; i " ' i '. ,• , "'•': ' ' , i., '"",•. image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } Robert C. Lee, Golder Associates Inc. Comments Regarding "Issue Paper on Evaluating Representativeness of Exposure Factors Data" 1 . The issue of representativeness relates to how the risk assessor makes judgments and corrections regarding uncertainty inherent in a nonrepresentative sample. Discussion of the differences between uncertainty (bias and/or error) and variability (heterogeneity) would be useful to avoid confusion. For example, Checklist I misleadingly implies that measurement error can have an effect on variability, which is an inherent property of a population. Uncertainty can either be characterized as systematic (bias) or nonsystematic (error). Uncertainty in exposure assessment may stem from: Model errors Errors in the design of the assessment method (i.e. measure of exposure) Errors in the use of the method Subject limitations Analytical errors One way to represent bias and error is as follows. A measured or observed value X, can be represented as a function of the true value 7], bias b, and nonsystematic error Ej, as: The population distribution of Ts represents variability. However, perfect knowledge is rarely available. Therefore, E can be represented, for example, as a normal distribution with a mean of zero and variance as: o2= o-2T F-33 image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } Robert 6. Lee, Golder Associates Inc. \g/here cr^ is the variance of the uncertain measure X, and o*T is the true variance (assuming independence). §ias (which can be positive or negative) can be represented as a deterministic shift in the mean of X as compared to the mean of T, as: Thus, error and bias can have an effect on the estimated population distribution, fcjut not on the true variability. .'• " , " "ill ! : • , • , -I;-':-- ; i,;, . . , ;, .1, , , , ..' . ; , 2, In many cases, an approach that uses "reference individuals11 or strata rather than attempting to evaluate or estimate variability in a broad population may be useful. For instance, if one is concerned about children's exposure to lead in a Western mining town, it may be simpler as a first step to hypothesize a few examples of children with deterministic characteristics with regard to site-specific population variability, and then evaluate the uncertainty associated with these reference individuals' exposures. This rflethod can be relatively inexpensive and easy compared to population sampling, and could be used as a screening step in an iterative decision-making framework. '•" 'i i? 3. The exact meanings of the terms "probability sample" and "probability sampling" Ss tlsed in the Issue paper are unclear. Presumably these are broad terms covering SM.' -' "-*; " »:,'!V::.::^' - ¥- ,V;;. : x. •••..*;** i? v ,vt•<•: .-p; •^•t'< ,;:• ,-„, • .; ,, .'..;. schemes such as random, stratified, cluster, composite, etc. sampling. If so, then there §hould be clarification and discussion regarding thei'methodological' and inferential differences between these methods. For example, simple random sampling may not be afSprdp'riafe for "all" environmental exposure variables. If an exposure factor varies geographically, then it may be more appropriate to spatially stratify the population, and characterize the factor within each strata as accurately and precisely as possible. IS! ".:;' F-34 image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } Robert C. Lee, Golden Associates Inc. 4. As stated in the text (page 8, final paragraph), the process of determining the "importance of discrepancies and making adjustments" may be highly "subjective". However, the remainder of the discussion focuses heavily on frequentist methods of accounting for sources of uncertainty, which may not be the most appropriate approach. There should be discussion regarding both empirical and nonempirical Bayesian methods of population inference, since these methods are very powerful and are increasingly used in risk applications. A major advantage of Bayesian methods is that they allow refinement or "updating" of a priori knowledge with additional data or information. 5. More attention is devoted to "temporal" characteristics of a population than "individual" or "spatial" characteristics in the text. The reason for this is unclear. There should be discussion of how to determine the relative importance of these characteristics in risk assessment. 6. Discussion of Bayesian techniques may be useful in Section 5 of the paper, which covers issues involved with improving representativeness. 7. Discussion of the use of simulations for future scenarios would be useful. For example, if a the characteristics of a population are changing over time, time trends could be incorporated into a simulation to determine the parameters of an particular exposure variable in, say, 20 years. Comments Regarding "Issue Paper on Empirical Distribution Functions and Nonparametric Simulation" F-35 image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } „';•! : Vi Robert C. Lee, Gblder Associates Inc. j. f. The assumptions listed in the Introduction of the issue Paper are ihiportant and rii" .I,..',. ;>,"•: I .ISljJ]'1. ti '"Pi' '"•,', ••'•''" '.I1 .'i '••,.,.. ' 'l- •!'(,:.„'( ': ;;/f!V''H'-j •• '. ^liSii -,"' :,,, M,!P ••:-•;• .'.,'': , :: ., ,\ "'•'!"' should be discussed further. The first assumption/.'. .data are sufficiently (ijili'l/',!'", 1i.|.')1,j''!,l|»r., .NiiJt |l:ii!:;"i',' \ ,;': ; _ ;:-; ,!*•••' :f!'i' .^ VIM':' 'j; ,i. ; i"; ..jj.'Jfv ,,,;,*, ''A, V,, 1=;,, ^' i ...!"»_. •,._.; 'i;,^;. ,' •, " j •.•• rfprisefitaliveof the exposure factor In question", Is rarely tiflet. Uncertaintyassociated With representativeness is often considerable. The second assumption,".. .the analysis involves and exposure/risk model which includes additional exposure factors'1, fi often true, although evaluation of the upper tail of a variability distribution is often difficult because of its uncertainty. If the tail is of interest, it niay be preferable to stratify I i 'ill III III PI I ' ',,'",'•' .!!, i" i1" • i ," •, ,'":: • ' : 'i '• ,",.,[,' ' , , „ - " , • ' the analysis so that the mean of a high-exposure stratum can be used in the risk ,1" ii •' i' ,, i i ' , ' ' it,! i 111 ,, i T ;„ , i', 1,1,11 i' 'r!i'"i,,l'ii '', "' \ '! ' •' • • ,'" i , ' assessment. The third assumption,". . .Monte Carlo methods will be used to investigate the variation in exposure/risk'1, may be true in practice, but other simple analytical and numerical methods exist. Given simple distributional assumptions (e.g. lognormality), a hand calculator can be used to calculate probabilistic output of many regulatory risk assessment models. 2. Examples of EDFs that have been used in risk assessments would be useful. 3. The statement implying that it is rare that theoretical probability distribution functions are "available" for exposure factors deserves discussion. For example, under :!:!»"i :, , ' .: ' ' ii.i'smiii .''„ ' •' 'ii1 : » .j'1 i1 • • " ; n ilk', i , ,:':,', •,;,' ;> ,iin,, .„. : "• i F" ;,i': ln' " ' ,,r' ••• • ; , • ' : "n, .'„ t|ie jTiaximum-entropy criterion, theoretical PBFS may |,e fj| jn a rigorous manner using various combinations of limited a priori information. Furthermore, the assumption of lognormality for many exposure variables and models has a theoretical as well as a mechanistic basis. It is'hard,tp argue against using iognorrnaf 'distributions when' non- negative, unimodal, positively skewed data are available. Regardless, there is a practical continuum Between using an EDF and, say, a maximum-entropy theoretical distribution. The issue of sensitivity is important; i.e. when does it make a difference in a risk assessment? In general, EDFs may take more time to develop. Discussions of the utility of particular distributions should be separated ftprn theoretical arguments. An iterative approach to refinement of environmental J!::!1!'' * '! ' ill I • ' • '•' ' fi, •" ' i ' ' , , ' i ' in I ",.i ' ' •,.! • ,! ""!", ' ,,r , '" ,n I , image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } Robert C. Lee, Colder Associates Inc. exposure distribution functions should be discussed. This could potentially avoid inefficiency, and could be used to focus research dollars. If conducted within a Bayesian framework, prior EDFs or PDFs can be refined given additional data. 4. Much discussion in the text centers on the appropriateness of particular goodness-of-fit methods, visualization, etc. All of these methods are "blunt tools". Most statisticians simply use a number of different methods simultaneously or iteratively. If all the_methods agree that a particular parametric distribution "fits" the data, then that distribution is probably appropriate. If they disagree, then the mechanistic and statistical justification for a particular distribution form and the sensitivity of the model output to the distribution defined should be examined; an EOF may be more appropriate. If the model output is insensitive to the particular PDF defined for a particular variable, then it probably does not matter what shape it takes. F-37 image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } Merits Comments on Issue Paper on Evaluating Representativeness Qf Exposure rs Data !*!!!:; ' flMjlM'l1 £l Inferences Worn a sample to a population I I1;1'1"1 li •••''•^••••..; ' Tjriii'ijjjjjjir i ' l».;' population of concern at a Superfund site, is generally the, population, surrounding the site. This is true if the concern is for exposures during remediation activities. If S'J?iiS ipi'int; s;i-i|i'; iiil|'i|| !F i ,- if j ;.=.;!• - •' • "'-, V1',,' :" jl" ,.••".' ; . i '> • <".••• •;«,'- - ' ,' V;;';,',,,," ,,;i, , ;! • . :' , , • ,;'-;,', : ,, there is some residual risk that may last over an extended time, the population erf cOTQerh may change. "In a brownfields situation, for example, the population of concern !!|!|''J ll'|l|i'"1 •'"!"""''IP*1 '» "i^f ;;;|j|iji|| , i* •:, : ,, ,,: ( ^^'i, , ,>'!; n ^ r ,: ^*; ; \^, | '*-: ;j'•,'•„;';,:, j!!1,;,.;1'1,,,,"jl|, :i."j:, '"'', »" "'j !.*»;:,!!,i^, '• •.• , '!" • !' ''"• ' '.,v.:! '" ' ,' ;;•!'•' •••.•' nfl^y be people who will work at the site years into the future These people may be q'uite different than the population currently living around the site. I" • ," '• 'litsll! " .; " •• "' ••••.,. r,' -,- . •'." '. ; " ,- Si1'1' ;i '•• .- '•• 4. COMPONENTS^ OF REPRESENTATiyEMESS| «, • ...,;,;',. ""• ^,; , „ „ "MI :'. i; " "• • ' «., '.' •'"'„ • •'•'•, ••' ' "''': ,, 'i'"""i" '"'' '« ' •„ ' • i' • , ,:,',., i"' fere is no question that one would like a cl§ar definition of the population of concern, '•; " i ! T:;I!' jj|., ;;:;. , --•',,,', -• • •;;••; "..;••:; ,.; i;/rr* ,:" • •••. :,-?;•;..•':>>,.'-.'; " ifiy'S"'t >':'',. ' :'j''i;.'^.1.' t\';'. >". ,!»'•' ..•'•• but |f a representative sampling of the characteristics of that population has not been «••' ' " • i,, •'.••.< ':>,1B(| , "" 'I!1- •• '„.'." •• , - '' ''. ' ;' ;: . " :; !"; ' ' •' !.»' : -I • I • '• " '. " . I" . "•'• : ." , done, that cjeflhltion doesn't exist. Isn't that why one uses information from a surrogate population? That question then is, if one cannot characterize the, population of concern, how can one know if the surrogate population is suitable to represent the •;:["' V, •""' '• * - ' |"" ;' !fi:;'| ", \'::' .-;• ';•' •';'" .•;';.i •';;,,.: • •:i •,. :», t , ;' , i ;';i., ;i ( • r;.:' • •; ,,i]ii i;;_.",, i; •.. , „ : i •.: , ,•.: : „ i,.. population of concern? The answer is a practical one. It depends on the availability of eSj which in turn one hopes depends on how severe the risk is judged to be. 4.1 Internal components - surrogate data versus the study population Certajnly the representativeness of the surrogate study for its own study population should be evaluated. This paragraph seems to suggest that every assessor that makes use of a surrogate study should make this evaluation. Good surrogate studies are •.,'!!:' > , , '"',•, .' "' ,'il ' "' „ .1 • ' ' J!"!1 ,'" J , , II . .1, '- ' ';',,'" ,,», , ' ,„ ' :] ", • :;:>' : ,: '' • • , ' . ,' , : • .i,,1 '!.,'' generally used over and over again by many assessors. Such an evaluation should gnly need to be made once, with the results made available to all assessors. Along with this evaluation should be an evaluation of the character of the population for which image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } Samuel Morris the particular surrogate study is useful. This could go further to provide some limiting population characteristics beyond which the surrogate would not be recommended. 4.2 External components - population of concern versus surrogate population The suggestion of using several national Food Consumption Surveys as a basis to extrapolate dietary habits into the present or future seems like a rather precarious thing to do. It also is something that could only be done for an extremely large, important, and well-funded assessment. It is another study that, if done at all, should only be done once and results made available widely. Regarding several assessors independently speculating on the mean and coefficient of variation of a parameter (expert judgment?), to avoid the phenomenon of anchoring, a useful protocol is to have the experts begin from the extremes and probabilities toward the central point, rather than beginning with the mean. Checklist I. I don't understand the questions, "For what population or subpopulation size was the sample size adequate for estimating measures of central tendency .. .and other types of parameters?" The previous questions ask if the sample size was adequate, etc. Presumably this means it is adequate for the size of the population that was studied. I am assuming that this checklist pertains to an internal analysis of the surrogate study and has nothing at this point to do with a different population that is of concern to the assessor. Checklist II. F-39 image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } -•; ;/ ; t| ,. . ' ',;. ."' , . "'v'' ' • ..;' Samuel Morris 1 suspect that In most situations, the answer to the first question will be that the two populations are disjoint. Checklist 111. These questions concern whether the two populations inhabit the same geographic airea. Presumable the interest is in similar climate, activity patterns, etc. Spatial characteristics convey a Droader-in fact a different-meahing to trie. It suggests how :," 1 •]./. '_" "' J I IJflJ! •,,..,•' _/ •„.. ;,,; • , - i , • , i , .,; _ ::,™, , ... . . , ,r ,•',;, the population is distributed in space. Is it a high density area or a low density area? Are there clusters of housing separated by open space? Responses to the Questions on Representativeness I ' :ji',' l|;!, ',,',:" '.: I'll: : '".'•",'*;,• . ! !';•''"•', ' ' '''••'.;!• ! ':'•'' •' *'.*;'''''Ll. :''J! .' |f ' '" '" '":' '"'',!', ' ' '•:'*1 " ','' '"''' Issue Paper oh Empirical Distribution functions and Non-Parametric Simulation introduction Is stochastic variability really the right term here? Juslto make sure I am interpreting this right, 1 take "variability" to mean that, for example, some people drink more tap water than others and thus have a greater exposure. The Big difference between f liability and scientific uncertainty or random error is that it is presumably possible to identify which individuals drink 2 liters/day and which drink 0.5 liters/day, or they can Identify themselves. This is important because it provides a tool for intervention. For ;;,!,'',"„• . i , ;, '-.I'M. I"-.- !! • ","!•;; • •: !•.',/. '.-.,-. ; -'".•" •.!;-. •• L'.I i •'>•;.'! •' iiu-'J" - i: :. j •• i1;.1'1 ' ' . ;* -' •' j'1!';,'1 example, we can warn pregnant women to reduce their intake offish rather than setting a standardI requiring' everyone to eat fewer fish. "Stdchastic variability" seems to imply f ariability that Is so randomized that we-nor the individuals involved-cannot determine E'1 ;, "•),•, „:, isfif ;•• :": . i"; ;: ' v • ' :.„'«.. i^ • '•.-:" ..t i-11."^1/;;1'^1!- •'•• .•v'";i"i''1i^!^"i;:' , ,•;• ;• • /.I'V"^ |/ho has a high exposure and who has a low exposure. In that sense, it is the same as I; ;.••" C' t^.; ** :' li"-'": •'/' i- •• ' :"' •' :' '' '•1;';'"";:': '•"'• •••"" !/ •ii'v' V; - •• •';; ' r':';- a Cancer dose-response function. ''ill'!"!!. ' ' ' 'i!|i " .''''''iiiili !n i ' 'i ii. n ..I ,. n ' " F-40 image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } Samuel Morris Why do we write-off the use of theoretically based distribution functions? Many environmental variables do seem to be distributed lognormally. It isn't just coincidence. I believe that we are often better off fitting our data to a lognormal than trying to develop an empirical distribution based on what is typically a rather small data set. I once got some good advice when I was a junior engineer trying to figure out how much water was flowing in a pipe. My boss told me, "We have a good theory explaining the flow of water in pipes, but our meters have a 5% error at best. If there is a difference between the theory and the data, assume the meters are wrong." My only problem with lognormals is how well they continue to map nature out in the extreme tails. Even there, however, how much confidence do we have in the 99th percentile of an empirically based distribution? Part 1. Empirical Distribution Factors Extended EOF The EOF is extended by adding plausible lower and upper bounds, but the paper does not mention how one extends the linearized curve to reach those bounds. Presumable by using a curve-fitting routine of some kind. In many cases, there is no clearly obvious point for the upper or lower bound. We know we do not have any one kg adult males, but how do we decide to stop at 15 kg and not 14? Expert judgment is used. Expert judgment may be all we have, but it is not a great justification, and it is important that we provide justification. I believe it is worthwhile to do a sensitivity analysis to find the difference between .using quasi-arbitrary bounds and letting the curve run out to zero or infinity. It might also be worthwhile to check the difference with stricter, but perhaps more reasonable bounds, say a 40 kg adult male. F-41 image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } Morris Kjixed Empirical-Exponential Distribution I think that mixing theoretical distributions with empirical distributions in some kind of ft'] > " 'K1 ; •;/.» ; iiiiilt' in.1:'' ..... • ; •;"""" •• • =,: ...... .>'• ''•'•• ' '• • ..... M-'. i1 •••'..'» •*:•.!' • . '••;'. .•„ :' . ........ composite sounds like a good idea. Starting Points f lie smaller the "data set, the greater the rationale for using a standard distribution. 111 I I Responding to #5, people feel more comfortable with a theoretical distribution because Jt has a theoretical basis that supports interpolation between data points and extensions beyond the data, although I was always told never to do the latter. When plotting empirical data without a theory, one never knows if there is some big discontinuity between two completely innocent looking data points. The problem is that the theory l3§h)nd the distnputjpn is matKematicai, not physical. To be comfortable interpolating or extrapolating in either case, one must have a theory of the physical process involved. image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } P. Barry Ryan Workshop on Selecting Input Distributions for Probabilistic Assessment In the transmittal letter dated March 27, 1998, Beth O'Connor asked us as reviewers to provide "... not... comprehensive comments, but rather your initial reaction and feedback on the issues... ." Further, we have been asked to focus on the so-called "Representativeness" Issue Paper. My discussion focuses on that manuscript to start. First Reactions My first thoughts on this paper center on the need for an "audience" to be selected. Issue papers such as this one will lead, eventually, to guidance documents similar to those supplied as background reading. But what is the audience of this document? To a degree, the audience must be viewed as one and the same. This document will be referenced in a guidance document. Assuming this, a diligent worker looking for more information will seek out this manuscript. Hence it should be readable and accessible to practitioners of risk assessments and exposure assessment science. With this assumed audience in mind, I continue with my initial reaction to the Issue Paper. The Introduction commences with a single sentence that concisely described the purpose of the document. This is a good start; the reader is entitled to know what is being discussed. Unfortunately, the next sentence is a parenthetical notation. Is this statement unimportant, less important, to be ignored, or what? The third sentence has a relative pronoun as the first word but the antecedent is unclear. To what does "This" refer? Exposure factors? Representativeness? Whatever it may be, it is both extremely brad and extremely important as the rest of the sentence tells us. Before the above is dismissed as grammatical nitpicking consider the following. At this point, we are only three sentences into the document and I, considered to be an expert F-43 image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } 11 .11 .i'lfl! ., . ,\ 1' 'iii [ 'it/!!1:" .' • f!'i' 3!'!' I ,"l !' -IS'; ,• ,;ii . ii ' , i . ' , : , , ' „ , I1 , ' II i, , i • h :, Vi .- , ,,',:, , . ... .. , „ ' : .. . '". , P. Barry Ryan reviewer, am uncertain as to what is being discussed. A gentle introduction to a difficult Subject goes a long way toward keeping the reader "on line." A little editing for style up i1 ,: ' '' " " I!1 ' "!l •' , ' " ' ijjj'j i " '' ' , ' ',i',,' f1 ' ' ' "•' i • '' ,„' ' "! " ,;,,' v i i , , ' i • , '» frprnt will make this document much more useful. 1"'' ' '' ''! ' '''1' ' ;: '' ' Let us continue. The next paragraph is a roadmap describing the way through the remainder of the document. These two paragraphs provide the Introduction. More is needed. Why is this important? When should it be applied? What has been done in fie past? These are all reasonable questions to ask. "i h The next section begins the meat of the Issue Paper. General Definitions/Notions of Representativeness is a real mouthful of a title. The term "Notions" has the corjhptation of uncertain knowledge. Definitions are quite the opposite. Will we be treated to contradictory information in this section? Apparently the answer is 'Yes" because, as pointed out the Issue Paper continues, a reference to Kruskal and Mosteller indicates that the term on which we are seeking guidance has no "... Unambiguous definition..." Why is it necessary so early on in the discussion to confuse "I:'"';; .'V "" "'", •" .• !:jj<;;J! : L V-v, i' f •"'• 1 ', "-/ :i.'; ' '"' ' :; 1 '••? • •" ,!>"' •:• Sf"' ' '•• ;'•'.''• '.-^f. " ',:..• ••„ '• fte issue in the mind of the reader by saying that no definition exists? Why would a reader of this document continue reading rather than throwing his or her hands up in Despair? V"' , '• ;' ,. ';), •..'''••'', ', , • ••.,: I, '"•''. • • «•., ;.•.;' •. , ! • ; T :•• • .Jj;.-•,.;,:„ '• :-i ' ; '•.'••>: i fhe next paragraph (and accompanying table) adds further fuel to the fire. What is the w,,, i p ' • Mh ijir i si*"1!'!!' ' ' "' ' ' '"! '"'' '" ' ' ' I"'!!1" '" ! ' • ' lilill!! '""!" J '' ' •' " ' ''• '' J": "'" '': ' " '! n ' lii!'!"1 " • ' '" " ' purpose of this table? How does it contribute to the definitions or notions of representativeness? There is no discussion of the importance of the terms, how they might be used In assessing representativeness, nor the purpose of the table. •ill ',11 F-44 image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } P. Barry Ryan So, again, we have a section that needs significant editing. It is not clear to me that this section adds any insight into the notion (or definition) of representativeness. Th elementary concept is not difficult. The attempt to be all-inclusive at the very beginning, however, is doomed to failure. It is difficult to tell someone what works by telling him or her all of the problems with the system first. It would be better to adopt a working definition, show how it can be applied to many situations, then list some problems with the working definition. This allows the reader to gain some understanding of the concepts, without having to grasp the entire subject a priori. I have, until this point, spent a great deal of time discussing a very small part of the Issue paper. In particular, I may have spent more space on the discussion than the manuscript length to this point. However, the first page or two of any document sets the tone for the whole piece. The tone for this manuscript ranges from one of despair to one of disorganization. There is very little room in that continuum for gaining new insight. 1 urge a re-write of these early sections. Moving on to the next section, A General Framework for Making Inferences, begins the "meat" of the manuscript. As a matter of style, I do not care for a series of parenthetical notations in sentences. I believe that it obscures the meaning of the prose. Shorter sentences fully describing each of the activities are better. This is a recurring style point throughout the document. I will not comment on it further. Figure 1 represents a nice, concise "decision tree" approach to risk assessment data collection. The discussion is muddied somewhat by the introduction of the (undefined) concept of surrogate data. Reordering of sentences in the paragraph to bring the example closer to the first use of the word surrogate would clarify substantially. But we quickly go far afield from our discussion of representativeness. The manuscript needs F-45 image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } P, Barry Ryan to focus on this concept. Indeed, the entire section on Inferences seems misplaced. Should it not be at the end of the document? On the other hand, Figure 1 is useful to t^e disci|ssion "of representativeness. The branches in which one must assess this factor offer an excellent opportunity to introduce techniques, etc., to assess representativeness. For example, the figure instructs the reader to follow the algorithms outlined in checklists l-IV. Why not discuss them now? It would seem that a , : ' I , , • ,| ' i'1'!" I ..... 'i i , " , i, • . " • ' i' , ,. ' ' • , ', "„•!»: ,i '.' i, ;, , i. discussion of Figure 1 in light of representativeness would be a more useful first step than to develop concepts of inference form it. The figure is designed to result in an inference, granted, but the pedagogical role of the figure here is to help the reader understand the concept of representativeness. the next section, Components of Representativeness, begins to dissect the concept into pieces more manageable. The table, Table 1 , and the coupling of the discussion to the Checklists in the appendix, are perhaps the strongest parts of the Issue Paper. fable 1 is especially noteworthy. It presents the fundamental questions and parses out according to the "population" characteristics under investigation. these include Individual Characteristics, Spatial (here misspelled as "Spacial") Characteristics, and Temporal characteristics. Further, the characteristics are divided between exogenous and endogenous effects- a very useful division. The focus should remain on this table. Discussion should expand, examples given, and understanding " ' ' i ...... Til , '„• "I • '""'" ' ••' „»,:! •• • . ,i, • i *, - T, , . • . '.. - reached. These are the essential concept of the Issue Paper. image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } P. Barry Ryan Unfortunately, the manuscript gets bogged down a bit at this point with the "Case" scenarios. I kept getting confused between Case 2, Case 2a, etc. Also, the introduction of the National Food Consumption Survey confused rather than helped. I found myself wondering if this approach was only applicable to the MFCS or did it have more general applicability. The topic is very general and the specificity of the example obscured that. Again, the tabular presentation is much more straightforward and helpful. Table 2 could be discussed without reference to the MFCS and the different components of representativeness addressed much more clearly and generally. With section 5, Attempting to Improve Representativeness, the tenor of the Issue Paper changes dramatically to become much more statistical in nature. It also becomes more difficult to follow. At points in this section, the authors go off on tangents. See for example the discussion on raking techniques on page 12. A better approach would include more on when such data are likely to be suspect and a better description of the weighting techniques that have been advocated. In the sub-section Adjustments to Account for Time-Unit Differences, there is considerable discussion of the Wallace, et al., approach to inferring temporal effects. No mention is made, however, of the work of Slob (See Risk Analysis 16, 195-200, 1996) who advocates a different technique and evaluates both. Regardless of this missing reference, one questions why it is here at all. It is very detailed and, in my opinion, should be described briefly in terms of its logic, then detailed in an Appendix. The brief reviews of the Clayton et al., paper, the two Buck, et al., papers, the work by Carriquay and co-workers should receive the same treatment. F-47 image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } i' .'» V,- :: -.'! •' "1:. " '•.. :' h. : •' ' -'••' '••*:? :" "•-• pYiarryRyan •i ;' g *,'. - ' ,*i|||| ,,,:. -• ".I* .-'.; . . ; ; '•' , :,' ' : ir • ." _ ;, * ' ' ;. ,- fhe section Summary and Conclusions, is really only a summary. The first two paragraphs perfiaps should have come earlier In the document rather than at the very end. They express the philosophy of what needs to be done. This is a good thing- it sets the stage for the Issue Paper. Continued Thoughts After the above impressions while reading the document, I have come away with the Impression of a fairly uneven presentation that may not be especially valuable either to the risk assessment community nor to EPA. the idea of ah Issue Paper addressing the '"i| ,!'. ,;- ; '•:',,! • /liiliiii! i'"!' ,r. • ,„>•;:,;.; I1'!! ,,, '.'.., i' ..,/"",,;. ;', ••'•'..;.;:>,, i' 'I'M,', •;>. .,., • '!;, _ ;.:•._ ' ," r ;-;'!' i "!•: gSpcSM of representativeness is a good one." Data are often used in a willy-nilly fashion "i1:]'1'1'1 ' is n v '.. • :" i; : •< ...• • i . , • i,,; •< ••. ,, n . ':-.• "• • ., •.. .1 ; " ,. • •, 'i-j.vi ' i mi '.M with little regard for the way in which they were collected hot what the study design intended to do. Because of this, erroneous conclusions can be drawn resulting in much Wasted effort and, sometimes, money. I think the document as now presented does not present the issues well. However, the • • ' ,.,''!!'' " •'"" '»,' ' '"" ",'''«": , ' ' „ ' i • - ,'" ' * "'i '". • i ' IN." "' .'• . „ ' " ' j ', ''• Figure, fables, and Checklists are excellent. They provide a strong foundation for a ";•• " ..... . i ' il , "llllfjjl, :.. 1.1 ...... "'t'. r. .: ......... i. ...... (,):.". '".'• ,-!,'- 'i" '..' ':.. : ..... : .. , ..... . ......... • V, i t/H-.', ...... I: : Si-' '" ....... '. " .. i," i document useful for Both the neophyte arid expert alike. As ah exposure assessor, I am always trying to come up with clean definitions of the parameters I am measuring. IS It exposure? Is it dose" Is it applied dose? The authors of this Issue Paper draft qnaftejl'a^ivvers'ito simiiar questions associated with flW representativeness of data, surrogates for data, and the pitfalls of ignoring the problem altogether. 'LJhlortuhalely, '' these gems are buried m a veriiabie rbckslicle of other information. They ||-§ hpl given flieir proper attention in the Issue Paper! The science anci EF*A would be Well served by asking for a re-write based on {he Figure, fa"b|es, and Checklists. Some Introductory prose should b"e placed up front to set the stage- perhaps the two paragraphs (or modifications thereof) found at the beginning of the Summary. This Material would be descriptive of the problem at hand answering questions such as why image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } P. Barry Ryan is representativeness critical, how is it often lacking, and why attempts to improve the representativeness of sample must be done carefully. This would then be followed by Figure 1 and its description, which leads further on to Table 1. The description of Table 1 and Figure 1 give the essentials of the representativeness argument. The next section would use Table 2 as its focus. Table 2 expands on the ideas of Table 1 and thus is an excellent follow on. The "examples" could be relegated to an appendix with more complete examples chosen and more detailed calculations worked out. Finally, the Checklists should be given a more prominent placement, and a more complete discussion. F-49 image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } Comments on PreVVojkshpp Issue Papers: "Evaluating Representativeness of Exposure Factors Data," and "Empirical •! "' ,1,^ t, :: : -"." ,",L, | , 'I,,IF '| ' « IjV, '",,1 '/;: „'. ' ii: | y" ' , ;" •:;,;• >• ' "' i"; '. ' , ••. Distribution Functions and Non-parametric Simulation" for US EPA Workshop on III • ' "' "" ',. "' *"..Ti :."':'-" ' :', •'. " '•'•''' [„ ": V .',''.' ''it'll ," •:'. '•.'„•'.".! , ' ', • • ' •'"'!• Selecting Input Distributions for Probabilistic Assessment "i" [ • ;;; ' "(New York, NY^ April 21 -22, 1998| ": sue Paper on Representativeness ' J!""!r""i';""' " ,1 .••TX f !l ' ::' """"-' «"' '' < '••• it " I find the discussion of representativeness in this first issue paper to be generally '".Hill i|j ,,; iiJi'i" i!:Hlii''<"'!'"! , '"'i,"r >..•"£-.• •,, .,;•.:.)....• , • ,. „• i.. .-,... v ,<,',~-:c • .•. v, '!•'..•'. .'.. i. ., v .' «• : ,. • thoughtful and Helpful. The paper does a good job of presenting statistical concepts of experimental design in a manner that should be understandable to most exposure 1'P •: ,,. :: ', J!]' , " ;, . • ' ' *t . .• ."• " ,• • •„, , „:;:, ;;'" • :" f,;:;: ,•;'. , .::. •,.,,.••;"!.'.; • ".;":' ;:r/,..;,',; ;.:,,:.,[,", ,; ' "g modelers. The major issues of target populations versus sampled or surrogate ""! ;">'! F"l III1"!'!; ,; ^||||| "'; .,•• ~.f-f\ ' '' ' " !.":''•*•..:',• ' ", '" , !.ll1-;;iS ' -, " 'f:';^ •"; • .r,i :'',I11I ; i- : ' " I" ; .il'/,1 ',' '"il'.jil!. Populations, and differences in available vs. desired spatial and/or temporal coverage and %in"V ;.' ,";;i"" " : < ,,"!l'l!f ""F, ".' •',':"'•,. :','*. 'i"1 ,! ,;'! ' <;- " ,7'.|•!*:,- •"''•'•' V •. • "''" i-'1'1 • '" ' ",.. • '"• '',;!{ Scale, are addressed in a clear and comprehensive ~—— 'III:!!"!""1!' 1l ' , , ill,, •; f •• •>'•'••! "'' "' ' •' i Ii ! V . • '.' ','" " ",i, ' " i !ii"'ji"! ' " , '; " : 5! !vi'' :••• i I manner. Tiered Approach and Sensitivity Analysis f he Issue of tailoring the frameworK to a tiered approach to risk assessment is integrally llnkM to the importance and need for sensitivity analysis when the tiered approach is used. 1'C ii , JiU'i infl'l '!-, ' iii '!! , ' ', „ , i ' • ' 'ii!'1" "' '"' '. , , '4.H'• i » , •. " " „,"„ I'1 M"!' ' , • , ; When simpler icreening level assessments are pursued, sensitivity analysis is critical to Piniii! i "!,,i,i „• 'i, ii . ;,..,!! ./ i , • '•' • ''i : *' .'"!! ,, " ' „' • i.. ri • in," iii!"1 'i 'i , • • .. .< . "• • determine whether a, significant problem, worthy of attention or remediation, could occur. lE>""'''".i :v'' "• ' : ,''«i;f • ' '•' : '*• " 'iii'1' ",'''"./ ',:!' /'"::: " '" •/':>. "";l ''; 5!j'' ,:, i* '.'.''"; •'. '", ; ;' :> ••'.". : :; Sensitivity analysis is always most meaningful in a decision analytic framework - can the decision deiiyed from the risk Assessment change as a result of a change in the simplifying assumption (in this case, the use of data or distributions derived from a sample of questionable representativeness)? The only way to determine whether this is so is to repeat the analysis with the underlying data or derived distributions modified in a manner consistent with known or suspected differences, over the range of plausible adjustments. image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } Mitchell Small If a plausible adjustment does lead to a change in the risk management decision, then the analyst must first consider a more rigorous basis for determining the adjustment. If, with a better basis for making the adjustment, the range of predicted exposure or risk still "straddles" multiple decisions regimes (i.e., different management decisions are still possible given the improved adjustment and the overall uncertainty from other assumptions/parameters in the assessment), then this suggests the need to move to the next level of sophistication in the tiered approach. This could include the use of a more detailed and rigorous exposure and risk assessment model, as well as collection of a more representative sample for the target population. Adjustment The discussion of methods for modifying statistical estimates derived from a surrogate population to obtain results applicable to a different target population is thorough and informative. I do have a few insights to add on encouraging the use of hierarchical models with covariates to derive more representative distributions for the target population; on variance adjustment methods for spatial data; and on the use of Bayesian methods for combining information from surrogate (e.g., national) and target (e.g., site-specific) samples. Adjustments based on covariates: The discussion in Section 5.1 covers the usual methods for weighting sample observations or sample statistics to adjust for stratification of the target population in the sampled population (either intended, as is the case in a pre- planned survey of the target population, or unintended, as is case addressed in the issue paper, when the stratification weights are a matter of happenstance). The discussion does recognize the utility of covariates (either continuous or discrete) for determining sample weights and mentions the method of "raking" for deriving these. F-51 image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } •Ill Small I think more could be done to encourage the collection and use of covariate data, in particular, using these data to develop "derived distributions" for the target population. Derived distributions arise when a relationship between the parameter of interest and the ...... 'MM 'i ' ..... "" ..... ii ...... ' "! ..... ! ...... M ! ..... S 1 1 '•"' ..... '" ' ...... '"'.i i '" ' I'"1.' '!!'"' ! ......... " ..... ' ,, , ,.||. ................. . ,, ,,,i „, , „ ..... cbvariates can be established in a surrogate population, [this relationship could be rriodified for the target population based on a small sample and Bayesian methods (see rjy discussion below for how this might be done).] The relationship is combined with the distribution of the coyariates in the target population to derive the distribution of^ the parameter of interest in the target population. The relationship need not be deterministic 4hVmethod is quite amenable to use with the usual regression relationships (with explicit distributions of residuals) that are developed in exposure assessment. Consioler the Mowing examPle witn a simP'e' closed-form solution: For subgroup j (i.e, based on gender, ethnicity, urban vs rural, etc.), the natural logarithm of house-dust lead, ln(house-dust lead), for person k is related to income, I, with the following relationship: II! = 3| + bjlin(lkij)] + ekij wnereajisthe intercept, bjtheslope and ekjthe residual of the regression relationship, with • i •, : :, :..: , || Income I for subgroup j is lognormal: then HtiL. for subgroup j is also lognormal with = ajVb|^, cj)HD4 = [b/^ cy2]0 -5 The distribution of HpL for the entire target population with subgroup proportions P(I is the Pj-weighted mixture of the lognormal distributions determined for each subgroup. ,v:.r 1i',; image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } Mitchell Small For more complicated relationships between the parameter of interest and the covariates, or a more complicated distribution of covariates in the target population, Monte Carlo simulation methods may be required to derive the distribution. An example of this (entitled, "Bayesian Analysis of Variability and Uncertainty of Arsenic Concentrations in U.S. Public Water Supplies," by Lockwood, et. al.) is attached. It presents early results of a project for the EPA Office of Ground Water and Drinking Water (OGWDW) to estimate a national distribution of arsenic occurrence in source water used by drinking water utilities, based on a stratified national survey. The application is an example of Case 3 in Table 2, where the surrogate population is a subset of the population of concern. The most pertinent part of the attachment is highlighted, noting that the national distribution is synthesized by sampling the covariates of the target population. The use of covariates for deriving distributions of exposure factors in a target population is a powerful tool that should be encouraged in the issues paper with more examples and methods. It would also encourage exposure assessors and analysts to be more careful and thorough in their collection of covariate data as part of their monitoring programs. Variance adjustment for spatial data: The report does a good job covering the options for adjusting bias and variance for time-unit differences; similar methods can be utilized for differing scales of spatial representation. A good reference for this is Random Functions and Hydrology (Bras, R.L. and I. Rodriguez-lturbe, Addison Wesley, Reading, PA, 1985), especially Section 6.8, Sampling of Hydrologic Random Fields. Methods are presented for accounting for spatial correlation when determining the variance of an area average. (The other thing we should do is vote on the correct spelling of spatial/spacial.) Bavesian methods for combining information from surrogate- and target-population samples: I have learned a lot recently about Bayesian methods for combining expert judgment and observed data to estimate distributions. Some of these are discussed in the attached F-53 image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } ill > It '"* Mitchell Small paper by Lockwbod et al. The Bayesian method allows a prior judgment for distribution parameters to be updated based on an observed data set, yielding a posterior distribution fpr the distribution parameters. The posterior distribution characterizes the uncertainty in t|e resulting estimation, but can also be used for "best-fit* point estimates (e.g, based on the mean or mode of the posterior distribution). Bayesian estimates converge to those of classical methods when "vague" or "informationless'1 priors are used, so that the information in the sample dominates that of the prior. ~ ' "' . • ".. ' Vl'liJIli " T • • : !l ' 'i 'i . i|» i, , ' •" . " '¥ "Oil! ,131 i, : v!' |ayesian methods can add a lot to the suite of tools available for using surrogate goppfaiion samples when estimating target population statistics. A number of these tools Ife described in a paper that Lara Wolfson and I are (hopefully!) about to complete, iiF'iiiij !'" iii,:"" • .;:',!•"' '• " .iiJiinni1! ' •, i*1! i, • .,,"1 ' •,;;:'!', i,i,r» , ,.,' i," • .«" :, ' i v T , !"• „'•< inn r'C1 'IF. "•' ,:„ "• '" •• • .,i: <t li! . > i ! , • ii/.' "yetfjods for Characferizmg Variability and Uncertainty: Bayesian Approaches and Insights" (we have been "about to finish this paper" for quite a long time, covering a few of our recent meetings - hopefully I will bring a copy to the meeting in New York). In particular, estimates from surrogate population samples can serve as priors for the target population, allowing information from (presumably small and limited) site-specific studies to be informed by, and combined with, the previous studies of the surrogate population. Results from multiple surrogate populations can also foe used, each given a weight, along with the informationless prior, to determine how much the resulting estimate will be based oh each of the surrogate population studies vs. the information in the target population §yn/ey itself. A Specific Comment on the Representativeness Paper: The discussion on !i§umrnary Statistics Available" in Section 8.1 (page 16) contains what 1 believe to be an error, when suggesting that standard deviations be averaged across fLibgroups 'when approximating a population standard deviation: "In the case of population variance, we recommend calculating the weighted average of the group standard P-54 image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } Mitchell Small deviations, rather than their variances, and then squaring the estimated population of concern standard deviation to get the estimated population of concern variance." However, neither of these approaches properly accounts for possible differences in the means across the subgroups, which also contribute to the population variance. The correct approach is to compute E[X2] for each subgroup: E[Xg2] = E2[XgJ + Var[>g then E[X2] for the population: E[XATP2] = Ig PgEtX/1 and finally, the variance of X for the population: Var[XATPJ = E[XATP2] - E2[XATP] where E[XATP] is computed using the middle equation on page 10. Issue Paper on Empirical Distribution Functions and Non-parametric Simulation You appear to have already gathered a lot of thoughtful comments on the two topics addressed in this issue paper. Will any of these respondents be at our meeting? Will they be identified? I have given more thought to Part II (Issues related to fitting theoretical distributions) than I have to Part I (Empirical distribution functions). I identified strongly with the comments of Respondent #6 in Part II. To add slightly to Respondent 6's comments, I note that parametric tests of significance for the fit of a TDF almost always reject a particular parametric form as the sample size gets large - real populations invariably exhibit some deviation from a theoretical model, which cannot capture all of the population's behavior and nuances. In these cases, visual comparisons of observed and fitted distributions are essential for determining whether these deviations are in fact important to the problem at hand. F-55 image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } ! J Edward J. Stanek 111 Review Comments on "Issue Paper on Evaluating Representativeness of Exposure Factors Data." (March 4,1998 Report) by Edward J. Stanek Are questions of differences in populations, questions related to differences in spatial coverage and scaled and questions related to differences in temporal scale complete? Should other areas be added? The document defines a population in terms of a set of units (subjects) at a location and time, a definition that is a standard starting point for traditional survey sampling. The definition of the population is important, since the term "representativeness" is being used tp describe the relationship between estimates of exposure, and the true exposure of subjects in the population (or summary measures of these true exposures). An example of a typical population is (p3) "the population surrounding a Superfund site". The population is defined as a "snapshot" of persons in time and space. Although this !;v ; ' : ;'."(! Y:'.-..'. ''vv, '•' , • • v."", •)-,'.. ;,v •• «:, ; ;•., ',(';•;' ' 1; •; •. definition fits the traditional survey sampling paradigm, this definition may be lacking from the stand point of defining exposure in the context of the public's health. The photographic like quality of the definition does not account for the fact that hew people may move into .f< • "•„>', ' ;.,: !i! r1: • .', ;!•;•' .(*", v,:- v •! •: "'• i'1"' "• ;• ,'t 'i,.«"'«"' : '::Vi- : r . , • ;. ., the picture, ancl others may leave after a short time has passed. Thus, while "representativeness" may be assessed for the picture, the picture itself may be limited. As a result, the assessment of representativeness may have limited relevance for exposure and ultimately tfie public's health^ 6f course, when one looks at the "snapshot" close to the time it was taken, the differences may be slight. After a longer time period, the II ',!;/ .ill- , ,;'!;,'Si image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } Edward J. Stanek Hi differences may be dramatic. This practical concern over defining the "population" is ignored in the report. It is important to introduce a longer time frame and possible changes in exposure when defining exposure in a population. Such definitions are important conceptually, pragmatically, and politically since they define the target parameters for exposure. Such definitions are accessible to a broad range of interested parties and not limited to statistical or technical experts. They set the stage for decisions on additional data collection, and technical choices for estimation and modeling. The current document limits the scope of "representativeness" by defining it only in a context that has an established traditional statistical literature. In a simple sense, such a definition may be diagramed as in Table 1. The idea is that over chronologic time, there will be mobility and other physical changes. Thus, exposure for the first subject (ID=1) may differ between 1998 (E^) and 1999 (E12). Similarly, ID=1 may move in the year 2000, and hence no longer be exposed. Other subjects may move in the area. Subjects will also age, and their exposure may change with age. Of course, the exposure values in Table 1, while potentially observable, are not known. Nevertheless, a consensus on what will constitute such a potentially observable exposure table is the starting point for discussion of "representativeness". This conceptual framework has a rich background (Little and Rubin (1987)). The present document defines the problem in terms of the shaded cells in Table 1. I suggest that the starting point should more closely correspond to a population as defined in Table 1. Establishing the goal first will help prioritize issues such as representativeness, sensitivity, and adjustments. One might dispute this goal by arguing that the problem definition is difficult, exceedingly complex, and since conceptual, detracts valuable time F-57 image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } Edward JL Stanek HI afld effort from what data is known. I would argue that establishing consensus on this definition (while not statistical) should be the starting point for "evaluating :to >ds and rtodel-based fnJirehqe (Sasselet alM'(1"§77J, Scott and Smith (-fggg^ Meedenand Ghosh ^•iL/ji.':'''?;•' "1 ;.',' r ••.•'. . >-.: •••••.•.^ • •;•••• •>">•.. ••• • ..•.<• t;.v v;." ir :, : ;•• ,/, •-: ^^ ,11 (1997)). .M "'' ', !<||i||l!!i! ,i I, i', 'i ': ! • ', • ' • , '»'' "' ' '• ' .','•• i",[!'"' ;f, ' 'i , • , , / ' ,''i tgbje 1., Pqtentiilly Observable Exposure on Subjects in the Defined Spatial Loeatfon i '' " i ' jLliij1 'i ! ' ' '". ' „ ' » ' ! ' » , '"" '' " • „'',!«, • ' '" »!• '' „!' ' i' iwl'ii,!!, ' i11:, n, ' • . " ' ,, . ' . .,' (E» ==Exposed) " ' ' ' " ' ' Time (Yr)(j) 1998 1999 2000 ... Subject fDs (I) ID=1 fD=2 ID= 3 ." ' " r • •"• , < ^11 » ^st* ? &-m i E« Mi M2 E32 £33 M3 ... ID-M ' t 1 IK <•>* 'f **-Ht , ... ... ... ^N2 ^N3 MM ID-N+ 1 EN+I, 2 MN+t i ID=M+ 2 ^^2,3 , MN+2 ID=N+3 ^N+3,3 . MN« i Average M't998 MtS99 M2000 M UNJ1 , j N, 1 "I I ,» j ; • rii "Illll "' . L "I "" I1 .ill „ ':• „ "' ,„::! ,i, I!1" I'1 " „ " , " C i" I' '»',"' i • il I1"*!! ,!' H.M! , , T1 |,|i| 1U" ,ij, . , .l',,,,,1:,,, , '. • ' '„ <i .! ,i,,|i ..I'l' „ " !'• , I1,,,, ' , , M ' . Ii " Are there wavs of formulating questions that will altow a tiered approach to risk assessment (a prodressiori from simpler screening level assessrrtents to more complex assessmehls)^ A genera! strategy for tiering estimation approaches is by ordering the assumptions. With very extensive assumptions, ail exposure assessments are easy. For example, assume ifi!i;]:;!'!" > \ •' -:, 1 • ^'li-1 ;.V/>'fif.!: ;, , '•• >' ;;'*"' ;":„, • i":.';.' i' ,,; ' '>''•[ ^ ' m'.',:\,. I;1* ', ": \ '•" '.".,1^•: Tf ;.•'''<• ' •.• L;'•':if I,1'1,'•"" I! a{ everyone at every time in every location has the exact same exposure, and that this : t nJ'.n."1; i.}"1',; -i m, iL.iiV:, ,:"•;.;, - l"iii IB:',, l.i,- '< ' i ( ji'i::.. r •: :' "i;::1.,; j ",.•! i M : r: a ." , T i!",, • ',. ..".„:'.. I": •',•:* i i. „•' , 1 ,'"!, Ii n-', vj .: i • Iposurl a§fi be measured without error. Using these assumptions, a single measure on 'VI" ""T ,. ;,' "Hit"! ;•,. : »: :»-'";!:i:ii i> .:.''.'.' ' I •••' . i!' '••; •'»•'' .' • •'•'( ;'V ," '/• . • .."''.' :' , .;'• •'VJ1 , ,!., . , . ' I-ilJII •' .. ' ' . , -i- ..' • ' : ••• !•:• . , : „ . f • F-58 ii i image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } Edward J. Stanek III a single subject will suffice. These assumptions are clearly too strong to be broadly acceptable. Nevertheless, these assumptions represent an extreme which has as an opposite extreme the target "potentially observable" population (which is exceedingly complex). A gradation of assumptions can be formed between the two extremes, with such a framework leading to a tiered approach. The framework asks how important are (or sensitive is the analysis to) population, spatial. and temporal differences between the sample (for which you have data) and the population of interest. What guidance can be provided to help answer these questions? The document addresses the way the "surrogate population" represents the population, how the sample from the surrogate population relates to the surrogate population, and finally, how the measured value relates to the true value for the measured unit. Assuming that the population defined is the potentially observable population of interest, this is a good framework for developing inference. Some guidance can be provided to structurally evaluate the sensitivity of the exposure estimates to analysis decisions. To do so, we build estimates from the data to the surrogate population, and finally to the population. F-59 image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } . WW .'I.11- "!!'!' "I •'" "'''»''" '<4V.Am'.P 15 "" v.fr. f.ft " J. iianifek 111 fabfe 2 represents a framework for successive developmeni of estimates Id fhe palliation. Probability sampling will connect the surrogate data to the surrogate I1"!!"1''1 i :' ,li',ii;'"!,i . "• . 'J..,' , ' Iflll , ' ' ',''.•'" i ' ,,in ,' ,ii '"i , !„ ,|, H. „ '!' ,'" ,! !:' „„ • ,i» „ " »,„ '• ','..;'!"!'»'':,!,: ^ilj. ""."j, "' ,,.i;ij||j' U j, ' - .'„" '" ,;,:!i>| 'i' hl,$ „ '''iff 	,!,ii'' j v n „,, ', 	 'jji, ^.jit i '
^gfjfifation, arid may serve'as the basis; for infeferice to the lower shaded portion of the
'V"i Vii  ;.J   III,;' I'Sil n,;,'"'- ' " '' •,	i	'.' |,* ' ''\f	'.'„	/ " -in; . ' •' I, 'it ,:'!, Li-!,: ..'-il-i	-,'l'ir ill1'!,, I-,:;!!!,,:•,i:*-:, •„!»'.,! J i i li:	'i •,,:,'»:,:,!''' • .... »•'' .li"'< - ,J ."", f	.11	'•"-.'
Surrogate  Population.  Specifically, the inference consists of estimates of population
parameters, and the accuracy (mean squared error) of those estimates.  Non-response,
limited  coverage,  etc. may  require'  additional  assumptions before inferertce cah be
e'Mended to the entire Surrogate Popujatidri.
Improvements In the accuracy of estimates for the surrogate population may be possible
Via modeling and/or post-stratification, the models developed oh surrogate data may
provide support and serve as a structure for assumptions needed to ^ree}jct exposure1 in
,i	'  ',;;.;•'» >r > .,• • ^mm ,-• •', ••, .  ,.   ' ,	 j... ",(-,•-i  •• .1. • : ,, i j  • - i .: ,t ,	.M, VIP	i	tar < • i	,,	;/ ,	)<.'•<  " , |	;, .;( ;; '•/;;: „•,:•' it.vi'i'i	!. '  I
the surrogate population not stemming from the probability sample. For example, models
based on suffojcjafe data  may develop a  strong dependency of exposure on age and
ge'rider, but a weak to hull  relationship with urban /rural geographic location in one slate.
Assumptions to esIM
rf Quiring assumptions) may be supported by evidence from the surrogate data, although
-||l i||^||y jy^ |jy  the pfobabliity sampling  inferential framework,  the  range of
l|rii|ivjty^ analysis {for example, varying the urbah/rumie^^
e;stabiished making  used of model based estimates when extending inference to the ribri-
iirrlpled surrogate population.
,    •   •   j,,"|!|i    '   ; .i,; , ' "• ,.;,  t "     .   ' : i  :     •   ;'•; •    ^ •»   -  ^   i  :  '   ,<   ' i ',• ,  ,  ••  ;:;„ i  .»,• \t

Models and assumptions most likely will be the primary source to generate estimates from
the surrogate population to the population of interest. As the distance increases from the
actual data, the role of the  models and assumptions will increase,  this increased role will
result in the estimates being more sensitive to the assumptions.    Much""progirelss is
currently being made in studying Issues  of  sensitivity similar  to  these  issues  In
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

Edward J. Stanek III

epidemiology, where a similar situation occurs in observational epidemiologic studies (see

recent presentations by Wasserman.Rotnitzkyetal. (1998)). Three-dimensional sensitivity
plots, such as those developed by Rotnitzky et al., provide a way of visually communicating
and identifying the relative importance of assumptions .
Table 2. Conceptual Steps in Developing Inference from Data to the Surrogate Population
to the Population of Interest.
Data From Surrogate
Population
Surrogate population
(Assumptions Required)
Surrogate Population
Population of Interest

(Assumptions Required)

Population of Interest
The description of adjustments focus on adjustments due to time unit differences. There
are empirical ways of dampening short time variation when estimating longer time interval
distributions that do not require  parametric assumptions  (such as  the  log-normal
assumptions illustrated by Wallace et al (1994).  Such methods (such as empirical Bayes
methods) require some assumptions, but the assumptions may be minimal and subject to
verification. More research is clearly needed in these areas.  This is however an active
research area that is close to providing answers to practical concerns.
F-61
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

4 iiunik
ffeftrencea
Qfj§&3li C-Mi Sarndal, C-E. And Wretman, J.H. (1977), FounciatlQnsof inf§r§ni?i in survey

Dimpling.  John Wiley and Sons, New York.
Lttt
, R.J.A., and Rubin, Q-B- (1987). statistical analysis with missing data.\|ohn Wil§y and
i	,;•"-'fS" "_,;>• ||	J'/jS	j I,'	:	;	f":	j1:"1"1 ;|,I' ' !,'"'!	-;,".'''•''•"."i: i"  ;"' """ '""*•	I.;1:1. ,'  " ' ;", ' :••'•••n^*   ••  hi    • • ^	  ;'
J  . •• "'i  ; ;,w, .' "{,;,,.:.,	,'i j.-i,/1..,' „ f;:; I, JiVJj'.i. '.'• ''.,'&'" ••*'*'\'':'''•$':. '• ; ::;, ' : "' i'..'i , ..'•-: ,••:•. if! ,""'' '.I Meeden, and Qhpsh, M. (1997). Bayesian methods for finite population s.ampHng. II'" T.Tf-i!""• f!": ;• iflf.^; •i!!:;;|ll;?jf-|j,*;>£ /i- si;,:' •:f i; ^. T: ,?^j1 ^r;Tr,,:»..'^. i'',".!? i;:^i•.'I'vi1;::,s,t, ir^ ': w,';£!> •,!:;i •', -... iui,, •,; *••••;•&•%- >•:• i? i i• >• ::•,, MpnP|raph on Statistics arid Applied Probability, Chapman and Hall, New York. Qott, A.J, and Smith, T.M.F. (1969), "|stim§tipn in mwltis§|e surveys," jpyrnal Pf the iSin Statj^ical Association, 64;§30-84Q. i Ii nap, L. (1998). A tutorial on CB-estimatian. Eastern Regjpnal North Annual Meeting of the Biometrics Society, Pittsburg, PA., April 1,1998. A. , Rpbjns, J., and, Scharfstein, P- (1998). A 6-estirnatipn approach for oonducting sensitivity analysis to informative drop out and npn-jgnprabl§ non-complacence IfP!" ;"''' I!1-;1!11 f % "• J'Jj'l' |; .I.'"-" -'• .,1. ;,,;',: •'..>:! ,' .'. .j.: ' V< . '«' ,;(,', ' -r,:;-; -!, , • "!- (||si:( •'.,,, ,;:'•:..;,••',.,. -, „ '' - ' '•;!',} in,,a r,andomjz§|fo|ot^-up stuHy. Eastern Regional North American Annual Meeting of the .ivi .^•_. «^^"^ pjtfsburg, PAl, April 1,1998. ' ';' " ;; «"" •••; i i ";: ';" .••;•.•:•• , . -. •, •: . >. ', ' . ;: .!i I 1II image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } Alan Stern Response to Questions on Issues Paper #1 (Representativeness) Alan Stern, Dr.P.H., DABT Div. of Science and Research New Jersey Dept. of Environmental Protection I believe that the "checklists" are a conceptually sound and thorough guide to approaching the issues of representativeness. The major problem with the issue of representativeness is not what criteria should be evaluated, but what remedies are available. In my experience, the majority of cases where probabilistic analysis is considered in environmental regulation/standard setting involve choosing a generic distribution to represent an essentially unknown population. That is, default distribution assumptions which can be employed in much the same way that standard point estimates are currently employed in (e.g.) the Superfund Program. Efforts such as the NHANES III project and other data gathering efforts on national and regional scales often provide data of excellent quality foe large scale populations, Notwithstanding that such data are often structured in a way which can permit information on specific subpopulations to be extracted in a representative fashion, we are rarely in a position to know who those subgroups should be in any given instance. While probabilistic analysis holds out the potential for realistic descriptions of the characteristics of real populations and their exposures, it has been, and, I believe, will continue to be rare for specific populations exposed at a given location to be characterized (other than possibly by their geographic location) in a way which will allow appropriate subpopulation data to be extracted from national/regional databases. If such populations were characterized and/or population-specific exposure data were collected in a focused study, then the issue of representativeness would become a more practical consideration. On the other hand, if such focused studies are not done, then there is little or no quantitative basis for considering whether national/regional population data are specific to the given population. Thus, in most cases the external data are likely to F-63 image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } Stern be "disjoint" with respect to the population of concern. In the absence of population- specific characterization (either with respect to demographics, oi, preferably with respect to specific exposure), there does not appear to be an objective way of even identifying how the national/regional surrogate data may be biased with respect to the population of concern. Havipg acknowledged this practical problem with deriving representative data exposure distributions, I am not sure that, from the standpoint of puEfic health and risk-based regulation, it is necessarily wise that the population "of'concern be precisely characterized. The reason for this is that precise characterizations of populations are I'll'ii" !'• l! ' ,"| I'1' •'• iW '""..I; .. • ".'"" , '„•„„ J. Itlj I!1!',,;1 V i;,!!"" ! "'! '" Jjl, '!*! -'I1'","!:""1!,. : '.',, '!:,'"„ ''!«"!!• !• ' \ Slii" ^OM. V W"^, ,'.', '" ",(''' i " ,„ ''•»..• .- "fm , ,,, ',r»r ' ; « (as recognized in the checklists) precise with respect to individuals, and their location in s|ace Ihd timer Such information is only precise for a specific moment of time. Pempgraphic arid land use patterns change over time, and distributional data which are II!'1" I;" "if i • • ; >. »m ,i,.:.• , , •:••• !]<:,q r ;lt; "'•,;. >:«'): "•.;i'lh'1!' i", ^:," 't':, "> W<• 'if v^-"'':-7,. W£. '::• n V V,; •:/^ :"r't ^," i )•. rljpfeslhtative for a given population at a point in time may not be representative for the " fe •; I, ' * i:';."., ' i" • ": "., n i : i Jf:.; . j.: •;«" Si:, it.,, ii! I • .•[ I;., . •.:'' ;"!< i; , t <; , ; iw: ||_T * l ;' 'i' "' ;;:!iiif :i ,"" •; 11,; " j, <» " ;;t, i". ::,.: : fl "• '!" , . f!i population at the same location several years or decades later. Risk-based regulatory (lecJsiQqs, on the other handl,''are|intend'edrto'be^protective of the exposed population into the indefinite future. Too specific a description of a population of concern may, therefore, make a risk-based regulatory decision unprotective of future populations at the given location. Such considerations seem to argue for more generic tailoring of input exposure distributions to include an intentional component of true uncertainty to address the possible, but unknown values which might apply to future populations and land uses. It is not entirely clear how this should be addressed in quantitative terms, Si:' , ,,. ,i " • • illl"1 '"liilllHI! ' ,„:'":,.. h, :,,,"•, ,:.",,in',"i •» ;, "!!, » ,' ' '• i!1 ill!!.«',.: i V. .i,]'- '• . ., •,' • , ', Mln", ,'"».:„'i ; iEK. !' ',,,"! i,":,".• ,.1,1 !i, ' ,l;,1lil1 i,'»i • , 'II, "ii, lll.il|i:,, but as a starting point, it seems necessary for such generic descriptions to include the range of values which could reasonably be anticipated to apply to a generic population a| a site. To the extent that such descriptions are biased with respect to the current "ill1, i i:.;, -:. . •;l in i i i in , i.».;,-,:l; .3, '.'i'lK:.. '""',..'." -si ; vijiriii::! t."':' f.'! /n '" t'"\: • '.".••• ....... . i. population and/or land uses, that bias should (as appropriate) be toward including more of the high risk population than is already present at the site. For example, if the "", lull "ill i,;1"!,,"1 ....... ."III;. '! Mi .'•': 'I'd ' .nil i,""! " "!'.' ': i'S;' :;". ,' .« Lifflli (.";,::! '; f., ;,, ill: If. .<; i, ». .'{. : :,:.,: , .' demographic make up of the potentially exposed population at a given site were such image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } Alan Stern that there were few young children, the generic input distributions should assume that at some future time, the population could have a larger proportion of young children. It may not be necessary to assume that the national or regional demographics shift in a radical fashion (although over time such shifts, do, indeed, occur), but rather to assume that local demographic idiosyncracies are short-lived. Thus, if a specific locality or neighborhood is demographically skewed toward families with older children, or without children, it should be assumed that in the future, the demographics may shift such that the proportion of young children at the local level of a site reflects the overall state, or county proportion. Such assumptions should be based preferentially on analysis of regional population data, and, if such data are not available, on analysis of national data. One obvious problem with such an approach is that adjustments of current local demographics to current regional demographics to account for future local demographic shifts assumes that regional demographic patterns are more stable than local patterns. This may be true in general, but will not necessarily be true in any given instance. Tiered Approach The usual rationale for a tiered approach is that it saves the time and effort which would be needed to conduct population and/or site-specific analyses. Computational time per se, however, is not usually a limiting factor in such analyses. Site-specific data collection, on the other hand, is a major undertaking and is generally a limiting factor. Thus, if population-specific data are available and (as above) it is appropriate to base a risk-based regulatory decision on such data, there is no reason to employ a tiered approach to site-specific distributional descriptions. If, as above, regional-specific distributions are more appropriate for risk-based determinations, and such data are available, then, likewise, a tiered approach is not necessary. If, as is usually the case, population site, or region-specific data are not available, and national population-based data are available, such data may be appropriate as the basis for a screening approach. In considering the use of such data in a non-population-specific context, F-65 image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } however, it must be asked to what extent the specific characteristics of the national ^ata might be misleading for screening purposes. Specifically, are the details of the njatjbh Distribution in the extreme tails appropriate, even for screening purposes, for a ijlpj j' j ,,,| , ! t||i !;,,i,j j, 'i • j'. i ''Oil , '''' ii'''!'"'" '' '"'" i'i ,;''',,;!' |,| 'it ,i| ."in"'''' '" ' i ' ,,'•"" ''".iii'i'"|!; v •!' ,, I|M i* ' i i:' ,]!] 1M ..I,1"!1" i,, '' 'I1'!,"!!,!1 'j ,»" ' •„ ' ' ,,'• i,, ' ' ' Mi1' :i:'fi '' " '''" ''•' '• ''' ." • • |tven subpopulation? Given the screening nature of such an assessment, it may be ffl&fi |ppr5pr-jafe 15 gerieraieand employ generic screening" dlstribuitibhs which use cjuahtJlative approximations specifically Intended for screen'Ing such as triangular, "l|l!lli" ' " " Lf!:,:,; "J ,.', isiliji! •, ill r lit,i jiii. ; .lili,;,} v',,/,1: *k.'"\,l 'i :.!,,, '.I, ..Viii; "L '!,,,' ,;!S,i : ini' V! -ill. •• !.!„" ."ii-'L "l ,;„ "k i. ', '•',.'• '' n and generalized distributions. Such distnbutions can also be applied when ^p^g'ng^pgf populatSon d'istrib'ultibhs arenoVavaiiaBie. fhese ^jg'^^^^^^ u)cS iaescnbejfer'examjple, rotative minimum vaiu^res^ma'te^ 18% values, mbsi ||eiy value's, esfimatet5 ^10°^'vatues and relative maximum values. It Is not necessarily clear ffiat such generic distnbutions would hot be more appropriate lor scfeen'ing plirpos'es than national populatioh-based data. Using such generic default screening ;• , „ i .ill'! 'rij , ,,|, iir '„, ,'ii', ihJ1' " „" ', . „ ! i'1',,' "I i '', ! '! '" i1 ' Iji, ' '• ! .. 'll11:,1'!!'.! ll'nlll I,,1 n!,1" .".r «, ,< , i1'1 ,!: L, • ' |, , ' Ml ill. 'ji,/ ij;,,.i| distributions would have the additional advantage 6? es'tabtisftlng" specie, ahcl easily identified rebuttable presumptions which would* form the starting point for site-specific ;':.(-,! :,»!' :-J if , ,„," ji,:: i -J'pjii- i I ,11 i iff , '.t\ modifications, thus, starting from a default screening 'd'isiriSution, it mlghl no! be necessary to generate a complete site-specific distribution in order to move toward site/regional specificity. Rather, consideration of the detauit distributibh may help locus 4L j i II '' ':ittt.- 'i'""i-1': -<K*< i-':""': ;&••!''%'..'iiLi^t'i :,:i'; ;!;::: «*: • /(i '->i$*& ,-••,I  , 	£ :;>'. ',-1;*i .v	•%;"":|
e need for more specific mfonnation, and it might be realized that the most significant
^|fference between the default assumption and the actual site/regional-specifie
"i,  "i	IIL; >-"U I!-.1,! '  SSI1	.;.•',•'!!  , ":*" it'sii •' .v'ii't  "f .';.••'.	:,	i >;• "  , r..; . •• i •< * ui"1"1, • ,  ..':•'!•. •.•'-•'".' •..;'''.,'.'  •   «• !•   '   •  - >". "' "
distribution lies (e.g.) in the upper tail of the default distribution, thus,  it might be
ftfecesiaf^ Bnly lo collect data gppr5prja|e |o mod1 jfyih" g" the §5% value in the default
' -". Distribution.   '""          ,          '   '    ,     ' "  .'       '        "   , ,
if! 	if
1,111!
In II "
'"'I1,
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

APPENDIX G

image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

:":-":,«*:  M
<
f"
"JSj
ll/i
i IJB •:;
i'-iitl
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

David E. Burmaster
27ApriM998

Memorandum

To:          Moderator, Participants, and Attendees —
Workshop on Selecting Input Distributions for Probabilistic Analyses

Via:          Kate Schalk, ERG                                       ~

From:        David E. Burmaster

Subject:      Thoughts and Comments After the Workshop in NYC
After much more reading and thinking, I remain staunchly opposed to letting the US EPA and its
attorneys set a minimum value for any or all goodness-of-fit (GoF) tests such that an analyst
may not use a fitted parametric distribution unless it achieves some minimum value for the GoF
test.

In honesty, I must agree that GoF tests are useful in some circumstances, but they are not
panaceas, they do have perverse properties, and they will slow or stop continued innovation in
probabilistic risk assessment. The US EPA must NOT issue guidance, even though it is
supposedly not binding, that sets a minimum value for a GoF statistic below which an analyst
may not use a fitted parametric distribution in a simulation.

Here are my thoughts:

1.     Re Data

For physiological data, many of the key data sets (e.g., height and weight) usually come from
NHANES or related studies in.which trained professionals use calibrated instruments to
measure key variables (i.e., height and weight) in a clinic or a laboratory under standard
conditions for a carefully chosen sample (i.e., adjusted for no shows) from a large population.
These studies yield "blue-chip" data at a single point in time. These data, I believe, contain
small but known measurement errors across the entire range of variability. At the extreme tails
of the distributions for variability, the data do contain relatively large amounts of sampling error.
Even with a sample of n = 1,000 people, any value above, say, the 95th percentile contains
large amounts of sampling uncertainty. In general,  the greater the percentile for variability and
the smaller the sample size, the  greater the (sampling) uncertainty in the extreme percentiles.

For behavioral and/or dietary data, many key data sets (e.g., drinking water ingestion, diet,
and/or activity patterns) often come from 3-day studies in which the human subject recalls
events during the previous days  without the benefit of using calibrated instruments in a clinic or
laboratory and not under standard conditions. Even though the researchers may have carefully
selected a statistical sample from a large population, no one can  know the accuracy or
precision of the "measurements" reported by the subjects. These studies yield  data of much
less than "blue-chip" quality for a 3-day interval. These data, I believe, contain  large and
G-1
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

;;5xi,  ;•>''•• > >'; i   ii   !'•  - -  ..':"!.. ' ?">' v   '    <•  •'•   ••." •  •, \ ;-" 11    "'   David E. Burmaster

unknown measurement errors across the entire range of variability. At the extreme tails of the
distributions for variability, the data also contain large amounts of sampling error. For a sample
with n = 1,000, any value above, say, the 95th percentile contains large amounts of sampling
uncertainty above and beyond the large amounts of measurement uncertainty. Again, the
greater the percentile for variability and the smaller the sample size, the greater the (sampling)
uncertainty in the extreme percentiles.

My conclusion from this? With all sample sizes, certainly with n < 1,000,1 think the data are
highly uncertain at high percentiles. I think it is inappropriate to eliminate a parametric model
that captures the broad central range of the data (say, the cehtrar"90 percentiles of the data)
Just because a GoF test has a low result due to sampling error in the tails of the data. (This
observation supports the idea that fitted parametric distributions may outperform EDFs at the
tails, of the data.) Asi Dale Hattis has written, use the process to inform the choice of parametric
models — not a mindless GoF test.

2,     Re Fitted Parametric Distributions

As is well known;

a 6-parameter model will always fit data better than a 5-parameter model,
15-parameter model willalways fit data better than a 4-parameter model,
a 4-parameter model will always fit data better than a 3-parameter model, and
13-parameter model will always fit data better than a 2-parameter model.

Thus, GoF tests always select models with more parameters than models with fewer
parameters.

TM§ perverse behavior contradicts Occam's Razor, a bedrock of quantitative science since the
13th century.

The venerable Method of Maximum Likelihood Estimation (MLE) offers an approach - not the
only approach — to this problem. First, the analyst posits a set of nested models in which, for
example, a n-parameter model is a special case of an (n+1)-parameter model — and the (n+1)-
parameter model is a special case of an (n+2)-parameter model. Using standard MLE
techniques involving ratios of the likelihood functions for the nested models, the analyst can
quantify whether the extra parameter(s) provide a sufficiently better fit to the data than does
o'oeMthe simpler models to justify the computational complexity of the extra parameter(s).

3,     Re Continued Innovation and Positive Incentives
to Cg|Ie.dj New Da^a snd  Develop New Methods

Oyer the last 15 years, the US EPA has issued innumerable "guidance" manuals that have had
the perverseeffect of stopping research and 'blocking innovation ~ all in the name of
jitii.
lr| my opinion, our profession of risk assessment stands at a cross-road. The US EPA could
specify, for example, all sorts of hurrieric criteria for GoF tests -- but the casualties would be (i)
"  I continued development of hew ideas and methods, especially the theory and practice of
G-2
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

David E. Burmaster

"second-order" parametric distributions and the theory and practice of "two-dimensional"
simulations, and (ii) the use of expert elicitation and expert judgment.

1 again urge the Agency print this Notice inside the front cover and inside the rear cover of each
Issue Paper / Handbook / Guidance Manual, etc. related to probabilistic analyses — and on the
first Web page housing the electronic version of the Issue Paper / Handbook / Guidance
Manual:
This  Issue Paper / Handbook / Guidance  Manual  contains  guidelines and
suggestions for use in probabilistic exposure assessments.

Given the breadth and depth of probabilistic methods and statistics, and given the
rapid development of new probabilistic methods, the Agency cannot list all the
possible techniques that a risk assessor may use for a particular assessment.

The US EPA emphatically encourages the development and application of new
methods in exposure assessments and the collection  of new data for exposure
assessments, and nothing in this Issue Paper / Handbook / Guidance Manual can
or should be construed as limiting the development or application  of new methods
and/or  the  collection of new  data whose power and sophistication  may rival,
improve, or exceed the guidelines contained in this Issue Paper / Handbook /
Guidance Manual.
References
Burmaster & Wilson, 1996
Burmaster, D.E. and A.M. Wilson, 1996, An Introduction to Second-Order Random
Variables in Human Health Risk Assessment, Human and Ecological Risk Assessment,
Volume 2, Number 4, pp 892 - 919

Burmaster & Thompson, 1997
Burmaster, D.E. and K.M. Thompson, 1997, Fitting Second-Order Parametric Distributions
to Data Using Maximum Likelihood Estimation, Human and Ecological Risk Assessment,
in press
G-3
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

P. Barry Ryan
jr'lt
Unfortunately, EDFs are not readily amenable to analyses that lend a lot of insight (cf., Wallace,
Duan, and Ziegenfus, 1994). If EPA codifies a fixed value, even in the guise of ''guidance"
pretty soon rib pdf will be safe from legal wrangling.

We spent a long time at the workshop fussing over definitions of representativeness, sensitivity,
etc!, with little focus on the utility of the techniques.  EPA'rMy'weiTbe" In the difficult position df
haying to
defend everything frorn  a legal perspective. However, the jDfedccu|3ati6ri with numbers often
comes at the expense of insight.  The role of probabilistic assessrnents is the latter. Qur goal is
to understand exposure and its influence on health,  not to focus bh a specific value of a GOF
test statistic.
in'
be used
b| prbfessibhals jfarniliar with tHe nuances bf the problem at hand ind the techniques used,
trjejr limilstions, and strengths."  I' object 'to the cookbook approach to this type of assessments.
, j
will now step clown off my soapbox.
P", Barry Ryan
nt and Environmental Chemistry
E-rnpry University
,
Xlanta,
(464) 727-5528 (Voice)
(4p4J 757-8744 (Fax-Work)
bryah^sph.embry.edu
.1	' I 1,1 "I" ,
i;:f! '*!
C'bfleagues-

I read with interest the comments forwarded by dr. David Surmaster regarding the conference
from last week.  "^   i  ''  "  i        	'  "^  "  ''_ " ^'    ,  .. .
ff'i  • • ''.:,	;  '•   i«   •••	 • •-;';.*.i1:-'".  , •: •    ''•'•••..  	 •  ; i  !; '•;'.'	' ';'' V'Vi.' ' •• -J	rV:. '•'••  ,;   ' "•.':•    '  '•''
I would like to add a few similar words regarding the codification of any specific values for any
• • njlljjj!|i i" LI	'"• ' , " 'i i",,,,  in!"', '  'WMI . nil,.1'i'lib i,1',' ,„'	;,'!	Him,,	iLi	i'111,,,! 	,        .       , • , , n  .• 	 	• ,i  .  	"• • . ," ,.• i	'   '    '.• " »•: "	::• i • » r	v, f:: ,,
specific goodness-of-fit (GOF) tests.

GpF tests, by their nature, are very restrictive in affording acceptance of a distribution.  For
elarnplei the Kolmbfgbrov-Smirhoff test chooses the largest difference between the observed
dfta and |he theoretical ranking and tests using that. Unusual occurrences in data, minor
contamination Of by other distfibutiohsi etc., can cause rejection of distributions that otherwise
p|§s the"duck tes"tr(if" it walk'slike  a duck,.^]even if brie'point IboEi"a little more  like a pigeon.
The GOF test will end up rejecting  pretty much everything leaving one with  no choice but to use
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

APPENDIX H

PRESENTATION MATERIALS
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

i!*
'.Ml!
> '• ill
; , ..... , .......... i, , ...... I.L: ., , 5 fii ...... i ,3 .  lii ........ .1, • i ...... .. a:, ....... , ...... lie ..... • ....... y ........ . ,: ,11;: Jllilllt :, In ..... -a . ....... a: 1. ai ...... >I ..... Hi ...... '.t!:'; t! ...... I  ........ : :„.,.= , >• ,S1 ........ ..... :i. ;:: .; ..... til: ...... ., i .I';. .: I i. A- : ...... :,:' ....... ii ...... ill ..... ;lii! ....... ; iliii-' ......... Miiit ....... . ,„:,!' iliiai- : : :iiiifer£ ...... ,.i ;
;a.ii ifc  :„ -i;,: ..... > ......... I '  ...i.'i fSiii iiiili i aiiilLilii ;' ..... .iilil ..... II
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

G)
£
O
,2
o>
c/>
O
§
03
Q.-Q
O
CO
CO
CU
Q.Q
DC
c
CD

0)
0)
0)
CO
0)
H-l
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

•«
• i
, v • jj'1 iiin '•.!,'» 	 , i iiiip't i,,!"!':j,,i!!!" i .'"'"', n,;: , • ;' !'""' "'/iiiiiiH • ,"„";'• ,S*
;, « rte^.i^'v'iJIJ' _C
• :- , L , ;-;i : C co
., . • : 3 C
0 o
Q) CO
Q |
(Q €Z
» , . , .~g~^^ ^—>
1 • ,,' * ,, >• Til t
o
Eo
fl\
VLJ
•• ,' *;•'- "3 'DC
.I •••;-. . .^..- -.^
. : •. t: •.•:'- O c
l>< o
|M|dijJ|H^ 	 :: 	 ; r^)
flip|lw8'~'; fl5

I:S|, I
mill LL
•illiri'SiSllS
!!!! ' Vlk ^1
CO

.•.ifc-i'i^ * 	 i*. 	 .-^;.!.ti 	 	 a 	 .vti 	 ^ 	 .-'urf, 	 	 /
"E
CD

c
0
o
CD
"2
0
0
LL
0
C.
"c
0
E
I^MHH
^/\
^J J
CD
CO
CO
^

i.:. 'if'.ki1

c*5*
00
O)
CO
^
Process" -
0
D)
c

O)
^«J
c
CO
^>

H-2
1 in* ,L, »i I1:,, 'i:'1""!;!1,
Q_
LU
E
o
**—
+2
_"w
HHJ
C
_Q
"o
(f)
O
"c
CD
0)
CD
13
O
CO

^^^^^c
E
0
!>

"

.;. 	 ,, :,.,
CO
0
.Q
H^
O

co
O)
o
ol
"D
C
CO
co"
c:
g
0
DC
co"
0
O
^^J
CO
^^•H
O
CO

:M,^J
CO
+2
"c
0
E
o
o
Q
0
Q
C.
CO
d
0
"o
O'

~' '"\
T3
E
ct

"

«v«l::..

V
CO
be
ji
Q
c
o
o
o_
9taAB
0
DC
"co

C'

o
0
1-

,a,«

1
1

ssues
!
^^ •
tr\ m
1
^J J ^1
0 1
CO I
CO I
< 1
•

.!;;;i,i:,!;;! ;;:i !!!:^i«!.i'i- 	 !ii;X;i 	 iSiitl, 	 I
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

«E
w

(0$(0 (0 O 0) « o> cc o ^ o2 U-CL CO CD CO _g "co ~p m CO -^ o TJ O O LL • O) • CD ,S2 1 o CO _o "o Q_ Q_ LU E o o CD o: co co CO CO 2 s CO °^ co ^ O) O CD +- o CD CO CO ^ CD O CO O ircL H-3 image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } f'illi' f~ I i'lJS,! 5 PCHJI,1'"'iii,. i!',, ^Bm <D m. iCL CO O CD IH iH-4 £0 O -r1 CZ aeo CO .•<£ •10 «0 ","«! ' ",,;•„,' '!':i :i iniipg! • i '"w,H""' •' :\wi \ image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } o 0) E E .Q (0 CO 0) to o .co CD 13 CO CO _o "6 Q_ 0) 0 0 L DC lit O «P c CD fS "J5 »-a>t: co ;,* o> "S.SJ o _ CO CO = OJ = O "o JQ : co •S'g-fe L O - - *- Q_ CO v, ^ - c c E o < "E 59- C/) H 05 CD : co c co I 0) o CD O O CO o O O g> LLJ CO CD o g OS CD E CD _O "o o_ H-5 image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } •.'••'•', S 1 ;'•' > n«!1!ll:K >, <D O (0 10 IS o o O "4= O 05 13 — - JD CD CO -C JC O = s g> .S CD - O CD O O •- co p> .j c Q> CL CD C J±f CD Q 8 . o <2 — 0 CD 05 S "" 05 O o CD CL CD ^•^ ^» Q Q_ Cd i I ••— /^s ^\ *•— O Q.Q O OT -S ~ O CD O CD O) CD o £ CO ^-» w CO o c: o -Q 05 Q_ LO H-6 image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } C0 CD o c o CO CD CD ••• Q. E t i "5 °° ^ CO < _o CO o CD °- LLI 0 o 05 < _O CO O CD 0) 2 o o CO 00 Ifl COLU 05 CO 0) 3 0 O C CO CD CO o5 O CO {£ Ct5 ^ *- CO CO Q) CO O C O "•*-* o CO CD Q5 C _ — - - " CO *= vP C C O ^- 'F -d CO •— O c: co CD = ^ O) CD C O Q < Q< LL A A A H-7 image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } it; ' •\ c o o &> CD -o c: CO CO © 8BPH5 0> o o , • *l ' if,":' ,11'.!,.,, .' ''!" ! it < ';3P^ 1 Q. €0 C O 0 CO 1 Q JO 11 I H-8 Q X" H— C6 0 •BBB 0) 13 Q to JD 03 3 Q S" <D ^iff^ ••••• ••H^^f 'iffff o image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } o Hh* 0) o o CO LL .Q x UJ .= £ jj> & en 0. 0) < o is o S| -2§ co "c C CD O CO •^ CD CD ^cc Q-^ CD CD LJ_ Q. O i wa^^m ^^^^-M O CO CO CD = CO 0 CO DC D_ H-9 image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } C/3 C Q ^ si « «X •>-3* O 'S r"^ &•§ &H Q <D *. 43 *8 C^g o ^ +^ ^ co c> •FH ^ ^-i a ' IS ^ C ^ •T—I -r-H « h <D CD c: > ' *'"* >-»v" a ^ Og Jg aD^ <D CSI S >T-^ VVJ *-> •rH TAN -lj O ,g '^J- •*-* o G S CD cd Su S ^ |i ^ O d <D OJQ < d o •p-{ ^ S0 Presented onniental Pr> New York !H * ^^ a W 00 rs-H r^ <N <N r— 1 <N ^ C^ < 00 • U> HH • • • r H-10 image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } e a o B O O in O C8 es § «^N w 3 £ o o £< OH OH H-ll image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } Qfi cc fi =5 O •N. Qfi 5- s i* -S JH DC ^ S !- cs e s H-12 image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } H-13 image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } H-14 image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } communicating a & ^ ^ g .g SP-N - ^ •^ »^N 13 O . ^5 »™2 *^^ ce ^ • PN ^^ Poten across WD eg 3 <Dfi C! 3 = 9 H, c o o s_ ^ M — ^^ ^^ ' : ^ - ^2 '•"•^ ^^^ ^^S *'ii111^ o ^ ;/> O H^i> •B H-15 image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } ,n,! iwiot i fi'i,''!"", - >» ft 2 O O s fi O, Q 5- -= -fi 53 m ^Q O O = O<a& ^••itf 5£ fi'H O m ica ech m O »l S3 •/*fl ''^ CJ a ;p«i< A a =00 T^1 i — © as - - ! JI IBMj6 image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } ff^« a • • 4> "I 5 0) H-17 image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } ; ii,. ••. "'.:.,!! v::... til rilii,;,:'.;,':.• •:•;„ i/1" :!',,3! •,: i ,' i'•;':'" >,,.•: ii,.!' >•"., v,;•: "!,"::,:,: it,-it&..,;",s:;' i'81 £• . ;:,*,. „:;:.:•>•.-.si , ..IrMiiftliit'.iicatilteifctliaa .."L All..!: liJi'l ili,'.,llil.',i:'llii:iil »!' Hi.!.'!...!", lltai.ill!.'ill l.iiitiM^ BtilLVtUB'.''! Ji»B MHJuJUPHgBIBgaiBiaiiatf '•"' "= '"•' *""'"' I ''!''.'• i;;1!:,:1^ ,;;, [T «i \,^ >» 1*1 * HH gg 2 d> ^ r "•" "•-% O) _ i> 'CS 0 a> "•a r 1 CU «3 O> CO o ••-< CO •»•* ^ <v 03 SH C3 ^= u a I ^ Q • —« & VI •w+ U .^4 CS a 5/5 5C SS a> u a? as r * ce j= 53 U S3 -fi ,2 U g •^ SH ee rs o A PH « o a o> P^ c^ H O .3 2 '53 >• ett o> ^ CQ CO Frt s _z w ••• ^>*i 0> ;! '>: '.:<•!, i !»;• 't-^iiiS)"!::! image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } c^» & & 3 <» c« b 0> X OJ S'S es s 03 O s p fl O ES a H-19 image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } o CO CD cd • 1—I CO CD ctf CO CO CO CD CD CD CD CD O ctf CD S -d' 2 ^ — S w> o b£ C co CO CD CO CD CO ?H fi « PH X CD >-» ^ c3 ^ -i—i o 3 S g » CO CD co ^^ ^ PH CO CD Cu CD CD O CD W) CD £> ^ to S CD ^ ^ r-J -TH CO J> "•* PQ g ^ S '^ ^ T3 S CO <U a CD CD CO CD <D 5 ^ O .> _CD T—H ij CL r->. 1T2 »-H M.X S co co CD CO co CD CD CO CD CD CO CO CD O CD CD co CD O CD 11 H-20 image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } £ * W> v a PN go « . 13 r s 2 T O 2 S 5« fl .^^ PN .9 -« •c .s H-21 image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } H-23 image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } H-25 image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } :''Ml *:i! CO CO CO VI , SBi-, a1;! I, -i itc •Hi' • • '<•> ;;• "i liTiiiiiiii 11 ilitll1' . ' , < i i, ",i i1 jft ^ML i • I i>9 • Jin! .1 f ;. • ' I,.: :| , '!, Ht ! •! ' 1 ^I^^B^ CD CO CO « 'S CO CD C/D H-26 image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } H-27 image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } H-28 image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } H-29 image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } H-30 image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } H-31 image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } :,»;;I N •«'!S ()•,; i.; ! •' "i 'JWIIIII-'!:* yiMW. "' '•''"i i',"l| i! ::r -ie tJ< Ji tt'S"1*:1."1 *•:;•» ! ':'«!"' • • !"S,: ;'',- inii.. I""1 "i -i*! T;»,,a!'•••$  ^imrrirr
H-32
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

c/s
e»   ee
ca   as  «  B
H-33
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

I,,!; ,  !!Hllllii|i»!l"!!|l<|j ,	p •;;'.'  lllhlWii'.',!!!'
,*!!1:  	", '!  Hi	!".i> III. H"1  i i :!»'   I,"*
I1' , .1

1,,

\ ; .
•r

!<
)l,
<il!

...
•|,

:m i,

i1 I1 i.
V '1'"
i , : " ,
i '

'!• ,

i.

'ri'l
,i;ii,:|'

r ,
i
1 '

•i i
i1
'ii",1 ; '

d
o
• ^-4
.^^^^
^" "i
^^^j
GO
•T— I
Q
o
^^J
§
1 1
Ctf
^3
"*^<
GO
CO
.2
ts
3
PH
a
fv| .0

o ^3
•2j 4D
^5^ * :fim4
O&
GO
H 3
C5 r— 1 CO
OC3 r" !
. Q §
«F"I »T-K y
^5 ^H *p
5^ o . O
SL^ c*^
fH r-j
X W fc
s

GO
a
.2
5
• T— 1
GO
« i-H
Q
o
•T— I
13
a
^
c2
• T— 1
PH
^4H
O
I>N
4^*
• ^^
i— H
OS
(&
too
..g
GO
GO
<D
GO
GO
<1

'• 	 ' 1 Illli.' !«,, i 	 !!,! 	 '.ill. I,,, ' . , .,„ IM , Ml' IN "!«' ',', 	 , 	 ', I, i',,, ''i , llll 'iiiil'i!! ,., ,J IN '' i i.'« , "'111
H-34
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

t.
Q
&
. =
o
n Functi
.2
"3
,£
*!Tj
JJ2
r^s
V^Q
^ ^n^
Q

i"N
0S5
.g
*Q<
1
smallest
|
jD
O
CO
r\
•j
X
€\
Cw
Cw
CD
.^
•4-*
ctf

CD
Tf\
CO
£H
^*"l
CD
5-H
a
CD
>
O
£
0
• T— 1
3
.0
• 1— 1
£
CO
• i— i
*&
CD
>
• _H
13
1
O
CD
4H
+-j
CO
• i— H
Pk
Q
W
CD
£
*f
VI
:
VI

0*$i^ VI 1— < K c\ co CD M O X*B"^> £H v^ CD sti ^ & r— i *rv 5 -2 IT ° \li s s* ^ 9^^ ^H <4H - 1 i I ^ 43 o i O £H ,0 "o C3 VI J^ /J ^ c^ | T> 'O . C^ Jt^ CD 55 S *^ * 1 1 g» CD ^ y^j Z^ £H •B ' o a /r ^2- ^>- <[j c3 & "co • i"H r* ^ ^** *~fc ^J ^^^ CO * ^^ ^^ ^^ ffi 8 5-1 ^ CD •H CD > • i— i ts "»w r-H 1 ^+H -*-J CD co CD -*-^> CD •s CO • i— < *73 CD £ CD H 03 PH Q CD ^ c^—i O CO CD *c3 WSJ CD H o Al /~ ^ ""s * 5 -^ O s ^™ (~ <-— ^s ^i •^, •^ N* *^ ^ *. 4 c\ -A •*. H c\ s 2 */ CO CD -4_ J • i— l -s o PH H-35 image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } .Si'.! 'v ;,]'.;* vv* ' ' Hi i. ,. ,P;t '1 1 •i,,,'1' LL Q LL! o ,0 Q. £ CO X LLI Q) D D) LL I o o oo o o o o CO O -J-J o m in .:£ (0 o > E o "O o C o (0 co Q, -- o o o CM o o ooooooooooo H-36 image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } CD "o § 0 T— 1 1 »f r\ en CD 1 § , en CD > • r-( s— > fc o ft 1 w § 0) | ^^^^ •^••i H""^ ^~* j3 ^^^^^^J 4V^ ® § 'W5 g •PM CD o3 1 a -i o > PH • 13 60 M "i- ^ >r M rs S^ |l T3 ^ ^^ ^^ 'a AI CD ^ ^^^ cs & fl> 1 -^ CO rv cn i-^j CD ^ ^ g 1 1 H CD ^ 50 1 ^ <^ c a ^ en _r en G CD ^3 ^ s ft** • fM^ > X ^ ^ § a o ^ o G-i ^H &^ T^ en ^ CD ^ *g ^ G ^ en T3 <D -22 ^ « 8 '*<-!.,* S 0 9 '- - -§ 8 | g SH S ^ CJ CD •n "CD I-H rt 1 CD 5 • 03 a cj c! • en pCJ en -= -. H « 1 .S . ^ g ^3 fi ,"*"* en & ^ ^ CD a 2 1 .1 « 0 H +j CD CH Oi g» ^ cd M <: t'i-§ 7 3 > ^ • ' 1=0 CD "g — S 2 ^ n rT3 ^ CD 2 ^ d ° JZ 5 «' S ^ 4n .2' f^ ^*^j! ^™ L^ ^^( § ft £ w & ^ > en ° ^ _§ 'rT .23 a ^ § P^ en en M 9 | 1 ' « py TO *^j _ <~{ ^g 13 ^ o . C4_( cn T~l <D 0 'S ' '• 8 a- i ^ i ^ CD /*, "jr" CD i^ O • H S LS -5- > CD & § CD ^ Q g, ,J3 H P-l CD H • dues of simulated EDF percentiles are equal to the sample > * en i-r-j CD CD VJ3 1 1 f"< ?— i X CD WQ, iF"™"1 • a CD i— H PH Q CD •S bO • TH VH CD ^ 3 T3 CD ^ CD ^ en en • i— i a 0 .i— * S :2 43 en ^ bO a • l-H CD § CD £ HH • ate the true mean and variance. a • T— 1 +-J en CD 6 *T3 2 H-37 image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } ^ Q A o = ^o 3 cs inearized EDF h-4 bservations o ing between "o f CD ,xtended EDF w 1 CD -2 CO CD CD O T— I (DO a "d "S # CD E CD # based on expert ju ' of the exposure variable (EDFs r\ CU &JD fl CS •^ O •»•* *J % «p^ 73 o> '»- <w !M O = et SJ V» H-> O CD q3 CD ^ 0 CO O 3 CD ts CO • 1— 1 cd -*— > CD O -§ 2 73 • PM ^M = 0 = 0 & M -- TS O) M •P-< CD •S 3 r— 1 1 1 CD a o ' CD i bO 1 O CD 1 cd > based on extreme ehavior of many continuous, ,-Q exponential CD -a r— -( CD *d O S ^ fe Q W CO § •-£ 3 unbounded distrib H-38 image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } Figure 4. Comparison of Basic EOF and Linearly Interpolated EOF n c«n o \ \ \ \ A \ \ \ \ N- o o o o o o 10 o 10 o CD 00 00 h- N- O C> O O O 1 1 ' 1 — 1 1 )0 250 300 350 4C Random Variate w CM H-39 image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } lilt Till" i| '< (M 1 'li|'"|' i 'i|v „: t ; I" 05 «*-' 05 mmmm L»J £ ^ I O (D - LL M— O _ §5 CO -= 05 Q. I? 05 11 I I I q d O) c CO o x T LLI Q o q ci o q d 13 O :'if:{:.}: H-40 Jiii'i „» '" - /,;"!'"• : J ••••: • ,*.( 1,1 h ::,vif. itil •,„ " „,:*, ,: ,, ,,,,,,„, n i,,,, » ,. i ii, I;M| ,,i '»• i* ;;'; ,.:i' iSiii'.;. iiirlif. i !.,» ; .: •: .ij!:i:,' t, "viil, ';': j, t' i: • ,"!-;>. ,'• jlil! •,' 'fa' SiE* ' iili;!:£:;:::; li I image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } w f" "S fc -*-> o> PQ g N .S Q -J gs & ^ C/J CN ON ON OO OO l> CN C"- *xF en VO CNI (N ON O O r-n' r-H (N r-H (N O oo in ON ON in oo Tf \o in -xf O T—i in rn O O O r-J OO CN ^O O 'xf O rf ON en oo o o r-H in -o o o es oo en ON O en vo eN en O O OO oooinooooo in ON 'xt* eN ^" vo ON CN oo oo en oo r-H rn oo ^xt" r^- r-H r-H vo CN en • , * * « • • • •• <^> O r-H VO O O O r-H r-H CNt-^ooO'sj-ONOen (NooenONeneneNCN oo en vo CN en oo o ^f O O r-H ^O O O O r—I in in en o CO CO s <D S CO 'co 3 *H CO O O O r-H in ON in ON H-41 image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } ,: • ' : ! ' "'!' ' "!• ,4 , ' ' ' ' 1' ' 1 >,j.< ' < , llll< ll'ill, ' 'i' •'' i i i ' "! " i|"" • i1 J ;;:' ' i , ' • "i , ! 'L '.. ' ' , "„„ ,it, ...-I '•;:, ' i ''i1 iri.' "l " ' f1 n , li '' III !.: . ''"" ; ' •' '''" ' ':"," , ' !! i" , '" " i ,"'i! i. V;,''" !: ) ""'i. Jr ' . " ' j 1- "it ; ;' !"•:.' li; ,i , ' ;| , ,M ri, i; ' ,| T3 1 <D & GO •rH PH Q W i r-j ^O ^ d '^ GO CD O s c^ GO § Jj T^PS ^^ 2 o ^^^^^ ^^i w .d O o ^ *|^ CD PT ( 2 ^ Q 3 ^3 H ry g 2 ^^^^^^ ^^J W^ r^- , ^ ^. s • ° fjrj liiJi " ' ;' '!'J|: : " ' , •.;/1' >;; : , ! "" • ';;: iihii'ii ,r ,:„ ' ,1" liiilillll i i, , '•' !•:,,:; • ii, CD ^o a WHwA w p-^ o GO Q w g o • 1— 1 H? ^ .a GO a o •T-H c^ 3 GO CD CD ^s J • H-42 CD a ^CD t< CD c\ Tl3 CD •E2 CD •^ CD ^ ^ O i ^{ GO CN CD GO GO ,- Q W C^j d CD GO 1 !;.v. i!c •?.;;.,' ;.&*-, ;"...; ••'• . ;ir . .'..'' jiiijiii" 'i'i.'ili'jiKiis i-iiti-; .[• ,:fi i:1 ''..'•iiiv,';." O- CD GO r"^ CD GO d .2 "GO 5-H CD ^ CD • rH S t o ':.',', r, ' , ; 'I1 " I'- /-i^V" 'r' tl 'itii' i:''it HliRl iilil ! 1 1 1 • I I I 1 1 1 1 1 I • • 1 1 1 1 1 image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } ^ o> QC ^ a^ fi TS O O O not rejected by the o «i-H -3 I CO • T—I T3 O . 1—I -*-> >? <& i s-> CO CD • T— 1 CO CD • & O CO CD CD o <D > (D "I a CD CO O O 0 T3 CO CO a o CO CD O £ CQ 03 t3 ce O *a . ^H a bO • T—I CO ^ O '^3 CO "is -*-> CO •S CO a CD CD PH CO § ^O _co 1/3 O O CD O CO I s a -S O-- >> ^ "pH -t-> ^1 <4H O CO -S CD a • CO CO CD CO rf\ CD ts W) * ^^H +-> CO <D 1 CD £ T3 3 o ^H CO C\ ID ^ -4— > « CO a o •l-H ^3 rS £ CO S' o <-w *\ cti S § i 60 *H _^ -<— > CD 1 £ PH S £ bb CD* . co" a o •43 ^ cd H-43 image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } BAYESIAN ANALYSIS OF AND OF ARSEN|C CONCENTRATIONS; IN r.S. PUBLIC WATER SUPPLIES MSI* I John R. Lockwood and Mark J. Schervish Department of Statistics Patrick L. Gurian rjepartment of Engineering '"&, Public Poiicy '|| ' 1 411' ' I ' 'ii '..;''<!:,' " Mitchell J. Small , , Departments of Engineering & Public Policy and ';\:, if :•., • '"' Civil ^ Environmental Eng. Carnegie Mellon University presented at 7th Annual Meeting of ISEA Session 1: Approaches to Uncertainty Analysis "' ' '''-V; ^search Mangle Par, ' ' ' ,'••" v; t;''November §r Sponsor: U,S. EPA OGWDW (Usual disclaimer — ) H-44 image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } OBJECTIVES Illustrate use of Bayesian statistical methods - variability in an exposure factor (arsenic concentration) is represented by a probability distribution model - uncertainty is characterized by the probability distribution function of the model parameters Illustrate use of probability distribution model with covariates (explanatory variables) - allowing extrapolation to different target populations H-45 image: countPages += 1 if (countPages == 2) { var el = document.getElementById("rankTop") if (el) if (-1 == -1) el.innerText = "All" var el = document.getElementById("rankBot") if (el) if (-1 == -1) el.innerText = "All" } gy Probability model with covariates - lognormal distribution with mean of .S • "Bl'l , ;.:£>,. region, • source type (sw vs. gw), and • size of utility (population served) — constant variance of In5 s Bayesian metHodology v ' '' " ' • '' " .1 , ., ; . ;; ! j ' '$ '. ;i "" | .' Ji . . ? „ /Ifii ....... , i ,. ",:« ;if , :; • „ -;' ; vij, ;  , ,I;L ,; ; ; , • ,
for model parameters
- posterior distribution c:c)niput:e(j using
Markov Chain Monte Carlo
i; s  " ;
.  .....
necessitated by model complexity
and BDL data
'' !"'•	«::I IJl ill'!, "i1' i).
H-46
: I	Inu .I,;.,	,' « , liliin I" 	i	V'i!,;1 ,	>i.'\,'ffliai ' ii.". M I'-ji	ilii. if, ,1	 ','..4..	!...!;	,il,'. Kitfl.^:	K"-;. ''i: ^^''.sWlH f'ifli 'SMIf- it 'i1-'I ;,:„£, i	: "Mi.1,1-;: Ml::;  ; *•.;! iv !-i'.".'i 	l»	^<lilll/i
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

•s
•«-»
CS5
o
o
I
o
6
o
"2
z
.2 "2
13 .5
CQ J
W) Q
,4_, "^
C ^*
<u <£
p •£
0 Q
1
j
c t^
.2 ^
CD
0
p
"c/3
o
J^
w
1
(D
H"
<D
a
3
O
CO
Sample
Locations

Database

o
t~-
cn

<n
o

^_
r^-
-^-

Surface and
groundwater
•1
1
OH

National AFSCHIC
ice Survey (NAOS)
TT (13
S s
rr^ o
^ O

o
<r>
a\

«o

(N
00
a\

Groundwater
"S
1
00
"5
S
tional Inorganics and
elides Survey (NIRS)
rt 3
ZC
0
< =6
CU rt
PQ OS

^
en

*""'

S
i/^
^ -

Surface and
groundwater
•a u.
C O
2 «
« ~o
S .22
rt "S
OS «
ion of California
gencies
+~t ^<*
"o *-•
1 |
< ^

»n
\o

00
.2
.OS
*"*"

O
p
^r*
*""
A
Surface and
groundwater
c o
§ ta
*l
^ j=
fe .£2
(S «*2
1
2
1
0
a-
tS
"S
P
flj
1
CO
<N
H-47
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

MODEL SPECIFICATION
+
is the natural logarithm of arsenic
concentration in fJig/L at j
th source in ith region
£ is a constant for ith region, where i ranges over
the seven geographical regions specified in NAOS

Xij is the natural logarithm of the population
served by jth source in ith region (an indicator of
the size and flow rate of the utility source)

gij is 0 if jth source in ith region is a surface water
source and  1 if it is a ground water source

€ij represents those sources of random variation
present at the jth source in ith region but not
captured by the covariates in the model.

H-48
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

US geographic regions based on arsenic NOFs
tS'&S'^isaSSMfe' /•^•^3S&t^£-iV!;5«''j:«si?S'!'; •-*
gg»*tj?v
\%&&8&~.
Jj|pf:g&.

jiSSg^^-^^fffr?.-•*"'• "'"' "
«S«<? VH'
;?«---^--';-.. -  •
•i-1'. -.*•'••••
,^t^-w|».-w-£3ii' y.' .'.-I /^T'/. V "
S^^^^-te-:'- ••
?^sS^§ivl >-r •

^^iis:.i';.;   .-.
Source:   Frey and  Edwards,  1997
H-49
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

DISTRIBUTIONAL ASSUMPTIONS
In the model
=  i +
+
+
it is assumed that
ViJ
That is, fa are sampled from a parent normal
distribution (hierarchical model).
The normality assumption of eij implies that
conditional on all parameters,
-f
4-
H-50
r
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

BAYESIAN METHODOLOGY
Probability model:
Begin with prior distribution f

Observe sample X = x5

Compute posterior distribution
H-51
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

Ill
PRIOR DISTRIBUTIONS
Without substantive prior knowledge about parameters
of hierarchical model, our priors were diffuse:
7
log(r2)
7V(0,32)
JV(05102)
N(Q,102)
AT(0,102)
JV(0,102)
These parameters are assumed independent a priori,
but are dependent in the posterior.
H-52
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

POSTERIOR ESTIMATES
Parameter
Mi
M2
Ms
M4
M5
M6
Mr
a2
4'
T2
0
7
P.M.
-3.13
-3.50
-3.62
-1.76
-1.84
-1.04
-1.41
2.23
-2.27
1.76
0.21
0.14
P.S.D.
0.65
0.61
0.61
0.57
0.59
0.66
0.62
0.21
0.74
1.76
0.05
0.19
H-53
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

.,'»'; , :: '"	llj.i '•
)'  i!'
i   I1 .!<i,i

Ijf',:1''
to

6
CD
' . 0
sja
;,'• :w
, •' <jj

o
""'	   o
cvj

CD
CO

9
0.05
0.10
0.15
;!;:"

.Sr'
0.20

Beta
0.25
0.30
0.35
.,,  •. • • 	11 da* •. ..••.! i
i ,.I»I|'H"!. "'li1'!'"!.. • ',,i'T I

•;•'  «' " III-"*:1 I; :M''" ,":
figure 2:  Scatterplot of 7 versus /? from a sample of size 30000 from the joint posterior

distribution.
'	-i
H-54
, '4 ,  '• ill1,:; I ,!, ' ",
si,".:'.: , ';Sii,ii; . * ' '	;	i
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

NATIONAL DISTRIBUTION & UNCERTAINTY

The national distribution of arsenic concentration

measurements is the mixture of all the distributions

from the individual sites:
F        - —
^National ~ AT
N
Fi,
All sites i

where N is the total number of sites in the nation.

Similarly for our estimates:

s<.
rp

National ~
All sampled sites i

where Wi is a weight indicating how much of the nation

is represented by site i.

.^

However, Fi is uncertain due to uncertainty in model

parameters.  The posterior uncertainty in Fi is

characterized by the many (equally likely) Fid obtained
^

by evaluating Fi with the parameters in MCMC sample

*

J-

We can then compute the mean, cdf, median, 5th

percentile, 95th  percentile, etc.  of the distribution of
H-55
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

•'	•+' ',:!:,i f<F
1 :, , '"' ", .Hi'111 4,!:, I"'
ui  ' ;l'i I1 , •' I'llJ	'  " " Hi , ''if I   'ii'i "'i;iii , 1 j	ill,
i, „  cii	i 'i: ir f   „ ,„, • ,' ,    " in  ' "  ;
CO
§
Q_
<D ^
> O
3s
Z3

I
O

O
.001
—i—
.01
.1             1          5

[AS] (micrograms per liter)
20    50  100
Figure 3: Posterior cumulative distribution function of national arsenic occurrence in source

Yo credible bounds and uncensored NAOS data overlayed.
'»  ••  &:'  :,   /"  v:'>.'C?* :
j "•" ',  •;  f;!:,  ;.   j, in-  ; !|!li; -i ^	|

•'."}	•'    '/•'•'!'•'"   '"   :"'1;:';	ill ''"
,•".?-f- i:;1"..-":" .t\'"
., ''•  ;";!, : :i; -,l •'
'.	,'•';. •'• ''iifl;-'"*',!
H-56
'II,
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

CO
o
.Q O
O
O
CM
O
q
o
0.75               0.80                0.85                0.90

Proportion of Samples Less Than 5 Micrograms per Liter

Figure 4: Posterior cumulative distribution function of the proportion of national arsenic
occurrence less than 5 /xg/L.
H-57
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

!'! I
POSTERIOR ESTIMATES: Alternative Model
Supposing the 7 should be positive, we kept all priors
the same as before except took the prior for log(j) to
beN(0,102).
Parameter
Mi
M2
M3
M4
Ms
Me
Mr
a2
</>
T2
/3
7
P.M.
-2.78
-3.17
-3.27
-1.42
-1.50
-0.71
-1.04
2.22
-1.94
1.74
0.18
0.03
P.S.D.
0.55
0.51
0.49
0.44
0.48
0.54
0.48
0.20
0.65
1.75
0.04
0.07
H-58
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

CO
d
in
d
05
<« o
<D
CM
d
0.0
0.1
—i—
0.2
—i—

0.3
Beta
Figure 7: Scatterplot of 7 versus /3 from  a sample of size 30000 from the joint posterior
distribution when 7 is forced to be postive.
H-59
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

I!";,
1	111,,
«•	-:
CO
' , id
I
oS
,8-
^ ^

"5

1

CM
O
,p
O
•	»'
"'. •'•'••'.'':>'?•*'
.',:'''••£;4* '?	c
,01
'•  •;' 111 :.',; ••..,'.
v.il |
i '*::•
.1                1          5
[AS] (micrograms per liter)
20     50   100
Figure 8: Posterior cumulative distribution function of national arsenic occurrence in source
water wiih 90% credible bounds  and uncensofed NADS data overlayed.  Plot based on
posterior when 7 is forced to be positive.
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

SUMMARY
Bayesian methodology provides a
powerful method for characterizing
variability and uncertainty in
exposure factors
— effect of alternative priors can be
investigated in a diagnostic manner
- though don't try this at home alone
(without a competent statistician)
Probability distribution model with
covariates provides insights, and a
basis for extrapolation to other
targeted populations or
subpopulations.
H-61
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

\$! ,;i  '
V !';V.
I,;
Bayesian Analysis of Variability and Uncertainty of
Arsenic Concentrations in U.S. Public Water Supplies
John R. Lockwood
Mark J. Schervish
Department of Statistics

PatrickJU Gurian
Department of Engineering & Public Policy

Mitchell J. Small
Departments of Engineering & Public Policy
and Civil & Environmental Engineering
)»&" ,';•;,"''<"'", ',  ;V iQarnegie Mellon University
;iil|:! ',!' "  :l :f     "I"'1'	'  '"ll i  ;  •'•. •    •'  "   "' . > "' '• i  '•
niiiiii '   ' '    ,      '       i' '       'i     "   i. '"'
IS   -. •   • :; 'i't	  .•. ' •.. "? ,!•:"   ';  ..  • i  '•>''•.'"•«
if [ {.;••:.:',.••:.  ">:   •.••••;;.:'  / •;•'•,'••.;•

presented at
Seventh Annual Meeting of the International Society of Exposure Analysis
Research Triangle Park, NC
'••'••"  ;,:::,;  •   :  ,'  •  :  '•:'	  November 3, 1997
(Session 1. Approaches to Uncertainty Analysis)
1:='''
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

Bayesian Analysis of Variability and Uncertainty of
Arsenic  Concentrations  in  U.S. Public Water  Supplies

John R. Lockwood1,
Mark J. Schervish1,
Patrick L. Gurian2
and Mitchell J. Small3

The risk of skin and other possible cancers associated with arsenic in drinking water has
made this problem a top priority for research and regulation for the U.S. EPA, as part of
implementation of the  Safe Drinking Water Act amendments of 1986 and 1996. To assess
the costs, benefits and residual risks of alternative maximum contaminant levels (MCL's) for
arsenic, it is important to characterize the current national distribution of arsenic concentra-
tions in the U.S. water supply. This paper describes a Bayesian methodology for estimating
this distribution and its dependence on covariates, including the source region, type (surface
vs.  ground water) and size of the source. The uncertainty of the fitted distribution is  also
described, thereby depicting the uncertainty in the proportion of utilities with concentrations
above a given MCL. This paper describes the first stage of this assessment, based on a sample
of concentrations from source water drawn by utilities.  Subsequent analyses will incorporate
the distribution  and effectiveness of current treatment practices  for. reducing arsenic,  and
include available data  sets of finished water quality to estimate  the arsenic concentration
distribution in water supplied to consumers.
Using arsenic concentration data for source (raw) water reported by 441 utilities from the
National Arsenic Occurrence  Survey (NAOS)  (Frey and Edwards, 1997), we fit a Bayesian
model  to describe arsenic concentrations based on source characteristics. The model allows
for both the formation  of a national estimate of arsenic occurrence and the quantification of
the uncertainty associated with this estimate. The specification of the model is
= Hi +
+
+
where
Yij is the natural logarithm of arsenic concentration in ^g/L at jth source in ith region

fa is a constant for i^ region, where  i ranges over the seven geographical regions
specified in NAOS

Xij is the natural logarithm of the population served by jth source in ith region (an
indicator of the size and flow rate of the utility source)

§ij is 0 if jth source in ith region is a surface water source and 1 if it is a ground water
source
1 Department of Statistics, Carnegie Mellon University.
2 Department of Engineering and Public Policy, Carnegie Mellon University.
3 Departments of Engineering and Public Policy and Civil and Environmental Engineering, Carnegie
Mellon'University.
H-63
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

ey represents those sources of random variation present at the jth source in ith region
but not captured by the covariates in the model.

Furthermore, we model the values /^ as independent normal random variables with mean
t|> ajid variance r2. The national distribution of arsenic in source water is thus modeled as a
1     mi||ure'of Jognormals with the mean of the log-concentration equal to /ij+fix^+75ij and the
standard deviation of the log-concentration equal to a.  The resulting distribution depends
upon the number "of utilities in each of the seven  regions (?),  their service populations x and
!, j'l,;: the(respective numbers  drawing water from surface (snj —  0) vs. ground (^ = 1) water
(for'now, the sample is  assumed  to be representative of the national  distribution, though
th'£ predicted distribution can be readily modified to reflect a  different distribution of the
covanates" in the target population).
To characterize the uncertainty of the  fitted national distribution, we use vague prior
distributions for the parameters -0, r, ft, 7, a and employ the  Markov Chain Monte Carlo
melhodology (Gilks efr al.j 1996) to  compute and simulate  realizations from the posterior
distribution of the parameters. Posterior uncertainty distributions of all quantities of interest
can be calculated from these realizations.
Table 1 lists the posterior means and posterior standard deviations for the fitted model
parameters. The mean values indicate that

I arsenic concentrations are generally higher in the west than in the east (the posterior
means of /*4, /is, ^ and fj,7 are greater than the posterior means  of /ii, /i2 and /t3)

• arsenic concentrations tend to be higher in source waters of larger utilities (the posterior
mean of ft is positive)

• arsenic concentrations are higher in ground water than in surface water (the posterior
mean of 7 is positive, though there is  significant uncertainty in this result since the
posterior  standard deviation of 7 is greater than the posterior mean)

The uncertainty in the fitted  national distribution is characterized by  the standard de-
viations of the parameters shown in Table  1 and by the covariance of the parameters in the
posterior joint distribution. Figures 1 and  2 illustrate this covariance for two of the param-
eter pairs:  (ft, tj£») and (^,7), respectively.  These covariances are of the type that commonly
arise in parameter estimation; for example, the positive association between higher ft (which
results in higher predicted arsenic concentrations)  and lower ty (which corresponds to lower
values of the /i, and lower'predicted' arsenic concentrations) is  necessary tomaintain the
match to the observed^sampie^values.   	  ^ _ ^	^	^	 ^  i
The national distribution is synthesized by sampling the joint parameter space (i.e, the
pOfnts in Figures 1  and 2 and the  associated  points for the  other model parameters) to
generate many possible distributions.  For each, the cumulative distribution function  (cdf)
at a particular value of the arsenic  concentration (exp(Y))  is computed as the average of
the predicted cdf's for each measurement in  the original sample  of 441, based on its model
covariates (or, the covariates for each utility in the target population, if these differ from the
sample). The multiple  cdf's generated from the parameter space describe the  uncertainty
of the national variability distribution. The median of the uncertainty distribution is one
H-64
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

Table 1:  Posterior means and standard deviations of parameters.  The regions (subscripts)
are l=New England, 2=Mid-Atlantic, 3=Southeast, 4=Midwest Central, 5=South Central,
6=North Central, 7=West.
Parameter
/"i
to
to
M4
^
y"6
7*7
a2
' ip
T*
(3
7
Posterior Mean
-3.18
-3.51
-3.66
-1.78
-1.89
-1.10
-1.47
2.17
-2.30
1.74
0.21
0.14
Posterior Standard Deviation
0.67
0.62
0.63
0.59
0.62
0.67
0.64
0.20
0.76
1.77
0.05
0.19
choice for a single estimate of the national distribution.  This median distribution is shown
in Figure 3, along with corresponding 5th and 95th percentiles and the observed distribution
of the original data set. The fitted distribution closely matches the observed distribution,
including  the result that 37% of the sample is at or below the arsenic detection limit of
0.5 /ig/L. The full uncertainty distribution  for the proportion of the national population
below one particular value of the arsenic concentration (5 /xg/L) is shown in Figure 4, where
this proportion is indicated to range from about 0.79 -  0.87, with a median of 0.83. This
characterizes the uncertainty in the proportion of utilities requiring treatment of their source
water to meet an MCL of 5 /^g/L.
Acknowledgment: This work  is sponsored by the  U.S. EPA Office of Ground Water
and Drinking Water,  Standards  and Risk Management  Division. The paper has not been
subject  to EPA peer review, and the views expressed are solely those of the authors.
References

Frey, M. M. and M. A. Edwards (1997). Survey arsenic occurrence.  Jour. AWWA, 89(3),
105-117.

Gilks, W. R., S.  Richardson and D.  J. Spiegelhalter, eds (1996).   Markov Chain Monte
Carlo in Practice. Chapman and Hall, London.
H-65
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

•:fSI cii
"'53
I 111 III
0.10
0.15
0.20       0'.25
•   Beta
o.3d
0.35
Figure 1:  Scatterplot of -0 versus (3  from  a sample of size 5000 from the joint  posterior
distribution
i",/'1 ''!' ' '  IS11!! I
',:" >   •iiiiisi  °
,;,:!'ln,i, '•   ; III:	I -, <"i .'
§•
,  0.1.0  '  ,,,,0.15       0.20       0.25       0.30       0.35
„    Beta
figure  2:  Scatterplot of 7  versus ft from a sample  of size  5000  from  the joint posterior
djstributkm	„
H-66
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

•e
a
2
a.
1 2
^
o
.001
.01        .1         1       5     20  50 100
[AS] (micrograms per liter)
Figure 3:  Posterior cumulative distribution function of national arsenic occurrence in source
water with 90% credible bounds and uncensored NAOS data overlayed.
| d
O
0.75            0.80            0.85            0.90
Proportion of Samples Less Than 5 Micrograms per Liter
Figure 4:  Posterior cumulative distribution function  of the proportion of national arsenic
occurrence less than 5
H-67    >?U.S. GOVERNMENT PRINTING OFFICE: 1999 - 750-101/00039
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}

III   ll"
:fj   ;
it! ,
image:

countPages += 1
if (countPages == 2) {
var el = document.getElementById("rankTop")
if (el)
if (-1 == -1)
el.innerText = "All"
var el = document.getElementById("rankBot")
if (el)
if (-1 == -1)
el.innerText = "All"
}