v/EPA
United States
Environmental Protection
Agency
Office of Solid Waste and
Emergency Response
Washington, D.C. 20460
Publication 9285.7-081
May 1992
Supplemental Guidance to
RAGS: Calculating the
Concentration Term
Office of Emergency and Remedial Response
Hazardous Site Evaluation Division, OS-230
Intermittent Bulletin
Volume 1 Number 1
The overarching mandate of the Comprehensive Environmental Response, Compensation, and Liability
Act (CERCLA) is to protect human health and the environment from current and potential threats posed by
uncontrolled releases of hazardous substances. To help meet this mandate, the U.S. Environmental Protection
Agency's (EPA's) Office of Emergency and Remedial Response has developed a human health risk assessment
process as pan of its remedial response program. This process is described in Risk Assessment Guidance for
Superfund: Volume I Human Health Evaluation Manual (RAGS/HHEM). Part A of RAGS/HHEM
addresses the baseline risk assessment, and describes a general approach for estimating exposure to individuals
from hazardous substance releases at Superfund sites.
This 'bulletin explains the concentration term in the exposure/intake equation to remedial project
managers (RPMs), risk assessors, statisticians, and other personnel. This bulletin presents the general intake
equation as presented in RAGS/HHEM Part A, discusses basic concepts concerning the concentration term,
describes generally how to calculate the concentration term, presents examples to illustrate several important
points, and, lastly, identifies where to get additional help.
THE CONCENTRATION TERM
How is the concentration term used?
RAGS/HHEM Part A presents the
Superfund risk assessment process in four "steps":
(1) data collection and evaluation; (2) exposure
assessment; (3) toxicity assessment; and (4) risk
characterization. The concentration term is
calculated for use in the exposure assessment step.
Highlight 1 presents the general equation
Superfund uses for calculating exposure, and
illustrates that the concentration term (C) is one
of several parameters needed to estimate
contaminant intake for an individual.
For Superfund assessments, the.
concentration term (C) in the intake equation is
an estimate of the arithmetic average concentration
for a contaminant based on a set of site sampling
results. Because of the uncertainty associated with
estimating the true average concentration at a site.
the 95 percent upper confidence limit fUCL) of
the arithmetic mean should be used for this
variable. The 95 percent UCL provides reasonable
confidence that the true site average will not be
underestimated.
Why use an average value for the concentration
term?
An estimate of average concentration is used
because:
Supplanaual Guidance to RAGS is a bulletin series on risk assessment of Superfund sites. These bulletins serve as supplements to
Risk Assessment Guidance for Superfund: Volume!Human Health Evaluation Manual. The information presented is intended as
guidance to EPA and other government employees. It does not constitute rulemaktng by the Agency, and may not be relied on to
create a substantive or procedural right enforceable by any other person. The Government may take action that is at variance with
these bulletins.
-------
Highlight 1
GENERAL EQUATION FOR ESTIMATING EXPOSURE
TO A SITE CONTAMINANT
, CRxEFD 1
7=Cx x
BW AT
where:
I = intake (i.e., the quantitative measure of exposure in RAGS/HHEM)
C = contaminant concentration
CR contact (intake) rate
EFD = exposure frequency and duration
BW = body weight
AT = averaging time
(1)
(2)
carcinogenic and chronic noncarcinogenic
toxicity criteria1 are based on lifetime
average exposures; and
average concentration is most
representative of the concentration that
would be contacted at a site over time.
For example, if you assume that an exposed
individual moves randomly across an exposure
area, then the spatially averaged soil concentration
can be used to estimate the true average
concentration contacted over time. In this
example, the average concentration contacted over
time would equal the spatially averaged
concentration over the exposure area. While an
individual may not actually exhibit a truly random
pattern of movement across an exposure area, the
assumption of equal time spent in different parts
of the area is a simple but reasonable approach.
When should an average concentration be used?
The two types of exposure estimates now
being required for Superfund risk assessments, a
reasonable maximum exposure (RME) and an
average, should both use an average concentration.
To be protective, the overall estimate of intake
(see Highlight l) used as a basis for action at
1 When acute toxicity is of most concern, a long-
term average concentration generally should not be
used for risk assessment purposes, as the focus
should be to estimate short-term, peak
concentrations.
Superfund sites should be an estimate in the high
end of the intake/dose distribution. One high-end
option is the RME used in the Superfund
program. The RME, which is defined as the
highest exposure that could reasonably be expected
to occur for a given exposure pathway at a site, is
intended to account for both uncertainty in the
contaminant concentration and variability in
exposure parameters (e.g., exposure frequency,
averaging time). For comparative purposes,
Agency guidance (U.S. EPA, Guidance on Risk
Characterization for Risk Managers and Risk
Assessors, February 26,1992) states that an average
estimate of exposure also should be presented in
risk assessments. For decision-making purposes in
the Superfund program, however, RME is used to
estimate risk.2
Why use an estimate of the arithmetic mean
rather than the geometric mean?
The choice of the arithmetic mean
concentration as the appropriate measure for
estimating exposure derives from the need to
estimate an individual's long-term average
exposure. Most Agency health criteria are based
on the long-term average daily dose, which is
simply the sum of all daily doses divided by the
total number of days in the averaging period. This
is the definition of an arithmetic mean. The
2 For additional information on RME, see
RAGS/HHEM Part A and the National Oil and
Hazardous Substances Pollution Contingency Plan
(NCP), 55 Federal Register 8710, March 8,1990.
-------
arithmetic mean is appropriate regardless of the
pattern of daily exposures over time or the type of
statistical distribution that might best describe the
sampling data. The geometric mean of a set of
sampling results, however, bears no logical
connection to the cumulative intake that would
result from long-term contact with site
contaminants, and it may differ appreciably from
and be much lower than the arithmetic mean.
Although the geometric mean is a convenient
parameter for describing central tendencies of
lognonnal distributions, it is not an appropriate
basis for estimating the concentration term used in
Superfund exposure assessments. The following
simple example may help clarify the difference
between the arithmetic and geometric mean when
used for an exposure assessment:
Assume the daily exposure for a trespasser
subject to random exposure at a site is 1.0,
0.01, 1.0, 0.01, 1.0, 0.01, 1.0, and 0.01
units/day over an 8-day period. Given
these values, the cumulative exposure is
simply their summation, or 4.04 units.
Dividing this by 8 days of exposure results
in an arithmetic mean of 0.505 units/day.
This is the value we would want to use in
a risk assessment for this individual, not
the geometric mean of 0.1 units/day.
Viewed another way, multiplication of the
geometric mean by the number of days
equals 0.8 units, considerably lower than
the known cumulative exposure of 4.04
units.
UCL AS AN ESTIMATE OF THE
AVERAGE CONCENTRATION
What is a 95 percent UCL?
The 95 percent UCL of a mean is defined
as a value that, when calculated repeatedly for
randomly drawn subsets of site data, equals or
exceeds the true mean 95 percent of the time.
Although the 95 percent UCL of the mean
provides a conservative estimate of the average (or
mean) concentration, it should not be confused
with a 95th percentile of site concentration data (as
shown in Highlight 2).
Why use the UCL as the average concentration?
Statistical confidence limits are the classical
tool for addressing uncertainties of a distribution
average. The 95 percent UCL of the arithmetic
mean concentration is used as the average
concentration because it is not possible to know
the true mean. The 95 percent UCL therefore
accounts for uncertainties due to limited sampling
data at Superfund sites. As sampling data become
less limited at a site, uncertainties decrease, the
UCL moves closer to the true mean, and exposure
evaluations using either the mean or the UCL
produce similar results. This concept is illustrated
in Highlight 2.
Should a value other than the 95 percent UCL be
used for the concentration?
A value other than the 95 percent UCL
can be used provided the risk assessor can
document that high coverage of the true
population mean occurs (i.e., the value equals or
exceeds the true population mean with high
probability). For exposure areas with limited
amounts of data or extreme variability in measured
or modeled data, the UCL can be greater than the
highest measured or modeled concentration. In
these cases, if additional data cannot practicably be
obtained, the highest measured or modeled value
could be used as the concentration term. Note,
however, that the true mean still may be higher
than this maximum value (i.e., the 95 percent UCL
indicates a higher mean is possible), especially if
the most contaminated portion of the site has not
been sampled.
CALCULATING THE UCL
How many samples are necessary to calculate the
95 percent UCL?
Sampling data from Superfund sites have
shown that data sets with fewer than 10 samples
per exposure area provide poor estimates of the
mean concentration (i.e., there is a large difference
between the sample mean and the 95 percent
UCL), while data sets with 10 to 20 samples per
exposure area provide somewhat better estimates
of the mean, and data sets with 20 to 30 samples
provide fairly consistent estimates of the mean
(i.e., the 95 percent UCL is close to the sample
mean). Remember that, in general, the UCL
approaches the true mean as more samples are
included in the calculation.
Should the data be transformed?
EPA's experience shows that most large or
"complete" environmental contaminant data sets
-------
Highlight 2
COMPARISON OF UCL AND 95th PERCENTILE
CO
o
a
a
O
Upper Confidence
Limit (UCL)
of the Mean
10 -
Mean 20
Concentration
25
30
As sample size increases, the UCL of the mean moves closer to the true mean, while the 95th
percentile of the distribution remains at the upper end of the distribution.
from soil sampling are lognormally distributed
rather than normally distributed (see Highlights 3
and 4 for illustrations of lognormal and normal
distributions). In most cases, it is reasonable
to assume that Superfund soil sampling data are
lognormally distributed. Because transformation is
a necessary step in calculating the UCL of the
arithmetic mean for a lognormal distribution, the
data should be transformed by using the natural
logarithm function (i.e., calculate ln(x), where x is
the value from the data set). However, in cases
where there is a question about the distribution of
the data set, a statistical test should be used to
identify the best distributional assumption for the
data set The W-test (Gilbert 1987) is one
statistical method that can be used to determine if
a data set is consistent with a normal or lognormal
distribution. In all cases, it is valuable to plot the
data to better understand the contaminant
distribution at the site.
How do you calculate the UCL for a lognormal
distribution?
To calculate the 95 percent UCL of the
aetic mean for a lognormally distributed data
set, first transform the data using the natural
logarithm function as discussed previously (i.e.,
calculate ln(x)). After transforming the data,
determine the 95 percent UCL for the data set by
completing the following four steps:
(1) Calculate the arithmetic mean of the
transformed data (which is also the log of
the geometric mean);
(2) Calculate the standard deviation of the
transformed data;
(3) Determine the H-statistic (e.g., see Gilbert
1987); and
(4) Calculate the UCL using the equation
shown in Highlight 5.
How do you calculate the UCL for a normal
distribution?
If a statistical test supports the assumption
that the data set is normally distributed, calculate
the 95 percent UCL by completing the following
four steps:
-------
Highlight 3
EXAMPLE OF A LOGNORMAL DISTRIBUTION
10 Mean 15
20
25
30 35
40
Concentration
Highlight 4
EXAMPLE OF A NORMAL DISTRIBUTION
Mean 20
Concentration
25
30
r
-------
where:
UCL
e
x
s
H
n
Highlights
CALCULATING THE UCL OF THE ARITHMETIC MEAN
FOR A LOGNORMAL DISTRIBUTION
UCL =
upper confidence limit
constant (base of the natural log, equal to 2.718)
mean of the transformed data
standard deviation of the transformed data
H-statistic (e.g., from table published in Gilbert 1987)
number of samples
Highlight 6
CALCULATING THE UCL OF THE ARITHMETIC MEAN FOR A NORMAL DISTRIBUTION
where:
UCL =
x =
s =
t =
n =
UCL=x+t(s/,/n)
upper confidence limit
mean of the untransformed data
standard deviation of the untransformed data
Student-t statistic (e.g., from table published in Gilbert 1987)
number of samples
(1)
(2)
(3)
(4)
Calculate the arithmetic
untransformed data;
mean of the
Calculate the standard deviation of the
untransformed data;
Determine the one-tailed t-statistic (e.g.,
see Gilbert 1987); and
Calculate the UCL using the equation
presented in Highlight 6.
Use caution when applying normal distribution
calculations if there is a possibility that heavily
contaminated portions of the site have not been
adequately sampled. In such cases, a UCL from
normal distribution calculations could fall below
the true mean, even if a limited data set at a site
appears normally distributed.
EXAMPLES
The examples shown in Highlights 7 and 8
address the exposure scenario where an individual
at a Superfund site has equal opportunity to
contact soil in any sector of the contaminated area
over time. Even though the examples address only
soil exposures, the UCL approach is applicable to
all exposure pathways. Guidance and examples for
other exposure pathways will be presented in
forthcoming bulletins.
Highlight 7 presents a simple data set and
provides a stepwise demonstration of transforming
the data assuming a lognonnal distribution
and calculating the UCL. Highlight 8 uses the
same data set to show the difference between the
UCLs that would result from assuming normal and
lognonnal distribution of the data. These
-------
Highlight?
EXAMPLE OF DATA TRANSFORMATION AND CALCULATION OF UCL
This example shows the calculation of a 95 percent UCL of the arithmetic mean
concentration for chromium in soil at a Superfund site. This example is applicable only to a
scenario in which a spatially random exposure pattern is assumed. The concentrations of chromium
obtained from random sampling in soil at this site (in mg/kg) are 10, 13, 20, 36, 41, 59, 67, 110, 110,
136, 140, 160, 200, 230, and 1300. Using these data, the following steps are taken to calculate a
concentration term for the intake equation:
(1) Plot the data and inspect the graph. (You may need the help of a statistician for this part
[as well as other parts] of the calculation of the UCL.) The plot (not shown, but similar to
Highlight 3) shows a skew to the right, consistent with a lognormal distribution.
(2) Transform the data by taking the natural log of the values (i.e., determine ln(x)). For this
data set, the transformed values are: 2.30, 2.56, 3.00, 3.58, 3.71, 4.08, 4.20, 4.70, 4.70, 4.91,
4.94, 5.08, 5.30, 5.44, and 7.17.
(3) Apply the UCL equation in Highlight 5, where:
x = 4.38
s = 1.25
H = 3.163 (based on 95 percent)
n = 15
The resulting 95 percent UCL of the arithmetic mean is thus found to equal
, or 502 mg/kg.
Highlight 8
COMPARING UCLS OF THE ARITHMETIC MEAN ASSUMING DD7FERENT DISTRIBUTIONS
In this example, the data presented in Highlight 7 are used to demonstrate the difference in
the UCL that is seen if the normal distribution approach were inappropriately applied to this data
set (i.e., if, in this example, a normal distribution is assumed).
ASSUMED DISTRIBUTION:
TEST STATISTIC:
95 PERCENT UCL (mg/kg):
Normal
Student-t
325
Lognormal
H-statistic
502
"7
-------
examples demonstrate the importance of using the
correct assumptions.
WHERE CAN I GET MORE HELP?
Additional information on Superfund's
policy and approach to calculating the
concentration term and estimating exposures at
waste sites can be obtained in:
U.S. EPA, Risk Assessment Guidance
for Superfund: Volume I Human
Health Evaluation Manual (Part A),
EPA/540/1-89/002, December 1989.
U.S. EPA, Guidance for Data
Useability in Risk Assessment,
EPA/540/G-90/008 (OSWER
Directive 9285.7-05), October 1990.
U.S. EPA, Risk Assessment Guidance
for Superfund (PartA Baseline Risk
Assessment) Supplemental Guidance/
Standard Exposure Factors, OSWER
Directive 9285.6-03, May 1991.
Useful statistical guidance can be found in many
standard textbooks, including:
Gilbert, R.O., Statistical Methods for
Environmental Pollution Monitoring,
Van Nostrand Reinhold, New York,
New York, 1987.
Questions or comments concerning
concentration term can be directed to:
the
Toxics Integration Branch
Office of Emergency and Remedial
Response
401 M Street SW
Washington, DC 20460
Phone: 202-260-9486
EPA staff can obtain additional copies of this
bulletin by calling EPA's Superfund Document
Center at 202-260-9760. Others can obtain copies
by contacting NTIS at 703-487-4650.
United States
Environmental Protection
Agency (OS-230)
Washington, DC 20460
Official Business
Penalty for Private Use
$300
First-Class Mall
Postage and Fees Paid
EPA
Permit No. G-35
Slephena Harmony
Head Libfaiian/Cooidinalor
U.S. EPA. AW8ERC Lfcwiy
26 W. Martin imher King Dr.
Cincinnati .OH 45268
8
------- |