Proceedings of the Workshop on Superfund Hazardous Waste: Statistical Issues in Characterizing a Site: Protocols, Tools, and Research Needs: February 21-22, 1990, Crystal City, Arlington, VA


7h«	Suit Ur«v«rsirv
C»c»rtrr»nt sf Suosres,
JC3 ^nm '_i*xsrvt 6iv
Ln«v«fsirv Pv*. ?A ! 6oC2 'JSA
• 9Mtonorma:
3T^86S-^ui2 !Ca««fl
3ii365-i:^'C«atJ
Cinscnr
Ganaoao P P*0. 3Sc.
3r^ini3orot
^*tr«m»ncK Statutes
of
Gfwuit# ccatoqy Program
center fop statistical ecology
AND ENVIRONMENTAL STATISTICS
5
Proceedings o-f

T' irx & Workallop on Superfund Hazardous Waste:
Statistical Issues in Characterizing a Site:
Pro toools, Tools, and	search Needs
February 21-22, 1990
Crystal City
Arlington, VA.
Editors:
EJerisert T.acayo
Statistical Policy Branch (PM—223)
U.S. Environmental Protection Agency
401 M Street, S.W.
Washington, DC 20460
SoytLl J. ffadeau
Environmental Response Branch/OSWEH
U.S. Environmental Protection Agency
Woodbridge Avenue
Raritarsi Depot - Building 10
Edison, NJ 08837
Canapsti P. Patil
Center for Statistical Ecology and Environmental Statistics
Department of Statistics
Penn State University
University Park, PA 16802
Larry Zaragoza
Site Assessment Branch (OS-230)
U.S. Environmental Protection Agency
401 M Street •
Washington, DC 20460
Printed on Recycled Paper

-------
DISCLAIMER
This report was prepared under contract to an agency of
the United States Government. Neither the United States
Government nor any of its employees, contractors,
subcontractors, or their employees makes any warranty,
expressed or implied, or assumes any legal liability or
responsibility for any third party's use or the results
of such use of any information, apparatus, product,
model, formula, or process disclosed in this report, or
represents that its use by such third party would not
infringe on privately owned rights. Publication of the
data in this document does not signify that the contents
necessarily reflect the joint or separate views and
policies of each co-sponsoring agency. Mention of trade
names or commercial products does not constitute
endorsement or recommendation for u3e.

-------
This volume is a compendium of the papers and principal commentaries
delivered at this Superfund statistical methods workshop. The workshop was
underwritten by the Statistical Policy Branch, the Agency lead in the development
of statistical techniques and methodologies for the Superfund program.
7!i»* purpose of the workshop was to bring together statisticians, Agency
" ' -rf-iijii ¦:::: i iI ¦ -¦ ••::•! manager';, and penple having	e\; . t i - h,
implement ing ".omet imes complex statistical gu i dance in ' he field. T!i • fY--edom
• >f	ion al lowed in this environment enabled ideas and construct ive comments
to be exchanged among the participants; the number of comments for each paper
being evidence of the interest this workshop aroused.
The Comprehensive Environmental Response, Compensation and Liability Act of
1980 fCERCLA) contains a special trust fund financed by taxes on crude oil and
a variety of commercial chemicals; this fund, popularly known as "Superfund," is
used in performing remedial cleanups and in responding to emergencies. Congress
revised the Superfund legislation in 198R by the Superfund Amendment and
Heaulhorizalion Act ^SARA) and included language that requires EPA to attain
certain criteria or standards. The language did not, however, include guidance
on how to determine if attainment has been reached after a remediation action.
The overall theme of the workshop was to investigate and explore methods and
technologies useful in determining attainment of specified standards. In order
to encourage innovative thought, the usual requirements on methodologies of cost-
effectiveness, practicability of use by non-statisticians, and limitations due
to the physical constraints of obtaining high quality dat.a were greatly relaxed.
The material presented at this workshop will ultimately be used to generate
guidance and advisory documents for use by Agency Superfund managers. The views
presented at this workshop are those of the participants and should not be
construed as being the Agency's official position on the subject or policy.
This workshop was the first in a planned series of environmental workshops
and conferences dedicated to bringing innovative statistical methodologies to
analysts, researchers, and environmental managers. Details on future conferences
maybe obtained from John Warren, Acting Branch Chief, Statistical Policy Branch,
Office of Policy, Planning, and Evaluation, EPA.

-------
Executive Summary
It has become increasingly important to develop, adapt and/or adopt
sound statistical protocols and procedures to help acquire statistical and
substantive accuracy and precision in Superfund hazardous waste site
characterization activities while maintaining cost effectiveness.
Site characterization issues and approaches have a variety of
dimensions and components that are largely site-dependent. Yet, certain
common denominators emerge. For example, the assessment and mapping of
site attributes of interest can be one component of site characterization.
Generally speaking, statistical approaches to site characterization
should be guided by specific needs and goals of sites and the cleanup
progression. Several phases of site characterization may transpire
throughout a site investigation, from initial discovery of a site, through
intensive pre-remediation sampling and post-remediation evaluation. The
phase of action helps determine the statistical hypotheses and
methodologies of site characterization. For example, before a site is
labeled "hazardous", the burden of proof is on EPA to demonstrate this.
Once the "hazardous" label is applied, the burden of proof is upon the site
remediator to demonstrate cleanliness.
It has become increasingly clear that the Superfund related substantive
policy, science, and technology issues, and the corresponding statistical
approaches to address these issues are closely intertwined and need much
closer attention than has been possible.
The purpose of the workshop was to bring together a group fo scientists,
managers, and statisticians involved with Superfund hazardous waste site
characterization and remediation work and to identify and/or formulate
statistical issues and statistical research needs to help address the
statistical protocols implicit in characterization and the remediation work.
Tlie workshop was planned for February 21-22, 1990 in Washington DC area at
the Crystal City Sheraton, Arlington, VA. The workshop consisted of
expository papers and presentations, preplanned and extempore discussions,
thematic round table luncheons, overall review and the subsequent concluding
panel discussion.
The technical content of the workshop included such topics as
statistical sampling and monitoring, composite sampling, spatial statistics,
encountered data and meta-analysis, risk assessment and ecotoxicology, and
attainment of cleanup standards. The participants having relevant, interest,
concern, and/or responsibility came from within academia, agencies, and
industry.
ii

-------
Purpose, Plaxix>in£* and Organization
It has become increasingly important to develop, adapt and/or adopt
sound statistical protocols and procedures to help acquire statistical and
substantive accuracy and precision in Superfund hazardous waste site
characterization activities while maintaining cost effectiveness.
Site characterization issues and approaches have a variety of
dimensions and components that are largely site-dependent. Yet, certain
common denominators emerge. For example, the assessment and mapping of
site attributes of interest can be one component of site characterization.
In a broader sense, the implementation of site characterization can be
specific to each site and can embody the detection of the pathways and the
risk of hazardous wastes to human receptors and others.
Generally speaking, statistical approaches to site characterization
should be guided by specific needs and goals of sites and the cleanup
progression. Several phases of site characterization may transpire
throughout a site investigation, from initial discovery of a site, through
intensive pre-remediation sampling and post-remediation evaluation. The
phase of action helps determine the statistical hypotheses and
methodologies of site characterization. For example, before a site is
labeled "hazardous", the burden of proof is on EPA to demonstrate this.
Once the "hazardous" label is applied, the burden of proof is upon the site
remediator to demonstrate cleanliness. The phases of site characterization
need not be statistically or monetarily disconnected. For example, cost
effectiveness might be improved if intensive sampling and careful
statistical design prior to remediation allowed lower remediation costs.
Hie choice of statistical parameter, the mean or a percentile, used to
characterize a site's contamination level may be influenced by the type of
waste present, its anticipated spatial pattern, and the potential
remediation technology. The choice of parameter, in turn, can establish
the sampling and estimation methodology.
It has become increasingly clear that the Superfund related substantive
policy, science, and technology issues, and the corresponding statistical
approaches to address these issues are closely intertwined and need ouch
closer attention than has been possible.
The purpose of the workshop was to bring together a group fo scientists,
managers, and statisticians involved with Superfund hazardous waste site
characterization and remediation work and to identify and/or formulate
statistical issues and statistical research needs to help address the
statistical protocols implicit in characterization and the remediation work.
The workshop was planned for February 21-22, 1990 in Washington DC area at
the Crystal City Sheraton, Arlington, VA. The workshop consisted of
expository papers and presentations, preplanned and extempore discussions,
thematic round table luncheons, overall review and the subsequent concluding
panel discussion.
Hi

-------
The technical content of the workshop included such topics as
statistical sampling and monitoring, composite sampling, spatial statistics,
encountered data and meta-analysis, risk assessment and ecotoxicology, and
attainment of cleanup standards. The participants having relevant interest,
concern, and/or responsibility came from within acadenia, agencies, and
industry.
TOPICS AND BRIEF SUMMARIES
1. SESSION 1: Environmental Monitoring and Statistical Sampling.
1.1 Statistical Sampling and. Analysis Issues and Needs for Testing
Attainment of Background-based Cleanup Standards at Superfund Sites:
Pre Iiminary Report [by Richard 0. Gilbert and Jeanne C. Simpson], This
presentation discusses statistical sampling and analysis research issues and
needs for testing whether remediated Superfund sites have attained
site-specific background standards. Some relatively simple sampling designs
and statistical tests are proposed and compared to more complex approaches
that theoretically, in some situations, may be more efficient and powerful.
These comparisons are used to suggest where research resources might be
focused to develop new or improved statistical sampling and analysis tools for
Superfund-site assessment and remediation.
2. SESSION 2: Composite Sampling and Hot Spot Identification.
2.1 Composite Sampling Using Spatial Autocorrelation for Palmerton
Hazardous Waste Site: A Preliminary Report [by Marilyn T. Boawe 11 and
Ganapati P. Patil]. Composite samples, formed by physically mixing individual
samples, have many applications and can reduce the overall cost of analytical
procedures that must be performed on each sample. Typical applications are
estimation or tests of hypotheses about the mean and identification of
individual samples exceeding some action level (hot-apot identification).
The presentation begins with an overview of the techniques and
applications of composite sampling. Following this, the use for hot-spot
identification is covered.
Let c be an action level above which some action such as the removal
of contaminated soil is necessary. Suppose n individual samples are dried
and homogenized and aliquots are composited. If one sample exceeds the
action level c and the rest have negligible pollution, then the composite
sample exceeds c/n. However, if all samples exceed c/n, the composite
sample will exceed c/n.' The former case requires action while the latter
case does not. Furthermore, the analytical procedure may have same minimum
detection limit (MDL). If the level of the composite sample is below the
MDL, no pollution is detected. Combining n samples may dilute the
concentration to below the MDL even if one or more samples exceed the action
level c.
iv

-------
The constraints are identified and methods of retesting to identify all
samples that exceed the action level are covered. Usually, it is thought
that pollution levels must be low for composite sample techniques to be
effective. Results of a preliminary survey show spatial autocorrelation of
pollution may allow the use of composite sample techniques at higher levels
of pollution. These results are illustrated and new directions of research
are discussed.
3. SESSION 3: Spatial Statistics and Composite Sampling.
3.1 Summary of Current Research in Ertvi ronmental Statistics at the
Envi ronmertt al Mom tor ing Systems Laboratory - Las Vegas [by Evan Englund and
George Platman]. When we seek to identify research issues and define
research needs for Superfund or any other program, one sensible approach is
first to review what is already being done. Ve can then determine what
additional work is needed, and examine priorities and levels of effort. This
paper describes the program in environmental statistics, geoatatistics and
chemometries being conducted by the Exposure Assessment Research Division
(EAD) of the Environmental Monitoring Systems Laboratory - Las Vegas
(EMSL-LV). This is not an exhaustive review, as it does not cover the
various statistical projects conducted by other divisions at the laboratory.
The EAD program began in 1980, and pioneered the application of
geostatistical methods for site assessment with the evaluation of lead
contamination in soils surrounding smelter sites in Dallas, Texas (Brown, et.
al., 1985). "Dallas Lead" has become the classic case study in the field.
The overall objective of the EAD research and development program is to
develop improved strategies for practical, cost-effective environmental
sampling, monitoring, and assessment. Tfte following project titles describe
individual research activities:
(1)	Cost-effective sampling design for site assessment,
(2)	Comparison of interpolative methods;
(3)	Geostatistical simulations of hydraulic head;
(4)	Hypothesis tests for spatial and multivariate data sets;
(5)	Visual display of uncertainty in spatial data;
(6)	Multivariate outlier tests;
(7)	Methods for exploring multivariate data sets;
(8)	Software development; and
(9)	Composite sampling.
3.2 Spatial Stat ist ics, Composite Sampling, and Related Issues in
Site Character!zat ion with Two Examples [by Nicholas C. Bolgiano, Ganapati P.
Patil, and C. Taillie]. Data from two Superfund sites, the Dallas Lead Site
and the Palmerton Site, are examined to assess what sources of variability
are present in the data and how sanpling at other sites of similar character
should be designed. The results of the analysis indicate that the spatial
variability in the accumulated heavy metal deposition at each site appears to
be dominated by variability on the scale of the sampled region, by local
v

-------
industrial contamination at the Dallas Lead Site, and by variability of nearby
soil volumes. Sampling of sites with these variability sources should
probably be designed to capture the large-scale trend, to identify the sources
and extent of local contamination, and to reduce the large variability of
nearby soil samples by composite sampling.
4. SESSION 4: Ecological Assessment at Hazardous Waste Sites.
4.1 Quant ifying Effects in Ecological Site Assessments: Biological
and Statistical Console rat ions [by Lawrence A. Kapustka, Mustafa A. Shirazi
and Greg Linder], This presentation consists of four parts and covers the
following:
(1)	Biological issues in environmental management and quantifying
effects of environmental chemicals;
(2)	Predicting biological effects from response/error surfaces
derived from toxicological data on single chemicals;
(3)	Framework for modeling biological responses to complex chemical
mixtures; and
(4)	Spatial statistics applications with examples.
5. SESSION 5: Environmental and Ecotoxicological Statistics.
5.1 Some Statist teal Issues Relating to the Characterization of Risk
for Toxic Chemicals [by William M. Stiteler and Patrick R. Durkin].
Conversion of Continuous Dose-Response Data to Quantal Form. The evaluation
of the health effects of toxic substances generally involves two different
kinds of data depending on the effect being studied. When the effect is
something like increased liver weight or enzyme activity, a continuous
variable is measured. For an effect like death or tumor development, quantal
data are collected. Different methodologies and approaches to dose-response
modeling have been devised for these two kinds of data.
In some cases, it is desirable to be able to use the same model or
analysis on both types of data. In particular, since many of the dose
response models are designed for quantal data, it would be useful to have a
biologically and statistically sound method for converting continuous
response data to quantal form.
Statistical Properties of Uncertainty Factors Used in Determining
Reference Doses. The current methodology for determining the "safe" human
exposure level for a toxic chemical (noncarcinogenic) involves the
determination of an "reference dose" (RfD).The determination of an RfD
involves the extensive use of "uncertainty factors" (also called safety
factors). These factors might be used for extrapolating from subchronic to
chronic exposure, from "lowest observed adverse effect level" (LOAEL) to "no
observed adverse effect level" (NOAEL), from animal to human, and for taking
into account reproductive or teratogenic effects about which information
vi

-------
might be lacking. More needs to be known about the statistical properties of
these uncertainty factors. In particular, the effect that the number of
available studies has on the process, and the resulting RfD, should be
explored.
Parameter Estimation for Threshold Models. Dose-response models for
noncarcinogenic toxic effects generally incorporate a threshold. The presence
of this threshold in the model can create problems in the estimation of the
model parameters. There can be a confounding effect between the threshold
parameter and the other parameters. If maximum likelihood estimation is
used, there is an abrupt change in the likelihood function at the threshold.
This creates a problem since the threshold is itself being estimated from the
data. New approaches need to be devised for these models.
5.2 Estimation of Concentration-Percent Survival Relationships:
Design Issues [by Ernst Linder], Concentration-percent survival relationship
models play an important role in ecological assessment of hazardous waste
sites. They are used to extrapolate from 100% site samples to samples of
lower toxicities away from the site. They are the basis for computing acute
and chronic toxicity measures, such as LC50, EC50 and MATC. Ve discuss
non-symmetric response curves that have recently been proposed, in particular
the family of power logistic response models that includes as a special case
the familiar logistic regression. We show how optimal choices for the
concentration levels change as a more general model is introduced. Designs
for estimating the concentration-percent relationship as well as for
estimating toxicity measures are investigated. We also discuss implementation
issues and show for which situations an optimal design can be achieved for
extrapolating to lower concentrations levels.
6. SESSION 6: Evaluating the Attainment of Cleanup Standards.
6.1 Evaluating the Attainment of Interim Cleanup Standards [by
Ganapati P. Patil and Charles Taillie]. Cleanup standards at hazardous waste
sites may be specified either in absolute terms of relative to preremediation
conditions. The latter can be suitable for monitoring the performance of
innovative technologies on an interim basis, and call for a two-stage sampling
design. The presentation discusses computer intensive methods in which the
pre-remediation information is employed to identify non-normal parametric
models which are used to design an efficient second stage sampling program and
corresponding likelihood ratio test. The parametric techniques are compared
with nonparametric approaches such as the Wilcoxon rank-sum test and the
Savage exponential -scores test. The material is illustrated with data from a
Superfund site at which an innovative vacuum extraction technology is being
employed for soil contaminated by volatile organic compounds.
vii

-------
Participants
Rath
U.S. E
Blercr
.S. Environmental Protection Agency
toxics Integration Branch (OS-230)
401 M Street, S.W.
Washington, DC 20460
202-475-9492
Jcholaa Bolgiann..
Karl Held
Nidi
Penn
&epar
317 Pood Lib
tjnivenity Park, PA 16802
814-863-5212
State University
tment of Statistics
T. Bn«wU
frfarlfyn
Penn Stan
i State University
department of Statistics
317 Pond Lab
Xjniversity Park, PA 16802
814-863-6164
van Fnplnnd
.S. Environmental Protection Agency
Environmental Monitoring Systems Laboratory
P.O. Box 93478
Las Vegas. NV 89193-3468
702-798-2248
ntorgt T. Flatman
Applied Earth Science Department
1 Building, Room 315
Stanford University
Stanford, CA 94303
U 5-723-0437
1 Engineering and Sciences Company
1050 E. Flamingo Road #120
Las Vegas, NV 89119
1702-734-3209
M. fl»rfr
[Bums & Roe Environmental Services. Inc
(601 S. Henderson Road
'King of Prussia. PA 19406
1215-354-0500
t Rlchnrrt 0. fillhfrt
Pacific Northwest Laboratory
; P.O. Box 999
! 906 Battelle Boulevard
1 Richland, WA 99352
I 509-375-2979
j Fred Ha»h*r»r
U.S. Environmental Protection Agency
i Quality Assurance Management Staff (RD-680)
; 401 M Street, S.W.
j Washington. DC 20460
202-382-5785
) Tttnifer Hnl#v
Washington, DC 20460
i 202-475-6705
SRA Technologies, Inc.
4700 King Street, Suite 300
Alexandria, VA 22302
703-671-7171
..BarnaJphnspn
U.S. Environmental Protection Agency
Office of Solid Waste (OS-311)
401 M Street, S.W.
Washington. DC 20460
202-382-4500
Cynthia Kalgri
U.S. Environmental Protection Agency
Region 6 (6HSR)
1443 Ross Avenue
Dallas. TX 75202
214-653-2287
Lmracc A. Kapustha
U.S. Environmental Protection Agency
Ecounicology Branch. Corvallis Environmental Research Laboratory
200 S.W. 35th St.
Corvallis. OR 97333
503-757-4606 (FTS 420-4606)
Herhtrt (Peal) Laravn. Ir
U.S. Environmental Protection Agency
Statistical Policy Branch (PM-223)
401 M Street. S.W.
Washington, DC 20460
202-382-2714
Enwt Unrf#r
University of New Hampshire
Department of Mathematics
Durham. NH 03824
603-862-2687
indtr
Grct L
NSITec
NSI Technology Services Corporation
200 S.W. 35th Street
Corvallis. OR 97333
503-757-4639
Michael Murray
Penn State University
104 Research Building A
University Park, PA 16802
814-863-9021
Wirnt Mrcra
Penn State University
School of Forest Resources
University Park, PA 16802
814-863-8911
. Roral [Varifau
¦ U.S. Environmental Protection Agency
S Office of Emergency and Remedial Response (OS-220)
i 401 M Street, S.W.
U A Environmental Protection Agency
Environmental Response BrandvOSWER
Woodbridge Avenue
Ran tan Depot • Building 10
Edison, NJ 08837
201-321-6740
D#an N*nhin>
U.S. Environmental Protection Agency
Quality Assurance Management Staff (RD-680)
401 M Street, S.W.
Washington, DC 20460
202-475-9464
viii

-------
Sinan Norton	
U:5. Environmental Protection Agency
Office of Research and Development (RD-689)
401 M Street S.W.
Washington, DC 20460
202-382-6955
fiananatl P. (r,.P.) Patil	
Penn Suue University
Department of Statistics
318 Pond Laboratory
University Part, PA 16802
814-865-9442
A'" PolYTTienonoH»o«	
SRA Technologies, Inc.
4700 King Street, Suite 300
Alexandria. VA 22302
703-671-7171
Ronald Pr^fnn	
U.S. Environmental Protection Agency
Toxics Integration Branch (OS-230)
401 M Street, S.W.
Washington, DC 20460
202-382-4307
R. Raiagopal	
University of Iowa
Department of Geography
302 Jessup Hall
Iowa City, LA 52242
319-335-0160
David T. Schagffcr	
University of Illinois
Department of Veterinary Biosciences
2001 South Lincoln Avenue
Urbana, IL 61801
217-244-0154,217-333-2506
M A. SMrari	
U.S. Environmental Protection Agency
200 S.W. 35th Street
Corvailis, OR 97330
503-757-4578 (FTS 420-4578)
Tfanne Simnsnn
Pacific Northwest Laboratory
P.O. Box 999
906 Battelle Boulevard
Richland. WA 99352
509-375-6946
Rnr Smith	——.	
U.S. Environmental Protection Agency
Region in
841 Chestnut Building
Philadelphia, PA 19107
215-597-9800
Mtndv Snopgrsltr	
U.S. Environmental Protection Agency
Region m (3HW15)
841 Chestnut Building
Philadelphia, PA 19107
215-597-2365
Mark P. Snrmor	
U.S. Environmental Protection Agency
Environmental Response Team (MS-101)
Wood bridge Avenue
Edison, NJ 08837
201-906-6826 (FTS 340-6826)
ix
William M. Stiteler	
Syracuse Research Corporation
Merrill Lane
Syracuse, NY 13210
315-426-3365
Charles Tailllc	
Penn Sate University
Department of Statistics
316 Pond Lab
University Park, PA 16802
814-865-5212
Donald P. Trgga	
Viar and Company
300 N.Lee Street
Alexandria. VA 22314
703-683-0885
John Warren		
U.S. Environmental Protection Agency
Statistical Policy Branch (PM-223)
401 M Street, S.W.
Washington, DC 20460
202-382-2683
Tcrl Weiaa	
U-S. Environmental Protection Agency
Region I
JFK Federal Building (HPRI)
Boston. MA 02203-2211
617-565-3715
UtTTtilTn R. Wllllamii	
U.S. Environmental Protection Agency
Environmental Monitoring Systems Laboratory
P.O. Box 93478
Las Vegas, NV 89193-3468
702-798-2103
I.rnda Wvnn	
Research Triangle Institute
EPA Quality Assurance Management Staff
1717 Massachusetts Avenue. Suite 102
Washington, DC 20036
202-332-5102
Fining Ztighami	
Oak Ridge National Laboratory
Building 7503 MS-6382
Oak Ridge, TN 37830
615-574-4503
Tnhn Zlrtchkr		
Dames and Moore
920 Kipling Drive
Atlanta, GA 30318
404-350-0087,404-262-2915

-------
Table of Contents
Preface			
Executive Summary.	,
Purpose, Planning, and Organization,
Participants				
i
ii
iii
viii
SESSION 1: ENVIRONMENTAL
MONITORING AND STATISTICAL
SAMPLING
Statistical Sampling and Analysis Issues and Needs for Testing
Attainment of Background-Based Cleanup Standards at Superfund Sites
Richard 0. Gilbert and Jeanne C.Simpson	 1
INVITED DISCUSSANTS:
Wayne Myers	 17
COMMENTS BY PARTICIPANTS:
Mindi Snoparsky	 19
SESSION 2: COMPOSITE SAMPLING AND
HOT SPOT IDENTIFICATION
Composite Sampling Using Spatial Autocorrelation for Palmerton
Hazardous Waste Site: A Preliminary Report
Marilyn T. Boswell and Ganapati P. Patil	 20
INVITED DISCUSSANTS:
David J. Schaeffer	 43
R. Rajagopal	 47
Forest Garner and Martin Stapanian	 51
Llewellyn Williams	 53
COMMENTS BY PARTICIPANTS:
Cynthia Kaleri		55
Susan Norton		55
David Schaeffer		55
Roy Smith	.					55
Mindi Snoparsky		56
x

-------
SESSION 3: SPATIAL STATISTICS AND
COMPOSITE SAMPLING
Suaary of Current Research in Environmental Statistics at the
Environmental Monitoring Systems Laboratory - Las Vegas
Evan Englund and George Flatman	 57
Spatial Statistics, Composite Sampling, and Related Issues in
Site Characterization with Two Examples
Nicholas C. Bolgiano, Ganapati P. Patil and C. Taillie	 79
INVITED DISCUSSANTS:
Michael M. Murray	115
COMMENTS BY PARTICIPANTS:
O Ekslukb *no FVatmam :
David Schaeffer	117
Roy Smith	117
Oh Boisjam , Path, and Taiuji :
David Schaeffer	117
Roy Smith	117
Ok Miiiat :
Mindi Snoparsky		117
SESSION 4s ECOLOGICAL ASSESSMENT
AT HAZARDOUS WASTE SITES
Quantifying Effects in Ecological Site Assessments: Biological and
Statistical Considerations
Lawrence A. Kapustka, Mostafa A. Shirazi, and Greg Linder	113
INVITED DISCUSSANTS:
Mark D. Sprenger	147
Wayne Myers	149
COMMENTS BY PARTICIPANTS:
Susan Braen Morton	151
Mark D. Sprenger	*51
xi

-------
SESSION 5: ENVIRONMENTAL AND
ECOTOXI CO LOGICAL STATISTICS
Sane Statistical Issues Relating to the Characterization of Risk
for Toxic Chemicals
William M. Stiteler and Patrick R. Durkin	152
Estimation of Concentration-Percent Survival Relationships: Design
Issues
Ernst Linder			163
INVITED DISCUSSANTS:
On SriTEici ano Dutim:
L. Kapustka, E. Linder, and M. Shirazi	177
On Linoei :
L. Kapustka, E. Linder, and M. Shirazi	178
COMMENTS BY PARTICIPANTS:
& Sti teles and Duni»:
H. Lacayo	179
E. Linder	179
Ok Lihoei :
H. Lacayo	179
SESSION 6: EVALUATING THE
ATTAINMENT OF CLEANUP STANDARDS
Evaluating the Attainment of Interim Cleanup Standards
G. P. Patil and C. Taillie	180
INVITED DISCUSSANTS:
Herbert Lacayo		227
Barnes Johnson	228
Jeri Weiss	231
COMMENTS BY PARTICIPANTS:
On Path and Tuiue :
Cynthia Kaleri	233
Roy Smith	233
Mindi Snoparsky	233
Jeri Weiss	233
Cta LkCAro:
David Schaeffer	233
xii

-------
SESSION 7: STATISTICAL ISSUES AND
APPROACHES FOR THE
CHARACTERIZATION AND REMEDIATION:
A DISCUSSION
INVITED DISCUSSIONS:
John Warren	235
Sue Norton	238
Wayne Myers	240
Michael Murray	241
Nicholas Bolgiano	243
John Zirachky	245
Jennifer Haley and Bill Hanson	248
Hoyal Nadeau	250
G. P. Patil	252
COMMENTS BY PARTICIPANTS:
Mindi Snoparsky	254
Llew Williams	254
xiii

-------
STATISTICAL SAMPLING AND ANALYSIS ISSUES AND NEEDS FOR TESTING ATTAINMENT
OF BACKGROUND-BASED CLEANUP STANDARDS AT SUPERFUND SITES
Richard 0. Gilbert
Jeanne C. Simpson
Pacific Northwest Laboratory(a)
P. 0. Box 999
Richland, WA 99352
ABSTRACT
The primary purpose of the Workshop on Superfund Hazardous Waste is to
identify statistical issues and research needs that can form the basis for a
long-term statistical research and training plan for Superfund hazardous waste
site characterization and remediation. This paper discusses issues and needs
that arise when statistical procedures are used to test whether remediated
Superfund sites have attained site-specific background standards. Several
nonparametric tests are discussed (Wilcoxon rank sum, slippage, quantile) as
regards their power to detect non-attainment of background standards. Some
of the important issues that appear to need attention are (I) how to select
site-specific background areas, (2) determine the types of post-remedial-action
concentration distributions that are likely to occur in practice for various
types of remedial actions (to better select the most powerful tests to use),
(3) develop and evaluate the power of multiple-comparison slippage, quantile,
and other robust tests, (4) conduct additional (to those reported here) power
studies of the Wilcoxon, slippage, and quantile tests, for various patterns
of "hot spot" contamination, (5) develop and communicate a unified approach
for deciding when to use geostatistical methods, classical testing methods
as discussed here, or both simultaneously, and (6) determine and evaluate
statistical procedures that are appropriate for testing compliance with non-
constant risk-based standards.
(a) Work supported by the U.S. Environmental Protection Agency under a Related
Services Agreement with the U.S. Department of Energy, Contract
DE-AC06-76RL0 1830. Pacific Northwest Laboratory is operated for the
U.S. Department of Energy by Battelle Memorial Institute.
-1-

-------
1. INTRODUCTION
After the soil at a Superfund site has been remediated, it is necessary
to determine if the remediation effort has been successful. This determination
involves comparing concentrations in soil at the remediated site with a cleanup
standard. The cleanup standard may be based on (1) what is technologically
achievable, (2) a risk assessment, or (3) site-specific background
concentrations. The comparison of residual concentrations with a standard
should be statistically based, using appropriate statistical sampling designs
and tests. This paper discusses statistical tests that may be used when the
standard is based on site-specific background measurements. Following the
presentation of methods and power results in Sections 2 and 3, an assessment
of statistical issues and needs is provided in Section 4.
1.1	DEFINITIONS
Cleanup Unit:	a geographical area of specified size and shape at the
remediated Superfund site for which a separate decision
will be made whether the unit attains the site-specific
background standard for a given soil pollution parameter.
Background Region: the geographical region from which the background area(s)
will be selected.
Background Area: a defined area in the background region from which
background measurements of the soil pollution parameter
of interest will be made to compare with soil measurements
of that parameter in one or more cleanup units at the
remediated Superfund site.
At some Superfund sites a single background area may not be suitable for
all cleanup units because they have different physical, chemical, or biological
characteristics. The number and size of cleanup units in the remediated
Superfund site will depend on many factors including the size of the site,
past and future use of the site, whether the type of remedial action is likely
to leave residual contamination in a known pattern, and geographical features
(hill-sides, ponds, flat land, etc.).
1.2	ASSUMPTIONS
•	The site-specific background area has no contamination from the Superfund
site or other man-made sources.
•	The background area is similar to the cleanup unit with regard to
physical, chemical, and biological soil characteristics.
•	Concentrations of the pollutant parameter do not change over time in the
background area or in the cleanup units after remedial action.
•	The measurements of pollution parameters in the background area and in
cleanup units are statistically independent.
-2-

-------
2.0 COMPARING A SINGLE CLEANUP UNIT WITH BACKGROUND
2.1	CONTAMINATION SCENARIOS
Let Fb(x) denote the distribution function of the background
measurements. Suppose the cleanup unit has undergone remedial action, but
that contamination above background levels may still be present. We consider
two types of contamination scenarios for the cleanup unit:
1.	Concentrations of the pollution parameter in all parts of the cleanup
unit exceed those in the applicable backgrouncTarea. This scenario is
called the shift alternative.
2.	Concentrations in a proportion, e, of the soil in the cleanup unit still
exceed concentrations in the applicable background area. This scenario
is called the mixture alternative. This alternative includes the case
of "hot spots".
For the shift alternative, let the distribution function of the cleanup-
unit measurements be given by
Fs(x) ' Fb(x - A) ,	(1)
where A / 0 and where A is in units of standard deviation. For the shift
alternative we test Hp; A a 0 against H^: A > 0. The Wilcoxon Rank Sum test
is appropriate for this H 0. In this paper we discuss the advantages and disadvantages of the
slippage and compliance tests (Section 2.3), the Quantile test (Section 2.4),
and acceptance sampling (Section 2.5).
In this paper we favor nonparametric tests because they (1) do not require
that Fb(x) be a normal distribution, and (2) are minimally affected by the
presence of data below the detection limit.
2.2	WILCOXON RANK SUM TEST FOR A SHIFT ALTERNATIVE
If the remedial action has the potential for leaving cleanup-unit
pollution-parameter concentrations more or less uniformly higher than
background, then a test that is designed to detect such a "shift" is needed.
The Wilcoxon test, which is discussed in many statistical textbooks, e.g.,
Conover (1980) is such a test. The Wilcoxon test is known to have good power
relative to the two-sample t test for a shift alternative (Equation 1) when
the two distributions are normal. Moreover, the Wilcoxon test can have much
greater power than the t test when one or both distributions are highly skewed
-3-

-------
(Conover 1980, p. 225). Also, the Wilcoxon test can be used when there are
a moderate number of data below the detection limit by treating those data as
having a single value less than the smallest measured value.
For the Wilcoxon test to be of practical use, one must be able to easily
determine the number of measurements to take in the background area and cleanup
unit. If the background area will be compared with a single cleanup unit,
the number of background samples, m, and the number of cleanup-unit samples,
n, should probably be equal. Figure 1 gives values of N a m + n for this
equal sample-size case for a equal to 0.01 and 0.05 for a range of values of
Pr, where
Pr ¦ the probability that a random measurement from
the cleanup unit will exceed a random measurement
from the background area.
The curves in Figure 1 were obtained for the shift alternative with normal
distributions. For that case, the relationship between Pr and A is
4 * Zpr*(2)0-5	(4)
where Zp is the point on the standard normal distribution where
Prob(Z < Zp ) * Pr. The scales of Pr and A in Fig. 1 were obtained
using Equation (4).
In practice, for shifts of normal distributions, N can be determined from
Figure 1 by specifying the:
•	acceptable Type I error rate (a)
•	acceptable Type II error rate (0) (or equivalently the power, 1 - /?)
•	size of Pr or A that is important to detect with power 1 - 0
The values of N in Figure 1 were computed using the following approximate
formula (from Noether 1987)
(Z, s + Z, )2
N - 	^	—	—	(5)
12c(l-c)(Pr - 0.5)2
for Pr > 0.5 and Z when c ¦ m/N (the proportion of total samples from the
background area) was set at 0.5, where Zp is the point on the standard normal
distribution where r a Prob[Z < Zp].
If a single background area will be compared with several cleanup units,
it may be desirable to have m > n rather than m * n. To determine m and n when
m > n:
1.	Determine N from Figure 1 for specified Pr (or A), a, and /?. Denote
this N by Ni.
2.	Specify the desired proportion of background measurements, c, where
c > 0.5
>4-

-------
100 -
a-0.01
a - 0.05
S- 80 "
CO
S 70-
M
0.95 (rightmost)
0.90
0.80
0.70
0.60 (leftmost)
& 50 ¦-
S
| 40 -
3
Z 30 "
i-
2
0.50 0.55 0.60 0.65 0.70 0.75 0.80 0.85 0.90 0.95 0.99
0.00 0.18 0.36 0.55 0.74 0.95 1.19 1.47 1.81 2.33 3.29 A
FIGURE 1. Power (1-0) of the Wilcoxon Rank Sum Test when a - 0.01 and
0.05 for Shift (A) Alternatives for Normal Distributions
3.	Find the factor F > 1 from Figure 2 for the specified c
4.	Compute N • FNi ¦ total number of measurements.
5.	Compute m » cN and n ¦ N(l-c).
Figure 2 was constructed by multiplying Equation (5) by the factor
0.5(0.5)/c(l-c) for specified c > 0.5.
2.3 THE SLIPPAGE TEST FOR A MIXTURE ALTERNATIVE
The Wilcoxon rank sum test is appropriate when the cleanup-unit
measurements are "uniformly" greater than the background measurements.
However, the power of the Wilcoxon test will be less for the mixture
alternative (Equation 2) than for the shift alternative. An alternative test
to the Wilcoxon test in this case is:
1.	Specify the desired Type I Error level, a.
2.	Count the number, K, of cleanup-unit measurements that exceed the maximum
background measurement. (The maximum background measurement is the
cleanup "standard".)
3.	Reject H0:c » 0 and accept Ha:« > 0 if K exceeds the critical value
obtained from the Tables in Rosenbaum (1954) for the specified a, m,
and n.
-5-

-------
22. ¦
F
0.55
0.50
0.60
0.65 0.70
0.75
0.80
0.85
0.90
Proportion of Background Measurements, c
FIGURE 2. Multiplicative Factor, F, for Determining N
for the Wilcoxon Rank Sum Test when n m
For example, if a » 0.05 and m ¦ n ¦ 20, then the critical value from
Rosenbaum (1954) is 5, i.e., we reject H0: e * 0 if more than 5 cleanup-unit
measurements exceed the maximum background measurement.
The power of this test is given in Table 1 for the case m » n from 10 to
75 for values of e from 0.05 to 0.65 when Fb(x) and Fc(x) are distributions of
any shape that are "completely" separated (e.g., A > 5 for normal
distributions, where A a (/*i - /*2)/<0- We see that for K a 4, a = 0.043,
this test has power greater than 0.80 when m a n * 10 if c is 0.45 or greater.
In contrast, using the relationship Pp » (1 + e)/2, we find from Figure 1 that
the Wilcoxon test has power of about 0.7 when a 3 0.05. In Table 1, if e -
0.10, then 60 or more measurements in both the background and cleanup unit
must be used before the power of the slippage test exceeds 0.80.
Figure 3 shows the power of the slippage test when K - 5 for values of e
from 0.05 to 0.75 when Fb(x) and Fc(x) are not "completely" separated. That
is, the distribution for the cleanup unit is (see Equation 2)
Fs(x) • (1 - «)Fb(x) + «Fb(x - 4)	(6)
where Fb(x) is a normal distribution with mean p\ and variance o<-, and
A = (nl m n2)/ff' Figure 3 gives power curves for A s 0.5, 1.0, 1.5, and 4
for a levels of approximately 0.025.
-6-

-------
n««
II
II
16
16
21
28
31
36
41
46
61
66
61
86
71
76
TABLE 1. Power of the Slippage Test when Fb(x) and Fc(x)
are Completely Separated (see text)
6 	
a
8 85
• 11
• ¦IS
• 28
• 25
• 3»
• .35
• ¦48
• 45
• 58
8.55
8.68
1.143
• •72
• .121
• .193
• 287
• 397
• 615
• 829
• .732
• 818
• .886
• 933
• 965
¦ IIS
• •27
• •46
• 178
• .128
•.196
• 287
• .391
• 614
• .817
• .721
• .812
8.883
I.ISI
• 1M
• .212
• 361
• 626
• 678
1.8(1
• 868
• 943
• 974
• 989
• .996
• 999
• 121
• •46
• •97
• .188
• .317
•.468
• 818
• .749
• 86«
• .919
• .961
1.984
• .994
1.124
• M5
• 166
• 338
• 526
• 7M
• 836
• 922
• 967
• 988
• 996
8.999

I 126
• •88
• .249
• 481
• 712
• .867
• 942
• 98«
• 994
• 999



• 126
• .116
• 342
• 821
• .828
• .937
1.981
a.995
• .999




1.127
• .147
• 439
• .735
• 9*7
• 975
• 995
• .999





1.127
• 182
• 534
• .823
• .962
•.99a
• 998






1.028
• 221
• 621
• 886
• 976
• 996







• 128
• 283
• 898
• 928
• 989
• 999







• 828
• 317
• .764
• 956
• .996








1.129
• 353
• 819
• .974
• 996








1.129
• 4M
• .843
• 986
•.999








9.129
• .447
• 897
• .991









• 129
• .493
• .924
• .995









4.0
i.o -•
0.8 -•
A - 1.5
9- 0.6-¦
IE
Ui
5
O 0.4-¦
a.
A- 1.0
Hsi
y
		
N-40
0.0 -¦
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0.0
EPSILON (e)
FIGURE 3. Power of the Slippage Test for Mixture Alternatives
	 when K » 5, n - n, a m 0.025, Background and
Cleanup-Unit Measurements are Normally Distributed
7-

-------
Figures 4 and 5 illustrate cleanup-unit measurement mixture distributions
when 25% (e = 0.25) of the soil in the cleanup unit is contaminated above
background levels by shifts A equal to 1 and 4, respectively. From Figure 4,
the distributions of the background and cleanup unit are seen to be very
similar when A * 1. In Figure 3 we see there is very little power to detect
this difference, even when 60 samples are taken in both the background area and
the cleanup unit. However, we see in Figure 5 that a marked difference in
the distributions, especially in the right-hand tails, occurs when A= 4. As
seen in Figure 3, there is good power to detect this difference (approximately
0.7 when m * n » 20, 0.97 when m =» n ¦ 40, and 1.0 when m = n » 60).
2.4 QUANTILE TEST FOR A MIXTURE ALTERNATIVE
The nonparametric "quantile test" was developed by Johnson et al. (1987)
to detect changes in a small proportion of a treated population. The test
appears to have considerable potential for testing for attainment of
background standards. The test is simple to use and is a locally most powerful
test for the mixture alternative. It reduces to counting the number of cleanup
unit measurements among the extreme order statistics of the combined background
and site data. The hypergeometric distribution is the null distribution of the
statistic.
The quantile test is conducted as follows:
1.	Select a value of bi that is greater than 0.5 and less than 1.0.
[If bi » 0.5 is used, the median test (Conover 1980) is obtained]
2.	Determine ri, r2,...,rn, which are the ranks of the n cleanup-unit
measurements in the combined set of in + n background plus cleanup-site
measurements. (Johnson et al. 1987, used average ranks when ties were
present.)
3.	Compute the quantile test statistics Q, where
Q ¦ £ J[ri/(im-n+l)]	(7)
ial
where
J[ri/(m+n+l)] a 1 if b1 < ri/(m+n+l) < 1
» 0 otherwise	(8)
From Equation (8) we see that Q is the number of cleanup-unit
measurements whose rank in the combined sample exceeds bi. Table 2 illustrates
the computation of Q for a data set in Johnson et al. (1987) for which
m = n » 10 and bi » 0.8. For this example, the quantile test is identical to
the slippage test.
-8-

-------

0.8

0.7

0.6

0.5
5H

M
C
0.4
0

Q


0.3

0.2

0.1

0.0

f \«	Background

\ V*—— Cleanup-Unit



)i-4 n-3 (i-2 n-1 fi >1+1
Concentration
H+2 (1+3 }i+4 n+5
11+6
FIGURE 4. Density Distribution of Mixture of Two Normal Distributions.
Background has mean fi and variance 1. Cleanup-unit is 75%
of the Background distribution and 25% with shift of 1.
0.8
Background
0.7
0.6
0.5
I
o
03
02
0.0
(1+1 H+2 }i+3 (1+4 (1+5 11+6
11-2
¦1
M
Concentration
FIGURE 5. Density Distribution of Two Normal Distributions.
Background has mean n and variance 1. Cleanup-unit is 75%
of the background distribution and 25% with shift of 4.
-9-

-------
Table 2. Example of Calculating the Quantile Test Statistic Q when
m = n = 10 and t>i = 0.8 (from Johnson et al. 1987).
i
n
ri/(m+n+l)
J[ri/(m+n+l)
1
1
0.0476
0
2
5
0.238
0
3
7
0.333
0
4
11.5
0.548
0
5
14
0.666
0
6
16
0.762
0
7
17
0.810
1
8
18
0.857
1
9
19
0.905
1
10
20
0.952
1

Quantile
Statistic
(Q) - 4
Using the hypergeometric distribution, Johnson
et al. (1987) showed that for this example
Hp:e * 0 would be rejected at the 0.043
significance level.
We note that the Wilcoxon test can be considered a special case of the
quantile test when the mixing proportion, e, goes to 1. In that case, the
cleanup-unit distribution is simply shifted to the right by the amount A.
Johnson et al. (1987) also develop the nonparametric "mixed normal" test
statistic for the mixture alternative for when Fb(x) and Fc(x) in Equation 2
are normal distributions with the same variance but the mean of Fc(x) is larger
than that of F(x). The quantile test has more potential for wide-spread use
than the mixed normal test because it does not require Fb(x) and Fc(x) to be
normal distributions.
Johnson et al. (1987) performed a small empirical power study of the
quantile test, the mixed normal test, the Wilcoxon test, and the normal scores
test when Fb(x) and Fc(x) are normal distributions with Fc(x) shifted from
Fb(x) by 1, 2, and 3 units/and with mixing proportions, e, equal to 0.1,
0.2, and 0.3. The power was quite high for the mixed normal and to a lesser
extent the quantile test when m ¦ n > 20. For A a 2 or 3, the quantile and
mixed normal tests had more power than the Wilcoxon and normal scores tests.
When small numbers of samples are used (n » » » 5 or 10) or when the mixing
proportion is small, all the tests had low power to detect the mixture
alternatives considered.
Figure 6 shows the power of the quantile test compared to the slippage
test for m ¦ n » 40. The distributions are the same as used for the slippage
test in Section 2.3 (Figures 4 and 5). The quantile test looks at the *0
extremes (bi » 0.875) from the combined sample (N - 80 and rejects the null
hypothesis if 8 or more of the extremes are from the cleanup unit [a - u.u^j.
The slippage test looks at the 5 extremes (bi * 0.9375) from the combined
sample and rejects Hq: e * 0 if all 5 of the extremes are from the clean-up
unit (a ¦ 0.027).
-10-

-------
A-4.0
1.0 -•
	Slippage Test
— Quantile Test
A-1.5
0.8 -•
«• 0.6-•
O 0.4-¦
A - 0.5
02-
0.0 -•
0.0
0.7
0.1
0.5
0.6
0.4
EPSILON (e)
FIGURE 6. Comparison of Quantile and Slippage Tests for
m ¦ n ¦ 40, a « 0.025. Background and
cleanup-unit both have normal distributions.
As seen from Figure 6, the quantile test has greater power than the
slippage test when A is relatively small. When delta is one, the two
distributions are "more distinguishable" when one looks at a larger portion
of the tail (Figure 4). Thus, the quantile test, which looks at twice the
number of extremes, has a power slightly more than 0.2, whereas the slippage
test has a power <0.1 for A ¦ 1 and e ¦ 0.25.
When A becomes large, the slippage test has greater power than the
quantile test. When A ¦ 4, the two distributions are "more distinguishable"
when one looks at a smaller portion of the tail (more to the right)(Figure 5).
The advantage of the slippage test declines as e becomes large, in which case
both tests have large power. When e is small, the greater power of the
slippage test as compared to the quantile test is substantial. For example,
when A » 4 and e * 0.1 the power of the slippage test is 0.5, while the power
of the quantile test is only slightly greater than 0.3.
2.5 TWO TESTS THAT USE AN ESTIMATED BACKGROUND STANDARD AS A CONSTANT
One possible approach to testing for attainment of background standards
is to (1) estimate a selected upper percentile of the background distribution
using m background measurements, (2) assume henceforth that the estimated
percentile is the established (known without error) background "standard",
and (3) conduct tests that are appropriate when the standard is a known
constant. One test that could be used in this case is compliance sampling,
as now discussed.
11-

-------
2.5.1 Compliance Sampling
The Compliance test (Shilling 1978) could be used when the background
standard is considered to be a known constant. One possible strategy is to:
•	Define the background standard to be the maximum of m background
measurements, where m is the minimum sample size required so that the
maximum of the m measurements is a specified percent (say 80%) confidence
limit on a specified upper percentile (say the 95th percentile) of the
background distribution. The background sample size m is read from
Table 5A in Conover (1980). From this point onward, consider the maximum
to be a constant.
•	Specify the maximum allowed fraction, P2, of the soil at the remediated
Superfund site that will be allowed to exceed the background standard
(maximum background measurement).
•	Specify the Type I and Type II Error rates.
•	Use Table 17-2 in Schilling (1982) to calculate the number of cleanup-
unit measurements, n, to collect.
•	If one or more of the n site measurements exceeds the maximum background
measurements, conclude that the unit has not attained the cleanup
standard.
This procedure is simple, but it is flawed because if the site and
background distributions are identical, the probability is greater than the
specified a level of the compliance test that one or more of the cleanup-unit
measurements will exceed the maximum site measurement. For this reason, using
the maximum background measurement as if it were a constant is probably not a
good idea. As the number of background measurements becomes large relative
to the number of cleanup-unit measurements, the inflation of a decreases in
magnitude. The slippage test (Section 2.3) may be used instead of the
compliance test.
Rather than use the maximum background measurement as the standard, one
could use an estimated percentile, say the 95th, of the background
distribution. This percentile, p, can be estimated nonparametrically as the
p(m + 1)th order statistic of the m background measurements. However, the a
level of any test conducted assuming the estimated percentile is a constant
will be greater than the specified level.
2.5.2 Acceptance Sampling
Acceptance sampling (Burstein 1971, pp. 69-80) could also be used to
test for compliance when the background standard is a constant, but a will
be greater than specified if the standard is an estimated parameter of the
background distribution. However, the a values will be less affected than
for compliance sampling because 1 or more measurements in the cleanup unit
are allowed to exceed the background standard without rejecting the null
hypothesis. The quantile test in Section 2.4 appears to be preferable to
acceptance sampling.
-12

-------
3.0 MULTIPLE CLEANUP UNITS AND POLLUTION PARAMETERS
If there are multiple cleanup units and/or pollution parameters, then
multiple statistical tests will be conducted. If each of k independent
statistical tests are performed at the a significance level when all cleanup
units are in compliance with all standards, then the probability all k tests
will indicate attainment of compliance is p » (1 - a)*. If a s 0.05 and
k a 25, then p * 0.28, and if k * 100, then p a 0.0059. Hence, as k increases,
the probability approaches 1 that one or more tests will falsely indicate
non-attainment of the standard. This problem has led to the development of
multiple comparison (MC) tests (Hochberg and Tamhane 1987; Miller 1981). In
general, these tests reduce the a level of each individual test so that the
overall "experiment-wise" a level is maintained at a specified level.
However, this approach can severely reduce the power of each individual test.
When testing for compliance with background standards, MC procedures for
comparing several treatments (cleanup units) with a control (background) may
be considered. These tests include Dunnett's test (Dunnett 1955, 1964), which
is appropriate for normal distributions, and the nonparametric rank test by
Steel (1959). Another approach would be to use simultaneous prediction
intervals for normal distributions (Davis and McNichols 1987, 1988; Gibbons
1987; McNichols and Davis 1988) or the nonparametric procedure by Chou and
Owen (1986). It appears desirable to determine the power of tests by Steel
(1959) and Chou and Owen (1986) for realistic alternative distributions of
residual contamination.
The authors are not convinced that MC procedures should or can be used
when testing for compliance with background standards. The extreme lowering
of power when many tests are made is a serious problem. Also, practical
limitations in field remedial-action activities may preclude being able to
delay testing until several cleanup units or pollution parameters can be tested
simultaneously. Perhaps a better approach is to conduct each test at the
usual nominal a level so that power is maintained. Additional samples could
be collected in those cleanup units for which the test indicated non-
attainment of the background standard.
4.0 RESEARCH ISSUES ANO NEEDS
Some suggested research issues and needs are as follows:
1.	Determine the types of distributions, Fs(x), that are likely to occur in
practice for various types of remedial actions. This should be done
because the power of the Wilcoxon, slippage, and quantile tests discussed
here was shown to depend on whether Fs(x) is skewed and if so, on the
amount of skewness. The power of these tests should be evaluated for
the identified distributions. In the absence of information on realistic
distributions, F$(x), all three tests could be used.
2.	If multiple comparison tests have a role to play, then efforts should be
made to develop slippage and quantile tests and robust procedures (e.g.,
McKean et al. 1989) and to evaluate their power.
-13-

-------
3.	When testing for compliance against background-based standards, the
importance of selecting appropriate background sites is of great
importance. There is a role for statistics in the background-selection
process, and it should be defined as carefully as possible. Indeed,
sampling background to assess spatial variability and average
concentrations levels seems important if a scientifically defensible
selection of a background area is needed.
4.	Tests for attainment of risk-based standards usually assume the standard
is a fixed constant. In reality, the risk-based standard has a great
deal of uncertainty. What tests are appropriate for the risk-based-
standards case when this uncertainty can be quantified and used.
"Uncertainty analysis" (Finkel 1990; IAEA 1989) is being used to evaluate
the uncertainty of environmental transport and dose model predictions.
These methods should be used to approximate the uncertainty in risk-based
standards. This uncertainty should be taken into account when performing
statistical tests.
5.	The power results reported here assume samples are collected using simple
random sampling. Additional power studies are needed for when various
patterns of "hot spot" contamination are present. Power should be
evaluated for both random sampling and systematic sampling on a
triangular or square grid.
6.	There is a need to develop and communicate a unified approach for
deciding when to using geostatistical methods, classical testing methods
as discussed here, or both simultaneously. The detection of hot spots
and the estimation of spatial patterns of residual contamination are
different ways of evaluating compliance with background standards.
5.0 CONCLUSIONS
It is clear that the optimal choice of a statistical test to compare
cleanup-units with background requires information about Fs(x). If the
remedial action has "uniformly" reduced contamination levels, but not to
background levels, the Wilcoxon test should be used because it has greater
power than the slippage or quantile tests for the shift alternative. However,
if most of the cleanup unit has been remediated to background levels and only
a few "hot spots" remain, the slippage and quantile tests are preferred over
the Wilcoxon test because of their greater power. A number of research issues
and needs have been suggested in Section 4.
6.0 ACKNOWLEDGEMENTS
The authors wish to thank Rick Bates, Pacific Northwest Laboratory, for
his review and insightful comments on practical aspects of statistical
procedures during the draft stages of this document.
-14-

-------
7.0 REFERENCES
Burstein, H. 1971. Attribute Sampling: Tables and Explanations. McGraw-Hill,
New York.
Chou, Y. and D. B. Owen. 1986. "One-Sided Distribution-Free Simultaneous
Prediction Limits for p Future Samples." Journal of Quality Technoloqy
18:96-98.
Conover, W. J. 1980. Practical Nonparametric Statistics. 2nd ed. Wiley,
New York.
Davis, C. B. and R. J. McNichols. 1988. "Discussion of 'Statistical
Prediction Intervals for the Evaluation of Ground-Water Quality,' by R. D.
Gibbons, Ground Water 25:455-465." Ground Water 26:90-91.
Davis, C. B. and R. J. McNichols. 1987. "One-Sided Intervals for at Least p
of m Observations From a Normal Population on Each of r Future Occasions."
Technometrics 29:359-370.
Dunnett, C. W. 1955. "A Multiple Comparison Procedure for Comparing Several
Treatment with a Control." Journal of the American Statistical Association
50:1096-1121.
Dunnett, C. W. 1964. "New Tables for Multiple Comparisons with a Control."
Biometrics 20:482-491.
Finkel, A. M. 1990. Confronting Uncertainty in Risk Management. A Guide for
Decision-Makers. Center for Risk Management, Resources for the Future, 1616
P Street, N.W., Washington, DC.
Gibbons, R. 0. 1987. "Statistical Prediction Intervals for the Evaluation
of Ground-Water Quality," Ground Water 25:455-465.
Hochberg, Y. and A. C. Tamhane. 1987. Multiple Comparison Procedures. Wiley,
New York.
IAEA. 1989. Evaluating the Reliability of Predictions Made Using
Environmental Transfer Models. Safety Series No. 100, International Atomic
Energy Agency, Vienna.
Johnson, R. A., S. Verrill, and D.H. Moore II. 1987. "Two-Sample Rank Tests
for Detecting Changes That Occur in a Small Proportion of the Treated
Population." Biometrics 43:641-655.
McKean, J. W., T. J. Vidmar, and G. L. Sievers. 1989. "A Robust Two-Stage
Multiple Comparison Procedure with Application to a Random Drug Screen."
Biometrics 45:1281-1297.
McNichols, R. J. and C. B. Davis. 1988. "Statistical Issues and Problems in
Ground Water Detection Monitoring at Hazardous Waste Facilities." Ground Water
Monitoring Review 8:135-150.
-15-

-------
Miller, R. C. 1981. Simultaneous Statistical Inference. 2nd ed.
Springer-Verlag, New York.
Noether, G. E.. 1987. "Sample Size Determination for Some Common
Nonparametric Tests." Journal of the American Statistical Association
82:645-647.	"
Rosenbaum, S. 1954. "Tables for a Nonparametric Test of Location." Annals
of Mathematical Statistics 25:146-150.
Shilling, E. C. 1978. "A Lot Sensitive Sampling Plan for Compliance Testing
and Acceptance Inspection." Journal of Quality Technology 10:47-51.
Shilling, E. G. 1982. Acceptance Sampling in Quality Control. Marcel Dekker,
New York.
Steel, R.G.D. 1959. "A Multiple Comparison Rank Sum Test; Treatments Versus
Control." Biometrics 15:560-572.
-16-

-------
Session 1: Environmental Monitoring and Statistical Sampling
DISCUSSION
Wayne L. Myers
Codlrector, Office for Remote Sensing of Earth Resources
Penn State Univ.
Univ. Park, PA 16802
Dr. Gilbert and colleagues have provided orthodox counsel on treatment
of one particular Superfund statistical stress sydrome — namely, that of
sampling in cases of background-based criteria for cleanup. Relative to
the scope of the session title, this falls considerably short of
comprehensive coverage. Since it is understating the case to say that the
session title is expansive relative to the allotted time, they should not
be chastised too severely for lack of thoroughness. Nevertheless, the
presentation spans a relatively small subspace of the expanding universe of
potentially relevant information technology. My concern is that this may
be symptomatic of an apparent traditionalism in the practice of statistical
medicine in the context of Superfund. This raises questions regarding what
constitutes appropriate statistical technology for Superfund.
If the presentational material is taken together with commentary and
questions as a group consultation, the agency and contractor
representatives appear to be seeking home remedies and over-the-counter
statistical medication. The statisticians seem to be practicing in terms
of routine prescriptions for generic drugs without (figurative) diagnostic
laboratory tests. This scenario is taking place against a backdrop of high
epidemiological risk to the public at large in the age of advanced
technology diagnostic expert systems like MYCIN. While there may not be a
basis for malpractice suits, negligence could be considered.
The implications of minimal statistical medicine in the context of
Superfund deserve examination. There is little doubt that it simplifies
life for an agency under stress of heavy caseload, but its primary service
is to corporate interests that may be asked to bear financial burden for
remediation and liability for social consequences of contamination. This
arises from underuse of existing information that is suggestive of likely
spatial distribution for the contaminants. Lack of definition for spatial
distribution directs litigation toward average contaminant loads rather
than maximum contaminant loads. The interests of the public would clearly
be best served by sampling most intensively for contaminants in places
where they are likely to be concentrated. Site-specific physiography,
hydrology, and Infrastructure are indicative of logical areas of
contaminant accumulation. The magnitude of public investment in Superfund
is obvious. The relative cost of using collateral information to model
dispersal is small in terms of Superfund budget, although not
inconsequential relative to current expenditure on data acquisition.
The modeling approach is considerably more complex; however, and would
extend the time requirement for remediation.
-17-

-------
The presenters have correctly identified spatial issues as being in need
of additional attention, but have not explored the dimensions of that need
in sufficient depth. It is not just a question of fine-tuning conventional
variogram models in geostatistics. Physically based models of contaminant
injection and transport need to be formulated and geostatistical estimation
approaches superimposed on the models in a manner that takes full account
of site conditions and history. Geographic information systems (GIS)
provide a suitable platform for hosting such models, but the models
themselves are yet to be configured appropriately for the purpose. Expert
systems technology has evolved to the point where it can be engineered to
manage the complexity involved. Work also needs to be done on factoring
the contaminant settings into type groups that are substantively similar
with respect to most efficient assessment scenarios. Such a typology could
serve an important role as a launching pad for initiatives in formal
knowledge base engineering.
It is likewise possible to use knowledge-based systems and symbolic
logic to model the behavior of statistical experts in selecting and
parameterizing distributions, determining sample sizes, selecting tests,
optimizing power of tests, etc. Appropriately configured expert systems
have the potential for much more rapid response in these respects than
human statistical counterparts. Furthermore, they can be replicated almost
Instantly and updated endlessly. Superfund provides the Ideal forum for
moving now to incorporate modern information technologies into
environmental assessment and monitoring.
-18-

-------
COMMENTS BY PARTICIPANTS
On Gilbert and S imp son
Mindi Saoparsky (U. S. Environmental Protection Agency): How can compliance
or acceptance sampling be incorporated into a ROD? How do we get sites already
in design to use some of these sampling and analysis procedures during the five
year evaluation stage? What type of documents should be required for post
design-cleanup evaluation?
19

-------
COMPOSITE SAMPLING USING SPATIAL AUTOCORRELATION FOR PALMERTON
HAZARDOUS WASTE SITE: A PRELIMINARY REPORT
M. T. BosweII and G. P. Patil
Center for Statistical Ecology and Environmental Statistics
Department of Statistics
The Pennsylvania State University
University Park, PA 16802
1. INTRODUCTION
A composite sample (also known as a group sample) is formed by taking
a number of individual samples and physically mixing them. Composite
samples are used for different purposes. The goal is to obtain the desired
information contained in the original samples but at reduced cost or
effort. Applications include estimating the mean of a stochastic process,
identifying the individuals with a certain trait, and estimating the
fraction of a population that possesses a given trait.
In some applications, it is hoped that the mean of the process can be
estimated by a single measurement taken on the composite sample, thereby
saving the cost of taking many measurements. Considerable savings in the
overall cost may be realized. What is lost is the ability to estimate the
variability in the measurement process. However, by processing several
composite samples, the variance may be estimated, still at considerable
savings. In applications to Superfund sites, usually every sample with
pollution present or exceeding some action level (trait) is to be
identi fied.
When the total volume of the entire material contained in the samples
is used to form composite samples and when the entire volume of each
composite sample is analyzed, there may not be any cost savings. Instead
of analyzing the entire composite, subsamples or aliquots are drawn and
processed separately. This allows the measurement errors to be taken into
account. Further, several composite samples may be formed from subsamples
or "increments" drawn from the original samples. This allows the variance
of the original observation.to be estimated.
Section 2 gives an overview of composite-sample techniques with
application to hazardous waste sites. These techniques assume composite
samples are formed with constant (equal) amount of materials. Much of the
material presented in Section 2 is discussed in Garner, Stapanian and
Williams (1986).
-20-

-------
Composite samples using random amount of material occur when the
sampling involves a random process, such as towing a net through water.
The amount of water filtered depends on many random conditions, such as
wind speed, wave height and water currents. The proportions of each sample
used to make the composite sample are either fixed (known) or random. (See
Boswell, 1977, 1978a, 1978b; Boswell and Patil, 1987; Elder, 1977; Elder,
Thompson, and Myers, 1980; Rhode, 1976, 1979). Further, t analyses or
tests are done on each subsample. This generalization takes into account
the variability in dividing the original samples into increments before
compositing, variability due to non-perfect mixing of the composite samples
and the selection of subsamples, and, finally, the variability of the test
procedure itself. This material is not covered in this paper.
2. AN OVERVIEW OF COMPOSITE SAMPLING TECHNIQUES
2.1 Composite Samples for the Detection of a Certain Trait
Laboratory procedures (such as blood tests for a disease) for the
detection of a certain trait have used compositing of samples to reduce
costs, (see, for example, Feller 1968, p239; Garner, Stapanian and Williams
1986). If the problem is to identify every individual with a rare trait,
then compositing several individual samples and retesting the individual
samples only when the composite sample tests are positive, has the
potential of greatly reducing the number of tests required. Feller gives
references to several generalizations to two stage sampling. Also, if the
individual samples are highly correlated, as may be the case in Superfund
sites, the traits need not be quite so rare.
Let ^j>^2''''''k	re9U't of n tests resulting in a 0 (trait
not present) or a 1 (trait present). Suppose I. are independent and
identically distributed with probability P[I.sl] = p which is small (rare
trait). A composite sample formed from these individual samples will
result in 1=1 if the trait is present in any of the individual samples
and will result in 1=0 otherwise. For the composite sample,
P[I=l] = 1 - P[Ij=0tI =0,...,1^=0] = 1 - (l-p)*1. Suppose, for example,
n : 12 samples are to be combined into a composite sample. If the composite
sample indicates the trait, then it remains to identify which of the twelve
individual samples have the trait. This can be done in several ways. One way
is to test all 12 samples. The number of tests, N, would then be either 1 or
13. The average number of tests is E[N] = 13^1 - (l-p)llJ + l£(l-p)lSj . If
p = 1/2000, then (1-p)13 = .994, and the relative cost (RC) is
RC = E[N] = y|< .006) ~ .994) = ~p. = .089 ,
measurements per individual sample. On the average, 1.0^7 tests must be done
-21-

-------
for 12 samples to determine which samples have the trait. Modifications for a
binary search or other retesting schemes might further reduce the number of
tests required.
In general, for composite samples of size k, the relative cost for a
procedure that tests every sample in a composite sample with the trait, is
RC = U(^"P) + (fc+Dtl-U-p) ] _ i + wk _ (!_p)k.
For p less than about .'29, the graph of RC first decreases to a
global minimum, then increases to a local maximum, and then decreases to an
asymptote greater than the minimum. A closed form expression for the
minimum does not exist; however, a good approximation is given in Samuels
(1978). The solution, tabled below, was found numerically. For larger
values of p the graph steadily decreases; thus, composite sampling is not
advantageous for large p. In Superfund sites where pollution exceeds the
action level 30% or more of the time, we see from the table that composite
sampling will not be useful unless, perhaps, strong spatial autocorrelation
exists. After remedial action has reduced the pollution loads, composite
sampling may be very cost effective.
k
P
RC(%)
k
P
RC(%)
k
P
RC(%)
1
0.2929
100






2
0.2928
100
17
0.0037
12
32
0.0010
6
3
0.1865
79
18
0.0033
11
33
0.0009
6
4
0.0855
55
19
0.0029
11
34
0.0009
6
5
0.0505
43
20
0.0026
10
35
0.0008
6
6
0.0335
35
21
0.0024
10
36
0.0008
6
7
0.0239
30
22
0.0022
9
37
0.0007
5
8
0.0179
26
23
0.0020
9
38
0.0007
5
9
0.0139
23
24
0.0018
8
39
0.0007
5
10
0.0111
21
25
0.0017
8
40
0.0006
5
11
0.0091
19
26
0.0015
8
50
0.0004
4
12
0.0076
17
27
0.0014
7
60
0.0003
3
13
0.0064
16
28
0.0013
7
70
0.0002
3
14
0.0055
15
29
0.0012
7
80
0.0001
2
15
0.0048
14
30
0.0011
7
90
0.0001
2
16
0.0042
13
31
0.0011
7
100
0.0001
2
The values of k are optimal for values of p given in the table.
Also, for a value of p between two tabled values, the smaller of the two
values of k is the optimal value.
For example, if we believed that the pollution load in a region is
such that the fraction p of sites that exceed the action level is about
0.01, then the optimal composite sample size is 11 giving the RC of 0.21,
implying that only about 21 percent of the tests would be required compared
to testing all of the individual samples. Note that if a composite sample
-22-

-------
size of 5 were used, the RC would be smaller than the tabled value of .43
because p is smaller than .0505. The error made by compositing a
non-optimal number of samples is usually small. If p = .01 and a
composite sample size of 5 is used, then the RC = 1 + 1/5 - (.99)5 = .'25 ,
and if a composite sample size of 20 were used then RC = 1 + 1/20 -
2 O
(.999) = .23 . The relative cost increases from 0.21 to 0.25.
This procedure can be generalized to test for r traits simultaneously.
In order to calculate the cost per individual, it is necessary to know the
r
probability of 1 or more traits = 1 - n P[i//i trait is absent],
i=l
2.2 Estimation of Incidence of a Trait with Composite Sampling
Any procedure to estimate the prevalence can be used to design
retesting strategies as described in Section 5 below.
Let p be the unknown incidence of the trait. A composite sample of
k individual samples will exhibit the trait with probability 1 - 11—p) .
Test a (large) number, n, of composite samples, and let x be the total
and p be the fraction of these that exhibit the trait. Then
Kn
k
Pn = x/n, where x ~ binomiaKn, l-(l-p) )
and therefore, an estimator of p is
~ i /i \1 /k
p = 1 - (1-Pn)
A
The estimator p has a positive bias as an estimator of p,
which can be shown using properties of the binomial distribution and
Jensen's inequality for the expectation of a convex function of a random
variable. In a simulation study to determine the bias and standard deviation
A
of this estimator p appears to over estimate p by less than .07p. The
bias, as a precentage of p, appears to decrease as p decreases and as m
and n increase.
Burrows (1987) proposes an alternative estimator
p s |a(l-pn) + (l-a)j1/l1 , where a = 2k/[2kn+n-l] .
A
He shows that p has a smaller bias and a smaller standard error than p .
Note, when testing for a trait is likely to cause embarassment to an
-23-

-------
individual, confidentiality can be assured by compositing samples from
several individuals and only testing the composite. If the composite has
the trait, then the individual(s) who have the trait are still unknown.
2.3 Composite Sampling to Test for Compliance
When it has been mandated that a polluted site be cleaned up, then it
is desirable to identify cases not meeting the standard. The measurement
of pollution is a continuous variable which must satisfy a certain
criterion. All cases that do not satisfy this criterion are to be
identified. If the compliance rate is high, as should be the case in
Superfund sites after remedial action has been completed, considerable
savings can be realized by the use of composite sampling techniques.
Let c be a criterion value that should not be exceeded. For
example, c might be the maximum allowable pollutant concentration. If the
pollution level exceeds c, then remedial action would be required.
Assume that the analytic procedure is exact (zero variation), and the
measure of the composite sample formed from equal sized increments is the
average of measures of the individual samples. In this case, if any one
sample exceeds c, then a composite sample of k individuals will exceed
c/k. Of course, it is possible that none of the samples will exceed c and
the composite sample concentration will still exceed c/k. If the composite
sample concentration exceeds c/k, then it is necessary to test each
individual sample. On the other hand, if the composite sample
concentration does not exceed c/k, then none of the individual samples need
to be tested.
Let F(x) be the distribution function of the individual samples; then
the k-fold convolution of F, Flk\x) = F(x) • F(x) * ... • F(x), is the
distribution function of the sum. The probability that a composite sample
(k)
of k individual samples exceeds c/k is 1 - F (c). The relative cost
per individual (RC) is
nc -F1, ltl/k. Pik)U).
(k)
As k increases, F (c) decreases to a limiting value of zero. The
optimal composite size depends on the distribution of the sum.
2.4 Compositing to Reduce Variance in the Presence of Inexact Analytical
Procedures
In the above discussions, the only sources of variability were from
the population and from sampling. The compositing will have no effect on
the standard error of the measurement process.
-24-

-------
If the outcome of the test applied to a sample has an error with mean
2 "2
zero and variance and if the population has a variance of a , then the
— 2 2
variance of X, the average of n samples, is (
-------
g. = g, i = 1 n. Then w. = 1/n and Z = X, and the estimator of n
based on the composite sample is exactly the same as that based on the n
samples.
The procedure described above assumes the fractions wt.w2 wn t0
be fixed and known. Examples of this kind of compositing include:
1. Stratified random samples are of this form, showing a
measurement of the composite sample formed by combining
the individual samples gives the same result.
2. Compositing of soil samples to obtain the overall or
average fertility of the soil. Other bulk samples
include sampling of fertilizers, coffee beans and
concrete mixes.
3. Compositing of samples taken from waste sites to obtain
the overall concentration of hazardous material.
4. Compositing of filtrate (assuming a known amount of
water is filtered) to estimate the abundance of
various plankton species.
2.6 An Example for Power Plant Effect on Environment
Consider a situation presented by Rohde (1979) where a small scale
study was made to see if compositing would be useful in studying the effect
of power plant construction on the environment. For the Crane Power Plant
near Baltimore City, data was collected on the density of plankton before
entrainment. A transect near the intake to the plant was selected starting
in deep water and ending in shallow water near shore. Samples were taken
at six locations along the transect. The first two, in deep water, were
sampled near the top and near the bottom. The density of plankton seemed
to be smaller near the shore and near the bottom. All locations were
sampled with equal effort by pumping water for 10 minutes at a rate of 37.8
gal/minute into a collecting net. The filtrate was washed into a 250 ml
bottle from which 1 ol subsamples were selected. Actual measurements of
the samples and of composite samples are given in the table below. The
first column of the Table gives the actual measurements. The second,
third, and fifth columns give the average/(composite sample measurements)
for various possible composite samples. It is clear that about the same
results were obtained from the composite samples as from the average of the
original samples. The average of the eight observations was 203.2
organisms per liter while the density in the composite sample was 209.3
organisms per liter.
-26-
-------
Location Density Density
Organisms/Liter of average/of composite
1
top
269.3
1
bottom
252.3
2
top
195.0
2
bottom
139.3
3

282.3
4

243.3
5

126.0
6

118.0
260.8
167.2
253.3
170.3
262.8
122.0
286.0
131.3
211.8
219.0
203.2
209.3
2.7 Heavy Metal Pollution
A study of heavy metal pollution of aquatic environments was reported
by Hueck (1976). In this study, mussels from relatively unpolluted water
were transplanted to various locations under study. After a period of
time, tissue from the mussels was analyzed for heavy metal content. Since
there was a great local variation within a region, pooling of results from
adjacent localities seemed reasonable. Composite samples were formed by
homogenizing the tissue of several specimens. The amount of heavy metal in
the sample was used as an indicator of the pollution of the region.
3. A STUDY OF THE EFFECT OF SPATIAL SCALE ON VARIOUS RETESTING STRATEGIES
3.1 Introduction
Sampling to determine the extent of pollution traditionally involves
taking measurements at every sampling location, often on a grid. Composite
sampling is an alternative approach that forms composite samples from a
number of individual samples, tests the composite sample, and retests
aliquots of the individual samples when the test on the composite sample
indicates that one or more of its constituent samples may be polluted.
Used in this way, composite sampling is most efficient when the overall
contaminant levels are relatively low or when the contamination is
spatially clumped, for otherwise excessive retesting of the constituent
samples will be necessary.
The Center for Statistical Ecology and Environmental Statistics carried
out a simulation study (i) to compare the cost (number of measurements) of
various retesting schemes and (ii) to study how spatial pattern in the data
affects the overall performance of composite sampling in the hot-spot
identification (action level) case.
In the presence/absence case, measurements indicate the presence or
27
-------
absence of contamination, but the contaminant levels are not available or
are not important. The method is to form and test composite samples, and
then to retest aliquots of the individual samples comprising any composite
that tests positive for pollution. A variety of strategies have been
proposed for carrying out the retesting. These include: the classical
Dorfman (1943) scheme, the Sterrett (1957) scheme, the Gill and Gottlieb
(1974) scheme, and a scheme based on entropy (Hwang, 1984).
A hot spot can be defined to consist of contiguous locations at which
the contaminant concentration exceeds a certain level c. This value may be
an action level that would require some remedial action. Instead of
analyzing every sample at every location, compositing combined with a
suitable retesting strategy can be used to determine the particular
locations where pollution exceeds the level c. The method is to form and
analyze composite samples, and then to reanalyze aliquots of the individual
samples of any composite that tests greater than c/k, where k is the number
of samples in the composite. If a composite tests less than c/k then
(barring measurement error) one is assured that every component sample is
below the action level and no retesting is needed. In the contrary case,
some retesting is required because one or more of the component samples
night, though not necessarily, exceed the action level. When required, the
retesting can be carried out according to various strategies as described
below.
It is not possible to form composite samples using existing data.
Instead, a conceptual composite sample can be formed, and a value
calculated by averaging. Thus simulations can be carried out on realistic
data. The algorithms necessary to carry out these simulations are of two
types, composite sample forming and retesting. The computer programs to
implement these algorithms are given in Appendix A of Bolgiano, et al.
(1989).
3.2 Retesting Strategies
The four retesting strategies mentioned above are designed for use in
the presence/absence case. Briefly, these strategies are:
1. The Dorfman scheme exhaustively tests every individual sample
from composites that test positive for pollution.
2. The Sterrett scheme tests individual samples sequentially until
a sample tests positive for pollution and then forms a composite
sample from the remaining individual samples. The process is
repeated as necessary.
3. The Gill and Gottlieb scheme divides the individual samples into
two groups, as nearly equal in size as possible, and forms and
tests a composite sample for each group. The process is repeated
as necessary.
-28-
-------
4. The entropy scheme starts with a pool of untested samples from
which composite samples are sequentially formed and tested. When
one of these "original" composite samples tests positive, then a
"secondary" composite sample is formed using one half (or as nearly
as possible) of the individual samples from the "original" sample.
If the "secondary" composite sample tests positive, then the
remaining individual samples from the "original" composite sample
are returned to the pool of untested samples. On the other hand, if
the "secondary" composite sample tests negative, then a composite
sample formed from the remaining individual samples would have to
test positive. These individual samples are treated as belonging to
an original (but smaller) composite sample that tested positive, and
the process continues.
The above schemes were modified for use in the action level case. The
modification of the Dorfman scheme was straightforward. Samples are
tested sequentially until the sum exceeds c (average exceeds c/k); then the
remaining are composited.
The Sterret scheme was similarly modified so that the individual
samples are tested until the sum of the concentrations exceeds c. The
remaining samples are then composited. This scheme has been further
modified. The remaining pollution in the unrestested samples can be
calculated from the retested samples, and the unretested samples need not be
composited. The modified scheme tests samples sequentially until the
remaining pollution is less than c. This modified scheme can also be
considered a modification of the Dorfman scheme.
The modification of the Gill and Gottlieb scheme for the action level case
was straight-forward. A composite sample that exceeds the action level c/k
for the composite is split as nearly as possible into two composite samples
which are tested, etc. •
The entropy scheme proceeds as described in (4) above except that
the composite sample must exceed c/k instead of testing positive for
pollution.
3.3 Composite Sample Forming Schemes
In order to assess the effect of spatial pattern on the comparative costs
of the retesting strategies, various methods of forming the composite samples
have been examined. These were chosen to determine if information about the
spatial structure can be used to reduce the amount of retesting required in a
composite sampling program.
1. Random Order. Selecting and compositing observations at random
ignores all spatial structure. Different runs of this algorithm give
different composite samples and different results.
2. Natural Order. The data were perhaps collected or numbered in
-29-
-------
some systematic manner that reflects spatial structure. Composite
samples are formed using the data in the order received.
3. Circular Sectors 1. Composite samples ICS's) are formed based on
distance from the center of the data. First, a value is fixed for the number,
k, of individual samples in each composite. If k does not evenly divide
the total number of available samples, then the remainder ("left-over"
samples) are formed into a single circular CS located at the center of the
data. Next a circular ring of CS's is formed using just enough CS's so that
the remaining number of CS's can be divided by 4. Finally, circular rings are
formed with four CS's each.
4. Circular Sectors II. CS's are formed in sectors as above, except
that the "left-over" samples are along the boundary of the region. First a
single circular CS is formed. Then, rings of four CS's are formed. The
"left-over" CS's are in the next ring. Finally the "left-over" samples are
in the last ring.
5. Circular Sectors III. CS's are formed in sectors as above. The
"left-over" samples are in the central circle. One CS is in the next ring.
Then rings of four composite samples are formed and finally the "left-over"
CS's are in the last ring.
6. Vertical Strips. A square grid is superimposed upon the study
region so that on the average ten cells would be necessary to form a single
CS. This partitions the space into vertical strips, and CS's are formed
by proceeding up the first strip, down the second strip, and so forth until
all CS's are formed. The "left-over" samples, if any, comprise the last CS.
7. Horizontal Strips. This is the same as Algorithm 6 above, but
with the roles of horizontal and vertical interchanged.
3.4 The Output and Discussion
Each of the above composite sampling schemes has been combined
with the retesting schemes, and the routines were run on 4 data sets
using several composite sample sizes. The results also depend on the action
level. Tables of output are given in Appendix B of Bolgiano et al. (1989).
After looking at the output, the graphs given in Section 6 were produced,
using a simulation based on the Dallas lead study, see Isaaks (1984) and
Flatman (1984). Different action levels can be thought of as different
pollution levels. Low action levels correspond to high pollution levels and
high action levels to low pollution. That is if a data set had the same
spatial pattern with double the pollution, the result would be the same as
using the current data with the action level divided by two. In these graphs,
the horizontal line corresponds to the cost of testing every sample. If a
composite sample scheme falls above this line the cost (in measurements) is
larger than the cost of testing all individual samples. The results of using
the following retesting schemes are compared on each graph.
-30-
-------
1. The Dorfman scheme
2. The Sterrett scheme (modified - either Dorfman or Sterrett)
3. The Gill and Gottlieb scheme
4. The entropy scheme
The graphs show how the results vary with the pollution level.
Different figures give the results for different composite sample forming
routines. The composite sample forming was designed to study the result of
using spatial correlated data. The first two figures are for random
ordering of the data and corresponds to data with no spatial structure.
Figure 3 gives costs using the natural order of the data which may
incorporate some spatial structure. Figures 4, 5, and 6 show various ways
of forming composite samples starting in the center of the data (not
necessarily the center of pollution) and using segments of circular bands
to form the composite samples. Figures 7 and 8 use horizontal and vertical
strips to form composite samples. See Bolgiano, et al. (1989) for more
details. These algorithms are not thought to be optimal in any sense, but
they show that spatial pattern in the data can have a significant effect on
the results.
3.5 Discussion of Current and Future Research
The results of the analysis clearly show that the spatial distribution
when incorporated into the composite sampling schemes has an effect. There
was interaction with the composite sample size and the composite sampling
schemes used for this analysis. For the random composite samples with
high pollution levels, the modified Sterrett scheme was best with small
composite sample sizes (2-6). With lower pollution levels, the Gill and
Gottlieb was best with composite sample sizes moderate (8 or more). Note,
the Gill and Gottlieb was a close second in this case.
For composite samples based on circular segments with high pollution
levels, the results varied, but a good choice seems to be the Entropy scheme
with a composite size of 2-4. With lower levels of pollution the choice is
not as clear. The Entropy scheme with a composite sample size of 6 or the
Modified Sterrett scheme with a composite sample of size 6 would be
reasonable.
For composite samples based on rectangular grids, the Modified Sterrett
scheme with a composite sample size of about 6 is a good choice for high and
low levels of pollution. Usually the modified Sterrett scheme gave the lowest
number of tests.
The Gill and Gottlieb scheme and the Entropy scheme can be modified to
account for the information gained from the retested samples about the
unretested samples. When this is done, perhaps they will do as well as the
modified Sterrett scheme.
Just what is an appropriate scheme in these cases needs further study
to define. Schemes such as those used in this research need further
-------
development. Schemes that use different composite sample sizes in
different regions, and adaptive schemes where the composite sample size
depends on the results of the previous composite samples may result in good
procedures. A combination of exhaustive testing over part of the area and
the use of composite sampling over the rest may be appropriate.
Some of the retesting schemes require sequential operations. It is
not known how the complexity of the retesting schemes will affect the
overall cost savings, nor is it known if complicated laboratory
procedures, involving the sequential forming of composite samples, will
cause significant measurement errors. Indeed, this research has assumed
the measurement error is negligible.
The goal of this research is to be able to recommend a particular scheme
based on the expected pollution levels and the expected spatial patterns.
To achieve this goal, a more extensive study is necessary. Data sets with
different spatial patterns and pollution levels could be simulated and used
for testing the procedures. Additional composite sample forming algorithms
need to be formulated and tested.
-32-
-------
3.6 Graphs of Cost Vs Composite Sample Size
coMFosrrt sampuno rr rakdom ordxx
700
•••
•00
*00
10
20

IMMIlM •«#!! ••««
COMPOSITE SAMPLING BY RANDOM ORDER
MW VCVtfc - Ml
lt| 1M0KI «•««
COMPOSTTE SAMPLING BY RANDOM ORDER
mii« vrvtt -
700
000
300
too
20
1 0
0
10
m#mti mi
COMPOSm 9AMPUJVC BY RANDOM ORDER
•Nt
700
*00
400
300
too
100
10
10
to
iwimt Mufti« «iat
FIGURE /
-------
C0MP08TO SAMPLING BT RANDOM CRUDI
'm V«V«l • t««
TOO
••I
•00
too
100
ft
It
to
20
IMTMMI IIMPi | |i||
coMPosrrc sampling by random order
mum tivii •
*?
00
fOO
too
0
10
10
10
coMPosnr sampling by random order
MtiW iCVCi • IM
700
• 00
soo
400
300
aoo
100

COMPOSnX SAMPLING BY RANDOM ORDER
Mfiia kittb - »•••
Mat
700
000
too
900
»*«
FIGURE 2
-34-
-------
COMPOSITE SAMPLING BY NATURAL ORDER
M1 >«M •
COMPOSITE SAMPUNC BY MATITRAL OROOI
Mf i«i • N«
composite sampling by natural order
ICVll • !«•
composite sampling by natural order
700
•00
soo
*00
300
300
100
*ftri •rr+tli
>0
FIGURE 3
-35-
-------
COMPOflm SAX7L2NG BY CDtCVUK SCKtWt I
llfti -
COMPOSm SAMPLING BY ORCVUJl SCHQfK 1
vrv«k - m
COMPOSITE SAMPLING BY COCUlkX SCHDCC I
Mf <« KVIk
COMPOSITE SAMPLING BY CTRCUUK SCHEME 1
tCVlb • * —
•00
•00
~00
>00
>00
100
700
•00
•00
400
200
100
FIGURE 4
-36
-------
composite sampling by aaciruK scheme z
*41 iaa i(«li - tH
•00
100
too
100
30
II
im#vi «*vi
CI
couposrrc sampling by circuuk scheme a
Ml • •• - M«
#00
300
10
nil
COMPOSTTE SAMPLING BY OUCULAJi SCHEME 2
*«'««¦ ii«ll - IN
MIT
700
•oo
• 00
~00
300
0
20
1 5
10
0
s
IMMItl >M»11 «i||
OOMPOSnX SAMPLING BY CIBCUU* SCHEME 2
iirtk • '•••
700
000
900
*00
900
too
IOO
0
II
to
0
10
ivc wa#ki iti«
FIGURE 5
-37-
-------
OOXPOSrrt SAMPLING BY CJRCVUK 9CHZKK 3
*•* I «¦ v(«fk - IN
compos rrt sampuhg by obcuia* 9chxus 3
*•<>«i trvti - i««
TOO
100
400
100

000
600

coMPosmc sampling by crccuu* soma a
Ml iM tlVlh • H*
OOUPOSnX SAMPLING BY CIRCUUK SCHEME 3
Mt IM ll«lL -
• 00
•00
*00
>00
<00
100
soo
FIGURE 6
-38-
-------
coxpoenr samplwc bt grid scums i
¦« VlVCt • IN
5
b
lo
o
JO
>MtV| |M#i|
4
COMPOSITE SAMPLING BY GOT) SCKTMX t
*«»<• Uvli - MM

j,
!P
io
k
i
h>
k

ii^u
-------
COMPOSITE SAMPLING BY GRID SCHEME 2
*4Vl «M llVCt • <•#
700
1 00
0
0
ft
10
1ft
20
lu#il till
COMPOSITE SAMPLING BY GRID SCHtMl 2
ktVlb • IN
• 00
<00
20
l<#M(f| li
FIGURE 8
COMPOSITE SAMPLING BY GRID SCHEME 2
MflM VfVIl - •••
(Mt
700
ftOO
~ 00
*
200
100
ro
COMPOSITE SAMPUNG BY CKID SCHEME 2
MOW kivtfc • '•••
700
• 00
400
200
100
m»wiic
iMiii nu
-40-
-------
ACKNOWLEDGEMENTS
The work on this paper has been carried out with partial support of EPA
Research Grant CR 815 273 010 and SRA/EPA Research Contract 40400-S-01 to the
Penn State Center for Statistical Ecology and Environmental Statistics. Our
thanks are due to Karl Held of SRA Technologies, Inc. and to
Herbert Lacayo, Jr. of ETA Statistical Policy Branch for their support and
encouragement. We are also thankful to our colleagues Nicholas Bolgiano and
Charles Taillie for their interest and several technical discussions.
REFERENCES
Bolgiano, N. C., Boswell, M. T., Patil, G. P., and Taillie, C. (1989).
Ta«k 4: Report on Evaluation of Selected Statistical Methods with
Potential for Addressing Superfund Site Characterization Problems.
Final Interim Report to SRA Technologies, SRA Technologies/EPA Prime
Subcontract to Penn State, Subcontract No. 40400-5-01, October 31,
1989.
Boswell, M. T. (1977). Composite sampling. Invited Paper for Satellite A,
International Statistical Ecology Program, College Station, Texas and
Berkeley, California.
Boswell, M. T. (1978a). Composite sampling. Invited Paper for Satellite B,
International Statistical Ecology Program, Parma, Italy.
Boswell, M. T. (1978b). Composite sampling and its application to the
estimation of plankton density. A chapter of the Pinal Report,
NEFC-03-7-043-35116, National Marine Fisheries Service, Woods Hole,
Massachusetts.
Boswell, M. T., and Patil, G. P. (1987). A perspective of composite
sampling. Co/mutt. Si at i si .-Theory Heth., 16(10), 3069-3093.
Dorfman, R. (1943). The detection of defective members of large
populations. Ann. Math. Statist., 14, 436-440.
Elder, R. S. (1977). Properties of Composite Sampling Procedures. Ph.D.
Dissertation, Virginia Polytechnic Institute and State University,
Blacksburg, Virginia.
Elder, R. S., Thompson, W. 0. and Myers, R. H. (1980). Properties of
Composite Sampling Procedures. Technomet rics, 22(2), 179-186.
Feller, W. (1968). An Introduction to ProbabiIity Theory and Its
Appticat ions, Vol. I, Third Edition. John Wiley & Sons, New York.
Flatnan, G. T. (1984). Using geostatistics in assessing lead contamination
near smelters. In Environmental Sampling for Hazardous Wastes,
-41-
-------
G. E. Schweitzer and J. A. Santolucito, eds. American Chemical Society,
Washintgon, DC. pp. 53-66.
Garner, F. C. , Stapanian, M. A. and Williams, L. R. (1986). Composite
sampling for environmental monitoring. To appear in Proc. Amer. Chem.
Soc. 139, NACS Meeting, Principles of Environmental Sampling, April 1987.
Gill, A. and Gottlieb, D. (1974). The identification of a set by
successive intersections. Informal ion and Control, 24, '20-35.
Hueck, H. J. (1976). Active surveillance and use of bioindicators. In
Principles and Methods for Determining Ecological Criteria on
Hydrobiocenoses. Pergamon Press, New York. pp. '275-286.
Hwang, F. K. (1984). Robust group testing. J. Qual. Tech., 16, 189-195.
Isaaks, E. W. (1984). Risk Qualified Mappings for Hazardous Waste Sites: A
Case Study in Distribution Free Geostatistics. Masters Thesis,
Department of Applied Earth Sciences, Stanford University. 85 pp.
Rohde, Charles A. (1976). Composite Sampling. Biometrics, 32,
273-282.
Rohde, Charles (1979). Batch, bulk, and composite sampling. In
Slatistical Ecology Series Vol. 5: Sampling Biological Populations,
R. M. Cormack, G. P. Patil, and D. S. Robson, eds. International
Co-operative Publishing House, Fair land, MD. pp. 365-377.
Sterrett, A. (1957). On the detection of defective members of large
populations. Ann. Math. Statist., 28, 1033-1036.
-42-
-------
COMPOSITE SAMPLING AND SUPERFUND SITES
M. T. Boswell and G. P. Patil
A Discussion by
David J. Schaeffer
University of Illinois
Department of Veterinary Biosciences
2001 South Lincoln Avenue
Urbana, IL 61801
Boswell and Patil consider a dichotomous situation in which a particular trait, say the presence of a
pollutant, is or is not present in the i-ih sample. Furthermore, it is assumed that the probability that the trait
exists in the i-th sample, P[Ij =» 1] » p, is small (i.e., the trait is rare). This assumption implies that the occur-
rence of the rare trait in samples has a binomial distribution with an incidence p, as Boswell and Patil make
explicit in section 2. A composite sample is formed by mixing equal quantities of k independent samples
prior to analysis. If the trait is found in the composite, further testing is required to identify the specific
samples having that trait. The first interest the analyst has, then, is whether it would be more efficient to
analyze each individual sample or analyze the composite and carry out follow-up analyses. Boswell and Patil
give the answer to that question as the relative cost: RC =* 1 + Ilk - (1 -p)k. The authors then ask: (1) Is there
a value of p above which RC > 1? (2) Is there an optimal value for k given pi Using simulation studies, they
show that RC > 1 when p >0.3, and that certain values of k minimize RC when p < 0.3.
There are four points of their subsequent discussion I now briefly address and for which 1 identify
additional research needs.
Point 1
In section 13 the authors state: "If the composite sample concentration exceeds c/k, then it is neces-
sary to test each individual sample." However, there may be situations or testing conditions for which limited
additional testing is all that is required. Consider the hypothetical situations shown in Figure I. Composites
A and B are each formed from 10 samples of unknown concentration. In composite A, sample 1 has the
constituent at a concentration of 2 units and the other samples do not have it all. In composite B, each
sample has the constituent at a level of about 0.2 units. The final concentration in each composite exceeds
the "action level" given by c/k > 0.19. For each composite the action level is exceeded, so we need to deter-
mine whether to test each individual sample (Dorfman retesting scheme) or if the necessary information can
be obtained using a second stage of compositing.
When composite A is split into two smaller composites, one of these will not have the constituent
present and the other will have the constituent at twice the concentration found in the original composite. In
contrast, the two composites formed from composite B each have the constituent at a mean level of 0.2 units.
Consequently, by considering the changes in concentration between the first and second stage, an analyst
might be able to distinguish those composites containing samples which exceed the action level from those
which do not contain such samples. The research needs are:
Presented at the EPA Workshop on Superfund Hazardous Waste. Statistical Issues in Characterizing a Site:
Protocols, Tools and Research Needs. February 21-22,1990, Crystal City Sheraton, Arlington, VA.
-43-
-------
(1) Identify the range of outcomes using representative sample combinations.
(2) Evaluate the effects of p and k on the outcomes.
(3) Determine if, and how, a 1 stage resampling can be used to estimate p.
(4) Identify better criteria for selection of a resampling plan.
Figure I: Hypothetical Composites and Action Level c/k — 0.19
Composite A 2 0.2 Composite g = o.2
(1 sample = 2; 9 samples =0) (10 samples each « 0.2)
Retest
k=5
Retest
k=5
Point 2
Consider a situation where the cost-effective composite size is k, the action level is c, and one sample
has a concentration > c. Consequently, the concentration in the composite, X, will exceed c/k. If X also
exceeds the detection limit (DL) of the method, then the resampling plans Boswell and Patil discuss may
apply. However, for many pollutants c is near DL, so X < DL, i.e., the analyte concentration is not in the
optimum analytical range. When this occurs, the analytical error can be large relative to the analyte concen-
tration, with consequences discussed by Janardan and Schaeffer (1979).
One way of proceeding is to include a sample of known concentration, SI > k DL, thereby ensuring
that all composites have a measurable concentration. Values of the analyte significantly below Sl/k then
signal problems in the sequence of forming the composite, extracting the analyte, and quantitation (Janardan
and Schaeffer 1979). Also, let the known sample contain a second similar analyte 52, at concentrations [5/]
* [52]. Assume that the recovery ratio is 1. With this additional information, the analyst can determine the
amount of SI in the sample even if recovery is poor since the ratio of [57] to [52] > 1 when SI is present else
the ratio will be » 1. The questions are:
(1) Devise optimum strategies for carrying out such a spiking procedure.
(2) Identify how to use the data to trigger further resampling.
Point 3
An important new contribution of the research are the results on the effects of incorporating spatial
information when forming the composite. Figure II shows effect-contours for a site at Rocky Mountain
Arsenal (Thomas et al. 1986). Effect magnitudes, like concentrations, change with location and depth. My
interpretation of the results depicted in Figures 1-8 of Boswell and Patil is that composite resampling
schemes fail when spatial composites are formed. Figure II may help to explain why such failure occurs.
The statistical model used in the resampling schemes assumes that the contaminant will occur in a
sample with frequency/7. An estimate of the chance that a composite will contain the analyte is obtained
using binomial distribution theory given in Boswell and Patil. However, Figure II shows that the effect
magnitudes, and consequently the probabilities of showing an effect, change spatially. Under the usual
assumption that toxicity is linearly related to the logarithm of the concentration, the effect contours imply
-44-
-------
similar concentration contours. That is, the probability that a sample will have an analyte concentration
exceeding c changes with location. Therefore, the incidence rate p is not a constant as assumed in the
compositing model but varies with location and is thus represented by a spatially determined vector p = (/>j,
/?2—)• This changes the probability model from binomial to multinomial or some compound distribution,
viewed, say, as multinomial incidence rates, the research questions are:
(1) How does the basic model for RC change?
(2) How does the change in model alter the sampling strategy?
Point 4
Point 4 follows from point 3. As the concentration of the analyte increases, p increases. Taking
spatial composites will generally result in more than 1 sample having the trait. Consequently, when resam-
pling is based on forming new composites of size k \ a higher proportion of such composites will likely exceed
an action level of c/k' than when the composites are formed randomly. The data obtained from such
(sub)composites is wasted since, eventually as p increases, every sample must be tested anyway. It follows that
the Gill and Gottlieb and entropy schemes will fare worse than the Dorfman and Sterret schemes when spa-
tial composites are formed. Key questions include:
(1) If the analyst is willing to form both a random composite and a spatial composite, what additional
information is gained from the second sample?
(2) Do spatial composites have inherently higher (or lower) information content than randomly formed
composites?
(3) If (2) is true, what is the optimal way to form such composites and under what conditions should they
be used? (For example, are spatial composites appropriate when contamination is low but not
high?)
Janardan, K. G., and D. J. Schaeffer. 1979. Propagation of errors in estimating the levels of trace organics in
environmental sources. Analytical Chemistry SI: 1024-1025.
Thomas, J. M., J. R. Skalski, J. F. Cline, M. C McShane, J. C. Simpson, W. E. Miller, S. A. Peterson, C. A.
Callahan, and J. C Greene. 1986. Characterization of chemical waste site contamination and determination
of its extent using bioassays. Environmental Toxicology and Chemistry 5:487-501.
-45-
-------
Figure II: Soil Spatial Gradients in Lettuce Seed Mortality (Thomas ei al. 1986)
I I 10*.20
20*30
30* • 50
iO»-75
75-100
a 20
DITCH
IS 30 45
OlSTAMCt (ml FftOM NORTHEAST CORNER
Fig- J- Estimated letiure »eed mortality (bi*«d on krijirgi for the 0-15 cm soil fraction from the Rocky Mountain
Arsenal.
IS 30
OISTANCC fm) FROM NORTHEAST CORNER
10*-20
20* -30
30* SO
50* ¦ 75
75*100
BITCH
Fig. 6. Estimated lettuce seed mortality (band on krifing) for the 15-30 cm soil fraction from the Rocky Moun-
tain Arsenal.
-46-
-------
The Utility of Composite Sampling for Hot Spot
(Contaminant Plume) Identification at Superfund Sites
A Discussion
R. Rajagopal, Professor, Departments of Geography and
Civil & Environmental Engineering
302 Jessup Hall, The University of Iowa
Iowa City, IA 5224 2
Introduction
This discussion will address some of the key issues that are
often raised when one proposes to use composite sampling as a
screening tool in ground-water quality monitoring at hazardous
waste sites. Several papers on the topic of composite sampling,
based on sound statistical reasons, have been reported in the
literature (Hueck, 1976; Rohde, 1976; Boswell, 1977, 1978a,
1978b; Boswell and Patil, 1987, 1990; Garner, Stapanian, and
Williams, 1986; and Rajagopal and Williams, 1989). The ideas
presented in these papers can be used as a springboard for the
development of practical screening tools for use in ground-water
quality monitoring at hazardous waste sites.
Patterns in the distribution of compounds at waste sites,
development of probes for selected compounds, and the use of such
probes in field screening in combination with potential sample
compositing schemes for use in the laboratory and the field are
all exciting areas for further research. For compositing to be
an economical alternative, integration of efforts from many
fields are essential.
As a rule, in the beginning stages of germinating an idea at
the interface of different fields, issues regarding the idea's
feasibility, economic viability, and applicability are often
raised. The idea of sample compositing in ground-water quality
monitoring at hazardous waste sites, is no exception to such a
rule. We should also be aware that for such an idea to come to
fruition, it must eventually pass several testis of implementation
in the field and in the laboratory.
In the development of composite sampling schemes, it is
necessary to address issues related to matrix heterogeneity
(interference due to the presence of multiple contaminants and
their inter-actions), current QA/QC practices, limitations of
existing technology, cost-effectiveness of monitoring operations,
regulatory requirements and liability laws. Key issues in some
47
-------
Matrix Heterogeneity
When samples are composited from multiple wells or sources,
the chemistry of the composite may or may not be significantly
altered depending on the nature and the composition of individual
samples. Several issues regarding sample reactivity, volatility,
false positives/negatives, and reporting limits that might hinder
the use of compositing in ground-water quality monitoring will
have to be addressed. There is little published literature on
such problems in the context of sample compositing. Laboratory
and field experiments to verify the severity and extent of such
problems is an important area for further research. The results
of such experiments (feasibility studies) will provide the much
needed facts and figures for the assessment of the scope and
utility of composite sampling as a screening tool in ground-water
quality monitoring at hazardous waste sites.
Detection Limits
Composite sampling will effectively increase the method
detection limit by a factor of n (number of samples composited)
and thus reduce the likelihood of detecting some contaminants.
For many VOCs monitored under the Superfund program such increase
in detection limits will be unacceptable under current regulatory
practices. If the reporting limits are set very close to or at
method detection limits, the utility of composite sampling will
be minimal. The relationship between reporting limit and method
detection limit is an extremely important area for further
research.
Of course, it might be beneficial not to think exclusively
in terms of an existing scan, current technology, a predefined
reporting limit, a fixed notion of QA/QC, and an inflexible
regulatory framework. The following quote in this regard is
illustrative (Environmental-Monitoring Systems Laboratory-Las
Vegas, US EPA Newsletter, Number 24, August 1988):
... A new state-of-the-art gas chromotography/Fourier
transform infrared (GT/FT-IR) system containing light pipe
technology has improved sensitivity to the 10-50 nanogram
range for trace analysis of environmental samples. Past use
of GC/Ft-IR has suffered from sensitivity in the mid-ppb to
low-ppm range. The EMSL-Las Vegas scientists evaluating the
new method expect the GC/FT-IR systems to have many
applications in the Superfund and RCRA programs.
Given such possibilities, it is prudent not to write-off the
utility of sample compositing in ground-water quality monitoring
at hazardous waste sites.
-48-
-------
Quality Assurance and Quality Control
One could argue that the labor intensive processes of compo-
siting, analyzing, recompositing and reanalyzing will result in
recommended sample holding times being exceeded. This is a two-
edged sword,
a) An extract of each sample can be preserved and compositing
can be carried out on the extracts. Thus, compositing of
extracts may make it possible to meet holding times that
sample load of a lab otherwise would preclude, and
b) Compositing and analyzing schemes should be tailored or
designed to assure that this does not happen.
Another often raised issue is that compositing will result
in a fragmented data set which will greatly reduce the usability
of monitoring data. Again, this statement is based on historical
experiences and a fixed notion of data quality. If our approach
is computerized to estimate the results of individual analyses
from composited results, we will have a sound monitoring data
base. For example, if a single well in a network of wells at a
site is contaminated, the analytical results from this well will
be verified by at least two samples at different dilutions,
providing significant quality checks without any additional
costs. Thus, in the context of such an interpretation, the QA/QC
costs will decrease with the compositing approach.
Economics and Efficiency
It is claimed that the cost of composite sampling relative
to exhaustive sampling would be significantly higher because
compositing is labor intensive and laboratory automation could
not be efficiently used. Such concerns are rooted in past
approaches to laboratory practices. Every automated technique of
today was a labor intensive procedure of yesterday. It is quite
likely that the whole process of compositing can also be
roboticized and automated.
Another argument offered against compositing is that it will
require the collection of larger volumes of samples (or
additional bottles). This by itself may not increase the cost of
sample collection significantly. For example, if compositing is
done in the laboratory such costs will be negligible. The cost
of collecting a larger volume of sample or additional bottles
will have to be considered within the context of the overall cost
of field visit, van rental, transport, shipping and storage.
Finally, in the analysis of cost-effectiveness of sample
compositing, in addition to considering the variable cost of
sample collection, storage, and analysis, it might be also useful
to account for certain fixed expenses such as the cost of well
construction, installation, and operation and maintenance.
-49-
-------
References
Boswell, M. T. 1977. Composite Sampling. Invited Paper for
Satellite A, International Statistical Ecology Program,
College Station, Texas and Berkeley, California.
Boswell, M. T. 1978a. Composite Sampling. Invited Paper for
Satellite B, International Statistical Ecology Program,
Parma, Italy.
Boswell, M. T. 1978b. Composite Sampling and its Application to
the Estimation of Plankton Density. A Chapter of the Final
Report, NEFC-03-7-043-35116, National Marine Fisheries
Service, Woods Hole, Massachusetts.
Boswell, M. T., and Patil, G. P. 1987. A Perspective of
Composite Sampling. Comaun. Statist. - Theory Meth.,
16(10): 3069-3093.
Boswell, M. T., and G. P. Patil. 1990. Composite Sampling and
Superfund Sites. Draft Paper presented at the EPA Workshop
on Superfund Hazardous Waste: Statistical Issues in
Characterizing a Site: Protocols, Tools, and Research Needs.
Arlington, Va.
Garner, F. C., Stapanian, M. A., and Williams, L. R. 1988.
Composite Sampling for Environmental Monitoring.
Chapter 25. In "Principles of Environmental Sampling."
Lawrence H. Keith (Ed.). ACS Professional Reference Book.
American Chemical Society, pp. 364-74.
Hueck, H. J. 1976. Active Surveillance and use of Bioindicators.
In Principles and Methods for Determining Ecological
Criteria on Hydrobiocenoses. Pergamon Press, New York,
pp. 275-286.
Rajagopal, R., and Williams, L. R. 1989. Economics of Sample
Compositing as a Screening Tool in Ground-Water Quality
Monitoring. Ground Water Monitoring Review. 9(1): 186-192.
U. S. Environmental Protection Agency. 1988. Newsletter.
Environmental Monitoring Systems Laboratory - Las Vegas.
Number 24, August 1988.
-50-
-------
SESSION 2: COMPOSITE SAMPLING AND HOT SPOT IDENTIFICATION
Forest C. Garner and Martin A. Stapanian
Lockheed Engineering & Sciences Co.
1050 E. Flamingo Rd, #251
Las Vegas, NV 89119
The paper by Boswell and Patil is an important contribution not only because it is an
extension of existing knowledge but also because of its focus on environmental
application and Superfund site characterization. It is this focus that makes the paper
readable by applied scientists, and therefore most useful to them. The examples are
drawn from real environmental data sets, emphasizing the practicality of the techniques.
Traditional applications of compositing are discussed, including locating hot spots,
reducing variance, and estimating probabilities. Each time a new paper is written the
techniques are illustrated more clearly and eloquently. Boswell and Patil also present
new information derived from a simulation of four retesting strategies and seven
compositing schemes focussing on comparisons of each strategy under each scheme.
The simulation uses data from the Dallas lead study. This study had multiple objectives,
not all of which were compatible with compositing. However, Flatman and others have
already characterized the spatial distribution of lead at this site. Boswell and Patil are
using the original data to determine the applicability of various compositing and retesting
strategies when the concentrations are spatially autocorrelated. This is the first work we
have seen on this very complex problem. How does the spatial distribution of the
pollutant affect the optimal compositing strategy? Unfortunately, it is not clear how to
design an optimal compositing and retesting plan in advance of a study when little is
known of the spatial distribution and hot spot frequency. Before the study begins, the
statistician typically must advise the designers, chemists, and politicians whether or not
to use compositing, and exactly which compositing strategy and retesting scheme are
most appropriate. Clearly, it is a waste of time and resources to make these decisions
after the study is over. Thus, Boswell and Patil's paper will probably stimulate further
research to determine (1) the relationship between optimal compositing and retesting
strategies and the form of the spatial distribution of the pollutant(s), and (2) methods for
finding the optimal strategy from limited preliminary information, such as would be
available from a pilot study.
Additional future research may also include the application of compositing to efficiently
estimate certain components of variance. Compositing can take place in many phases,
including combining aliquots from different samples, combining aliquots from the same
sample, and combining extracts from different aliquots. The different components of
variance resulting from the measurement of each of these types of composites may be
important from the perspectives of quality control or method optimization. Researchers
-51-
-------
may wish to derive the most cost-effective estimates of these components.
Other research may include more realistic accommodation of measurement errors, more
realistic cost models, and quality assurance and quality control strategies appropriate
for composite sampling. We would like to emphasize the quality control perspective.
When compositing is used, the quality control procedures of traditional sampling plans
are often inappropriate. For example, the use of double-blind quality control samples
may be a highly effective way to identify problems in a traditional program, but will
generally prove too costly in a compositing scenario where the resulting positive analysis
may result in several reanalyses and hence much greater expense. Additionally, in a
compositing scheme there may be more costly consequences of false positives and false
negatives. This may call for the use of more blanks and low-level calibration checks in
order to control these error rates.
By far the greatest need is for increased awareness and understanding of the
advantages and limitations of compositing. Scientists are sometimes unreasonably
hesitant to recommend or even consider composite sampling. Some concerns are valid
but others reflect the need for more education and especially communication. That is
why papers like Boswell and Patil's are so important.
-52-
-------
Session 2: Compositing and Hot Spot Identification
L1 ewe 11 yn R. Willi an;s
U. S. Environmental Protection Agency
EMSL-LV
P. 0. Box 03178
Las Vegas, NV HHIf 13-34G8
I am sure glad that this workshop is an indication that there is momentum going'
forward in th<^ area of composite sampling, and I appreciate Marilyn's paper.
Again, I share Forest's appreciation that the compositing lias been extended into
the spatial realm in this paper. I do have a number of concerns with respect to
ar'-ar. where we have to be a bit more realist ic about how compos i te ramp] ing is
taken up and how it is implemented. David, was it you, this morning, who
mentioned that our problems do not usually tunic one uiiai> tu at a I iine.' Hi » s is
certainly a problem as all of the statistics you have seen hero htive be.'ii based
on single target analyte problems. How do these apply to Super fund where
virtually everything is multianalyte, and even if we hail a target to go in with,
we have several lui.g lists of analytes that we must analyze for with Super-fund
samples. This can be a real problem. A partial answer might come out of a group
like this, associated with composite sampling, in conjunction with
recommendations from some of our panels on multianalyte, multivariate approaches.
This could involve the creation of single value, from the data for all of the
analytes quantified, and the use of this multivariate value I <> ; ar 1 control
areas with areas that we have just treated or between treated and untreated.
This is only one possibility of capturing much more of the information content
resident in the samples we analyze. I don't think we have a chance of convincing
the Superfund program that all we have to look at is a single target analyte at
a particular site. Very difficult. So, if we could approach it in a
multivariate way, I think we have got some chance.
Where do I come at this? Well, I am Director of a Quality Assurance and Methods
Development Division in Las Vegas so I come at it from both the methods side and
the quality assurance side. I am concerned about, from a quality assurance
standpoint, something that Forest mentioned with respect to where to composite.
Do we just composite the field samples or do we bring the samples back in, do an
extraction on them, and then composite them when they are in the fluid phase.
It is a lot easier to combine solutions than to combine soil samples. And, to
do so in a much more valid sense. I think we have to know what the various costs
are and where the variability is in each one of these segments of our process so
we can put our money where the error is. But, we have all seen situations such
as have been described. I know that Forest had an example that he didn't include
for want of time, but I will anyway. This is with respect to a whole bunch of
samples that were taken for a program years ago for dioxin analyses at relatively
high cost. After 2,000 samples or so had been taken, and all came up blank, we
thought in retrospect that it would have been nice to have been able to take a
look at those samples as a composite and save an awful lot of money. Well, it
turned out that there did exist a method with much better sensitivity than the
-53-
-------
oik; we wi'T'' using but at. a higher cost. I! ::ii;;hl i ndeed have been cost effect i v«-»
to us<; that- more expens i vp method on far fewer samp]es. This is really
concern. Mnrllvn i nd i cat o.) that h*j really didn't want to get into • or of
dilution with respect to compositing and the relationship of sample concent ratj on
to the composite concentration. But, this is a very real problem both from n
policy standpoint as well as from a technical standpoint. As long as there
exists legislation that says your action level will be e
-------
COMMENTS BY PARTICIPANTS
On Boswell and Patil
Cynthia J. Kaleri (U. S. Environmental Protection Agency): I believe that
composite sampling can be extremely beneficial in various cases; however, the
example of compositing groundwater samples is way off-base!
Data Quality Objectives: Why is groundwater data collected at Superfund
sites? Answer:
(1) Determine quality of aquifer and physical characteristics ...composites
might work; but foremost,
(2) Typically to characterize a contaminant plume.. .only discrete samples can
help conceptualize and properly document the outer boundaries of the plume.
(Several other reasons, also).
Conclusion: The example was not practical for a Superfund Site...but
the concept should be followed up on (soils as an example).
Susan Braen Norton (U. S. Environmental Protection Agency): I have four
practical issues to bring up regarding composite sampling:
1) The assumption of a homogeneous mixture of samples may not be valid
particularly for soil samples. For example, if a hydrophobic organic chemical
is preferentially partitioning onto organic matter, then, if your aliquot
contains organic matter you'll get a "hit" and vice versa.
2) Mixing a group of samples that contain volatile or even semivolatile
chemicals may result in the loss of that chemical.
3) How can we handle it when y/k is below the detection limit?
4) Long lab analysis turn around time may hinder resampling efforts.
David Schaeffer (University of Illinois): Before the site is cleaned up, we
expect that a given pollutant will be present in samples at levels exceeding c.
Hence, it would seem that the purpose of compositing is to discover those samples
which are not polluted.
After cleanup, we expect most samples to be free of the pollutant. The
purpose of compositing is to discover samples which are contaminated.
Your paper considers the second case. How do the distributions, power curves,
relative cost, etc. change for the first case?
Roy L. Smith (U. S. Environmental Protection Agency): Although the promise
of increased statistical power and decreased cost achieved by compositing is
appealing, practical applications will require resolution of the following
issues:
1) Where volatile compounds are of concern, compositing may be inappropriate.
The act of compositing may result in substantial loss of these substances,
resulting in an underestimation of contamination.
2) Compositing has the effect of raising detection limits for analytes. It
is therefore inappropriate for substances (such as vinyl chloride in groundwater)
55
-------
which already exceed health-based criteria at the detection limit.
3) Superfund action levels and cleanup goals may be risk-based rather than
concentration-based, and may vary for individual contaminants depending on levels
of other compounds in the sample.
4) Superfund cleanups are often performed under intense deadline pressure.
Time for retesting individual samples is often unavailable.
Mindi Saoparsky (U. S. Environmental Protection Agency): Composite sampling
should be considered out of the question for ground-water samples.
Hydrogeologists take such special care in the spatial placement (location) and
vertical placement (discrete monitor points) of wells that any compositing of
ground-water samples would not only negate these efforts, but also cause data
unusable for risk assessment and remedial action.
In addition, the savings for ground-water compositing is minimal since most
of the costs incurred regarding ground-water sampling concern well installation
and the use of appropriate sampling equipment (i.e. bladder pumps vs. bailers)
for specific chemicals.
-56
-------
SUMMARY 07 CURRENT RESEARCH IN ENVIRONMENTAL STATISTICS
AT THE
ENVIRONMENTAL MONITORING SYSTEMS LABORATORY - LAS VEGAS
Evan J. Englund
and
George T. Flatman
U.S. Environmental Protection Agency
Environmental Monitoring Systems Laboratory
Las Vegas, Nevada 89193-3478
NOTICE
Although the research described in this article has been
supported by the U. S. EPA, it has not been subjected to Agency
review and no official endorsement should be inferred. Mention
of trade names or commercial products does not constitute
endorsement or recommendation for use.
INTRODUCTION
When we seek to identify research issues and define research
needs for Superfund or any other program, one sensible approach
is first to review what is already being done. We can then
determine what additional work is needed, and examine priorities
and levels of effort. This paper describes the program in
environmental statistics, geostatistics and chemometrics being
conducted by the Exposure Assessment Research Division (EAD) of
the Environmental Monitoring Systems Laboratory - Las Vegas
(EMSL-LV). This is not an exhaustive review, as it does not
cover the various statistical projects conducted by other
divisions at the laboratory.
The EAD program began in 1980, and pioneered the application
of geostatistical methods for site assessment with the evaluation
of lead contamination in soils surrounding smelter sites in
Dallas, Texas (Brown, et.al.,1985). "Dallas Lead" has become the
classic case study in the field.
The overall objective of the EAD research and development
program is to develop improved strategies for practical, cost-
effective environmental sampling, monitoring, and assessment.
The following sections describe individual research activities.
COST-EFTECTIVB SAMPLING DESIGN TOR SITE ASSESSMENT
The objective of this project is to quantify the cost-
benefit relationship between various soil sampling design
-57-
-------
parameters such as number, quality, and pattern of samples, and
the outcome of site remediation decisions. Principal
Investigators are Evan J. Englund, Dennis D. Weber, and Nancy
Leviant.
Contaminated soil is a problem at many Superfund sites.
Often large volumes of soil are affected; expensive remediation
methods such as incineration or stabilization lead to very high
estimates of remediation costs. Decisions whether or not to
remediate a site, or which portion of a site to remediate, are
based on data acquired through a sampling program. Incorrect
decisions are potentially very costly to society, either through
unnecessary clean-up costs or through unnecessary long-term
exposure .
A soil sampling plan designed to evaluate a contaminated
site is the net result of a series of design factors (number of
samples, pattern, etc.). Each design decision affects the
quality and cost of a site assessment. Inadequate sampling
incurs the costs associated with poor decisions, while excessive
sampling pays for unnecessary information. The objective of a
sampling plan should be to minimize the overall cost to society.
When samples are used to select the most highly contaminated
portions of a site for remedial action, the procedure necessarily
involves an interpolation step wherein the data from sample point
locations are transformed into an estimation surface over the
entire area. Contour or isopleth maps are frequently used to
display such surfaces. Decisions about areas are then based on
the estimates. Previous studies of sampling design have not
adequately dealt with the relationship between sampling design
parameters and the quality of area-based decisions after
interpolation.
The specific decision considered in this study involves
making estimates of local concentrations for small sub-areas or
grid cells (blocks) within the site, and choosing to remediate
only those blocks whc?.e estimated concentrations exceed the
action level.
Our experimental approach is to use a large surrogate "site
model" data set which exhibits many of the statistically
undesirable characteristics encountered in real environmental
sampling. The model is repeatedly resampled. Ordinary kriging
estimates from each sampling are used to make remedial decisions,
and the quality of the decisions is determined from the model.
We have selected a subset of the larger Walker Lake data set
(Isaaks and Srivastava, 1989) as the site model. The data are
highly positively skewed, discontinuous, and exhibit a spatial
correlation structure. The subset of the Walker Lake data set
chosen for this study contains 19,800 data in a 110x180 array
(Figure 1).
-58-
-------
Figure 1. Shaded map of the site model showing 19,800 points.
The site model has been subdivided into 198 square blocks,
each containing 100 data values (Figure 2). Average true block
values have been computed for comparison with block estimates
made from the various samplings.
Figure 2. Shaded map of site model, showing 198 true block
means.
-59-
-------
The experimental approach is a 3x3x2 factorial design, with
three different sample sizes, three different sample patterns,
and two levels of sample error. Combinations of these lead to 18
different sample designs, each of which was repeated three times
for a total of 54 samplings.
Sample size refers to the number of samples to be collected
in a given sampling. The three sample sizes to be used are 104,
198, and 308. The three sample patterns to be used are simple
random, cellular stratified, and regular grid. Cellular
stratified sampling involves selecting a randomly located sample
within each grid cell, and is intermediate between the other two
patterns. Sample error represents the cumulative total of all
possible error components included in the collection, handling,
preparation, and analysis of a sample. Two levels of sample
precision are considered in this study - a base level at zero
error, and a high level at a relative standard deviation (RSD) of
32 percent.
Bias was not included as a part of the factorial design.
The effects of adding a constant bias to a set of sample data
were evaluated later by adding the bias to the kriged estimates
and recalculating the decision quality measures.
Decision quality is measured by a (linear) loss function.
The assumption is that society pays a cost for all contaminated
areas, either as a remediation cost for each block cleaned, or as
a less easily defined cost (health effects, ecological damage,
etc.) for each block which remains contaminated. The latter cost
is assumed to be directly proportional to concentration, while
the former is assumed to be constant. To balance these two types
of costs, we assumed that an action level for remediation is
society's best estimate of the breakeven point, where the cost of
cleaning a block equals the cost of not cleaning it.
Preliminary results indicate that the number of samples is
the major factor in decision quality. Sampling precision and
pattern had little or no effect. A complex relationship exists
among level of bias, action level, and loss, but in general,
levels of bias less than 20 or 30 % had relatively small impacts.
These results support the use of field screening and
portable analytical methods, if they are significantly less
expensive than conventional sample collection and laboratory
analysis. The relatively small effect of sample pattern on the
results suggests that for practical purposes, the pattern may be
selected primarily based on cost and convenience.
-60-
-------
COMPARISON 07 INTERPOLATION METHODS
The objective of this study is to identify the best
interpolation methods for optimizing spatial decision-making, and
to provide guidelines for their use. Principal Investigators are
Evan J. Englund, Dennis D. Weber, and Forest Miller.
In the sampling design experiments, we investigated the
relationship between design factors such as the number or quality
of samples and the quality of remediation decisions. But samples
do not lead directly to decisions. Samples generally represent
only point locations, while remediation decisions involve larger
areas. Spatial interpolation is a necessary intermediate step;
decisions are based on interpolated values, and not directly on
data. This project examines the impact of various interpolation
methods on the quality (as measured by cost-effectiveness) of
spatial decisions.
A multitude of interpolation methods are available, ranging
from completely subjective manual contouring of data, to
completely automated "black box" computer programs. Some of the
more commonly used interpolation methods include the polygon
method (nearest neighbor), inverse distance weighted averaging,
splines, polynomial trend surfaces, triangular irregular network
(TIN), and kriging.
Most of these are actually classes of methods, with a number
of variations available to the investigator. For example,
kriging methods have evolved into a large family of variants,
including ordinary kriging, simple kriging, universal kriging,
disjunctive kriging, indicator kriging, probability kriging, and
multigaussian kriging.
The basic questions we need to ask about interpolation
methods are: "Does it make any difference which method is used?
If so, which is best?" The approach to this question is similar
to that for sampling design. Now, however, we keep the sample
sets constant and vary the interpolation method; decision quality
is measured in the same manner as before.
An initial scoping study has demonstrated that the choice of
interpolation methods is critical to decision quality (Englund,
1990). There is a need for a systematic evaluation of
interpolation methods in order to determine which are most cost-
effective for environmental applications. Two parallel studies
are in progress. The first involves a systematic evaluation of
various off-the-shelf interpolation methods for which software is
currently available and in common use. The second is a search to
identify and test potential alternative interpolation methods
which may not be widely known or available.
The ordinary kriging performance results from the 54 sample
data sets in the sampling design experiment provide a benchmark
-61-
-------
for comparing kriging variants and other interpolators. Two
comparisons have been completed to date, with log-kriging
(ordinary kriging with log-transformed data) and with the TIN
interpolation method used in the ARC-INFO GIS. The initial
results indicate that log kriging was slightly superior to
ordinary kriging when the back-transform from the kriged log
values was performed correctly. Both kriging methods produced
significantly better results than TIN.
GE08TATX8TXCAL SIMULATIONS OF HYDRAULIC HEAD
This project recently investigated the use of geostatistical
simulations of hydraulic head for determining the optimum
locations of monitoring wells and the probability that they would
detect a contaminated plume from a given source location. The
principal investigators were Dennis D. Weber and Dale Easley.
The goal of a ground-water monitoring program is to be able
to detect contamination from a leaking source before it has moved
a great distance, so that environmental impacts and potential
remediation costs can be minimized. In order to detect a
contaminant plume, a monitoring well must first intercept it. if
we know the exact path a contaminant plume would take, then we
could simply place a single monitoring well in its path and take
samples at appropriate intervals. On the other hand, if the path
is unknown, we must completely surround the potential source with
closely spaced wells. We never know enough about an aquifer to
be able to predict flow precisely, but we usually have some
information which limits the possibilities. We may have some
hydraulic head measurements, occasional hydraulic conductivity
measurements from pump tests, geologic descriptions, etc. We
must be able to use this information to maximum advantage.
The assumptions made in this study were that the aquifer
parameters are heterogeneous and that there is a paucity of data.
An approach was developed to use the maximum information
available about the problem as input to the two standard
statistical techniques described below. The result was an
integrated procedure for estimating the Probability Of Plume
Intercept (POPI).
The POPI procedure uses a geostatistical simulation approach
(GEOCONSIM) to account for the stochastic component of the
groundwater system. The procedure generates numerous possible
patterns of hydraulic head consistent with known information.
The simulations are forced to match existing data values at
measured locations, and are also forced to honor the statistical
distribution and the spatial variability of the data. To
generate the hundreds of simulations of hydraulic head on a high
resolution grid, a very fast computer algorithm was developed,
based on a frequency domain technique.
Hydraulic head was selected for simulation because
-62-
-------
measurements of this variable are readily available, inexpensive,
and reliable. Variogram models defining the spatial variability
of hydraulic heads are relatively easy to obtain; when data are
scarce, models can be estimated vith only minor consequences. It
is recognized, however, that hydraulic heads may not adequately
determine all conditions in the aquifer because they may vary as
a result of external factors such as evapotranspiration, uneven
recharge, etc.
The second phase of the POPI procedure uses parameter
estimation of hydraulic conductivity (PAREST) to force the
hydraulic head simulations to honor the physics of ground-water
flow. The established PAREST parameter estimation software was
automated, adapted, and validated for the POPI procedure. PAREST
is an inverse modeling method which iteratively calculates a set
of hydraulic conductivity values consistent with the set of
simulated hydraulic head values. If PAREST cannot reach an exact
solution, it is allowed to adjust the input head values within
specified limits. These are considered "corrected" values
because they have been forced by the numerical model to conform
to the physics of groundwater flow. If no solution can be found
with minor adjustments of heads, the input values are considered
seriously in error, and the simulation is rejected as
unrealistic.
Hydraulic head alone is insufficient for modeling
contaminant transport, for which we need to estimate hydraulic
conductivity, porosity, and dispersivity. All are difficult or
impossible to obtain in detail for a heterogeneous aquifer.
However, if hydraulic conductivity is assumed to be isotropic,
hydraulic head is sufficient to construct flowlines from a
potential contaminant source. The interceptions of the flowlines
with a given transect form a frequency distribution, from which
we infer the locations at which monitoring wells would be most
likely to detect a contaminant plume (Figure 3).
The operation of the POPI procedure was demonstrated on a
real-world data set, although no attempt waariftade to verify the
results or to validate the procedure in general. EAD's work on
this project will be completed upon publication of a journal
article describing the procedure.
-63-
-------
Estimated Probability
Distribution of Flow Paths
Aquifer System Mo del
Figure 3. Schematic illustration of POPI output. Flow lines
originating from a contaminant source at location S
represent a projection on the surface of centerlines of
contaminant plumes. A histogram of the frequency of
intercepts of the simulated flow lines with transect A-
B provides an estimate of the actual probability of
plume intercept for any proposed well location along
the transect.
HYPOTHESIS TESTS TOR SPATIAL AMD MULTIVARIATE DATA SETS
The objective of this project is to develop statistical
tests of hypotheses which are effective at detecting significant
differences in data sets which vary in space and time. The
principal investigator is Leon E. Borgman.
Two areas of research in this area have been pursued during
the past year. These were the development of a two-sample, Mann-
Whitney nonparametric test for correlated data? and the
investigation of new procedures based on principal component
analysis, to estimate spatial-temporal covariance functions in
two-, three-, and four-dimensional space.
A major difficulty with nonparametric tests in the context
of spatial-temporal data is the assumption of independence of
observations which is used in the development of the tables for
the critical values. Most environmental data are correlated to
some extent because of spatial position. Earlier research by
Quimby (1986) explored the effect of spatial correlation on the
usual normal and students-t tests. The extension of these
techniques to nonparametric or distribution-free tests was
indicated by Borgman and Quimby (1988) and derived mathematically
by Borgman (1988). Last year, Bei-Ling Lee explored the
Wilcoxon, paired-comparison test and its power (or type II error)
as part of her masters degree program.
-64-
-------
Lilliana Gonzalez has recently implemented nonparametric
procedures for two-sample correlated data (i.e., two-sample
Wilcoxon-Mann-Whitney test). She has developed computer
algorithms for the test and, in the process, solved a non-
uniqueness problem outlined by Borgman (1988).
The covariance function (or a frequently used alternative,
the variogram function) provides a quantitative model of the
correlation structure of data distributed in space and time. The
function must be known, estimated, or assumed before all the
standard geostatistical procedures, such as kriging or
simulation, can proceed. This is fairly easy for one- and two-
dimensional data. Well accepted, standard estimation procedures
exist.
However, many environmental problems involve four dimensions
(three in space and one in time). The estimation of the
covariance function in 4-D space has unresolved problems. Yet
the resolution of these problems is a necessary prerequisite for
using either parametric or nonparametric, correlated-data, tests
of hypotheses.
A possible new method based on a type of principal component
analysis was investigated during the spring-summer of 1989. The
method is easy to extend to multiple dimensions and builds on
earlier EPA-sponsored research by Hagan (1982) and Taheri (1980)
relative to radial covariance functions with an elliptical base.
Research on this method of estimation is continuing.
The general problems of treating and testing correlated data
measured in a spatial temporal region arise in many applications.
Although the current research has been directed at particular
problems related to the RCRA sites, the new results obtained
should have very wide application to many EPA concerns. In
particular, the techniques of hypothesis testing with correlated
data, estimation of multidimensional covariance functions, and
rapid conditional simulation of space-time grids of data should
have substantial applicability to problems such as long-term
climatic change, global warming, and acid rain. The opportunity
arose in the summer of 1989 to examine some acid rain data, and a
limited research effort is underway to access the applicability
of these procedures.
-65-
-------
VISUAL DISPLAY OF UNCERTAINTY IN SPATIAL DATA
The objective of this project is to develop practical
methods for generating multiple realistic simulations of
spatially distributed variables which are consistent with all
existing data, both "hard" and "soft". Differences between maps
of these simulations will provide graphic displays of the degree
of uncertainty of our knowledge, and identify areas where
additional information is critical. The principal investigator
is Andre 6. Joumel.
Traditional interpolation techniques, including kriging and
splines fitting, result in smooth maps that can give a severely
biased image of actual spatial variability, and underestimate the
actual proportion of extreme values (either high or low). These
biases due to smoothing increase as the data becomes more sparse,
a case not uncommon in environmental applications.
This is the same problem encountered with linear regression.
A regression line provides the "best" estimate of variable Y,
given a value for variable X, but the distribution of estimated
Y's will be too smooth. The estimated Y is merely the estimated
mean of the distribution of possible Y values associated with the
given X. To understand the uncertainty of our estimate of Y, we
need to know the distribution of all possible Y values, not just
the mean. With simple textbook distributions, we can compute
confidence limits which adequately describe the uncertainty, in
the more complex, real-world data sets we encounter in
environmental applications, it might be more appropriate to
simulate a set of, say, 100 possible Y values for a given X.
From this we can not only derive confidence limits, but we can
also use the simulated values to conduct a sensitivity analysis
by examining the potential consequences for each of the
possibilities.
Conditional simulation extends this concept to spatial data
sets, generating realistic simulations that not only honor data
values at their locations but also reproduce the complexities of
the spatial variability and frequency distribution of the known
data. Conditional simulations allow generating not only one map
but a whole suite of alternative, equiprobable maps. Available
data alone cannot distinguish among these alternative maps; thus,
the differences between them provide a direct visualization
(stochastic imaging) of uncertainty. For example, if a
particular trend or pattern is seen on some maps and not on
others it is deemed unreliable. A particular remediation
process, or more generally any transfer or loss function (such as
that used in the sampling design study), can be applied to each
alternative map. The result is not one but a distribution of
predicted response values, providing an assessment of sensitivity
to uncertainty.
The indicator simulation technique being developed in this
66
-------
project will have two significant advantages over existing
conditional simulation techniques. First, it uses models of
spatial variability for a number of different values of the
measured variable. This permits simulation of more complex
phenomena, such as aquifer conductivities, where the spatial
pattern of high values may differ significantly from that of low
values. Second, the indicator data transform provides a way to
combine data of varying qualities in a consistent and
quantitative manner. This may permit more effective use of such
data as non-detects, low-precision field screening measurements,
indirect measurements obtained from remote sensing or geophysics,
and possibly even subjective information such as a high-medium-
low classification. The versatility of this technique has great
potential for many types of environmental applications.
ISIM3D is a software package for generating alternative,
equiprobable, stochastic images of the spatial distribution of a
continuous variable in 2 or 3 dimensions. The algorithm used is
that of sequential indicator simulation (SIS) (Joumel and
Alabert, 1988). It consists of simulating at each node of a
specified grid a series of K indicator values indicating in which
of (K+l) classes the continuous simulated values fall. Then a
Monte Carlo simulation from within-class distributions yields the
required simulated value for the continuous variable. At each
node the simulated value is made conditional not only to prior
data but also to all previously simulated values, thus ensuring
reproduction of the K indicator covariance models.
In the indicator approach the model of spatial variability
can consist of any number of indicator covariance functions
characterizing the probability to find an attribute value in any
given class, say Z(x)>z10, given that a nearby datum value has
been found to be within any other class, say Z(x+h) is in the
range z,, zu1. These different indicator covariances can
differentiate the patterns of spatial continuity of different
classes of Z-values, for example, the low-to-median soil
concentrations of a toxic metal may be randomly distributed due
to pervasive urban background pollution, whereas high
concentration values may display distinct trends due to the
process of transport from a particular primary source.
Figures 4-6 provide an example of ISIM3D output graphs for
a 2-dimensional run using surrogate data. Figure 4 shows a cross
section of an aquifer (dark zone) for which hydraulic
conductivities will be simulated.. The known conditioning data
are located along five drillholes. The black segments of the
holes represent low-conductivity shales, while white represents
high-conductivity sand. Note that most of the shales lie at the
bottom of the aquifer, except for the fourth hole from the left.
•67
-------
Figure 5 gives three equally probable alternative images of
the sand/shale distribution in that section as simulated by
ISIM3D, which honor the data from the five holes. The range of
horizontal correlation is large (exceeding half the dimension of
the section presented); thus spatial uncertainty is relatively
low as can be seen from the similarity between the three
simulated images.
In Figure 6, a similar group of alternative images displays
horizontal aquifer conductivity data simulated from corresponding
conditioning data from the five holes. These three conductivity
fields could be used to simulate the transport of contaminants,
say from a source at the left of the section to a drinking water
well at the right of the section. ISIM3D allows constructing the
input to a Monte Carlo simulation that, for example, would yield
the probability distributions of first arrival time and full
passage of the contaminant plume at the location of the drinking
water well.
oooo
-10.000
¦I
5 -20.000
N
•30.000
•40.000
a ooo
23.000
Xroo
X-Kli
73.000
100.000
Figure 4. Conditioning data located along five holes (Dark black
is shale, white is sand).
-68-
-------
0.000
-10.000
M
g -20000
N
-JO000
-40000
•'?'£'*: • I ** W %•; v*
aooo 15.000 jo.ooo 73.000 100.000
x-axis
aooo
-laooo
*1
5 -20000
N
•mooo
-uiooo
0,000 u.000 50.000 7S.ooo loaooo
x-axu
aooo
•ioooo
M
J -20000
N
¦i&OOO
>40000
0000 23.000 5a000 75.000 loaooo
x-uis
Figure 5. Three simulations of the sand/shale sequences.
-69-
-------
a ooo
-10.000 .
'2 -20.000 . •*__ v
-3a 000
-40.000
¦ •Sriitg-- .... .
A- w»*. •*>•'•: . w„.
.. ^,-x^-rr.:.
>¥*&<*•>>$

xY
ip - • * , • ¦ %
*••¦'»• • T~*,r
0.000
25i»0
50 000
x-ixis
71.000
100.000
0.000
laooo
5T. -v . . — -i
k **' :,"v
2 -2a 000
-*0.000
¦¦¦¦¦¦¦
a 000
23 ooo
i />¦ '¦ ¦ -v
' "• .4-— • r
¦ - . - . ¦¦¦v.. ,. ..:
30.000
x-axas
n000
loaooo
aooo
•taooo
3 -20000
-jaooo
-40.00c

¦' < V

aooo
23 ooo
30.000
X-1X15
75.000
loaooo
Figure 6. Three simulations of the conductivity field.
-70-
-------
MULTIVARIATE OUTLIER TESTS
The objective of this project is to develop useful state-of-
the-art statistical tools for identifying outliers in
multivariate data sets. The test(s) should be useful for both
statistically rigorous rejection of unacceptable results in
quality assurance programs, as well as for rapid screening of
large complex data sets to identify suspect or interesting
measurements. The principal investigators are George T. Flatman,
Forest C. Garner, and Martin A. Stapanian.
Most EPA data sets are multivariate (i.e., more than one
variable is measured for a given specimen of water, soil or other
experimental medium). Modem analytical techniques such as GC-MS
and ICP make it possible to measure many variables in one
analysis. In addition, regulatory pressure is increasing the
list of mandatory compounds which must be measured.
All major EPA data sets are scrutinized for unusual results
as part of a rigorous quality assurance/quality control (QA/QC)
program. Errors due to transcription, measurement, analysis, or
contamination may be found and corrected when the proper QA/QC
methods are used. For these situations we need a rigorous
statistical outlier test to identify erconeous values.
Experimenters frequently encounter data that are unusual
relative to the remainder of the data set. It is important that
these outliers are identified, because when they are due to
errors, they may influence descriptive and inferential statistics
to the point where information is obscured. On the other hand
outliers are sometimes valid measurements which are of special
interest because they reflect a significant change in the state
of the system being measured.
Tests for such outliers in univariate data are much more
numerous and we11-developed than are multivariate tests (Barnett
and Lewis, 1984). A multivariate outlier might not be extreme in
any of its components. Therefore, univariate tests might be
wholly inappropriate for multivariate data. When univariate
outlier tests are applied to multivariate data sets, much
information may be ignored, thus reducing power. Univariate
tests concentrate on data that are of unusual magnitude in a
single variable. They are oblivious to correlations which may be
present among the variables. Alternatively, multivariate tests
consider both magnitude and correlation structure.
Figure 7 illustrates how univariate outlier tests may be
ineffective when multivariate data are considered. It shows an
X-Y plot of the percentage recoveries of two isomers of
dichlorobenzene from fortified environmental water samples. Here
we introduce two terms: "magnitude" outlier and "trend" outlier.
Point A appears to be consistent with the relationship between
the recoveries of the two isomers, but is excessive in magnitude.
Point B, on the other hand, is not consistent with the trend or
-71-
-------
pattern of recoveries of the two isomers and is an outlier even
though it is not excessive in magnitude. Most univariate tests
would identify point A and not point B as an outlier. Tests
based on regression, such as those developed by Tietjen, Moore
and Beckman (1973) and Lund (1975), would identify point B and
not point A as an outlier. Such tests are based on models which
are not appropriate when none of the variables is controlled. A
multivariate test ideally would identify both trend and magnitude
outliers.
ISO
<30
* n°
f <00
3
¦ w
2 H
M
30
40
Figure 7. Example magnitude outlier (A) and trend outlier (B).
Many multivariate outlier identification procedures, such as
graphical techniques, are often informal, not rigorously defined,
or otherwise limited due to conceptual or logistical constraints
(Barnett and Lewis 1984). Such tests are often difficult to
automate, employ vague assumptions, lack consistency of results
between researchers and cannot specify error rates. Some
powerful, rigorous multivariate outlier tests have been proposed,
but are difficult to use from a practical standpoint. They may
lack published critical values or algorithms for computing test
statistics. This project is to test two such methods, Mardia's
kurtosis and generalized distances, and if possible develop them
for practical use.
Mardia's kurtosis (Mardia 1970, 1974) has several desirable
properties as an outlier test. However, critical values for data
sets with more than four dimensions have not been published. A
complete set of critical values for three and four dimensions are
only available from the Royal Statistical Society Library (Mardia
1970, 1974). A similar test, which also lacks published critical
values for multivariate data sets, is based on the generalized
-72-
-------
distance. Siotani (1959) provided an interesting approach for
finding the distribution of maximum generalized distances, using
Taylor series approximations. However, critical values for more
than two variables for instances in which the covariance matrix
is unknown were not provided. Gnanadeskikan and Kettenring
(1972) proposed the test for identifying outliers in multivariate
data sets. The operating characteristic of both tests have not
been published.
Generating the critical values for both tests is difficult,
and requires a high-speed inexpensive computer and advanced
software. Mardia and Kanazawa (1983) provided a technique for
generating approximations to the critical values for Mardia's
kurtosis. However, the accuracies of these approximations had
not been determined previously.
We have developed computer programs which (1) rigorously
determine the accuracy, in terms of Type I error rates, of the
approximations of the critical values for Mardia's kurtosis, (2)
compute exact critical values for Mardia's kurtosis and the
generalized distance through simulation, and (3) compute
confidence intervals about the exact critical values.
Two papers on this work are being prepared for publication.
In the first (Stapanian, et. al., 1990a), it was concluded that
the observed error rates of the approximations of the critical
values of Mardia's kurtosis converged to the intended rates as
the number of observations increased. However, the observed
error rate often differed from the intended rate by a factor of
two or more, especially when the number of observations relative
to the number of dimensions (variables) was low. it is up to the
user to determine if the approximations of the critical values
are acceptable.
The second paper (Stapanian, et. al., 1990b) reports
simulated critical values (for up to 25 dimensions and 500
observations) for both Mardia's kurtosis and the generalized
distance. Nonparametric confidence intervals for the critical
values are given. The operating characteristics of both tests
are described. The two tests are compared with respect to
sensitivity to outliers and masking. Recommendations to applied
statisticians for the use of these tests and interpretation of
the results are given.
SCOUT, a computer program which implements the outlier tests
along with principal components analysis and 3-0 graphics, is
being prepared for release.
METHODS FOR EXPLORING MULTIVARIATE DATA SETS
The objective of this project is to evaluate the potential
applicability of new or relatively unknown multivariate data
exploration methods for the identification of significant
-73-
-------
patterns and relationships in complex environmental data sets.
The principal investigator is Donald E. Myers.
In the study described above, we investigated methods for
detecting outliers in multivariate data sets. Here we look at a
more general question, namely, "How can we determine what
information is significant or interesting in a large data set
containing dozens or even hundreds of measured variables?"
When we have a data set with only two variables, and want to
examine the relationship between then, we can compute statistics
such as a linear correlation coefficient to describe a
mathematical relationship between the variables. If the two
variables are perfectly correlated, that is, one can be explained
entirely as a function of the other, then we can reduce the data
set to one variable (expressed as a linear combination of the
two) without losing any information.
Alternately, we can plot an X-Y scatterplot and look for
patterns in the data. Two variables may have a low correlation
coefficient and still be related in interesting ways.
Multivariate exploration methods generally involve some
combination of the two approaches. The basic approach is to
reduce the dimensionality, or number of significant variables, as
much as possible, and then to search for interesting patterns in
the remainder. The number of significant variables is often of
interest in itself, as an indicator of the number of possible
causative factors or sources.
The current study is examining three specific multivariate
methods: Correspondence Analysis, Linear Dependency Analysis, and
Exploratory Projection Pursuit. The three methods are of
interest both because of special facets of the methods but also
because of a degree of newness. Except for Correspondence
Analysis, which has only recently been incorporated into one of
the standard software packages, none of the three has been widely
available. Each incorporates or utilizes correlations or
intervariable dependencies in a slightly different way.
Projection Pursuit in particular allows the identification of
non-linear correlations.
The methods are being tested on three EPA data sets: the
1982 National Human Adipose Tissue Sample BroadScan Survey
(NHATS) ; Eastern Lake Survey (ELS); and Hazardous Ranking System
(HRS). Each of these data sets is multivariate in several
respects, and there is a need to reduce or eliminate redundancy.
Since 1982, the principal investigator together with his
graduate students have utilized and made improvements to a
Correspondence Analysis program which originated as a batch mode
code published by David et al (1977). Changes have included
transferring it to the microcomputer and adding a number of
diagnostics as well as the analysis of supplementary variables.
-74-
-------
This program will be incorporated into the Geo-EAS II package.
Computer Sciences Corporation personnel have added the Screen
Management Utilities to provide the same user interface used in
Geo-EAS, on-screen 3-D graphics are being added to allow better
visualization of results. This program has received extensive
testing on the ELS and NHATS data sets and should be ready for
final approval and release by EPA in the near future.
The Linear Dependency Analysis algorithm was first given by
Kane et.al. (1985) and a batch FORTRAN mode code was published by
Ward et.al. (1985). As noted by the authors, the code was
computationally demanding on a VAX. The principal investigator
has access to the CYBER 205/ETA-10 at the John Von Neumann Center
at Princeton via a high speed satellite link. The code has been
substantially re-written and a version has been prepared for use
on this machine.
As described by Friedman (1987), Exploratory Projection
Pursuit is a method for searching for dependencies not given in
terms of the first two moments. The essential core of the code
was obtained from Friedman and has been ported to the CYBER 205.
As obtained from Friedman the program did not include diagnostics
or graphics, some have been added and others are in progress.
SOFTWARE DEVELOPMENT
The objective of this effort is to encourage the wider
application of state-of-the-art statistical methods to
environmental problems by providing practical, easy-to-use
microcomputer software. The principal investigators are Evan J.
Englund, and Nancy Leviant.
Computer software is the primary technology-transfer
mechanism for statistical research and development. The ideal
outcome of a research project in environmental statistics is
development of a problem-solving tool that can be effectively
used by the people facing the problem.
The Geo-EAS (Geostatistical Environmental Assessment
Software) package and User's Guide (Englund and Sparks, 1988) was
designed to provide a basic set of geostatistical tools for
practical environmental applications. The package was released
in the fall of 1988. Designed for use on IBM-PC compatible
computers, the principal functions are the production of 2-
dimensional grids and contour maps of interpolated (kriged)
estimates from sample data. Other functions include data
preparation, univariate and bivariate statistics, data maps,
variogram analysis, and correspondence analysis.
The first phase of Geo-EAS is being distributed in
executable form on diskettes which contain the 13 Geo-EAS
programs plus the example data files referred to in the tutorial
section of the Geo-EAS User's Guide. More than 1200 copies of
-75-
-------
this package have been distributed to date. During the past
year, a number of "bugs" have been reported in the initial
Version 1.1. These have been corrected, and Version 1.2 has
recently been released. The source code and Programmer's Guide
have also been released.
Several new programs have been developed to add functions to
the Geo-EAS system. These will be released in the future as
Geo-EAS II. The additional functions include contours for non-
gridded data, cross-hatch plots, correspondence analysis,
variograms for large data sets, and 3-D Graphs.
Planned software development includes transferring Geo-EAS
to a UNIX environment on a Sun workstation and to VMS on a VAX
machine; and adding probability kriging (PK) software. Over the
longer term, we plan to develop software based on the results of
the on-going research reported in the previous sections. These
may include SCOUT, ISIM-3D, POPI, multivariate data exploration
methods, and space-time hypothesis tests.
COMPOSITS SAMPLING
A research project was conducted on the most cost-effective
methods for estimating the proportion of individuals possessing a
characteristic, such as a disease, antibody, or chemical
compound. When it is not necessary to identify the individuals
that are positive for the characteristic in such experiments,
sample compositing (group testing) may be applied. Sufficient
test material from the appropriate number of individuals is
taken, aliquots are combined, and the resulting composites are
analyzed. Substantial savings in analytical costs may occur when
a carefully planned compositing technique is used.
This work resulted in a paper which has been accepted to the
Journal a£ Official Statistics (Gamer, Stapanian, Yfantis and
Williams, in press). In this paper,a method was developed for
finding the optimal number of test portions that should be
combined into a composite for a given proportion of individuals
possessing the characteristics. The paper also determined the
cost savings resulting from carefully planned experiments to
estimate these proportions when sample compositing is used.
REFERENCES
Barnett, V. and Lewis, T., 1984, Outliers in Statistical Data,
New York, J. Wiley.
Bei-Ling Lee, 1988, Power of the Geostatistical Wilcoxon Matched
-Pairs Signed-Ranks Test, Plan B Paper for M.S. Degree,
Statistics Department, University of Wyoming, Laramie,
Wyoming, 81 pp.
Borgman, L.E. and Quimby, W.F., 1988, Sampling for Tests of
-------
Hypothesis When Data Are Correlated in Space and Time,
Principles of Environmental Sampling (L.H. Keith, Ed.) The
American Chemical Society, 25-43.
Borgman, L.E., 1988, New Advances in Methodology for Statistical
Tests Useful in Geostatistical Studies, Mathematical
Geology, 20(4):383-403.
Brown, K.W., Mullins, J.W., Richitt, E.P., Flatman, G.T., Black,
S.C., and Simon, S.J., 1985, Assessing Soil Lead
Contamination in Dallas Texas. Environmental Monitoring and
137-154.
David, M., Dagbert, M. and Beauchemin, Y., 1977, Correspondence
Analysis, Q. Colorado School of Mines, 71(1).
Englund, E. J. and Sparks, A. R., 1988, Geo-EAS (Geostatistical
Environmental Assessment Software) User's Guide,
EPA/600/4-88/033, U.S. Environmental Protection Agency.
Englund, E.J., 1990, A Variance of Geostatisticians, Mathematical
Geology 22:417-455.
Friedman, J.H., 1987, Exploratory Projection Pursuit, J. Amer.
Stat. Assn. 82(397):249-266
Gamer, F.C., Stapanian, M.A., Yfantis, E.A., and Williams, L.R.,
1990, Probability Estimation with Sample Compositing
Technigues, Journal of Official Statistics (in press).
Gnanadesikan, R. and Kettenring, J.K., 1972, Robust Estimates,
Residuals and Outlier Detection with Multiresponse Data,
Biometrics 28:81-124.
Hagan, Randy L., 1982, Application of Spectral Theory and
Analysis in Mining Geostatistics and Statistical Linear Wave
Theory, Ph.D. Thesis, Statistics Department, University of
Wyoming, Laramie, Wyoming, 338 pp.
Isaaks, E. H., and Srivastava, R. M., 1989, An Introduction to
Applied Geostatistics, Oxford University Press, New York,
561 pp.
Journal, A.G. and Alabert, F., 1988, Focusing on Spatial
Connectivity of Extreme-valued Attributes: Stochastic
Indicator Models of Reservoir Heterogeneities, SPE paper
#18324, accepted for publication in SPE Formation
Evaluation.
Journal, A. G. and Huijbregts, Ch. J., 1978, Mining
Geostatistics, Academic Press, London, 600 pp.
Jourael, A.G., 1989, Fundamentals of Geostatistics in Five
Lessons, Short Course in Geology, Vol. 8, American
Geophysical Union, Washington DC, 40 p.
-------
Kane, V. , Ward, R.C. and Davis, G.J., 1985, Assessment of Linear
Dependencies in Multivariate Data. SIAM J. Sci. Stat. Comp.
6:1022-1032.
Leczynski, B., Mack, G.A., and Bemer, T., 1988, 1978 Population
Estimates from Fiscal Year 1982 Specimens: National Human
Adipose Tissue Survey Broad Scan Analysis, NHATS-SS-09 Final
Report.
Lund, R.E., 1975, Tables for an Approximate Test for Outliers in
Linear Models, Technometrics 17:473-476.
Mardia, K.V., 1970, Measures of Multivariate Skewness and
Kurtosis in Testing Normality and Robustness Studies,
Biometrika 57:519-530.
Mardia, K.V., 1974, Applications of Some Measures of Multivariate
Skewness and Kurtosis in Testing Normality and Robustness
Studies, Sankhya 36:115-128.
Mardia, K.V. and Kanazawa, M., 1983, The Null Distribution of
Multivariate Kurtosis, Commun. Statist. Theor. Meth. 12:569-
576.
Siotani, M., 1959, The Extreme Value of the Generalized Distances
of the Individual Points in the Multivariate Normal Sample,
Ann. Inst. Statist. Math. 10:183-203.
Stapanian, M.A., Gamer,F.C., Fitzgerald, K.E., Yfantis, E.A.,
and Williams, L.R., 1990, Two Tests for Outliers in
Multivariate Data: Critical Values and Power Curves, (in
preparation, submitted to EPA for review).
Stapanian, M.A., Garner, F.C., Fitzgerald, K.E., and Flatman,
G.T., 1990, Accuracies of Three Approximations of Critical
Values of Mardia's Kurtosis (in preparation, submitted to
EPA for review),
Taheri, S.M., 1980, Data Retrieval and Multidimensional
Simulation of Mineral Resources, Ph.D. Thesis, Statistics
Department, University of Wyoming, Laramie, Wyoming, 265 pp.
Tietjen, G.L., Moore, R.H. and Beckatan, R. J., 1973, Testing for a
Single Outlier in Simple Linear Regression. Technometrics
15:717-721.
Ward, R.C., Davis, G.J. and Kane, V.E., 1985, An Algorithm for
Assessing Linear Dependencies in Multivariate Data.
ACM-Trans. Math. Software, 11:170-182.
-78-
-------
SPATIAL STATISTICS, COMPOSITE SAMPLING, AND RELATED ISSUES
IN SITE CHARACTERIZATION WITH TWO EXAMPLES
N. C. Bolgiano, G. P. Patil, and C. Tail lie
Center for Statistical Ecology and Environmental Statistics
Department of Statistics
The Pennsylvania State University
University Park, PA 16802
ABSTRACT
Data from two Superfund sites, the Dallas Lead Site and the Palmerton
Site, were examined in order to assess what sources of variability were
present in the data and how sampling at other sites of similar character
should be designed. The results of the analysis indicate that the spatial
variability in the accumulated heavy metal deposition at each site appears to
be dominated by variability on the scale of the sampled region, by local
industrial contamination at the Dallas Lead Site, and by variability of nearby
soil volumes. Sampling of sites with these variability sources should
probably be designed to capture the large-scale trend, to identify the sources
and extent of local contamination, and to reduce the large variability of
nearby soil samples by composite sampling.
1. INTRODUCTION AND BACKGROUND
Superfund site-characterization requires that estimation of
contaminants be statistically and substantively accurate and precise while
maintaining cost-effectiveness of sampling and remediation. Costs are
incurred to the society by exposure to hazardous wa'stes and by expensive
assessment and cleanup at hazardous waste sites. The purpose of this study
was to retrospectively examine data from two Superfund sites, the Dallas
Lead Site and the Palmerton Site, in order to examine what components of
variability were present in the data and to explore sampling schemes
efficient in producing data for accurate prediction of contaminants if
sites with similar variability are sampled in the future.
We approximate large-scale trend in the chemical concentration data by
fitting a bicubic spline function (Hayes and Halliday, 1974) and we judge
the presence or absence of a small-scale stochastic process contributing to
data variation by the presence or absence of autocorrelation in the spline fit
residuals. By choosing the number of knot points to be less than the number
of data points, the bicubic spline fit smooths, rather than interpolates, the
data. Decomposing spatial variability into a large-scale trend and a
small-scale stochastic process can be problematical if the trend model and the
covariance structure of the stochastic process are indeterminate (Armstrong,
1984). A common approach to approximating the trend is to fit polynomials in
-79-
-------
local data neighborhoods It would seem that as the data neighborhood size
decreases, the polynomial fit to these data might increasingly tend to fit
variation contributed by a small-scale stochastic process. Therefore, an
appeal of approximating the trend by noninterpolative bicubic splines is that
the fit is made in one operation to all the data, and small-scale variation
might not be fit if there is a high degree of smoothing. Subsequent to the
data decomposition, a crossvalidation study was performed in order to examine
how a response surface predicted by kriging night differ if the actual
sampling had been less intense.
1.1 The Two Si tes
Data from the Dallas Lead Site and the Palmerton Site were chosen
for this exploratory analysis because of common features of these sites
and prior investigations: the similarity in the contamination processes,
to the sampling schemes, and to the extensiveness of prior statistical
analyses. The processes sampled at both sites were of heavy metal
accumulation from the fallout of air-borne particles emitted from point
sources. The sampling schemes were designed to provide information about
the data autocorrelation structure so as to enable response surface
prediction via kriging, composite sampling was utilized at both sites,
and special samples were taken to estimate variability due to certain
sources.
I.I.I The Dallas Lead Site. The Dallas Lead Site, located in Dallas,
Texas, consisted of three areas sampled during 1982, with subsequent
assays of the soil lead content. These areas were in the vicinities of
the Dixie Metal Company smelter (DMC area) and of the RSR Corporation
smelter (RSR area), and in a region thought to be relatively uncontaminated
from smelter lead fallout (reference area). The areas were sampled in a
grid design, with grid squares measuring '228.6 m (750 ft) on a side. This
intersample distance was selected to be approximately 2/3 of the range of
autocorrelation estimated from prior monitoring. Pour individual soil
cores, each 2 cm in diameter and 7.5 cm in depth, were collected from the
major compass points of the perimeter of a 10 ra diameter circle and com-
posited. After the soil samples were bulked, dried, sieved, and mixed, the
soil lead content was measured on a 5 g aliquot of the approximately 160 g
sample. At some locations, duplicate samples were taken to assess the
variation of nearby composite samples. In addition, aliquots from three
equal portions of some samples were assayed in order to assess subsampling
and measurement error (Brown and Black, 1983; Brown et al., 1985).
1.1.2 The Palmerton Site. The Palmerton Site is centered around two
zinc smelters located near the town of Palmerton, Pennsylvania, in the
eastern part of the state. The first zinc smelter, which later became
known as the West Plant, began operation in 1898. The town of Palmerton
developed to the east of this smelter. Beginning in 1913, a second
smelter, which became known as the East Plant, began operation at a
-80-
-------
location just east of Palmerton. Heavy meta1-conLaining dust was emitted
from stacks, with the peak emissions probably occurring between 1949,
when a process called sintering began, and 1954 when emission controls
were installed. In 1980, the West Plant was shut down and primary zinc
smelting at the East Plant was modified to secondary refining.
Additional pollution control devices installed during 1967-1980 have
reduced heavy metal emissions to levels within National Ambient Air
Quality Standards. The contaminants of concern have been the heavy
metals cadmium (Cd), copper (Cu), lead (Pb), and zinc (Zn). Heavy metals
in the soil, groundwater, and surface water caused the site to be placed
on EPA's National Priorities List.
A Remedial Investigation was begun on off-plant property by Gulf and
Western, the former plant owner. This off-plant area consisted of
approximately 50 square miles of land which was thought to have been
contaminated over the years. As part of this remedial investigation,
soil sampling was undertaken to assess the extent and severity of heavy
metal contamination in the Palmerton area soils (R. E Wright Associates,
1988).
The Palmerton Site was sampled in two phases during 1985-1986. Soil
samples were assayed for Cd, Cu, Pb, and Zn concentrations, though
measurements of Cu concentrations were not made on second phase samples
since first phase Cu levels were thought to be acceptably low. The initial
phase sampling design incorporated features of both grid and transect
designs. The grid, covering a large portion of the Palmerton residential
area between the two smelters, was designed to capture the data
autocorrelation structure, using a 366 o (1200 ft) range of autocorrelation
extrapolated from an analysis of Dallas Lead Site. A grid spacing of 122 m
(400 ft), or 1/3 of the anticipated autocorrelation range, was selected.
The transect pattern was design to delineate the overall trend in heavy
metal concentrations, while the second phase sample locations were located
to fill in gaps noted from first phase data (Figure 1). The lengths of
transects were calculated from process model predictions of heavy metal
accumulation. Gaussian dispersion models, based upon historical wind data
and topography predicted that the majority of the deposition was expected
on the valley floor, oriented in the southwest to northeast direction.
Large deposition was also expected from air passing over Blue Mountain and
through the Lehigh River gap in the mountain, to the south of the smelters.
Differences in the spatial pattern of the different heavy metals was
anticipated, as most of the Pb were thought to have been emitted from the
West Plant, while most of the Cd and Zn emissions and all of the Cu
emissions were thought to have been emitted from the East Plant. Except
near the grid, the spacing of sample locations on the transects was either
345 m or 366 m (Starks et al., 1986, 1987; R. E. Wright Associates, 1988).
Soil sampling at the Palmerton Site was performed by compositing
four individual soil cores in the first sampling phase and nine
individual soil cores in the second sampling phase. In the first
sampling phase, individual soil cores taken at each of the major compass
points on a 6 m diameter circle were composited. In the second sampling
-81-
-------
phase, individual cores from the four major compass points on a 6 m
diameter circle, from the four minor compass points on an inner 4.25 m
diameter circle and from the joint center point of both circles were
composited. Bach soil core was 1.9 cm in diameter and 15 cm in depth.
After the soil samples were bulked, dried, sieved, and mixed, heavy metal
concentrations (ppm) were measured on 5 g aliquots (Starks et al., 1986;
Brown et a I., 1989).
Soil samples were also taken in order to allow estimation of certain
variance components. Individual soil samples were not composited at ten
first phase locations so as to measure their variation and the variance
reduction from compositing. Implicate composite samples were taken 0.5 m
apart for assessing the variation of nearby composite samples. Some
composite samples were split, for duplicate chemical analyses, so as to
estimate variation induced by subsampling and measurement (Starks et al.,
1986). The magnitude of the duplicate sample variability relative to
measurement error lead to the decision to increase the composite sample
size in the second samp Iing phase from that used in the first sampling
phase (Starks et al., 1987).
Geostatistical analyses of the Palmerton Site data were previously
reported. No autocorrelation was judged to exist in the data of each sampling
phase after removal of trend fit by second-order polynomials in moving
neighborhoods. Kriging predictions were made using pure nugget
autocorrelation models (Starks et al, 1987; Brown et al., 1989).
1.2 The Sampling Model
The following model (Cressie, 1988) appears to have been implicitly
assumed in designing the Dallas Lead Site and the Palmerton Site sampling
schemes.
Zlx) = /i(x) + £lx) ~ 6(x) ~ v, (1)
where Zlx) is the process of measured contaminant concentration at
location x, /i(x) is a large-scale deterministic trend, £(x) is a
2
small-scale stochastic process with Var(£) = 6(x) is a micro-scale
2
stochastic process with Var(j) = an^ " represents measurement
2 «-•
errors which are independent with Var(u) = a . s, and A are correlation
raatri ces.
The small-scale stochastic process is regarded as occurring on a spatial
scale larger than the minimum inter-sample distance (or at least the
smallest distance for which there is sufficient information), but much
smaller than the entire region. The micro-scale stochastic process is
regarded as occurring within the spatial scale of the minimum inter-sample
di slance.
-82-
-------
When the process, Z, is sampled, variability due to measurement is
added. This can be due to errors in recording the exact sampling
location or the soil volume extracted, inability to thoroughly mix the
composite sample, or inexactness in laboratory procedures.
The model assumes stationarity at least within small data
neighborhoods. Three levels of stationarity can be distinguished. Strong
stationarity requires that the joint distribution of Ztx^), Z(x^), ....
Z(xr) is equivalent to the distribution of Z(x +1?) ,Z( x^+h), . . . ,Z(xn+h)
for each vector ft .
Second order stationarity requires that the means, E[Z(x)], and
variances, Var(Z(x)), exist, are constant and do not depend upon x, and also
that the covariance between Z(x() and Z(x2> exist and depend only upon the
vector h = joining xi and x^. Under second order stationarity, the
corre1ogram , p(K) = p(fti = Corr(Z(),Z(xt) is related to the
senivariogram, -)r(ii) by
2f(ft) = E[Z(x)-ZU+K)]J
= 2
-------
1.3 Composite Sampling
Composite sampling becomes a useful technique for reducing the
response surface prediction error when the support (i.e. area or volume)
that is used for sampling differs from that used for prediction. For
practical purposes, the sample support is usually much smaller then the
prediction support. For example, the prediction support might be the size
of a soil block removed in a single pass of a bulldozer (Englund, 1987).
The soil sampled for contaminant assays is typically much smaller than such
a soil block that might be remediated. Inference is desired on the spatial
scale of the estimation support, while minimizing the effect of
heterogeneity present on the scale of the sample support. Composite
sampling can decrease the micro-scale variance and thus the overall
response surface prediction variance (Starks, 1986). Composite sampling to
minimize the effect of such micro-scale variability upon inferences
targeted for a larger scale was applied to the Dallas Lead and Palmerton
Sites (Brown and Black, 1983; Starks et al., 1987). The effectiveness of
composite sampling in reducing micro-scale variation was also examined in this
study.
2. METHODS
'2.1 The Data Analysis
The data were first examined for inconsistencies. Then, some
features of the data were examined through frequency distributions,
contour plots, identification of outliers, plots of variance versus mean,
sample semivariograms, and variance component estimates from individual,
duplicate, and split sample data. Decomposition of the data into
large-scale trend versus small-scale stochastic process was explored
through fitting trend models via a bicubic spline algorithm of Dierckx
(1981) and examining residual semivariograms.
The sample semivariogram was calculated by
a i n(h) ,
1 = 2nThT £ [ZUi} "Z(Vh)] • (1)
where Z(x.) is an observation at point x., Z(Xj+h) is an observation a
distance h away, and n(h) is the number of observations h units apart.
Intervals of h were determined so that n(h) was at least 30, a
minimum suggested by Journal and Huijbregts (1978).
2.2 The Spline Model
The data locations are in a closed rectangular domain
D = [a,b] x [c,d]. Here, we consider locations as pairwise coordinates
-84
-------
(x,y) The spline function
g h
s(x,y) = V 5! c M , (x) N . (y)
q=-k r=-/ 4,r Q' '
where
c are coefficients and M . ,(x) and N , ,(y)
q,r q,k+l r,/+l
are normalized R-splines, defined on the knots
Aqi'Vl Vk<-1 '
"nd Wl 'r,(,l * ••• = *-1 = c; V-2 * * "k.W * dl '
On any subrectangle
Dj . = [A.,A.+1] x ("j."j+1l i = 0,1,....g, j = 0,1,...,h ,
s(x,y) is given by a polynomial of degree k in x and I in y.
All derivatives
d*** slx.yj/dx'y' for 0 < i < k-1 and 0 < j < /-I
are continuous in D. The bicubic splines have the property that
M , ,(x)=0 if x < A or x > A , ,
q,k+L q q+k+1
r,^+l (y) = 0 if y < u or y > u
For data values zf at points (xr, yr), r= l,2,...,m and
with positive weights wr> the spline function is fit so that
a measure of the goodness of fit,
Or i 2
* ral ' "Wl •
and a measure of the lack of smoothness in s ,
g h , g .2 h g , h v2
0(e) - I I [l «q jc r ~ I I I b ,«q r -
i=l T--1 q=-k q'' Q,r; j=l q=-k ,J q,IV
satisfy the constraints:
85-
-------
Minimize G(c)
subject to E(c) < S,
where S is a specified parameter.
In practice, it is suggested that the choice of S be in the range of
n ± V2n if the weights = Ho, where a is the anticipated error
standard deviation. The suggested maximum number of knot points is
k + 1 + Vn/2, where k is the polynomial order. As S decreases, more
knot points are added and a closer fit is realized. The number of knot
points and their position is determined automatically by sequentially
adding knots at those locations where the fit is poorest (Dierckx,
1981) .
2.3 A Crossvalidation Study of Sampling Intensity
A crossvalidation analysis was performed in order to examine how the
predictive power of data changed with different sampling intensities.
This was performed by resampling the data at intensities lower than the
realized intensity and assessing how well data omitted one at a time
could be predicted by the resampled data. Observations were randomly
assigned to one of two groups: a subset regarded as resampled points to
be used in calculating kriging predictions and a subset omitted in the
calculation of kriging predictions. The selection of data was performed
by first stratifying the data to insure that there was some degree of
systematic coverage of the sampled site, and then by permuting the
observations within strata and choosing permuted points in order until
the desired density was realized. All observations, from both the
resampled subset and its complement, were predicted from the resampled
subset, however. In each of 100 runs per selected sampling intensity,
each datum in the nonresampled subset was predicted from the resampled
data and each resampled datum was predicted from the remaining resampled
points. The measure of prediction performance was the mean squared error
(MSE) of crossvalidation,
n (Z. _ Z.*)2
MSE of crossval idation = E — , (7)
i=l "
where Z. is datum i, i=l,...,n, and Z. is the predicted value of
Zj . This statistic was also calculated for the case of all observations
being resampled. A measure of intersample distance of the resampled
points was calculated by the median minimum distance of points to neighbors
-86-
-------
3. RESULTS
3.1 Data Inconsistencies
Several apparent inconsistencies in both data sets were amended. A
single observation of 0 parts per million (ppm) Pb from the Dallas Lead
Site RSR area was omitted since it was considered suspect. In the Dallas
Lead Site data, there were 43 sets of two observations and 11 sets of
three observations having within-set intersample locations less than 0.91
m (3 ft) apart. Eight of the 11 sets of three observations corresponded
to measurements on 8 of the 11 split samples previously reported (Brown
et al., 1984, 1985). The other three split samples appeared to have been
represented by their means or by single split sample measurements.
The remaining 46 sets were considered to be duplicate sample measurements.
The concentrations and coordinates of the observations within sets were
averaged. In the Palmerton Site data, a discrepancy in the Zn data was traced
to the first phase Zn concentrations units being of parts per 10S and second
phase observations units being of ppm.
3.2 Frequency Distributions and Summary Statistics
Frequency distributions of the measured concentrations indicated positive
skewness. Coefficients of variation were high, especially for the Dallas Lead
Site DMC and RSR Pb concentrations. The mean Pb concentration from the Dallas
Lead Site reference area were lower than the mean Pb concentration from the
DMC and RSR area. Of the Palmerton Site data, the mean Cd concentrai »us
lower than the mean Pb concentration, which was lower than the mean Zn
concentration (Figure 2, Table 1).
3.3 Contour Plots
Contour plots of the data indicated that the highest heavy metal
concentrations generally occurred in the center of the sampled regions, except
for the Dallas Lead Site reference area data for which no contamination focus
was noted. The contours of the Dallas Lead Site DMC and RSR area data tended
to be more circular than the contours of the Palmerton Site data, which tended
to be more elliptical. There were also contours of high concentration in the
Dallas Lead Site DMC and RSR area data located apart from the smelter
location, which was near the center of the sampled site (Figure 3).
3.4 Identification of Outliers
Outliers, or observations inconsistent with neighbors, were identified in
order to minimize noise in assessing semivuriograms and to understand what
processes might be occurring at a local level. Isaaks and Srivastava (1988)
-87-
-------
found that noise in sample semivariograras of Dallas Lead Site data was reduced
after omitting outliers.
Outliers from the Dallas Lead Site were previously identified
(George Flatman, personal communication, Appendix A). These were mostly
associated with industrial sites and coincide with the high local
contours of the Dallas Lead Site data. The outliers constituted some of
the largest measured Pb concentrations from Dallas Lead Site samples.
Of the DMC area data, the largest and 4 of the top 9 Pb measurements
were outliers. The largest and 6 of the top 7 RSR area Pb measurements
were outliers, while the largest and 4 of the top 8 reference area
measurements were outliers.
Outliers from the Palmerton Site were identified by noting
inconsistencies of measured concentrations with neighboring points. During
the first sampling phase, the occurrence of previous disturbance of the soil
by human activity was recorded, and the 31 of 100 first phase grid points that
were labeled as being from disturbed locations were omitted in the previous
geostatisticaI analysis. Some of these observations appeared to be consistent
with neighbors and were retained in this study. Some of the outliers
identified here were ouch lower in concentration then their neighbors and were
labeled as being from disturbed locations. Outliers were also identified as
those observations with spline model absolute residuals*> 2.5, when S was 500
(Append i x A).
3.5 The Relationship Between the Variance and the Mean
Natural logarithm transformations were previously applied to the Dallas
Lead Site and the Palmerton Site data, prior to kriging, for the purpose of
inducing stationarity in the variance. The effect of this transformation upon
stabilizing the variance of duplicate samples had been examined for the
Palmerton Site data (Starks et a!., 1987). Since there was much variation in
the duplicate samples of size two, we did not utilize the duplicate sample
data for the purpose of examining the relationship between mean and variance.
Instead, the sample data, excepting outliers, were subjectively grouped into
subsets of similar data value (Figure 4) and sample statistics were
calculated from data within subsets. This stratification was intended to
limit the effect of trend upon the variance-mean relationship. There appeared
to be a linear relationship between the sample standard deviation and the
sample mean and a natural logarithm transformation removed much of this
dependence (Figure 5).
3.6 Sample Semivariograms
Sample semivariograms were calculated after transforming the measured
heavy metal concentrations by ln(ppm) to induce stationarity in the variance
and after omitting outliers. Semivariograms in the directions of 0, 45, 90,
and 135 degrees were calculated, except for the reference area data, for which
it was difficult to attain enough distance groups of size >30 to discern
-88-
-------
patterns in the directional semivariograms
Differences were observed in semivariogram patterns of the Dallas
Lead Site data. The Dallas Lead Site reference area data semivariogram
was roughly constant, while the semivariograms of the DMC area and RSR area
data increased, then leveled off, with increasing distance. There did not
appear to be a strong directional effect (Figure 6).
The Palmerton Site data semivariograms exhibited differences according to
direction, with the slightest increase with increasing distance noted in the 0
degree directional semivariogram. At distances less than 3 km, the greatest
increase with increasing distance was observed in the 90 degree directional
semivariogram. The 45, 90, and 135 degree directional semivariograms of the
first phase data were higher than the corresponding second phase data
semivariograms. This may largely be due to the sample locations of data and
not reflective of the differences in composite.sample sizes. The similarities
in the 0 degree directional semivariograms and in the nuggets of the different
sampling phase data suggests that a substantial variance-reducing effect of
increasing the composite sample size from four to nine cannot be noted from
these sample semivariograms (Figure 6).
3.7 Pooled Sample Variances of Individual, Duplicate, and Split Samples
Sample variances of the individual, duplicate, and split sample sets were
pooled, after transforming the data by In(ppm), in order to: (1) assess the
effectiveness of composite sampling, and (2) allow estimates of micro-scale
variation and of subsampling and measurement error.
The effectiveness of composite sampling in reducing micro-scale variation
was noted by the decrease in the sizes of the individual and duplicate sample
variances of the Palmerton Site data as the composite sample size increased.
Though the relative differences in variance estimates were comparable between
the changes in composite sample sizes, the absolute magnitude decreased as the
composite sample size increased, indicating decreasing returns upon an
increase in the composite sample size. The pooled individual sample variance
was sensitive to a single individual soil sample exhibiting unusually low
heavy metal concentrations Omission of this observation lowered the
estimated variance reduction by compositing (Table 2).
Further observations about the pooled sample variances were made.
Duplicate sample variances were larger than split sample variances. The split
sample variances of the second phase Palmerton Site data were lower that the
split sample variances of the first phase data, a decrease that Brown et al.
(1989) thought might be due to learning on the part of the analysts. Similar
duplicate and split sample variance estimates were noted between the Dallas
Lead Site Pb data and the Palmerton Site first phase Pb data (Table 2).
-89-
-------
3.8 Decomposition of Spatial Variability into Regional Trend and Stochastic
Process
Cubic spline models were fit to the Dallas Lead Site and the Palmerton
Site data in order to decompose the spatial variability into regional trend
and a stochastic process. Several levels of S were selected to supply
varying degrees of fit, after omission of the identified outliers. Apparent
discontinuities associated with outliers caused rank deficiency in the design
matrix as S decreased. Weights of 2.0 were used, derived from an anticipated
error standard deviation of 0.5.
Smooth fits to the trend in the Dallas Lead Site data and in the
Palmerton Site data were provided by spline models with S values of 300 and
500, respectively. However, as S decreased, the spline models fit more of
the spatial variability, as indicated by increased complexity in contour plots
of fitted values and by lower sills of residual semivariograms. Sills were
considerably lowered by spline models with relatively few knots as compared
to semivariogram sills when no trend was removed (Figure 6). Omnidirectional
semivariograms were calculated for the Dallas Lead Site data as there was no
indication of anisotropy in these data. The Palmerton Site semivariogram was
collapsed over sampling phase, since the difference in the nugget from
increasing the composite sample size from four to nine appeared to be small
relative to noise in the semivariogram. This judgement was made from viewing
the semivariograms with no trend removed and from the pooled duplicate sample
variances. As S decreased, the semivariograms of spline fit residuals
tended to exhibit a pure nugget pattern, with little difference according to
di rection (Figure 7).
The proportions of variability in the log(ppm) data attributable to
different sources were calculated from the spline model results and the
pooled sample variances. The proportions of variability due to combined
subsubsampIing and measurement error, to micro-scale variation, and to
the combined effect of large-scale trend and local discontinuities
associated with outliers were calculated as
*2 Ait Aa A>» Aj
a a -a a"-a
a n a , z n
a a ' A2 ' a A2 '
a a a
respectively, where a , a , and a are the estimated measurement error,
^ n z
nugget, and sample variances of the In(ppm) data, respectively.
The estimated proportion of variability due to subsampling and measurement
error was small, being less than 1%. The proportion of variability due to
micro-scale variation was larger, in the range of 15-25%, while the proportion
of variability due to both the large-scale trend and to outliers was the
largest, in the range of 75-84% (Table 3).
-90-
-------
3.9 Crossvalidation Study Results
The purpose of this study was to assess the spatial scale necessary to
capture the large-scale trend. Therefore, the identified outliers were
omitted from the crossvalidation analysis, since they probably could not be
predicted well from nearby data.
Omnidirectional spherical semivariograms were fit to the Dallas Lead
Site data and to the Palmerton Site data and were used in calculating the
kriging predictions. A linear anisotropic semivariogram provided a good fit
to the Palmerton Site semivariogram data, but when the data were thinned below
approximately half of the full data set, the kriging predictions with this
model were unstable. When more than half of the data was resampled, the
predictions based upon the spherical semivariogram and upon the linear
anisotropic semivariogram were very similar. Neighborhoods of points with
highest correlation were used in calculating kriging predictions, with chosen
neighborhood sizes chosen to be 20 for the Dallas Lead Site data and 15 for
the Palmerton Site data. A Lagrange multiplier for a constant mean was used
in the calculation of kriging coefficients as it provided a lower MSG of
crossvalidation than kriging predictions calculated without Lagrange
multipliers.
The MSE of crossvalidation decreased in a nonlinear pattern with
increasing percentages of data in the resampled subset. There was relatively
little loss in prediction accuracy with up to about 60 percent omitted from
the data subset. For example, when the resampled subset constituted 42-45
percent of the nonoutlier data set, the increase in the MSE of crossvalidation
was just 10-17 percent over the MSE of crossvaIidation when all nonoutliers
were resampled. The median minimum intersample distance among resampled
points exhibited a similar pattern with increasing resampling percentages
(Figure 8). Standard errors of the mean for the 100 runs were 0.0005-0.007
and 0.0002-0.006 for the MSE of crossvalidation and for the median minimum
intersample distance, respectively.
Standardizing intersample distances by estimated ranges of large-scale
autocorrelation presented a different view of the scale of sampling from that
anticipated in the sampling design. If the range of autocorrelation was
estimated from omnidirectional spherical semivariogram models fitted to the
data with only outliers omitted, then the median minimum intersample distances
were 11-15% and 4-5% of the estimated ranges of autocorrelation, respectively
(Figure 8). Calculating the intersample distance on the scale of the
large-scale process thus gave a sampling scale much less than the 1/3-2/3 of
the range of autocorrelation thought to have been required for sampling a
small-scale stochastic process.
The capability of estimating semivariograms was not lost when much of the
data was omitted, though there may have been too few points to reliably
estimate a semivariogram when only 17-19% of the Dallas Lead Site nonoutlier
data were resampled. The sample semivariogram patterns when data were omitted
in resampling were similar to the semivariogram patterns of all nonoutlier
data (Figure 9).
-91-
-------
4. DISCUSSION
This retrospective analysis of data collected at the Dallas Lead and the
Palmerton Super fund Sites examined the nature of variability in the data and
sampling considerations for sites possessing the variability scales
hypothesized for these data. The size of variability sources and the spatial
scale on which they occur are important factors in designing an efficient
sampling scheme and accurately predicting a contaminant concentration response
surface. Being able to allocate resources in light of anticipated sizes of
variability sources can contribute to cost effective sample design (Provost
1984).
Traditionally, the sampling approach to geostatistics has implicitly
assumed that there exists a stochastic process nested within the regional
trend that can be captured by a systematic design (Yfantis et al., 1987;
Flatman et al., 1988). However, the Dallas Lead Site data and the Palmerton
Site data did not appear to exhibit small-scale variability. Instead, the
variability appeared to consist of a large-scale trend with discontinuities
caused by either local contamination processes or by local soil disturbance,
by variability occurring on a micro-scale, and by a very small measurement
error. The crossvalidation results implied that sampling of these large-scale
processes might have been suitably achieved with a larger scale of sampling
than the scale that was utilized.
One purpose of sampling a small-scale stochastic process is to achieve
stationarity in the data, or at least to be able to assume approximate
stationarity in local data neighborhoods. The importance of the stationarity
assumption may depend upon the use for which the response surface prediction
is intended and the importance of an accurate variance estimate. The kriging
predictions may be less sensitive to stationarity violations than are kriging
variance estimates, as the kriging weights are likely to be similar using
different, but reasonable, semivariogram models, while the kriging variance
estimate depends upon the assumed semivariogram. However, for these data, it
appears that calculation of kriging coefficients by other than a pure nugget
correlation model requires that the data correlation structure be modeled from
the large-scale trend. Other authors have modeled the correlation structure
of the large-scale trend and performed kriging using this estimated
autocorrelation (see, for example, Cressie, 1989).
If the data correlation structure is to be estimated from data sampled on
the large scale, then the guidelines for sampling on the small scale may not
apply. The grid locations at the Dallas Lead Site and at the Palmerton Site
were arranged to be 2/3 and 1/3, respectively, of the anticipated range of
autocorrelation. Sampling this scale of a small-scale stochastic process is
usually suggested for geostatistical studies (Flatman et al., 1988). If a
systematic sampling scheme is designed-to capture the regional trend, the
intersaople distance might be much smaller than 1/3-2/3 of the large-scale
autocorrelation range.
-92-
-------
If local contamination processes are important components of the entire
contamination process, as was evident at the Dallas Lead Site, then sampling
can be designed to detect hotspots (see Chapter 10 in Gilbert, 1987), as well
as to capture the large-scale trend. However, at the Dallas Lead Site, the
local hotspots were largely associated with industrial sites, and sampling of
such locations might be planned rather than being randomly encountered. At
the Palmerton Site, the outliers constituted a lower proportion of the data
set and they were most frequently lower in concentration than their neighbors,
presumably because the soil had been previously disturbed.
Prediction of the extent of hotspot contamination around the local
industries would likely be overestimated if the autocorrelation structure of
the large-scale process were employed for interpolation (Gensheimer et al.,
1986). Perhaps the best data-based solution to predicting the extent of local
hotspot contamination is realized by further sampling near those locations.
Spatial variability can be modeled by deterministic or stochastic
components, or a combination of types, as in (1). Deterministic variability
can be viewed as being a product of underlying mechanistic factors. For
example, the large-scale variability at the examined sites might have
resulted from a physical process that is a function of gravity, air flow, and
contaminant particle size. Stochastic variability can be considered to have
arisen from processes of unknown causes independent of spatial location.
However, as a process is studied in greater detail, underlying causal
mechanisms may become apparent, and so the designation of deterministic or
stochastic is often applied relative to our scale of reference (Wilding and
Drees, 1983; Cressie, 1988). The modeling of variability as deterministic or
stochastic may depend, in part, upon the degree of model explanation versus
model empiricism (Lehmann, 1990) that is sufficient for site characterization.
Modeling the spatial process as a stochastic process is the assumption of
geostatistics and that approach has been valuable for mapping. However,
future site characterization needs may require models of a more deterministic
nature. At the Dallas Lead and the Palmerton Sites, it appears that most of
the spatial variability might be regarded as deterministic by our scale of
reference. Considering the spatial variability in a more deterministic nature
could allow answers to questions that have been posed for both sites.
Consideration of human health remediation at the Dallas Lead Site required
knowledge of how both smelters and motor vehicles contributed to soil lead
contamination (Carra, 1984). At the Palmerton Site, differences in lead
variability from cadmium and zinc variability might be attributable to the
additional contribution of lead from motor vehicles (Starks et a)., 1987).
Much of the pattern of heavy meLal contamination at the examined sites seems
to be related to spatial features, such as the alignment of the Palmerton Site
contamination contours wilh the ridge and valley topography, and the lead
level decrease in the Trinity River floodplain at the Dallas Lead Site DMC
area. Spatial data analysis and response surface prediction that combine
aspects of empiricism and determinism are likely to be topics of future
research.
The Palmerton Site data provided an opportunity to examine the efficacy
-93-
-------
of composite sampling upon reducing micro-scale variation. Compositing
appeared to have been effective in reducing the micro-scale variation, as
indicated by the sizes of variance component estimates from individual,
duplicate, and split samples, though it appears that increasing the composite
sample size from four to nine may have diminished the variance reduction
returns compared to increasing the composite sample size from one to four.
Compositing of individual soil samples may be desired when soil sampling by
small cores and when the micro-scale variability is similar to that at the
Palmerton Site.
Cost-effective sampling is likely to be achieved when the sampling design
reflects knowledge about the sizes of variability components in the process of
interest and the spatial scales on which they occur. If the regional trend
and variation among nearby individual samples contribute significantly to
spatial variability, then the choices of a sampling scale in measuring the
important features of the regional trend and the use of composite sampling to
minimize the micro-scale variation might be important to achieving
cost-effectiveness in hazardous waste site characterization.
Polm«rton Sit«
i a
o-t.
• — ptaM 1
¦» — ptMM 2
*-
~~
» ~, » ? ~~ *~ *
* v.- :i: -

.et
• • 13
Easting (km)
i a
FIGURE I. Sample location patterns of the soil samples taken at the Palmerton
Site. The coordinate space has apparent Iy been rotated from its geographical
space, so that the easting coordinate is aligned with the ridge and valley
oriental ion.
-94-
-------
Xou«i6«jj
feu«1b«4j
X0u*r6»jj
Xauanbuj
XauanMij
FIGURE 2. Sample frequency distribut ions of the Dallas Lead Site and the
Palmer ton Site ppm measurements. Categories with zero observat ions have been
omit ted.
-95-
-------
TABLE I. Descriptive statistics of the Dallas Lead and Palmerton Site
heavy metal concentrations. The statistics n, X, S, and
CV = 10Q*(S/X), represent the sample size, sample mean, sample standard
deviation, and coefficient of variation, respectively.
Site Metal
Da 11 as Lead Pb
reference area
Dallas Lead Pb
DMC area
Dallas Lead Pb
RSR area
Palmerton Cd
Palmerton Pb
Palmerton Zn
n X S
88 125.2 123.1
180 429.9 999.7
180 364.2 854.5
413 42.5 47.7
413 207.1 205.6
413 3126. 4267.
CV Range
98 16.0-703
233 24.4-10400
235 11.2-6060
112 1.29-364
99 7.30-1730
137 146-40000
-96-
-------
l».*ic Area ro
Reference Area Pb
0 1 2
Catting (ka)
RSR Art* Pb

Eaatlng (ka)
i a s
Eaatlng (ka)
Palaarton Cd
Eaatlng (ka)
Palaarton Pb
t r-
Palaarton Zn
O7 V" '
D - 0
eso*
Esse lag (km)

Eaatln* (ka)
FIGURE Contour plots of the Dallas Lead Site and the Palmer ton Site
ppm measurement s. The contour line increment was 100. 500, and 500 ppm
for the Dallas Lead Site reference area, DMC area, and RSR area data,
respect ively, and it was 50, 250, and 2500 for the Palmerton Site Cd, Pb, and
Zn data, respect ively. These plots were made by predict ing concent rat ions at
grid locations using l/di stance* weighting of the 8 closest data points using
the SURFACE II algorithm (Sampson, 1084).
-97-
-------
Dallas Lead DMC Area Pb
B S 7
• •
3
a

4
Easting (km)
1 2
Palm»rton Sit* Cd
. • • <• • i
• • 12
Casting (km)
1 9
Dallas Lead RSR Area Pb
£>1
x
¦c
o
z
• • • • t I
•»
• • • I •
12 3 4
Easting (km)
1 2
1
,
-------
• - OMC VM
* -
•••
I...
log(p»mPtt)

Fwiwlw SM
r
i...
i..
Pomwten SM
to* (p»m P%)
<•« >~« ••• IM
MM U«M |«N*
loa
-------
MI«C DUC VM
Di*t4AC« (km;
Oollo* RSP Arm*
Dltfanc*
-------
TABLE 2. Pooled .sample variumes of individual, duplicate, and split
sample Inippm) data.
Sample
Componen t
Dupli ca te
Sp 1 i t
Area
re f e rence
DMC
HSR
comb i ned
Da 1 las Lead Site
Numbe r
of Sets
9
18
19
1 1
Var iance
Es t i ma te
0.0083
0.0157
0.0396
0.00528
Pa 1 me r ton Site
Sample Sampli ng
Component Phase
Individua I
Duplicate
Spl i t
Numbe r
of Sets
10
10
10
11
10
7
Cd
0.766
0.275
0.0690
0.0116
Var iance
Es t i ma te
Pb
0.584
0.341
Zn
0.336
0. 195
0.0730 0.0989
0.00876 0.0168
0.00275 0.00453 0.00380
0.000934 0.00193 0.00291
~After omitting a single individual sample of anomalously low
concentra t i ons
-101-
-------
2 . 89
1 . 88
Casting (km)
0 . 23
1 . OS
1 . 88
Easting (km)
2.89
3 . so
0 . 23
1 . 08
3 . SO
DMC Area Pb, S=150
DMC Area Pb
3 . so
/*¦
2.49
S-200
q.0 . 2
1 . 08
0 . 25
3 . 50
2 . 09
1 . 88
Casting (km)
o. 23
i. o«
Distance (km)
FIGURE 7. Contour plots of spline model fits and sample semi var ioi>rams
of spline fit residuals for the Dallas Lead Site DMC and RSR area data
and for the Palmer ton Site data.
-102-
-------
RSR Area Pb, S=300
RSR Area Pb, S=200
3 . SO
3 . 30
t . 06
1 . 09
0 . 23
0 . 29
3 . 30
1 . 06
0 . 23
1 . OS
0.23
2 . 89
3 . 30
1 . SB
Easting (km)
RSR Area Pb, S= 150
RSR Area Pb
3 . so
S-300
2 . 69
S-150
1 . oa
0. 23
1 . 09
1 .98
Casting (km)
2 . 99
3. 30
Distance (km)
FIGURE 7 (conlinuvd).
-103-
-------
Palmerton Cd, S = 500
Easting (km)
Ea.s
O
o>
3 0 • 4
O
I"
E
<^o. i
Palmerton Cd, S=500
1 2
Distance (km)
Palmerton Cd, S —400
• . 9
Easting (km)
gO.S
o
?0 . «
Eo. i
I#-*
E
i
0.0
Palmerton Cd, S=400
i a s
Distance (km)
Palmerton Cd. S=»300
# Palmerton Cd, S=300
0L
Easting (km)
o. a
£ ® • 4
w
o
"|o. J
J?
E
^o.t
o.o
t 7
Distance (km)
FIGURE 7 (cunt inued).
-104
-------
Palmerton Pb, S = 500
*7 . 5

3 . 3
Easting (km)
eo.»
o
o
00. 4
go.3
A
!«•»
6
<^0.1
Palmerton Pb, S=500
1-0*
43*
J- JO'
4 - 135*
1 1
Distance (km)
Palmerton Pb, S — 400
• . s
-7 . S
3.5
1 . S
o. so
Easting (km)
E° •s
a
JTo. *
J
60.3
<3
a" • *
£
o
<"o. I
0.0
Palmerton Pb. S=400
i a 3
Distance (km)
Palmerton Pb. S=300
Easting (km)
0 0.4
o
J"
I"
$o. t
0.0
Palmerton Pb, S=300
« a 3
Distance (km)
FIGURE 7 (continued).
-105-
-------
Palmerton Zn, S=500
Easting (km)
E 0's
o
>
£••3
A
£
<^0.1
o. a
Palmerton Zn, S=500
i j
Distance (km)
Palmerton Zn, S—400
Easting (km)
o. t
E° s
o
0. 1
Palmerton Zn, S=400
1 1 3
Distance (km)
Palmerton Zn. S=300
Easting (km)
o.t
£0.5
o
w
9
00.4
0
>
eo.j
(/)
£
^0.1
9.0
Palmerton Zn, S=300
1 3 3
Distance (km)
FIGURE 7 (continued).
-106-
-------
TABLE 3. The estimated components and proport ions of variability
attributable to different sources in the Dallas Lead Site and the
Palmerton Site data.
Si te
Metal
Variance Component Estimate
12 3
Total Nugget Measurement Error
Dallas Lead
Pb
1.315
0.313
0.00528
- DMC Area

Da 1 las Lead
Pb
1 .277
0.314
0.00528
- RSR Area

Palmer ton
Cd
L. 238
0.199
0.00275
Palmer ton
Pb
0.803
0.201
0.00453
Palmer ton
Zn
1.266
0.206
0.00380
Percent of Variability Estimate
Si te
Me ta 1
Trend4
Micro-scale
Measurement
Dal las Lead
Pb
76.2
23.4
0.4
- DMC Area

Dallas Lead
Pb
75.4
24.2
0.4
- RSR Area

Palmer ton
Cd
83.9
15.9
0.2
Palmer ton
Pb
75.0
24.5
0.6
Palmer ton
Zn
83.7
16.0
0.3
1Sample variance of In(ppm)
2
Estimated by semivariogrura of spline model residuals collapsed over distance
(S=200 for Dallas Lead DMC and RSR area data, S=300 for Palmer ton Site data).
Includes error of subsampling
4Includes local discontinuities of outliers
-107
-------
uo!W>^**»0 PISH
3
.9
Q
6
3
£
e
3
s
v
3
a
o
««
«
0 0 0 4
(im) mama um»i«
UOMBp^MMO |0 JSH
FIGURE 8. The crossvalidat ion results: (A) USE versus the percent of data
resampled, (ti) median minimum intersample distance versus the percent of data
resampled (Palmerton Site Cd, Pb, Zn data were nearly identical so only the Cd
data is present ed), and (C) USE versus the median minimum inter sample
distanced scaled by the ratine of autocorrelat ion estimated by omnidirect ional
spherical-semi variogram models. Values are the means of 100 runs except for
the case of 100% of the data being re sampled.
-108-
-------
f
h
&
c
fl
9
« •
- <*•

«•
FIGURE 9. Sample semivariograms for different percentages of resampling for
the Dallas Lead Site DMC and RSR area data and for the Palmerton Site data.
The data are single resamples of the nonoutIier data and the smooth curve is
the spherical semi variogram fit to the 100% resampled data.
-109-
-------
ACKNOWLEDGEMENTS
The work on this paper has been carried out with partial support of
EPA Research Grant CR 815 273 010 and SRA/EPA Research Contract 40400-S-01 to
the Penn State Center for Statistical Ecology and Environmental Statistics.
Our thanks are due to Karl Held of SRA Technologies, Inc. and to Herbert
Lacayo, Jr. and N. Phillip Ross of EPA Statistical Policy Branch for their
support and encouragement We are also thankful to our colleague Marilyn T.
Boswell for his interest and several technical discussions.
REFERENCES
Armstrong, M. (1984). Problems with universal kriging. Mathematical
Geo lot; y , 16, 101-108.
Brown, K. W., Beckert, W. F. , Black, S. D., Flatman, G. T., Mullins, J.
W. , Richitt, E. P., and Simon, E. J. (1984). Documentation of
EMSL-LV Contribution to Dallas Lead Study. U.S. EPA 600/4-84-012.
Brown, K. W., and Black, S. C. (1983). Quality assurance and quality control
data validation procedures used for the Love Canal and Dallas lead soil
monitoring programs. Envi ronmental Moni tor i ng Jt Assessment , 3, 113-122.
Brown, K. W., Flatman, G. T., Englund, E. J., Shoener, E., Starks, T. H. ,
Rohde, S. C. , Schnell, M. II., Fisher, N. J., Sparks, A. R. , and
Gruber, D. K. (1989). Documentation of EMSL-LV contribution to
Palmerton, PA. Superfund remedial investigation. U.S. EPA.
Brown, K. W., Mullins, J. W. , Richitt, E. P., Flatman, G. T., Black, S. C.,
and Simon, S. J. (1985). Assessing soil lead contamination in Dallas
Texas. Environmental Monitoring and Assessment , 5, 137-154.
Carra, J. S. (1984). Lead levels in blood of children around smelter sites in
Dallas. In Environmental Sampling for Hazardous Wastes, G. E. Schweitzer
and J. A. Santolucito, eds. American Chemical Society, Washington, DC.
pp. 53-66
Cressie, N. (1988). Spatial prediction and ordinary kriging. Math. Geol . ,
20, 405-421.
Cressie, N. (1989). Geostastics. Amer. Statistician, 43, 197-202.
Dierckx, P. (1981). An algorithm for surface-fitting with spline functions.
I HA J. Numerical Analysis, 1, 267-283.
Englund, E. (1987). Spatial autocorrelation: Implications for sampling and
estimation. In ASA/EPA Conference on Interpretat ion of Envi ronmentat
Data: III Sampling and Site Selection in Env i ronment al Studies, May
14-15, 1987, W. Liggett, ed. EPA-230-08-88-035, USEPA, Office of
Planning and Evaluation, Washington, D.C., pp 31-39.
-110-
-------
Flatman, G. T., England, E. J., and Yfantis, A. A. (1988). Geostatistica1
approaches to the design of sampling regimes. In Principles of
Environmental Sampling, L. Keith, ed. American Chemical Society, pp.
73-97.
Gensheimer, G. J., Tucker, W. A., and Denahan, S. A. (1986).
Cost-effective soil sampling strategies to determine amount of soils
requiring remediation. In Proceedings of the National Conference on
Hazardous Wastes and Hazardous Materials, March 4-6, 1986, Atlanta,
GA. pp. 76-79.
Gilbert, R. 0. (1987). Statistical Methods for Environmental Pollution
Monitoring. Van Nnstrand Reinhold, New York.
Hayes, J. G., and Halliduy, J. (1974). The least-squares fitting of cubic
spline surfaces to general data sets. J. Inst. Math. Appl. , 14, 89-03.
Isaaks, E. H., and Srivastava, R. M. (1988). Spatial continuity measures for
probabilistic and deterministic geos tat i s t i cs . Math. Geo., '20, 313-341.
Journel, A. G., and Huijbregts, C. J. (1978). Mining geostat i st ics.
Academic Press, London. 600 p.
Lehman, E. L. (1990). Model specification: The views of Fisher and Neyman,
and later developments. Statistical Science, 5, 160-168.
Provost, L. P. (1984). Statistical methods in environmental sampling. In
Environmental Sampling for Hazardous Wastes, G. E. Schweitzer and J.
A. Santolucito, eds. American Chemical Society, Washington, DC. pp.
79-96.
R.E. Wright Associates, Inc. (1988). Palmerton Zinc Off-Site Study Area. Draft
of Remedial Investigation and Risk Assessment. Report prepared for Gulf
+ Western.
Sampson, R. J. (1984). SURFACE II Graphics System. Kansas Geological
Survey, Lawrence, KS.
Starks, T. H. (1986). Determination of support in soil sampling. Math.
Geo I any, 18, 529-537
Starks, T. II., Brown, K. W., and Fisher, N. J. (1986). Preliminary
monitoring design for metal pollution in Palmerton, Pennsylvania.
In Quality Control in Remedial Site Invest iuat ion: Hazardous and
Industrial Solid Waste Testing, 5th Vol., C.L. Perket, ed. ASTM
925. ASTM, Philadelphia, PA. pp 57-66.
-Ill-
-------
Starks, T. H. , Sparks, A. R. , and Brown, K. W. ( 1987). Geostatistica1
analysis of Palmerton soil survey data. Environmental Monitoring
and Assessment , 9, 239-261.
Wilding, L. P., and Drees, L. R. (1983). Spatial variability and pedology.
In Pedogenesis and Soil Taxonomy. I. Concepts and Interactions. L. P.
Wilding, N. E. Smeck, and G. F. Hall, eds. Elsevier, N.Y. pp. 83-116
Yfantis, E. A., Flatman, G. T., and Behar, J. V. (1987). Efficiency of
kriging estimation for square, triangular, and hexagonal grids.
Math. Geology, 19, 183-205.
-112-
-------
APPENDIX A
Dallas Lead Site and Palmerton Site Outliers
Dallas Lead Site Outliers
Area Sample (x,y) coordinates pm Pb
Nuraber (ft)
reference 05523 (5256,3167) 324
05522 (6291,2803) 703
05506 (2139,6762) 16
05858 (4928,6720) 334
05504 (4935,7476) 394
DMC 01472 (5754,10869) 2950
01499 (9203,9831) 2190
01433 (11434,7567) 10400
05069 (1595,6555) 2880
01210 (8977,9292) 1060
RSR 05727 (3407,1203) 6060
05032 (3667,10521) 5090
05563 (11318,6454) 2560
00867 (11314.6503) 3390
01430 (6436,5966) 4070
06417* (4533,5444) 3970
06418* (4532,5443) 7190
05136 (1499,4580) 856
05139 (2125,4671) 384
05728 (2985,4656) 332
00848* (4496,3048) 827
00847* (4496,3048) 715
05770 (4456,5438) 290
* - duplicate samples
Comment
Residential area
Residential area
Church lawn
Industrial area
Industrial area
Industrial area
Industrial area
Industrial area
Hospital grounds
Industrial area
Industrial area
Residential area
Indus
trial
area
Indus
trial
area
Indus
trial
area
Indus
trial
area
Indus
trial
area
Indus
trial
area
Indus
trial
area
Indus
trial
area
Indus
trial
area
I ndus
trial
area
Indus
trial
area
-113-
-------
Palmerton Site Outliers1
Samp 1e
(x,y) coordinates
Metal ppm

Number
(thousands of ft)
Cd
Pb
Zn
Phase
Commen t
BU30
(18.9576,22.3850)
2.14
32.5
410
1

low,
di sturbed
CD44
(22.7808,16.5992)
14.2
47.8
750
1

low,
SE transect
BS34
(18.3543,20.8544 I
22.5
89.
1610
1

1 ow,
Palmerton Hosp.
BN34
(16.3961,20.6209)
21 .6
84.0
2195
1

1 ow,
di s turbed
CS72
(29 1240,5.4517 )
34 2
171.
1900
2

high,
S. of Blue Mt.
BM31
(15.9516,25.73481
66.7
394.
2800
o

high,
N. of Palmerton

Palmerton
Si te Out 1i ers"

Sample
(x,y) coordinates

Me ta 1 ppm

Number
(thousands of ft)
Cd
Pb

Phase
Comment
AE20
2.1702
26.4413
39.40
590.0
*
1740

near highway
AH17
3.4338
27.5655
26.20
151.0

2900
+
I
AK12
4.7963
29.4709
29.50
770.0

2400
+
2
near highway
A023
6.0750
24.6304
6.75
12.4
-
780

2
AR37
7.3880
19.3162
6.41 -
131.0

660
-

AS24
7.6098
24.5743
8.48
7.3
-
650

«)

BN40
16.4576
18.1245
266.00
930.0
40000

BP36
17.0910
19.9278
18.20 -
92.0

6900

1
grid, disturbed
BT27
18.9800
23.4898
6.18 -
77.0

430

1
grid
BV30
19.7373
22.3875
12.80
161.0

680
-
1
grid
BV36
19.8445
20.0746
98.90
49.9
-
9400

1
grid
BZ28
21.2817
22.9746
9.52 -
46.0
-
660

1
S. of H. School
CA41
21.5464
18.1092
364.00
1730.0
+ 13700

I
CB26
22.0942
24.0434
76.10
308.0

290
-
1

CD59
22.9612
10.4366
2.75 -
27.5

360

CE32
23.2569
21.4739
18.60 -
92.0

1320

CX34
31.1226
20.7715
179.00
640.0
22400
+
1

EF29
45.0831
22.0612
5.82 -
72.0

280

EI 34
46.3059
19.9951
24 . 10
69.0

8600
•f
1

'defined prior to spline fit.
2
defined subsequent to spline fit (-,+ denote outliers of low or high
concentration, respectively).
-114-
-------
DISCUSSION TOPICS FOR SPATIAL STATISTICS AND COMPOSITE SAMPLING
SESSION 3
M.R. MURRAY
THE PENNSYLVANIA STATE UNIVERSITY
SOIL AND ENVIRONMENTAL CHEMISTRY LABORATORY
104 RESEARCH BUILDING A
UNIVERSITY PARK, PA 16802
I. Comments on Stationarity:
The paper presented by Nick Bolgiano et al., serves to demon-
strate that inducing stationarity on nonstationary data is often
a difficult tasks. The decision as to whether to remove a trend
and/or to stabilize the variance prior to spatial modeling re-
quires an appreciation of the population under study, and a large
degree of personal judgment. In using geostatistics, we must make
assumptions that cannot be proven nor disproven in the pure
sense. At best, all we can do is to document our decision and to
explain unambiguously how the analysis was performed. From this
perspective, geostatistics should be treated as a heuristic
approach to spatial modeling. This is not to say, however, that
geostatistics should be discouraged as a tool in spatial model-
ing. Indeed, experience has shown that in many situations the
geostatistical approach to spatial modeling outperforms other
methods!
II. Comments on the use of ordinary kriging at hazardous vaste
sites:
The paper presented by Nick Bolgaino et al. briefly discusses the
use of alternative interpolation methods that could be used in
place of ordinary kriging. The point was made that when the
assumption of stationarity are in question, the benefits of
kriging are questionable. Indeed, this point is well taken. If
one desires a contour map of the contaminate concentration over
the site then perhaps a simpler procedure would suffice (e.g.
inverse distance squared weighting). It must be remember, howev-
er, that all spatial interpolation methods have assumptions
(whether deterministic or stochastic) and all of these procedures
have limitations as to their usefulness. Thus, the user must
choose the most appropriate method for the job in hand.
III. Comments on the use of advance geostatistical techniques at
hazardous vaste sites:
If we accept the definition of risk as being a rieasure of the
probability of a receptor to danger or a hazard, then a measure
of risk in a spatial framework could be very useful. The idea of
-115
-------
risk qualified mapping has proven to be a very beneficial tool in
the remediation processes at hazardous waste sites. While sever-
al kriging approaches could be used to develop risk qualified
maps, probability kriging (PK), developed by Journel and his
students at Stanford University, is, in my opinion, the most
appealing. Probability kriging allows one to develop a posterior
cumulative distribution functions for unknown points within the
study site. Furthermore, PK utilizes, via cokriging, indicator
transformations of the original data and the uniform ranking of
the original data. Because both transformations can only take on
values between 0 and l, PK is very robust against data outliers.
Furthermore, PK has been shown to be insensitive to nonstationar-
ity. This last point is purely a heuristic argument since the
estimation of a local distribution function calls for strong
stationarityl In line with the development of PK, Journel and his
students have recently developed a conditional simulation ap-
proach called sequential indicator simulation (SIS). This ap-
proach allows one to produce an array of equiprobable maps of the
phenomena under study. Such maps provide a stochastic imaging of
uncertainty which should prove invaluable in the risk assessment
process.
IV. Suggestions for future research:
The annual report of research presented by Drs. Englund and
Flatman is a testimony to the significant contribution that the
EMSL group is making in the area of spatial statistics. One area
that I feel needs further research efforts is the incorporation
additional information within geostatistical analysis. The use of
"soft" data or "fuzzy' data sets along with "hard" (precise) data
within the kriging framework could prove to be very cost effec-
tive, especially where cost of chemical analysis is high. Resent
development of procedures such as soft kriging, Bayesian kriging
and fuzzy set geostatistics warrant further investigation as to
their usefulness in the hazardous waste setting.
-116-
-------
COlVflVIEISrTS BY PARTICIPANTS
On. E nglu rid and Flatman
David Schaeffer (University of Illinois): 1) Considering your model data,
in particular the spatial structure, I'm wondering whether the M-dimensional time
series models of Leo Aroian et al. could be modified so that "time" is replaced
by "block" and the spatial dimensions correspond to sub block values, depths, or
other spatial or contaminant or types of gradients. Examples of the latter are
geooorphology, groundwater flow; some k = [k, ,kj,...,k„ ] set of concentrations
for other pollutants (that is, the gradient of the k, th pollutant relative to
other pollutants]. 2) Apply your methods to 2 or more contaminants. There is
no a priori reason to expect the various contaminants to be spatially correlated
with each other. On this assumption, certain cells of the grid would be + for
pollutant, and others + for pollutant 2, etc. Eventually, one might have all
grid cells + for some contaminant. Are you contemplating developing an index or
some other aggregation method to make an overall cleanup decision.
Roy Smith (U. S. Environmental Protection Agency): The validity of the
conclusions depends on the degree to which the test data represented actual site
conditions. The use of elevation data interpolated from contour maps may not
accurately represent chemical concentrations. The work would be more convincing
if the 'true' data population was assembled from chemical concentration data.
On Bolsfiano, Patil, and Taillie
David Schaeffer (University of Illinois): Have you compared your results with
results from standard or dispersion models of the type EPA uses for air quality
etc.? Could you combine these deterministic models with the
stochastic ones for detrending? For example: 1) use the dispersion model to
estimate expected concentrations; 2) compute residuals as different from real
values; 3) carry out kriging on residuals.
Roy L. Smith (U. S. Environmental Protection Agency): Selecting strata a
posteriori on the basis of the same data used in the analysis seems likely to
bias the results. Strata should have been selected using different data than
those used for kriging.
On Murray
Mindi Snoparsky (U. S. Environmental Protection Agency): I must reiterate
regarding the problems associated with compositing!
Research money should be spent on the use of statistics in determining the
attainment of cleanup standards.
Compositing may be appropriate only in very specific areas of consistent metal
contamination that have homogeneous soils and geology (this is rare).
-117-
-------
QUANTIFYING EFFECTS IN ECOLOGICAL SITE ASSESSMENTS:
BIOLOGICAL AND STATISTICAL CONSIDERATIONS
Lawrence A. Kapustka 1
Mostafa A. Shirazi 1
Greg Linder 2
1 U.S. EPA
2 NSI Technology Services Corporation
Environmental Research Laboratory
200 S.W. 35th Street
Corvallis, OR 97333
-118-
-------
INTRODUCTION
The United States Environmental Protection Agency is respon-
sible for developing methods to control potentially harmful
chemicals entering the environment. The determination of harm is
often based on laboratory toxicity tests of chemicals using plants
and animals. The vast number of toxicity tests of single chemicals
that have been produced to date provide an extensive data base to
help determine potential harm of chemicals to humans and the
environment.
Within the conterminous states, the US has approximately
30,000 known hazardous waste sites (see Fig. 1). Many of these
are both large and located in environmentally sensitive settings.
In recent years, there has developed a growing awareness of the
importance of biological and ecological assessment in the US EPA
Remedial Investigation/Feasibility Study (RI/FS) process for
Superfund sites. Unfortunately, there is a tendency to collect
large amounts of information on numerous topics and then wonder
what it means. There is a great need to develop better study
designs and better statistical tools to improve our ability to
characterize environmental conditions, and to bring together the
reductionist and holistic approaches. Classical ecological studies
do not offer acceptable models to work from, primarily because the
sets of objectives for ecological site assessments must be made
substantively narrower than typical ecological studies.
At Corvallis, we have been working on various toxicity test
methods and on site characterization/assessment methods for some
time. Slightly more than two years ago, ERL-Corvallis began to
address ways of integrating methods to achieve meaningful ecolog-
ical assessments of hazardous waste sites. ERL-C has published
the first guidance document on ecological site assessments
methodologies (Warren-Hicks al., 1989). The conceptual approach
to site assessment has been discussed in subsequent papers (Murphy
and Kapustka, 1989; Kapustka and Linder, 1989).
Examples of research activities at Corvallis are presented
below as distinct chapters even while they are related segments of
the same problem. Ideally, an actual example of the application of
our site assessment approach would help to integrate the chapters.
Plans are currently underway to apply these concepts to hazardous
waste sites for demonstrating the utility of various methods. The
chapter titles are:
BIOLOGICAL/ECOLOGICAL PERSPECTIVES (Kapustka)
TOXICOLOGICAL PERSPECTIVE (Kapustka)
MATHEMATICAL ANALYSIS OF TOXICITY DATA (Shirazi)
STATISTICAL APPROACHES TO ECOLOGICAL ASSESSMENT (Linder).
CONCLUSION (Kapustka)
-119-
-------
Fig. 1. Distribution of hazardous waste sites in the conterminous
United States.
Site Distribution

395.0
625
5.0
-120-
-------
BIOLOGICAL/ECOLOGICAL PERSPECTIVE
When exposed to stressors, organisms potentially have many
responses. Through behavioral, morphological, physiological, and
genetic avenues, organisms can actually or effectively avoid a
stress. For instance, animals are capable of detecting some
chemical contaminants in food items; given an option to select
between "clean" and contaminated foods, they can avoid exposure.
Many microorganisms display chemotaxis as a means of directional
movement in response to chemicals. Plants often respond to
chemicals by differential growth rates, thus roots may have slower
growth into a contaminated mass of soil than roots on the same
plant extending into "clean" soil; again a means of avoidance.
Once an organism is exposed, there may exist biochemical and
physiological means of minimizing or avoiding the stress (e.g.,
detoxification, elimination, etc.). Longer term responses (days
to weeks) include acclimation. A stress that may be lethal in an
acute exposure may produce no detectable response if administered
gradually. Finally, over time periods measured in generations,
population genetics may come into play. Sensitive genotypes give
way to resistant genotypes; the effect may be no change in
population numbers or distribution, but implications for biologi-
cal fitness may be profound. Fundamentally, this is not different
from changes in communities where sensitive species are displaced
by resistant species; however, we are more accustomed to recogniz-
ing such changes.
Understanding the biology stress encompasses some very
interesting and complex issues, some of which are encumbered by
semantics. For instance, we tend to associate stress with a
negative response. Increase in growth is seldom thought of as
negative. Yet the classical do-se-response curve is characterized
by a range of low doses that stimulate growth followed by a gradual
change in sign of the response that eventually reaches death (e.g.,
auxin effects on plants as illustrated by the herbicide 2,4-D).
Even in such cases, the increased growth (stimulation if you will)
is not beneficial to the organism. Rather, it is a sublethal
response that lowers the organism fitness. If confronted with a
different stress, the organism is less capable of coping.
Continued exposure to stress (a single stressor or combination of
stressors imposed simultaneously or sequentiall) becomes
cumulative.
Ecology is an integrative discipline which draws upon diverse
sources of information (e.g., chemical, physical, geological,
biological, etc.) to describe the interactions of organisms,
populations, communities, and ecosystems with each other and their
surroundings. The purpose of an ecological assessment of a
hazardous waste site (HWS) is to determine if an adverse ecologi-
cal effect has occurred as a consequence of the materials present
121-
-------
at the site. The information gathered in the ecological assess-
ment should provide valuable insights into spatial distribution,
risk modeling, and evaluation of remediation options.
Hazardous waste sites have restricted access due to legal-
proprietary and human health risk considerations. Restricted
access imposes significant constraints on ecological assessment
and is the foremost reason for the paucity of ecological informa-
tion on existing sites. Precautions necessary to insure worker
safety add significantly to the cost of collecting site data.
Sample handling, chain of custody, and QA/QC requirements add
further to the special costs of assessing HWSs. Collectively,
these conditions lead to restricted, sometimes incomplete, data
sets upon which decisions must be made. Throughout a project, the
site assessment process must provide information that can feed into
critical decisions. These include determining the
o magnitude and extent of current impact,
o causality/weight of evidence,
o estimation of future impacts,
o merits of remediation options.
Consequently, it is exceedingly important that ca
done to insure that the proper information is
correct fashion. Sampling design and statistical
be considered early on to achieve effective and
resources.
TOXICOLOGY PERSPECTIVE
Chemical and biological interactions associated with expo-
sures to environmental contaminants may be evaluated according to
various assessment strategies. For example, both chemically-based
and toxicity-based approaches have made significant contributions
to ecological assessments for hazardous waste sites (Parkhurst
ai. 1989). But unifying chemical and biological response data
requires that statistical techniques not only be appropriate and
adequate for the task, but be carefully interpreted as part of the
site assessment.
An ecological assessment may be considered an integrated
evaluation of biological effects derived through measurements of
toxicity and exposure (US EPA, 1988a; 1989b). As complex interre-
lated functions, toxicity and exposure assessments yield estimates
of hazard (Fig. 2) which are associated with environmental
contaminants in various matrices sampled at a site. Depending upon
the environmental matrix being tested, toxicity assessments may be
derived using test methods which evaluate freshwater, marine, or
terrestrial biota.
reful planning be
obtained in the
assumptions must
efficient use of
-122-
-------
Figure 2. Toxicity and exposure assessments as constituents of
hazard evaluation.
Toxicity
Assessment
Exposure
Assessment
> Hazard
Evaluation
<
In evaluating adverse biological endpoints, toxicity assess-
ment methods frequently yield information regarding acute responses
elicited by environmental contaminants, e.g., dichotomously
distributed variables like mortality data. Additionally, toxicity
evaluations may yield continuously distributed variables such as
growth, as well as quantal response data such as teratogenic or
genotoxic endpoints, which may suggest longer-term biological
effects associated with subacute and chronic exposures to complex
chemical mixtures characteristic of most hazardous waste site
exposures. Resident populations of plants, microbes and animals
may respond to environmental contaminants if exposure occurs.
Within the context of ecological assessment, adverse biological
effects potentially associated with these exposures may suggest
various applications for toxicity testing throughout the site
assessment process (Athey &1. , 1987 ; Warren-Hicks al., 1989).
For example, in terrestrial and wetland habitats, toxicity
estimates for contaminated soil may be derived from phytotoxicity
tests as well as animal and microbial test systems. Linkages
between chemical contaminants and adverse ecological effects,
however, require not only toxicity evaluations of representative
species but adequate chemical analyses for deriving strong
inferences regarding causality (Parkhurst &1. , 1989; Stevens eJt
al. . 1989). When toxicity assessments are combined with (1)
chemical analyses which evaluate pertinent site samples and (2)
field surveys which measure ecological endpoints, higher level
biological organization (e.g., populations and communities) may be
evaluated during the site assessment process. Ecological data,
then, as well as toxicological and chemical information, are
prerequisites for developing and implementing sound ecological
management practices. Linkages among these information subsets
(chemical, toxicological, and ecological) may be established using
statistical and mathematical methods.
MATHEMATICAL ANALYSIS OF TOXICITY DATA
-123-
-------
Hazardous waste sites often release complex mixtures of toxic
chemicals to the environment, but biological effects data on
mixtures are far less abundant than on single chemicals. The
feasibility of relating the combined effect of a mixture to the
effects of individual chemicals in the mixture has been a long
range goal in toxicology. The achievement of this goal is an
important element of utilizing the vast quantities of existing
singly-tested chemical toxicity data to hazard assessment.
Toxicity tests with different chemicals are known to produce
a wide range of dose-time-response surfaces. Shirazi and Lowrie
(1990) successfully parameterized the response surfaces of 470
industrial chemicals to fathead minnows. The parameters of the
model are used as scalers to produce commensurate concentration
and time coordinates and to provide continuous measures of class
boundaries for different modes of biological response. These same
model parameters were used by Shirazi and Linder (1990) for
predicting toxicities of mixtures of narcotic chemicals from
parameters of singly tested chemicals. The database and a theore-
tical extension of these studies are given here to demonstrate the
utility and the power of parameterization in making use of existing
and future data for hazard assessment.
The model describes the response surface R as a function of
the concentration C, time t, and 4 parameters fe, k and j: such
that
a b
R(C,t) - exp[-(kC) (rt) ].
The scale and rate factors k and jl are measures of toxic rates of
actions for concentration and time, respectively, and the form
factors & and £ quantify the strategies of response to increasing
concentration and duration of exposure.
The parameters of the equation, calculated from test data by
Shirazi and Lowrie (1990) for fathead minnows, are presented in
Table 4. The capitalized scale factor K applies to the data at
the end of 96-hr test. The last column in the table is a constant
obtained from fitting the data with the above model. The calcula-
tion procedure and additional interpretations of the results are
detailed in Shirazi and Lowrie (1990). In addition to model
parameters, the table contains the octonol/water partition
coefficient (log P), the molecular weight (Mw), and the chemical
registry number (CAS). The mean values of the form factors are ^
¦ 4.58 and £ » 1.16. There is a negative correlation of log P with
log LC50 (-0.73) and a positive correlation with Mw (0.61). Log
P is uncorrelated with the form factors a. and
The relationship between Log P, Mw, and Log LC50 is given in
Table 1. This summary presentation of the interrelationships is
obtained from fitting normal trivariate probability functions to
the data. The table shows that toxicity increases with increasing
-124-
-------
values of Mw and Log p and decreases with decreasing values of Log
p and Mw.
Shirazi and Linder (1990) showed that the scale factor KM of
a mixture of two or more narcotic chemicals has a simple rela-
tionship with the scale factors Kl, K2, etc. of singly-tested
chemicals when mixed fl, f2, proportions of chemical 1, 2, etc.
The relationship is:
l/(KM) - f1/(Klft) + f2/(K2ft) + ...
where, fl + f2 + . . . - ft. This relationship was verified with
data for a one dimensional response curve:
a
R(C) - Exp[-(KC) ].
The same relationship may be presumed to hold for k and the rate
factor x in a two-dimensional system.
The relationships between mixture form factors and the form
factors of singly-tested chemicals are not direct but can be
calculated from the one- or the two-dimensional equations of the
response relationship with respect to two dimensionless coordinates
for concentrations (kC) and (KC) and time (rt) according to the
procedure proposed by Shirazi and Linder (1990).
Examples of mixture form factors am for a one-dimensional
response and equal proportions of two, three, and four chemicals
in a mixture are presented in Table 2. Likewise, examples of
mixture form factors am and bnj for a two-dimensional response and
equal proportions of two chemicals in a mixture are presented in
Table 3. The relationships are highly nonlinear for form factors
that are widely different. For example, si. ¦ 0.4 and ¦ 8-0
yields am ¦ 1.08 (Table 2). An approximate linear relationship of
mixture form factors and the singly-tested form factors can be
obtained from an arithmetic average only when all form factors are
nearly equal.
The emphasis in these calculations is upon a systematic
compilation, interpretation, and analysis of test data to facili-
tate their use, extend and expand their utility in environmental
management, and for design of new tests. The work is one of many
steps that are being taken to use biological data for assessing
environmental problems of hazardous waste.
The problems of using toxicity tests for assessing biologi-
cal impact of complex chemicals in a real environment are many even
when the focus is the analysis of dose-response tests. For
example, the measured response may not be monotonic and symmetri-
cal with respect to the mean. The response surface may be multi-
modal, containing one or more stimulatory and inhibitory segments
-125-
-------
in the same test. The measured response may be weak compared with
variability due to experimental error, making the determination of
various modes difficult even when they are expected on biological
grounds. The data may contain measured extreme-value outliers that
are valid responses given the potential biological sources of
nonuniformity in the populations of test organisms. These
situations present difficulties that require special, if at times
unconventional, analyses. The conventional approach of condensing
a whole dose-time-response curve to a single median concentration
endpoint is inadequate and could at times be misleading.
STATISTICAL APPROACHES TO ECOLOGICAL ASSESSMENT
Because waste sites and reference sites are nonrandom samples,
most classical approaches to statistical analysis (e.g., hypothesis
testing and analysis of variance) may not be the methods of choice
in ecological assessments for hazardous waste sites (Stevens,
1989). Unless these potential flaws in quantitative analysis are
addressed, hazardous waste site assessment should rely on tech-
niques which are more appropriately identified as being exploratory
data analysis in character. Various statistical methods may be
applied and yield a framework wherein chemical, toxicological, and
ecological information become integrated. These component parts,
then, become building blocks within the site assessment process.
Depending upon the effort invested in gathering site information,
the resulting data should yield a framework for an ecological
assessment for a hazardous waste site. The chemical, toxicologi-
cal, and ecological information collected for a site may be
balanced or weighted among these component parts, depending upon
site-specific characteristics. Historically, for example,
chemically-based methods were the primary assessment tools applied
to hazardous waste site evaluations, regardless of whether the
concerns regarded human health or ecological effects. Causal
linkages between adverse biological responses and contaminant
presence were assumed, and were based largely on extrapolation from
laboratory-derived single-compound toxicity evaluations to field
settings most frequently characterized by complex chemical mixture
exposures. However, if toxicity-based criteria and ecological
survey data were considered complementary components to chemical
analyses during site assessment, then statistical methods could
integrate these component data sets (Fig. 3). Management decisions
regarding the environmental hazard associated with chemical
contaminants at the site could be developed using an integrated
assessment strategy and would not rely exclusively on chemical
analyses; for most environmental hazard assessments, toxicity-
based criteria have become increasingly important owing to the
complex chemical mixtures characteristic of environmental exposure.
Toxicity assessments which evaluate adverse effects through
-126-
-------
Figure 3. Statistical or quantitative methods should be used to
integrate sources of information (toxicity, chemical, and ecolog-
ical) into the site assessment.
Toxicity data Chemical data Ecological data
Statistical or Quantitative
Integration
Contribution to Site Assessment
measurement of biological endpoints (Parkhurst ££ &!•# 1989) and
field surveys which measure ecological endpoints indicative of
higher level structure and function (Bromenshenk, 1989; Kapustka,
1989; LaPoint and Fairchild, 1989; McBee, 1989) contribute to the
environmental hazard assessment process and enhance resource
management during all phases of the evaluation process. For
establishing these critical linkages among chemical, toxicologi-
cal, and ecological information, the quantitative methods most
appropriate for these integrations may be suggested by the data
collections themselves, and may include various methods which have
found past applications in applied ecology and environmental impact
assessment (Cairns §£ ai.. , 1979; Clarke, 1986).
Multivariate analysis. Independent of the applications apparent
within the context of contaminant ecology, applied multivariate
techniques (e.g., direct gradient analysis, ordination, and
classification) have had a recurring role in ecological research,
and have been used within a variety of settings, including
terrestrial and aquatic habitats (freshwater, estuarine, and
marine, as well as freshwater and estuarine wetlands); histor-
ically, a wide variety of ecological endpoints (e.g., populations
and communities) have been the primary focus in these applications
which have classically evaluated vegetation or microbial and animal
populations or communities which were subjected to naturally-
occurring "stressors" (e.g., temporal and spatial habitat varia-
tion; environmental perturbations such as fire) or anthropogenic
sources of habitat alteration. Many compilations and reference
texts are available and provide starting points for evaluating the
past record of these techniques (Capen, 1981; Gauch, 1982; Gilbert,
1987; Orloci &1., 1979). Their application to hazardous waste
site assessment may be estimated from a review of the applied
literature, and these approaches should be adequate, if judged
pertinent to site assessment during the early stages of work plan
development.
-127-
-------
Time series analysis. While the time constraints of hazardous
waste site work may preclude long-term studies on any one site,
various methods drawn from statistical time series analysis may be
applicable to site evaluation, particularly since the site has
been, and will continually be, "changing" with time. Indeed, the
potentially dynamic character of waste sites, particularly those
considered from their initial "discovery and listing" through
various stages of "clean up and restoration," suggest various time
series techniques (e.g., trend analysis) which may repeatedly
contribute to a specific site assessment during its "life history."
Additionally, the historical information which is available for a
particular site may afford the opportunity to conduct a variety of
techniques drawn from time series analysis; the application of time
series analysis has found wide application within basic ecological
research, and numerous references are available which should be
considered within the setting of hazardous waste site assessment
(Anderson, 1976; Box and Jenkins*, 1970; Cormack and Ord, 1979 •
Fuller, 1976; Shugart, 1978).
Geostatistical analysis. Recently, the description and inter-
pretation of spatial distributions for waste site contaminants have
increasingly been applied to exposure assessments (Flatman, 1984;
Journel, 1984; US EPA, 1988b), and the coincidence in patterns
which may be apparent between contaminant and toxicity distribu-
tions has been tentatively applied toward linking these measures
within a site assessment (Linder §£ , 1989). Within waste site
settings, applied geostatistical analysis has found applications
in soil and sediment evaluations; while primarily applied to
mapping exercises for plotting contaminant distributions within
landscape settings, the roles of variogram analysis and kriging may
be of greater value beyond that contribution which is required in
developing contaminant distribution maps (Clark, 1979; Journel and
Huijbregts, 1978).
Environmental sampling and study design. Regardless of the
statistical methods used in evaluating chemical, toxicity, and
ecological data collected for a site,-the- most critical problems
which should be considered in the site work plans revolve about
field sampling and its design and implementation. Without
adequate, well-designed field sampling plans, the subsequent data
analysis could become a secondary issue, particularly within the
context of litigation. Various references have been compiled which
address the problems of field sampling within an ecological context
(Barrett and Nutt 1979; Cormack ££. &!., 1979; Green, 1979; Krebs,
1989), and recent efforts to delineate these issues within an
applied context have considered hazardous waste sites specifically
(Keith, 1988; Schweitzer and Santolucito, 1984; US EPA, 1989b).
CONCLUSION. Ecological assessments for hazardous waste sites
should include acute toxicity tests which most frequently measure
mortality, and short-term tests which measure biological endpoints
-128-
-------
other than death. Toxicity assessment tools, then, may yield
information regarding acute biological responses elicited by site
samples as well as suggest longer-term biological effects (e.g.,
genotoxicity or teratogenicity) potentially associated with
subacute and chronic exposures to complex chemical mixtures
characteristic of hazardous waste sites (Kapustka and Linder, 1989;
Murphy and Kapustka, 1989). Toxicity evaluation methods which
contribute to site assessment should reflect site-specific demands
implicit to the ecological assessment process; however, toxicity
tests are but one component of an ecological assessment for a
hazardous waste site. Strongest inferences regarding the
coincidence of contaminants and biological response may be derived
from sampling plans which consider both toxicity and chemical
characterization, yet an ecological assessment must also consider
field components early in site evaluation. This becomes particu-
larly important when field sampling is considered, since integra-
tion of toxicity assessments (be those in si tu or laboratory-
generated), chemical analyses, and field assessments requires a
well-designed sample plan to establish linkages among toxicity,
site-sample chemistry, and adverse ecological effects. spatial
statistic techniques like kriging are finding increased applica-
tions in linking toxicity with other elements of site evaluation
(e.g., field-sample chemistry). Through kriging, for example,
areal distributions for site-specific toxicity and chemistry data
sets may be derived; then, "distribution maps" for toxicity and
chemistry data may be overlaid. Patterns of coincidence apparent
in these distributions may then suggest linkages among toxicity,
site contaminants, and adverse ecological effects. similarly,
multivariate techniques, particularly direct gradient and cluster
analysis, appear quite relevant to hazardous waste site assess-
ment. The applied ecological research literature presents numerous
case histories frequently developed from studies concerned directly
with habitat alteration consequent to anthropogenic activities
(e.g., mining and agricultural practices, as well as aquatic impact
assessments for effluent discharges into lotic systems), and these
methods may be pertinent to site assessment for aquatic or
terrestrial sites. Time series analysis, while not having a
history in waste site assessment, offers numerous techniques which
would appear appropriate to site assessments; these methods may be
particularly significant, if the entire "life history" of the
hazardous waste site is considered during the early phases in work
plan development.
LITERATURE CITED
Anderson, 0.0. 1976. Time series analysis and forecasting: the
Box-Jenkins approach. Buttersworth, London, England.
Athey, L.A., J.M. Thomas, J.R. Skalski, and W.E. Miller. 1987.
Role of acute toxicity bioassays in the remedial action
process at hazardous waste sites. EPA/600/8-87/044. U.S.
129-
-------
Environmental Protection
Laboratory, Corvallis, OR.
Barrett, J.P., and M.E. Nutt.
environmental sciences.
Hampshi re.
Box, G.E.P., and G.M. Jenkins,
forecasting and control.
Cali fornia.
Agency, Environmental Research
1979. Survey sampling in the
COMPress, inc., Wentworth, New
1970. Time
Holden-Day,
series analysis
San Francisco,
Bromenshenk, J. 1989. Terrestrial invertebrate surveys. in W.
Warren-Hicks, B. Parkhurst, and S. Baker, Jr. (eds.).
Ecological assessment of hazardous waste sites. EPA/600/3-
89/013. U.S. Environmental Protection Agency, Environmental
Research Laboratory, Corvallis, OR.
Cairns, Jr., J., G.P. Patil, and W.E. Waters (eds.). 1979 .
Environmental biomonitoring, assessment, prediction, and
management—certain case studies and related quantitative
issues. International Cooperative Publishing House, Fairland,
MD.
Capen, D.E. (ed.). 1981. The use of multivariate statistics in
studies of wildlife habitat. USDA Forest Service, General
Technical Report RM-87. Rocky Mountain Forest and Range
Experiment Station, Fort Collins, Colorado.
Clarke, R. (ed.). 1986. The handbook of ecological monitoring.
Oxford Science Publications, Clarendon Press, Oxford, England.
Clark, I. 1979. Practical geostatistics. Elsevier Applied
Science, New York, NY.
Cormack, R.M., G.P. Patil, and D.S. Dearborn (eds). 1979.
Sampling biological populations. International Cooperative
Publishing House, Fairland, MD.
Cormack, R.M., and J.K. Ord (eds.). 1979. Spatial and temporal
analysis in ecology. International Cooperative Publishing
House, Fairland, MD.
Flatman, G.T. 1984. Using geostatistics in assessing lead
contamination near smelters. in G.E. Schweitzer and J.a.
Santolucito (eds.), Environmental sampling for hazardous waste
sites. American Chemical Society, Washington, D.c.
Fuller, Wayne. 1976. Introduction to statistical time series.
Wiley and Sons, New York, NY.
Gauch, Jr., H.G. 1982. Multivariate analysis in community
ecology. Cambridge University Press, Cambridge, England.
-130-
-------
Gilbert, R.O. 1987. Statistical methods for environmental
pollution monitoring. Van Nostrand Reinhold, New York, NY.
Green, R.H. 1979. Sampling design and statistical methods for
environmental biologists. John Wiley and Sons, New York, NY.
Journel, A.G. 1984. New ways of assessing spatial distributions
of pollutants. I_n G.E. Schweitzer and J.A. Santolucito
(eds.), Environmental sampling for hazardous waste sites.
American Chemical Society, Washington, D.C.
Journel, A.G., and C.J. Huijbregts. 1978. Mining geostatistics.
Academic Press, New York, NY.
Kapustka, l. 1989. Vegetation assessment. I_n W. Warren-Hicks,
B. Parkhurst, and S. Baker, Jr. (eds.), Ecological assessment
of hazardous waste sites. EPA/600/3-89/013. U.S. Environmen-
tal Protection Agency, Environmental Research Laboratory,
Corvallis, OR.
Keith, L.H. (ed.). 1988. Principles of environmental sampling.
American Chemical Society, Washington, D.C.
Krebs, C.J. 1989. Ecological methodology. Harper and Row,
Publishers, New York, NY.
LaPoint, T. , and J. Fairchild. 1989. Aquatic surveys. in w.
Warren-Hicks, B. Parkhurst, and S. Baker, Jr. (eds.).
Ecological assessment of hazardous waste sites. EPA/600/3-
89/013. U.S. Environmental Protection Agency, Environmental
Research Laboratory, Corvallis, OR.
LeGendre, L., and P. LeGendre. 1983. Numerical ecology. Elsevier
Scientific Publishing Company, New York, NY.
Linder, G., M. Bollman, w. Baune, K. DeWhitt, J. Miller, J. Nwosu,
S. Smith, D. Wilborn, C. Bartels, and J.C. Greene. 1989.
Toxicity evaluations for hazardous waste sites: an ecological
assessment perspective. in Proceedings of the Fifth Annual
Waste Testing and Quality Assurance Symposium, Office of Solid
Waste and Emergency Response, Washington, D.C.
Kapustka, L.A., and G. Linder. 1989. Hazardous waste site
characterization utilizing ia situ and laboratory bioassess-
ment methods. pp. 85-93. Hi Davis, W.S., and T.P. Simon
(eds.). Proceedings of the 1989 Midwest Pollution Control
biologists Meeting# Chicago, IL. US EPA Region 5, EPA-905-
89-007.
Ludwig, J.A., and J.F. Reynolds. 1988. Statistical ecology. John
Wiley and Sons, New York, NY.
-------
McBee, K. 1989. Field surveys: terrestrial vertebrates. In W.
Warren-Hicks, B. Parkhurst, and S. Baker, Jr. (eds,).
Ecological assessment of hazardous waste sites. EPA/600/3-
89/013. U.S. Environmental Protection Agency, Environmental
Research Laboratory, Corvallis, OR.
Murphy, T., and L. Kapustka. 1989. Capabilities and limitations
of approaches to in situ ecological evaluation. In Proceed-
ings of symposium on in situ evaluation of biological hazards
of environmental pollutants. Plenum Press, New York.
Orloci, L. C., R. Rao, and W.M. Stiteler (eds.). 1979. Multi-
variate methods in ecological work. International Cooperative
Publishing House, Fairland, MD.
Parkhurst, B., G. Linder, K. McBee, G. Bitton, B. Dutka, and C.
Hendricks. Toxicity tests. In W. Warren-Hicks, B. Parkhurst,
and S. Baker, Jr. (eds.), Ecological assessment of hazardous
waste sites. EPA/600/3-89/013. U.S. Environmental Protec-
tion Agency, Environmental Research Laboratory, Corvallis, OR.
Schweitzer, G.E., and J.A. Santolucito (eds.). 1984. Environmen-
tal sampling for hazardous waste sites. American Chemical
Society, Washington, D.C.
Shirazi M.A., and L.N. Lowrie. 1990. A probabilistic statement
of the structure-activity relationship for environmental risk
analysis. Arch. Environ. Contam. Toxicol. 19,597-602
Shirazi M.A., and G. Linder. 1990. An analysis of biological
response to chemical mixtures. Toxicologist 10.1, p. 83.
Stevens, D., G. Linder, and W. Warren-Hicks. 1989. Data inter-
pretation. in W. Warren-Hicks, B. Parkhurst, and S. Baker,
Jr (eds. ) . Ecological assessment of hazardous waste sites.
EPA/600/3-89/013. U.S. Environmental Protection Agency,
Environmental Research Laboratory, Corvallis, OR.
Shugart, Jr., H.H. (ed.). 1978. Time series analysis and ecolo-
gical process. SIAM, Philadelphia, PA.
Steven, D. 1989. Field sampling design, in W. Warren-Hicks, B.
Parkhurst, and S. Baker, Jr. (eds.). Ecological assessment
of hazardous waste sites. EPA/600/3-89/013. Environmen-
tal Protection Agency, Environmental Research Laboratory,
Corvallis, OR.
U S EPA. 1988a. Review of ecological risk assessment methods.
EPA/230/10-88/041. U.S. Environmental Protection Agency,
Office of Policy Planning and Evaluation, Washington, D.C.
-132-
-------
U.S. EPA. 1988b. GEO-EAS (Geostatistical Environmental Assess-
ment Software): User's Guide. EPA/600/4-88/033. U.S.
Environmental Protection Agency, Environmental Monitoring and
Systems Laboratory, Las Vegas, NV.
U.S. EPA. 1989a. Interim Final Risk Assessment Guidance for
Superfund—Environmental Evaluation Manual. EPA/540/1-
89/001A. US EPA Office of Emergency and Remedial Response,
Washington, DC.
U.S. EPA. 1989b. Methods for evaluating the attainment of cleanup
standards: soils and solid media. Statistical Policy
Branch, Office of Policy, Planning, and Evaluation,
Washington, D.C.
Warren-Hicks, W., b. Parkhurst, and S. Baker, Jr. (eds.). 1989.
Ecological assessment of hazardous waste sites. EPA/600/3-
89/013. U.S. Environmental Protection Agency, Environmental
Research Laboratory, Corvallis, OR.
-133-
-------
Tabla 1. Tha probability of log LC50, log P, and Mm, sumarizing tha fathaad minnow toxicity data.
CJ
Hw 25 50 75 100 125 ISO 175 200 225 2S0 275 300 325 350 375
log P
-2.00 3.76 3.70 3.63 3.57 3.51 log IC50
0.0S 0.03 Probability
-1.50 3.52 3.45 3.39 3.33 3.26 3.20 3.13
0.02
-V.00 3.28 3.22 3.15 3.09 3.02 2.96 2.90 2.84
0.06 0.02
-0.50
0.00
0.50
1.00
1.50
2.00
2.50
3.00
3.50 1.06 1.00
0.03 0.10
4.00 0.00 0.75
0.01 0.04
4.50 051
0 .02
5.00
5.50
6.00
6.50
7.00
25
50
75
3.76
0.06
3.70
0.07
3.63
0. 06
3.52
0.10
3.45
0.13
3.39
0.14
3.28
0.15
3.22
0.21
3.15
0.25
3.04
0.20
2.98
0.31
2.91
0.38
2.so
0.22
2.74
0.38
2 . 67
0.51
2.56
0.22
2.50
0.40
2.43
0 . 60
2.32
0.18
2.26
0.38
2.20
0.62
2.08
0.13
2.02
0.31
1.96
0.57
1.85
0.08
1.78
0.22
1.72
0.46
1.61
0.04
1.53
0.14
1 .47
0.32
1.37
0.02
1.30
0.07
1.24
0.19
3.33
0.11
3.26
0.08
3.20
0.04
3.09
0.23
3.02
0.18
2.96
0.11
2.85
0.40
2.79
0.34
2.72
0.24
2.61
0.58
2.55
0.56
2.48
0.44
2.37
0.74
2.31
0.76
2.24
0.68
2.13
0.82
2.07
0 .90
2.01
0.88
1.89
0. 82
1.83
0.96
1.77
0.98
1.65
0.74
1.59
0.94
1.52
1.00
1.41
0.59
1.35
0.84
1.29
0.98
1.17
0.40
1.11
0.66
1.04
0.87
0.93
0.23
0.87
0.44
0. 81
0.67
0.70
0.11
0.63
0.25
0.56
0. 43
0.45
0.05
0.39
0.12
0.32
0.23
0.21
0.02
0.15
O.OS
0.08
0.11

-0.09
0.02
-0.15
0.04
0.01
2.65
0.14
2.59
0.06
2.53
0.02

2 .42
0.29
2.36
0.15
2.29
0.06
2.23
0.02

2.17
0.50
2.11
0.30
2.03
0.14
1.99
0.05
1.92
0.02

1.94
0.73
1.88
O.SO
1.81
0.27
1.75
0.12
1.68
0.04

1.70
0.91
1.63
0.72
1.57
0.4S
1.50
0.22
1.44
0.08
1.38
0.02

1.46
0.99
1 .40
0.87
1.34
0.62
1.27
0.35
1.20
0.15
1.14
0.05

1 .22
1.00
1.16
0.94
1.09
0.75
1.03
0.48
0 . 96
0.24
0.90
0.09
0.84
0.03

0.98
0.96
0 .92
0.94
0.85
0.81
0.79
0.57
0.72
0 32
0. 66
0 .14
0.59
0.05
0. 00
0.01

0.75
0.82
0.68
0.86
0.61
0.78
0.55
0. 60
0.49
0 . 37
0.42
0. 18
0.36
0.07
0.29
0.02

0.50
0.60
0.43
0.69
0.38
0.68
0.31
0.56
0 .25
0 . 38
0.18
0.21
0.12
0.09
0.06
0.03
0 .00
0. 01
0.26
0.37
0.20
0.48
0.13
0.51
0.07
0.46
0 .01
0.34
-0.06
0.20
-0 .12
0.10
-0.19
0.04
0. 00
0.01
0.03
0.19
-0 .04
0.28
-0.11
0.33
-0. 16
0.33
-0.23
0.26
-0.30
0.17
-0.36
0.09
-0. 43
0.04
0.00
0.01
-0.22
0.08
-0 .28
0.14
-0.34
0.18
-0.41
0.20
-0 .47
0 . 13
-0.54
0.13
-0.60
0.08
-0. 66
0. 04
0.00
0. 01
-0.45
0.03
-0.53
0.06
-0.59
0.09
-0.65
0.10
-0.71
0.10
-0.77
0.08
-0.84
0.05
-0. 90
0. 03

0.00
0.01
-0.07
0.01
0.29
0.01
0.41
0.01
-0.62
0.01
0.39
0.01
-0.45
0.01
0.27
0.01
-0.52
0.01
0.00
0.01
-0.84
0.01
-0.52
0.01

-------
Tabl* 2. Mlxtur* form factor u obtained from mixing aqual proportions of 2, 3 and 4 chauula using Equation 2.
a3 - 0.50
*1
0.40
0.80
1.00
2.00
4.00
6.00
8.00
al
0.40
0.80
o
o
2 .00
4.00
6. 00
8 . 00
a2

0.40
0.40
0.63
0.71
0.93
1.04
1.07
1.08 H
0.40
0.44
0.60
0.68
0 .86
0.97
1. 00
1. 01
o.ao
0.63
0.80
0.91
1.35
1.78
1.97
2.06
0.80
0.60
0.71
0.7a
1.04
1.22
1.28
1.32
1.00
0.71
0.91
1.00
1.47
2.02
2.28
2.43
1.00
0.68
0.7B
0.84
1.08
1.27
1.34
1.37
2.00
0.93
1.35
1.47
2.00
2.82
3.34
3.70
2.00
0.86
1.04
1.08
1.24
1.41
1.48
1.51
4.00
1.04
1.78
2.02
2.82
4.00
4.86
5.49
4.00
0.97
1.22
1.27
1.41
1.54
1.59
1. 62
6.00
1.07
1.97
2.28
3.34
4.86
6.00
6.89
6.00
1.00
1.28
1. 34
1.48
1.59
1. 64
1.66
8.00
1.08
2.06
2.43
3.70
5.49
6. 89
8.00
8.00
1.01
1.32
1.37
1.51
1.62
1.66
1.68

• 3 -
1.00

a3 -
2.00

al
0.40
0.80
1.00
2 .00
4.00
6.00
8.00
al
0.40
0.80
1.00
2.00
4.00
6. 00
8.00
*2

0.40
0.65
0.73
0.78
0.96
1.08
1.13
1.14 u
0.40
0.81
0. 93
0. 96
1 .07
1.17
1.20
1.22
o.to
0.73
0.87
0.94
1.26
1.58
1.71
1.78
0. 80
0.93
1.19
1.26
1.52
1.86
2.02
2.11
1.00
0.78
0.94
1.00
1.33
1.69
1.85
1.94
1.00
0.96
1.26
1.33
1 .63
2.03
2. 24
2.36
2.00
0.96
1.26
1.33
1.63
2.03
2.24
2.36
2.00
1.07
1.52
1.63
2.00
2.53
2. 84
3. 03
4.00
1.08
1.58
1.69
2.03
2.44
2.66
2.79
4.00
1.17
1. 86
2.03
2 . 53
3.15
3. 54
3. ao
6.00
1.13
1.71
1.85
2.24
2.66
2.89
3.02
6.00
1.20
2.02
2.24
2.84
3.54
3 . 98
i . 27
>.00
1.14
1.78
1.94
2.36
2.79
3.02
3.16
8.00
1.22
2.11
2. 36
3.03
3.80
4. 27
4. 59

a3 -
4.00

a3 -
6.00

• 1
0.40
0.80
1.00
2.00
4.00
6.00
8.00
al
0.40
0.80
1.00
2.00
4.00
6. 00
8.00
•2

0.40
0.90
1.06
1.08
1.17
1.24
1.26
1.27 u
0.40
0.92
1.09
1.13
1.20
1.26
1.28
1.29
0.80
1.06
1.48
1.58
1.86
2.18
2.34
2.43
0.80
1.09
1.60
1.71
2 .02
2.34
2. 50
2 . 59
1.00
1.08
1.58
1.69
2.03
2.44
2.66
2.79
1.00
1.13
1.71
1.85
2.24
2.66
2.89
3.02
2.00
1.17
1.86
2.03
2.53
3.15
3.54
3.80
2.00
1.20
2.02
2.24
2.84
3.54
3. 98
4.27
4.00
1.24
2.18
2.44
3.15
4.00
4.55
4.92
4.00
1.26
2.34
2.66
3.54
4.55
5.20
5.65
S.OQ
1.26
2.34
2.66
3.54
4.55
5.20
5.65
6.00
1.28
2.50
2.89
3.98
5.20
6. 00
6.57
8.00
1.27
2.43
2.79
3.80
4. 92
5.65
6.17
8.00
1.29
2.59
3.02
4.27
5.65
6. 57
7.23

*3 -
0.50
*4-2.
00

a3 -
0.50
a4 - 4.
00

•1
0.40
0.80
1.00
2.00
4.00
6.00
8.00
al
0.40
0.80
1.00
2.00
4.00
6.00
8.00
*2

0.40
0.79
0.87
0.89
0.99
1.07
1.10
1.12 u
0.40
0.88
0.98
1.00
1.07
1.13
1.15
1.16
0.80
0.87
1.01
1.04
1.18
1.33
1 .39
1.43
0. 80
0.98
1.18
1.22
1.33
1.45
1.50
1.52
1.00
0.89
1.04
1.07
1.22
1.38
1.44
1.48
l.OO
1.00
1.22
1.25
1.38
1.50
1.55
1.58
2.00
0.99
1.18
1.22
1.35
1.50
1.57
1.61
2.00
1.07
1.33
1.38
1.50
1.63
1.69
1.72
4.00
1.07
1.33
1.38
1.50
1.63
1.69
1.72
4.00
1.13
1.45
1.50
1.63
1.74
1.80
1.82
S.00
1.10
1.39
1.44
1.57
1.69
1.74
1.77
6.00
1.15
1.50
1.55
1.69
1.80
1. 84
1.87
8.00
1.12
1.43
1.48
1.61
1.72
1.77
1.79
8.00
1.16
1.52
1.58
1.72
1.82
1.87
1. 89
-------
Tabla 2. (Continued.)
•3 - 1.00 *4 - 2.00
al
0.40
0.80
1.00
2.00
4.00
6.00
*2

0.40
0.85
0.95
0. 97
1 .08
1 .18
1.22
0.80
0.95
1.16
1.21
1 .41
1 .66
1.79
1.00
0.97
1.21
1.25
1 .47
1.76
1.90
2.00
1.08
1.41
1.47
1 .71
2 .03
2.20
4.00
1.18
1.66
1.76
2 .03
2.35
2.53
6.00
1.22
1.79
1.90
2 .20
2.53
2.70
a.oo
1.24
1.85
1.98
2 .30
2.63
2.81

*3
-2.00
•4-4.
00

•1
0.40
0.80
1.00
2 .00
4.00
6.00
•2

0.40
0.99
1.16
1.18
1 .26
1.33
1.36
0.80
1.16
1.58
1. 66
1 .90
2.17
2.31
1.00
1.18
1.66
1.76
2 .03
2.35
2.53
2.00
1.26
1.90
2.03
2.39
2.82
3.07
4.00
1.33
2.17
2.35
2.82
3.34
3.65
6.00
1.36
2.31
2.53
3.07
3.65
3.99
8.00
1.37
2.39
2.63
3.24
3.85
4.21

• 3
- 4.00
*4 - 1.
00 *5 ¦
2 .00

• 1
0.40
0.80
1.00
2.00
4 .00
6.00
•2

0.40
1.02
1.16
1.19
1.26
1.33
1.36
0.80
1.16
1.49
1.55
1 .72
1.92
2.02
1.00
1.19
1.55
1.61
1 .80
2.03
2.14
2.00
1.26
1.72
1.80
2.03
2.29
2.44
4.00
1.33
1.92
2 . 03
2 .29
2.58
2.73
6.00
1.36
2.02
2.14
2.44
2.13
2.89
8.00
1.38
2.08
2.21
2.53
2.83
2.99

•3
- 4.00
*4 — 1
.00 aS -
6.00

al
0.40
0.80
1.00
2.00
4.00
6.00
*2

0.40
1.09
1.27
1.30
1.36
1.42
1.44
0.80
1.27
1.74
1.82
2 .02
2.22
2.32
1.00
1.30
1.82
1.91
2.14
2.37
2.49
2.00
1.36
2.02
2.14
2.44
2.73
2.89
4.00
1.42
2.22
2.37
2.73
3.07
3.25
6.00
1.44
2.32
2. 49
2.89
3.25
3.44
8.00
1.45
2.38
2.56
2 .99
3.36
3.56

a3
1
Is)
o
o
a4 - 1.
00

8 .00
al
0.40
0.80
1.00
2.00
4.00
6.00
8.00

1.24 u
0.40
0.85
0.95
0. 97
1.08
1.18
1.22
1.24
1.85
0.80
0.95
1.16
1.21
1.41
1.66
1.79
1.85
1.98
1.00
0.97
1.21
1.25
1.47
1.76
1. 90
1. 98
2.30
2.00
1.08
1.41
1. 47
1.71
2.03
2.20
2. 30
2.63
4 .00
1.18
1.66
1.76
2.03
2.35
2.53
2. 63
2.81
6.00
1.22
1.79
1.90
2.20
2.53
2.70
2. 81
2.92
8 .00
1.24
1 .85
1. 98
2.30
2.63
2.81
2.92

a3
- 2.00
a4 - 6.
00

8.00
• 1
0.40
0.80
1.00
2.00
4.00
6.00
8.00

•2

1.31 am
0.40
1.02
1.20
1.22
1.30
1.36
1.38
1.39
2.39
0.80
1.20
1.69
1.79
2.04
2.31
2.44
2 . 52
2.63
1.00
1.22
1.79
1 . 90
2.20
2.53
2.70
2 . 81
3.24
2.00
1.30
2.04
2.20
2.61
3.07
3.34
3.52
3.85
4 .00
1.36
2.31
2 .53
3.07
3.65
3.99
4.21
4.21
6.00
1.38
2.44
2.70
3.34
3.99
4.37
4.61
4.44
8.00
1.39
2.52
2 . 81
3. 52
4.21
4.61
4.88

a3
-4.00
a4 - 1.
.00 a5 -
4.00

8.00
al
0.40
0.80
1.00
2.00
4.00
6.00
8. 00

«2

1.38 an
0.40
1.07
1.24
1 .26
1.33
1.39
1. 42
1.43
2.08
0.80
1.24
1.66
1.73
1. 92
2.12
2.22
2. 28
2.21
1.00
1.26
1.73
1. 81
2.03
2.25
2.37
2. 45
2.53
2.00
1.33
1.92
2.03
2. 29
2.58
2.73
2. 83
2.83
4.00
1.39
2.12
2.25
2.58
2.90
3. 07
3. 18
2.99
6.00
1.42
2.22
2.37
2.73
3.07
3. 25
3. 36
3.09
8.00
1.43
2.28
2.45
2. 83
3.18
3.36
3. 48

a3
- 4.00
a4 - 2.
00 a5 -
2.00

8.00
al
0.40
0.80
1.00
2. 00
4.00
6.00
8 . 00

1.45 am
0.40
1.07
1.23
1.26
1.33
1.40
1.43
1.45
2.38
0.80
1.23
1 . 65
1.72
1.92
2.15
2.27
2. 35
2.56
1.00
1.26
1.72
1.80
2.03
2.29
2.44
2.53
2 .99
2.00
1.33
1.92
2.03
2.31
2.64
2. 83
2.95
3.36
4.00
1.40
2.15
2.29
2.64
3.01
3.23
3.37
3.56
6.00
1.43
2.27
2.44
2.83
3.23
3.46
3. 61
3.68
8.00
1.45
2.35
2.53
2. 95
3.37
3. 61
3.76
-------
Table 3. Hlxtura Coca factor* am and bm obtun*d from mixing equal promotions of two chaotic*la with for™ factor*
al, bl and a2, b2 and Equation 1.
bl-
al 1.00
1.20 b2- 1.20
2.00 3.00 4.00
a2
1.00
2.00
3.00
4.00
5.00
6.00
• .00
1.20
1.00
1.26
2.66
1.13
2.88
1.0S
3.12
0.99
3.37
0. 95
3.60
0.93
3.83
1.24
1.45
1.56
2.60
1.43
2.87
1.34
3.12
1.27
3.37
1.22
3.61
1.19
3.85
1.26
l.ai
1.52
2.60
1.68
2.88
1.58
3.14
1.51
3.39
1.45
3.63
1.42
3.87
1.26
2.10
1.49
2.64
1 .65
2.92
1. 80
3.18
1 .73
3.43
1.68
3.66
1.64
3.91
5.00
1.27
2.34
1.47
2.72
1.62
98
78
23
95
48
1.90
3.71
1.86
3.95
6.00
1.26
2.55
1.45
2.81
1.60
3.05
1.76
3.29
1.93
3.53
2.13
3.76
2.09
4.00
8.00
1.26 an
2.77 bn
1.44
2.92
1.59
.14
.74
.37
.92
. 60
11
3.
1.
3.
1.
3.
2.
3. 82
2.61
4.07

bl»
1.20
b2- 2.80

al
1.00
2.00
3.00
4.00
5.00
6.00
a2

1.00
0.53
0.64
0.68
0.69
0.69
0.70

3.86
3.84
3.83
3.82
3.82
3. 81
o
o
0.56
0.68
0.78
0.88
0.94
0.94

3.80
3. 80
3.79
3.78
3.78
3.78
O
o
0.59
0.71
0.82
0.91
1.00
1 . 09

3.77
3.77
3.76
3.76
3.76
3.77
4 .00
0.61
0.74
0.85
0.95
1.04
1.13

3.77
3.77
3.77
3.77
3.78
3.78
5.00
0.63
0.76
0.88
0.98
1.07
1. 16

3.80
3. 80
3.80
3.81
3.81
3. 82
6.00
0.65
0.79
0.90
1.00
1.10
1. 19

3.84
3.84
3.85
3.86
3.87
3. 88
8 .00
0.67
0.81
0.92
1.03
1.13
1. 22

3.90
3.91
3.92
3.93
3.94
3. 96

bl-
1.20
b2- 0.40

bl-
0.40
b2- 1.20

al
1.00
2.00
3. 00
4 . 00
5.00
6.00
8.00
al
1.00
2.00
3.00
4.00
5.00
6.00
a2

**
e
o
0.90
0.89
0.89
0.89
0.88
0.88
0.88 am
1.00
0.86
1.09
1.30
1.29
1.28
1.28

3.98
3.93
3.90
3.88
3.87
3.87
3.88 bm

3.95
3.92
3.89
3.87
3.85
3. 83
2.00
1.12
1.11
1.10
1.09
1.09
1.08
1.08
2 .00
0.83
1.06
1.26
1.45
1.65
1.86

3.80
3.76
3.74
3.73
3.73
3.73
3.74

3.80
3.79
3.78
3.77
3.76
3.76
3.00
1.27
1.26
1.25
1.24
1.23
1.23
1.22
3.00
0.80
1.03
1.22
1.41
1.61
1.81

3.69
3.66
3.65
3.65
3.66
3.67
3.69

3.75
3.74
3.73
3.73
3.73
3.74
4.00
1.21
1.39
1.38
1.37
1.36
1.36
1.35
4 .00
0.79
1.00
1.20
1.38
1.58
1.78

3.65
3.63
3.63
3.63
3.65
3.66
3.69

3.74
3.73
3.73
3.73
3.74
3.75
5.00
1.15
1.51
1.50
1.49
1.48
1.48
1.47
5.00
0.77
0.99
1.18
1.36
1.55
1.75

3.66
3.65
3. 65
3.66
3.68
3.70
3.72

3.75
3.75
3.75
3.76
3.77
3.78
6.00
1.10
1.64
1.62
1.61
1.60
1.60
1.59
6.00
0.76
0.97
1.16
1.35
1.53
1.73

3.70
3.69
3.70
3.71
3.73
3.75
3.78

3.79
3.79
3.79
3.80
3.81
3. 82
8.00
1.05
1.57
1.96
1.95
1.94
1.93
1.92
8.00
0.75
0.96
1.15
1.33
1.52
1.72

3.76
3.76
3.77
3.79
3.81
3.84
3.87

3.84
3.84
3.85
3.85
3.87
3. 88
8.00
0.71 am
3.81 bm
0.95
3.78
1. 18
3.77
1.30
3. 80
1.34
3. 84
1.37
3.89
1.41
3.97
8 . 00
1.27 am
3.82 bm
1.85
3.76
2.27
3.74
2.23
3.76
2.20
3.79
2.18
3.83
2.16
3.89
-------
T*bl« 3. (Continued.)
bl- 2.BO b2- 1.20
•1 1.00 2.00 3.00 4.00
•2
1.00 0.66 0.67 0.67 0.67
3.95 3.94 3.94 3.94
2.00 0.81 0.82 0.82 0.82
3.94 3.94 3.93 3.93
3.00 0.87 0.94 0.94 0.95
3.93 3.93 3.93 3.93
4.00 0.88 1.06 1.06 1.07
3.93 3.93 3.93 3.93
5.00 0.89 1.15 1.18 1.18
3.94 3.93 3.93 3.93
6.00 0.91 1.16 1.29 1.29
3.95 3.95 3.95 3.95
8.00 0.92 1.17 1.40 1.50
3.97 3.97 3.97 3.97
bl- 0.40 b2- 2.80
al 1.00 2.00 3.00 4.00
*2
1.00
0.46
0.67
0.85
0.97

1.00
1.10
1.28
1.46
2.00
0.58
0.70
0.81
0.90

1.92
1.93
1.96
2.01
3.00
0.58
0.70
0.81
0.90

2.29
2.31
2.33
2 . 39
4.00
0.59
0.72
0. 82
0.92

2.64
2.67
2.70
2.74
5.00
0.61
0.73
0. 84
0.94

2.96
2.99
3.03
3.07
6.00
0.62
0.75
0. 86
0.96

3.27
3.30
3.33
3.37
8.00
0.64
0.77
0.89
0.99

3.57
3.60
3.64
3.68
1A
©
o
6.00
8.00
0. 67
0.67
0.67
3.94
3.94
3.95
0.82
0.82
0.82
3.93
3.94
3.94
0.9S
0.95
0.95
3.93
3.93
3.94
1.07
1.07
1.07
3.93
3.93
3.94
1.1S
1.19
1.19
3.94
3.94
3.95
1.30
1.30
1.30
3.95
3.96
3.97
1.S0
1.50
1.50
3.98
3.99
4.00
5.00
6.00
8 .00
1.04
1.10
1.13
1.63
1.76
1.91
0. 99
1.07
1.21
2.08
2.16
2.26
0.99
1.07
1.23
2.45
2.52
2.61
1. 01
1.09
1.25
2.80
2.86
2.93
1 . 03
1.12
1.29
3.12
3.17
3.24
1. 06
1.15
1.32
3.42
3.47
3.53
1.09
1.18
1.36
3.73
3.78
3.84

bl-
2.80
b2- 0.
*1
1 . 00
2.00
3.00
«2

1 .00
0.66
0.65
0 . 65

3.79
3.76
3.74
2 . 00
0 .78
0.73
0.78

3.73
3.71
3.69
3.00
0.90
0.90
0 . 90

3.69
3.66
3.65
4.00
1.01
1.00
1.01

3 .66
3.64
3.64
5 . 00
1.11
1.10
1.10

3.67
3.65
3.65
6.00
1.20
1.20
1. 19

3.69
3.69
3 69
8.00
1.28
1.36
1.36

3.74
3.74
3.75

bl»
2 . 80
b2- 2.
*1
1.00
2.00
3.00
• 2

1 .00
0.66
0.66
0.66

3.84
3.83
3.83
2 .00
0.69
0.84
0.84

3.83
3.82
3.81
3.00
0.72
0. 88
1 .01

3.82
3.81
3.81
4.00
0.76
0. 93
1.06

3.82
3.82
3.82
5.00
0.80
0. 97
1 .11

3.84
3. 84
3.84
6.00
0.83
1.01
1 . 16

3.88
3.88
3. 88
8 . 00
0.87
1.05
1 .20

3.93
3.93
3.94
00
5.00
6. 00
8.00

65
0.65
0.65
0.65
am
74
3.75
3.76
3. 79
bm
78
0.78
0. 79
0. 79

69
3.70
3.71
3. 73

90
0. 90
0.90
0. 91

65
3.66
3.68
3. 70

01
1.01
1.01
1 . 02

65
3.66
3.68
3.70

11
1.11
1.11
1 .11

66
3. 68
3.70
3.72

19
1.19
1.20
1.20

70
3.72
3.74
3. 77

36
1.36
1.36
1 . 36

76
3.79
3.81
3 . 84

00
5.00
6.00
8.00

67
0.67
0.68
0. 69
am
82
3.82
3.83
3. 83
bm
85
0 .86
0.86
0. 87

81
3.81
3.81
3. 82

02
1.03
1.04
1 . 04

81
3.81
3.81
3. 82

19
1.20
1.21
1.21

82
3.82
3.83
3. 84

24
1.37
1.36
1.38

84
3.85
3. 86
3. 87

30
1.42
1.55
1. 55

89
3. 90
3. 91
3. 92

34
1.47
1. 60
1.84

95
3.96
3.98
4.00

4
0
3
0
3
0
3
1
3
1
3
1
3
1
3
4
0
3
0
3
1
3
1
3
1
3
1
3
1
3
-------
labia 4. Baapsnaa aurtaca propartiaa of 470 induatcxal chaaicala taatad
Chaaaical Nana
0 -KETHOXYBENZAMIDE
1, 1, 1, 3, 3. 3-HEXAFLDORO-2-PROPANOL
1, 1, 1-TRICHLORO-2-METHYL-2-PROPANOL (B YD RATE)
1, 1,1-TRICHLOROETHAHE
1,1,2, 2-TETRACHLOROETBANE
1, 1, 2 -TRICHLOROETHAHE
1, 2, 4-TRIMETHYLBENZEHE
1, 2-BIS ( 4-PYRIDYLIETHANE
1, 2-DIAMINOPROPANE
1, 2-DIBROMOBEN ZEHE
1, 2-DICHLOROBENZENE
1, 2 -0ICHLOROETHANE
1, 2-OICHLOROPROPANE
1, 2-DIMETBYLPROPYLAMINE
1, 3,5-TRICHLORO-2,4-DINITROBENZENE
1,3-OIAMINOPROPANS
1, 3-DIBROMOPROPANE
1, 3-DICHLORO-4, 6-DINITROBENZENE
1, 3-DICHLOROBENZENE
1, 3-DICHLOROPROPANE
1, 3-OIETHYL-2-THIOBARBITORIC ACID
1, 4-DIAZABICYCLO(2.2.2)OCTANE
1, 4-DICHLOROBOTANX
1, 4 -DIC Y ANOBOTANE
1, 4-DINITROBENZENE
1, 5-DICHLOROPEHTAHE
1, 5-HEXADIEN-3-OL
1, 6-DICYANOHEXANE
1, 8-DIAMINO-P-HENTHANE
1, 9-DECADIENE
1- (2-AMINOETHYLIPIPERAZINE
1- (2-CHL0R0ETHYL) PYRROLIDINE. HCL
1-(2-BYDROXYETHYL)PIPERAZINE
1- (CARBOXYMETHYL) PYRIDINIOM CHLORIDE
1- (p-TOLOENESULFONYL) IMIDAZOLE
1-ADAMANTANAMIHE
1-AMINO-2-PROPANOL
1-BENZYLPIPERAZINE
1 -BENZYLPYRIDINIUM 3-SULFONATE
1-BROMOBOTANE
1-BROMOHEPTANE
1-BROMOHEXANE
1 -BROMOPROPANE
I-BOTANOL
1 -CHLORO-2-PROPANOL
l-CHLORO-3-HITRQBENZEKE
1-DECANOL
1-ETHYNYL-CYCLOHEXANOL
l-FLOORO-4-NITROBENZENE
1-HEPTANOL
1-HEPTTN-3-0L
l-HEJCANOL
1-HEXEN-3-OL
1-METHYL HEPTYLAMINE
1-METHYLP IPERAZINE
1-NAEHTHOL
1-NONANOL
1-OCTANOL
1-OCTYN-3-OL
CAS No.
log P
Mw/100
3424939
0. 87
1.51
920661
1.59
1.68
57158
2.03
1.77
71556
2.49
1.33
79345
2.64
1.68
79005
2.05
1.33
95636
3.78
1.20
4916578
1.93
1. 84
78900
-0.91
0.74
583539
3.58
2.36
95501
3.38
1.47
107062
1.48
0.99
78875
1.99
1.13
598743
1.10
0.87
6284839
2.65
2.71
109762
-1.49
0.14
109648
1.99
2.02
3698837
2.49
2.37
541731
3.60
1.47
142289
2.00
1.13
5217470
0.00
2.00
280579
1.33
1.12
110565
2.24
1.27
111693
-0.42
1.08
100254
1.46
1.68
628762
2.76
1.41
924414
0.57
0.98
629403
0.63
1.36
80524
1.23
1.70
1647161
4.90
1.38
140318
-0.68
1.29
7250671
1.35
1.70
103764
-0.76
1.30
6266235
0.00
1.74
2232088
0.00
2.22
768945
2.44
1.51
78966
-0.96
0.75
2759286
1.76
1.76
69723944
0.00
2.49
109659
2.75
1.37
629049
4.36
1.79
111251
3.80
1.65
106945
2.10
1.23
71363
0.88
0.74
127004
0.14
0.95
121733
2.41
1.58
112301
4.00
1.58
78273
1.73
1.24
350469
1.80
1.41
111706
2.41
1.16
51703664
1.52
1.12
111273
2.03
1.02
4798441
1.12
1.00
693163
2.82
1.29
109013
-0.10
1.00
90153
2.84
1.44
143088
3.47
1.44
111875
2.97
1.30
618724
2.05
1.26
fathead nunnova.
logLCSO log K
2.071
2.353
2.151
1.733
1.351
1.884
0.924
2.220
3.113
0.550
1.100
2.173
2.125
2.447
-0.499
3.094
0.346
-1.293
0.947
2.024
3.652
3.221
1.699
3.314
-0.400
1.440
1.601
2.737
1.132
-0.517
3.353
2.164
3.950
2.243
1.641
1.499
3. 400
1.716
3. 425
1.557
0.202
0.504
1.903
3.244
2.497
1.239
0.410
2.434
1.424
1.573
0.151
2.024
1.545
0.653
3.415
0.616
0.790
1.221
-0.392
-2.109
-2.393
-2.171
-1.777
-1.377
-1.918
-0.950
-2.251
-3.149
-0.623
-1.199
-2.220
-2.166
-2.467
0.460
-3.124
-0.379
1.245
-1.015
-2.046
-3.669
-3.264
-1.744
-3.334
0.290
-1.469
-1.656
-2.BOB
-1.856
0.491
-3.389
-2.235
-3.893
-2.276
-1.671
-1.535
-3.428
-1.748
-3.457
-1.587
-0.219
-0.570
-1.942
-3.285
-2.548
-1.262
-0.440
-2.455
-1.463
-1.610
-0.228
-2.063
-1.579
-0.698
-3.442
-0.650
-0.821
-1.248
0.341
log k
-2.146
-2.427
-2.172
-1.797
-1.379
-1.924
-0.957
-2.260
-3.189
-0.724
-1.202
-2.297
-2.199
-2.471
0.292
-3.158
-0.456
1.090
-1.026
-2.047
-3.682
-3.285
-1.751
-3.334
0.157
-1.466
-1.738
-2.807
-1.863
0.479
-3.441
-2.270
-3.967
-2.304
-1.677
-1.600
-3.438
-1.812
-3.462
-1.593
-0.220
-0.664
-2.000
-3.287
-2.659
-1.271
-0.441
-2.458
-1.495
-1.612
-0.409
-2.064
-1.602
-0.720
-3.462
-0.720
-0.822
-1.247
0.223
a
4.143
4.000
7.987
3.593
6.119
4.688
6.020
5.298
4.453
2.180
1.599
3.336
3.936
8.161
4.151
5.410
4.712
3.310
2.361
7.317
9.641
3.729
3.500
7.995
1.446
5.578
2.891
2.233
6.528
6.201
4.481
2.253
3.649
4.885
5.431
4.418
5.619
4.997
4.975
5.237
9.568
2.403
4.131
3.855
3.161
6.836
5.252
7.603
4.105
4.370
2.051
4.056
4.624
3.512
5.853
4.931
5.229
5.785
3.158
log I
-1.910
-1.852
-1.962
-1.892
-1.951
-1.906
-1.939
-1.913
-1.905
-1.756
-1.972
-I.743
-1.840
-1.953
-1.760
-1.882
-1.804
-1.715
-1.916
-1.954
-1.898
-1.905
-1.940
-1.982
-1.618
-2.023
-1.803
-1.998
-1.946
-1.912
-1.862
-1.853
-1.789
-1.910
-1.922
-1.796
-1.932
-1.786
-1.946
-1.928
-1.972
-1.748
-1.808
-1.947
-1.823
-1.928
-1.970
-1.956
-1.850
-1.963
-1.668
-1.954
-1.866
-1.864
-1.909
-1.804
-1.966
-2.016
-1.740
b
2.085
1. 056
0.515
0.774
0.409
0.416
0.988
0.697
2.308
0.969
0.453
1.069
0. 909
1.077
3.144
1.854
2.011
1.916
0.385
0.385
1.524
1.023
0.529
0.122
0.528
0.395
1.311
0.112
1.151
1.043
1.958
0.610
1.387
1.873
0.541
1.532
1.114
1.626
0.605
0.570
0.808
0.958
1.381
0.193
2.207
1.082
0.294
0.925
0.990
0.513
1. J 74
0 s
.5
1/ 4.43
1.540
c
-3.675
-2.505
-5.684
-2.369
-4.422
-2.841
-4.366
-3.558
-3.912
-1.078
-1.494
-1.468
-2.331
-6.048
-2.945
-3.962
-3.322
-1.600
-1.599
-5.072
-6.461
-2.735
-2.526
-5.700
-0.107
-4.819
-1.826
-1.755
-4.941
-4.348
-3.365
-1.345
-2.238
-4.055
-3.601
-2.591
-4.235
-2.897
-3.608
-3.475
-7.142
-1.109
-2.467
-2.553
-2.571
-4.756
-3.764
-5.633
-2.545
-3.388
-0.713
-2.665
-2.895
2.116
-4.383
-3.102
-3.711
-4.343
-1.647
-------
•C-
o
1-PENTANOL
1 - PROPANOL
2' , 3' , 4' -TRICBLOROACETOPHENONE
2' , 3', 4'-TRIMETBOXYACETOPHENONE
2' , 4' -DICHLOROACETOPHENONE
2' - (OCTYLOXY) -ACETANILIDE
2' -HYDROXY-4' -METHOXYACETOPHENONE
2, 2'-METHYLENE BIS (3, 4, 6-TRICHLOROPHENOL)
2, 2' -METBYLENEBIS (4-CBLOROPBENOL)
2, 2,2-TRICHLOROETHANOL
2,2,2-TRIFLOOROETHANOL
2, 2,5, S-TETRAMETHYLTETRAHYDROFURAN
2, 2-DICBLOROACETAMIDE
2, 2-DIMETHYL-1 -PROPYLAMINE
2. 3, 4, 5, 6-PENTAFLOOROANILINE
2,3,4,5-TETRACHLOROPBENOL
2, 3,4-TRICHLOROANILINZ
2, 3,5,6-TETRACBLOROANILINE
2.3-BENZOFORAN
2, 3-DIBROMOPROPANOL
2, 3-DIHYDROBENZOFORAN
2, 3-DIHETBYL-l, 3-BOTADIENE
2, 3-DIMETBYLVALERALDEBYDE
2, 4, 5-TRIBROMO IMIDAZOLE (NOMINAL)
2,4,5-TRIMETBOXYBENZALDEHYDE
2.4.5-TRIMETHYLOXAZOLK
2.4.6-TRIBROMOPHENOL
2,4,6-TRICHLOROPHENOL
2, 4-DICBLOROBENZALDEHYDK
2, 4-DICBLOROBENZAMIDE
2.4-DICHLOROPHENOL
2, 4-DIBYDROXYBENZALDEBYDE
2, 4-DIMETBOXYBENZALDEBYDE
2, 4-DIMETHYL-3-PENTANOL
2, 4-DIMETBYLPBENOL
2, 4-DINITROANILINE
2.4-DINITROPBENOL
2, 4-BEXADIENE
2, 4-PENTANEDIONE
2, 5-DIMETHYL-2, 4-BEXADIENE
2.5-DIMETBYLFORAN
2, 5-DINITROPBENOL
2. 6-DICBLOROBENZAMIDE
2, 6-DIISOPROPYLANILINE
2, 6-DIMETHOXYTOLUENE
2, 6-D IMETH YLHORPHOLINE
2.6-DIPHENYLPYRIDINE
2, 6-PYRIDINEDICARBOXYLIC ACID
2, 9-DITHIADECANE
2- (2-ETHOXYETHOXY) ETBANOL
2- (BROMOMETHYL)TETRABYDRO-2H-PYRAN
2- (DIISOPROPYLAMINO)ETHANOL
2- (ETBYLAMINO) ETBANOL
2- (N-ETHYL-M-TOLOIDINO) ETBANOL
2-ACETAMIDOPHENOL
2-ACETYL-l-METHYLPYRROLE
2 - ADAHANTJkNONE
2-AIXYLPHENOL
2 - AMINO- 4 ' -CHLOROBENZOPHENONE
2-AMINO-4-CBLORO-6-METHYLPYRIMIDINE
2-AMINO-5-BROMOPYRIDINE
2-AMINO-5-CHLOROBENZONITRILE
2-HROHO-3-PYRIDINOL
2-BOTANOL
2-BOUHOttt
2-BOTANONE OXIMB
71410
71238
13608872
13909734
2234164
55792616
552410
70304
97234
115208
75698
15045439
683727
5813649
771608
4901513
<34673
3481207
271896
96139
496162
513815
3944761
2034222
4460860
20662844
118796
88062
874420
2447792
120832
95012
613456
600362
105679
97029
51285
592461
123546
764136
625865
329715
2008584
24544046
5673074
141913
3558698
499832
56348404
111900
34723824
96800
110736
91883
614802
932161
700S83
1745819
2894511
5600215
1072975
5922601
6602320
78922
76933
96297
1.56
0.88
2.678
0.25
0.60
3.687
3.57
2.23
0. 304
1.12
2.10
2.343
2.84
1.89
1.063
4.41
2.63
-0.244
2.24
1.66
1.871
7.54
4.07
-1.687
4.26
2 .69
-0.473
1.42
1.49
2.471
0.41
1.00
2.127
2.40
1.28
2.228
0.09
1 .28
2.544
1.19
0.87
2.718
2.22
1 .83
1.584
4.21
2.32
-0.348
3.33
1.97
0.559
4.10
2.31
-0.603
2.67
1.18
1.246
0.63
2 .18
1.854
2.14
1.20
1.947
2.70
0.82
0. 895
2.07
1.14
1.174
1. 96
3.05
0.792
1.38
1.96
1.745
2.34
1.11
2. 658
4.02
3.31
0.827
3.69
1.97
0.739
3.11
1.75
0.308
1 . 54
1.90
1.558
2.92
1.63
0.941
1.71
1.38
1.132
1.91
1.66
1.393
1.93
1.16
2.201
2.30
1.22
1.296
1.84
1. 83
1.182
1.54
1.84
1.056
2 .96
0. 82
1.310
-0.54
1.00
2.144
3.76
1.10
0.553
2.65
0.96
1.843
1.75
1.84
0.549
1.25
1. 90
2.827
4.07
1.77
1.171
2.80
1.52
1.283
0.32
1.15
2.620
4.82
2.31
-0.585
0.90
1.67
2.501
3.11
1.78
1.016
-0.15
1.34
4.441
1.61
1.79
2.340
0.86
1.45
2.309
-0.46
0.89
3.278
2.56
1.79
1.734
0.72
1.51
1.435
1.02
1.23
2.270
1.43
1.50
1.739
2.64
1.34
1.236
3.95
2.32
0.176
1.13
1.44
2.175
1.39
1.73
2.266
1.91
1.53
1.458
1.65
1.74
2.708
0.61
0.74
3.510
0.29
0.72
3.567
0.36
0.87
3.075
-2.713
-3.716
-0.329
-2.377
-1.094
0.152
-1.901
1.654
0. 450
-2.507
-2.184
-2.305
-2.583
-2.736
-1.606
0 .290
-0.611
0.562
-1.307
-1.940
-1.979
-0.927
-1.196
-0.838
-1.771
-2.671
-0.923
-0.872
-0.355
-1.761
-0.999
-1.165
-1.447
-2.241
-1.331
-1.203
-1.099
-1.329
-2.173
-0.592
-1.875
-0.603
-2.950
-1.194
-1.336
-2.645
0.557
-2.520
-1.040
-4.487
-2.394
-2.354
-3.359
-1.763
-1.476
-2.299
-1.861
-1.286
-0.245
-2.195
-2.290
-1.480
-2.745
-3.553
-3.593
-3.114
-2.728
4.601
-3.718
5.581
-0.378
6.492
-2.381
4.663
-1.100
5.136
-0.287
1.734
-1.916
5.272
1.560
4.744
0.445
6.995
-2.511
4.468
-2.255
2.799
-2.423
2.071
-2.619
4.124
-2.742
8.678
-1.607
7.238
0.282
2.770
-0.640
3.079
0.406
3.897
-1.317
2.626
-2.095
1. 851
-1.992
4.928
-0.928
5.058
-1.201
7.012
-0.874
3.442
-1.799
6.060
-2.674
11.971
-0.961
1.669
-0.953
1.202
-0.479
3.382
-2.093
0.784
-1.015
2 .730
-1.204
4.813
-1.547
2.920
-2.244
4.041
-1.378
4.S24
-1.206
7.542
-1.134
3.708
-1.330
8.458
-2.221
5. 429
-0.607
4.040
-1.899
4.929
-0.694
2.953
-2.895
1.296
-1.197
6.896
-1.354
2.976
-2.693
6.571
0.498
5.697
-2.525
8.277
-1.044
6.654
-4.497
3.525
-2.549
2.961
-2.359
3.559
-3.337
1.976
-1.767
5.519
-1.666
3.902
-2.286
5.525
-1.946
1.304
-1.404
3.139
-0.361
2.316
-2.198
7.879
-2.294
6.594
-1.502
7.262
-2.785
4.378
-3.575
3.685
-3.595
6.115
-3.136
4.077
-1.891
-1.951
-1.900
-1.942
-1.928
-1.407
-1.903
-1.775
-1.923
-1.929
-1.818
-1.712
-1.911
-1.937
-1.961
-1.898
-1.894
-1.764
-1.950
-1.796
-1.905
-1.978
-1.960
-1.813
-1.861
-1.840
-1.881
-1.740
-1.827
-1.923
-1.928
-1.820
-1.734
-1.933
-1.843
-1.949
-1.836
-1.966
-1.847
-1.901
870
791
-1
-1
-2.158
-1.943
-1.868
-1.825
-1.820
-1.929
-1.943
-1.919
-1.682
-1.946
-2.014
-1.938
-1.668
-2.015
-1.806
-1.751
-1.714
-1.945
-1.944
-1.935
-1.883
-1.880
-I.955
-1.898
0.795
0.370
3.849
0.468
0.535
1.322
0. 965
2.131
0.611
0.327
1.216
0.902
2.062
1.189
0.502
0.263
1. 025
2.786
0.859
1.532
0.843
1.129
1.535
0.730
1.368
0.258
0.632
0.405
2.702
4.394
C.801
1.154
1.177
0.307
1.517
0.587
0.888
0.723
1. 900
0.730
1. 048
1.410
0.410
0.499
0.471
2.022
2.082
0.772
0.668
0.574
1.532
0.499
1.327
0.439
2.361
2.115
0.632
1.595
1.003
0.670
0.645
3.364
1.771
0.758
0.522
1.041
-3.017
-3.859
-5.825
-3.358
-3.392
0.095
-3.620
-2.706
-4.497
-2.900
-1.780
-0.889
-3.757
-6.224
-5.123
-1.672
-2.361
-2.652
-2.313
-1.448
-3.310
-4.387
-5.723
-1.950
-3.955
-8.231
-1.228
-0.418
-2.904
-3.436
-2.251
-2.617
-1.490
-2.672
-2.983
-5.332
-2.285
-6.212
-3.550
-2.641
-3.102
-1.719
-1.911
-4.751
-1.656
-4.108
-3.702
-5.527
-4.625
-2.370
-1.234
-2.598
-2.713
-3.703
-1.923
-5.999
-0.764
-1.792
-0.892
-5.433
-4.635
-6.618
-3.389
-2.359
-4.293
-2.997
-------
2-BCTTYN-l-OL
764012
2-BOTYNE-l, 4-DIOL
110656
2-CHLORO-l-KETHYL-PYRIDINIOM IODIDE
14338320
2-CHLORO-3-PYRIDINOL
6636788
2-CHLORO-4-KETHYLANILINE
615656
2-CHLORO-4-NITROANILINE
121879
2-CBLORO-5-NITROBENZALDEBYDE
6361213
2-CHLORO-6-FLOOROBENZAIDEHYDE
3874S1
2-CHLORO - 6-HETH YLBENZONITRILE
6S75093
2-CHLOROANILINE
95512
2-CHLOROETHANOL
107073
2-CHLOROPHENOL
95578
2-CYANOPYRIDINE
100709
2-DECANONE
693549
2-DECYN-l-OL
4117140
2-DIMETHYLAMINOPYRIDINE
5683330
2-DODECANONE
6175491
2-ETHOXYETHYL ACETATE
111159
2-ETBOXYETHTL METBACRYLAXE
2370630
2-ETHYL-1-BEXANOL
104767
2-FLOOROTOLUENE
95523
2-BYDROXYBENZAMIDE
65452
2-BYDR0XYETBYL ACRYLATE
818611
2 - BYDROXYETBYL METBACRYLATE
868779
2-BYDROXYPROPYL ACRYLATE
999611
2 -HETBOX YETBYLAMIHE
109853
2-METBYL-l, 4-NAPHTHOQUINONE
58275
2-METHYL-1 -PROPANOL
78831
2-METHYL-2, 4 -PENTANEDIOL
107415
2-METHYL-2-PROPANOL
7S650
2-METHYL-3, 3, 4. 4-TETRAFLOORO-2-BOTANOL
29553262
2-METBYL-3-BOTYN-2-OL
115195
2-HETBYLBOTYRALDEBYDE
96173
2-KETBYL IMIDAZOLE
693981
2-KETHYLPIPERAZINE
109079
2-METHYLVALERALDEBYDE
123159
2-NITROPBENOL
88755
2-PENT ANONE
107879
2-PHENOXYETHANOL
122996
2-PHENYLPHENOL
90437
2-PROPANOL
67630
2-PROPYN-1-OL
107197
2-»«c-BOTYL-4, 6-DINITROPHENOL (DINOSEB )
88857
2-TRIDECANONE
593088
2-ONDECANONE
112129
3, 3-DIMETHYL-2-BCTTANONE
75978
3, 3-DIMETHYLBOTYIAHINE (STATIC RENEWAL)
15673004
3, 4-DICBLORO-l-BOTEHE
760236
3, 4-DICHLOROAHILINE
95761
3,4-DICHLOROTOLOENE
95750
3, 4-DIMETHYL-1-PENTYN-3-OL
1482151
3,5-DIBROMO-4-BYDROXYBENZONITRILE
1689845
3, 5-DIBROMOSALICYLALOEBYDE
90595
3, 6-DIMETHYL-1-HEPTYN-3-OL
19549986
3, 8-DITBIADECANE
54576328
3- (3, 4—DICBLOROPBENOXY)BENZALDEBYDE
79124768
3-(3-PYRIDYL)-1-PROPANOL
2859678
3- ( 4 -TERT-BOTYLPHENOXY) BENZALDEBYDE
69770240
3-ACETAMIDOPHENOL
621421
3-AMINO-5, 6-DIMETBYL-l,2, 4-TRIAZINE
17584122
3-BENZYLOXYANILINE
1484260
3-BROMOTBIOPHENE
872311
3-BUTYH-l-OL
927742
3-CHLORO-l-PROPANOL (STATIC)
627305
3 - CHLORO - 2 -CHLOROMETH YL-1 -PROPENE
1871574
3-CYANO-4,6-DIMETBYL-2-HYDROXYPYRIDINE
769288
0.16 0.70 1.099
1.83 0.86 1.749
0.00 2.56 2.365
1.50 1.30 2.112
2.58 1.42 1.626
2.17 1.73 1.312
2.28 1.86 0.617
2.54 1.59 0.868
2.94 1.52 1.166
1.90 1.28 0.993
0.03 0.81 1.744
2.15 1.29 1.145
0.50 1.04 2.882
3.44 1.56 0.686
3.33 1.54 0.076
1.43 1.22 2.071
4.49 1.84 0.135
0.65 1.32 2.167
1.40 1.58 1.426
2.81 1.30 1.479
2.93 1.10 1.289
1.28 1.37 2.077
0.21 1.16 0.600
0.47 1.30 2.508
0.35 1.30 0.686
0.67 0.75 2.783
2.20 1.72 -0.998
0.76 0.74 3.152
0.67 1.18 4.026
0.35 0.74 3.749
1.03 1.60 2.713
0.33 0.84 3.539
1.14 0.86 0.989
0.60 0.82 2.478
0.44 1.00 3.456
1.67 1.00 1.329
0.00 1.39 2.227
0.91 0.86 3.106
1.16 1.38 2.560
3.36 1.70 0.808
0.05 0.60 3.974
0.37 0.56 0.221
3.69 2.40 -0.286
5.02 1.98 -0.339
3.96 1.70 0.111
0.97 1.00 1.941
1.72 1.01 2.835
1.97 1.25 1.043
2.69 1.62 0.933
4.22 1.61 0.483
1.26 1.12 2.308
2.99 2.77 1.087
3.83 2.80 -0.044
2.32 1.40 1.774
3.11 1.78 0.825
5.49 2.67 -0.445
0.60 1.37 2.158
5.93 2.54 -0.400
0.73 1.51 3.051
0.21 1.24 3.074
2.79 1.99 0.957
2.62 1.63 0.915
0.50 0.70 1.644
0.01 0.95 2.938
1.56 1.25 -0.672
2.03 1.48 2.147
-1.130
-1.166
5.091
-1.781
-1.817
4.854
-2.394
-2.418
5.335
-2.825
-2.829
12.392
-1.659
-1.681
4.886
-1.342
-1.344
5.340
-0.667
-0.846
3.210
-0.917
-0.981
3.222
-1.185
-1.182
8.488
-1.122
-1.319
1.236
-1.798
-1.883
2.956
-1.210
-1.244
2.420
-2.928
-3.001
3.472
-0.710
-0.711
6.831
-0.141
-0.201
2.426
-2.102
-2.102
5.257
-0.189
-0.222
2.970
-2.196
-2.131
5.352
-1.483
-1.688
2.759
-1.497
-1.498
8.725
-1.325
-1.332
4.500
-2.103
-2.125
6.095
-0.652
-0.875
3.067
-2.543
-2.599
4.522
-0.729
-0.834
3.675
-2.829
-3.043
3.470
0. 933
0.842
2.458
-3.173
-3.173
7.458
-4.052
-4.051
6.063
-3.797
-3.829
3.361
-2.762
-2.783
3.221
-3.567
-3.579
5.789
-1.030
-1.194
3.851
-2.536
-2.586
2.749
-3.512
-3.608
2.864
-1.366
-1.502
4.295
-2.268
-2.288
3.860
-3.148
-J.155
3.857
-2.580
-2.580
7.921
-0.837
-0.843
5.566
-4.006
-4.007
4.985
-0.280
-0.435
2.702
0.258
0.256
5.804
0.292
0.220
3.355
-0.169
-0.224
2.761
-1.997
-2.242
2.862
-2.875
-2.895
4.043
-1.087
-1.278
3.678
-0.960
-0.981
5.825
-0.519
-0.540
4.420
-2.355
-2.423
3.432
-1.114
-1.116
5.961
0.011
-0.071
4.809
-1.832
-1.825
2.786
-0.857
-0.911
5.056
0.392
0.370
2.998
-2.217
-2.456
2.684
0.367
0.375
4.882
-3.068
-3.075
9.629
-3.114
-3.149
3.940
-0.988
-0.995
5.140
-0.961
-1.007
3.484
-1.678
-1.713
4.698
-3.022
-3.156
1.885
0.624
0.505
3.279
-2.223
-2.439
2.101
-1.162
1.506
-3.451
-1.837
1.209
-2.961
-1.881
1.266
-3.633
-1.955
1.635
-9.255
-1.865
0.926
-3.065
-1.960
0.368
-3.858
-1.704
2.066
-1.766
-1.833
1.366
-2.255
-1.998
1.513
-7.418
-1.859
1. 977
-1.902
-1.799
1.379
-1.901
-1.920
1.298
-2.227
-1.777
1.247
-1 .874
-1.963
0.296
-4.743
-1. 812
0.852
-1.512
-1.972
0.071
-3.650
-1.851
0.752
-1.760
-2.084
3.454
-8.183
-1.651
1.705
-1.169
-1.959
0.521
-6.105
-1.941
0. 809
-3.521
-1.875
1.231
-4.046
-1.692
2.354
-1.656
-1.841
1.802
-3.204
-1.811
2.264
-2.857
-1.593
1.902
-0.912
-1.730
0. 889
-1.093
-1.987
0.137
-5.369
-1.994
0.256
-4.629
-1.850
0. 820
-1.983
-1.893
0.757
-2 .225
-1.903
0.929
-3.974
-1.720
2.413
-2.117
-1.834
0.946
-1.983
-1.760
1.237
-1.519
-1.806
3.291
-3.566
-1.893
0.845
-2.614
-1.913
0.395
-2.483
-1.972
0.195
-5.485
-1.847
0.269
-3.979
-1.960
0.164
-3.443
-1.731
1.671
-1.513
-1.950
0.439
-4.072
-1.801
1.340
-2.084
-1.816
0.920
-1.553
-1.577
1.737
-0.551
-1.895
0.954
-2.768
-1.687
2.383
-1.963
-1.890
1.344
-4.051
-1.885
0.990
-3.017
-1.783
1.171
-1.858
-1.946
0.419
-4.201
-1.803
2.187
-3.189
-2.023
0.466
-2.651
-1.827
1.778
-3.281
-1.892
0.717
-2.024
-1.807
3.663
-3.044
-1.996
2.700
-5.633
-1.946
2.066
-7.526
-1.881
1.360
-2.944
-1.924
0.627
-3.498
-1.838
1.107
-2.184
-1.868
1.466
-3.294
-1.893
2.842
-2.764
-1.804
2.193
-2.498
-1.691
1.557
-1.076
-------
3-DIMETBYLAHINOPROPYL CHLORIDE. HCL
5407045
3-ETBOXY-4-BYDROXYBENZALDEHYDE
121324
3-FURANMETHANOL (STATIC)
4412913
3-BYDROXY-2-NITROP YRID INE
15128822
3-BYDROXY-3, 7, 11-TRIMETHYL-1, 6, 10-DODECATRIENE
7212444
3-KETHOXYP HENOL
150196
3-METHYL-1 -PENTYN- 3 -OL
77753
3-METHYL-2-BOTANONE
563804
3-METHYL-3-PENTANOL
77747
3-PENTANONE
96220
3-PYRIDINECARBOXALDEHYDE
500221
4' -CHLORO-3' -NITROACETOPBENONE
546S656
4,4'-DIHYDROXYDIPHENYL ETHER
1965099
4. 4'-ISOPROPYLIDENEBIS (2, 6-DICHLOROPHENOL)
79958
4, 5-DICHLOROCATECBOL
3428248
4, 5-DICHLOROGOAIACOL
2460493
4,6-DIMETBOXY-2-BYDROXYBENZALDEBYDE
708769
4, 6-DINITRO-O-CRESOL
534521
4- (2-HYDROXYETHYL) MORPHOLINE
622402
4- (DIETHYLAMINO) BENZALDEBYDE
120218
4- (DIETHYLAMINO) SALIC YLALDEBYDE
17754904
4-ACETAHIDOPHENOL
103902
4-ACETYLP YRIDINE
1122549
4 - AMINO-2-H1TROPBEMOL
119346
4-BENZOYLP YRIDINE
14548460
4-BROMOPBENYL 3-PYRIDYL KETONE
14548459
4 -BOTYLANILINE
104132
4-CHLORO-3-HETBYL PHENOL
59507
4-CBLOROBENZALDEBYDE
104881
4-CBLOROCATECBOL
2138229
4-DECYLANILINE
37529308
4-DIMETHYLAMINO-3-METBYL-2-BOTANONE
22104628
4-DIMETH YLAMINOCINNAMALDEHYDE
6203185
4 - ETHOX Y - 2-NITROANILINE
616864
4-ETHYLANILINE
589162
4-ETHYLP HENOL
123079
4-FLOORO-N-METHYLAMILINE
459596
4-FLOOROANILINE
371404
4-HEXYLOXYANILINE (NOMINAL CONC .)
39905572
4-METHOXYPHENOL
150765
4-METBYL-2-PENTANONE
108101
4-METHYLOXAZOLE
693936
4-METHYLPHENOL (P-CRESOL)
106445
4-NITROBENZAMIDE
619807
4-MITROPBENOL
100027
4-NITROPHENYL PHENYL ETHER
620882
4-OCTYLANILINE
16245797
4-PENTYN-2-OL
2117115
4-PHENYLP YRIDINE
939231
4-PICOLINE
108894
4-PROPYLPHENOL
645567
4-TOLDIDINE
106490
5,5-DIMETHYLHYDANTOIN
77714
5-BROMOSALICYLALDEBYDE
1761611
5-BROMOVANILLIN
2973764
5-CHLORO - 2-MERCAPTOBENZOTBIAZOLE
5331919
5-CHLORO-2-PYRIDINOL
4214793
5-CHLOROSALICYLALDEHYDE
635938
5-DIETHYLAMINO-2-PENTANONE
105146
S-ETBYL-2-METB YLP YRID INE
104905
5-HYDROXY-2 -NITROBENZALDEHYDE
42454068
5-METHYL-2-HEXANONE
110123
3-NONANONE
502567
6-CBLORQ-2-PICOLIHE
18368634
6-CBLOR0-2-PYRIDINOL
16879020
6-HETBTL-5-EEPTZN-2 -ONE
110930
0. 66
1.58
1.88
1.66
0.30
0.98
1.01
1.40
4.40
2 .22
1.58
1.24
0.86
0.98
0.57
0 . 86
1. 53
1.02
0.79
0.86
0.51
1.07
1.96
2.00
3.18
2.02
6.44
3.66
2.90
1.79
3.26
1.93
2.33
1.82
2.56
1.98
-0.53
1.31
2.94
1.77
3.34
1.93
0.32
1.51
0.48
1.21
0.96
1.54
1.98
1. 83
2.97
2.62
3.15
1.49
3.10
1.43
2.10
1.41
1.97
1.45
6.32
2.33
0.30
1.29
0.00
1.75
1.94
1.82
1.96
1.21
2.58
1.22
2.09
1.25
1.15
1.11
3.66
1.93
1.34
1.24
1.19
1.00
0.94
0.83
1.94
1.08
0.82
1.66
1.91
1 .39
4.28
2.15
5.27
2.05
-0.0 8
0.84
2.59
1.55
1.22
0.93
3.18
1.36
1.39
1.07
-0.75
1.28
3.15
2.01
2.09
2.31
2.97
2.02
1.78
1.30
3.00
1.57
0.35
1.57
2.49
1.21
1.65
1.67
1.72
1.14
2.91
1.42
2.09
1.28
I.It
1.30
1.70
1.26
2.176
-2.207
2.155
-2.273
2.755
-2.802
2.268
-2.310
0.166
-0.191
1.931
-1.958
3.110
-3.129
3.076
-3.121
2. 843
-2.870
3.225
-3.252
1.196
-1.232
0.826
-0.859
O. 803
-O.838
0. 197
-0.258
0. 1S9
-0.207
0.755
-0.778
0. 365
-0.398
0.375
-0.433
3.499
-3.541
1.297
-1.315
0.759
-0.795
2.901
-2.936
2.261
-2.290
1.501
-1.526
2.003
-2.024
1.218
-1.243
1.016
-1.049
0. 866
-0.959
0.441
-0.512
0.284
-0.321
-1.042
0. 992
0. 929
-O.953
0. 808
-0.851
1.429
-1.449
1.888
-1.920
0. 985
-1.071
1. 587
-1.629
1.087
-1.214
0.446
-0.497
2.227
-2.272
2.785
-2.816
3.204
-3.242
0.663
-0.939
2.143
-2.157
1.713
-1.748
0.441
-0.471
-0.855
0.823
1.519
-1.560
1.251
-1.280
2.667
-2.719
0.979
-1.043
2.247
-2.280
4.305
-4.373
0.044
-0.097
1.809
-1.856
0.531
-0.563
3.078
-3.095
-0.125
0.089
2.476
-2.527
1.915
-1.933
1.769
-1.872
2.290
-2.308
1.514
-1.545
2.366
-2.390
2.312
-2.346
1.994
-2.017
-2.217 5.236
-2.240 1.349
-2.906 3.434
-2.309 3.S25
-0.208 6.330
-1.965 5.885
-3.132 9.250
-3.144 3.532
-2.879 5.850
-3.252 5.916
-1.283 4.438
-0.933 4.788
-0.849 4.677
-0.268 2.627
-0.213 3.310
-0.786 7.142
-0.677 4.745
-0.498 2.734
-3.564 3.746
-1.316 9.035
-0.819 4.365
-3.063 4.615
-2.334 5.484
-1.536 6.184
-2.026 7.770
-1.252 6.447
-1.066 4.840
-1.073 1.725
-0.695 2.251
-0.430 4.301
0.998 3.166
-1.061 6.504
-0.867 3.702
-1.450 7.832
-1.952 4.994
-1.249 1.843
-1.715 3.754
-1.457 1.255
-0.525 3.135
-2.368 3.576
-2.817 5.055
-3.327 4.197
-1.675 0.575
-2.164 11.091
-1.781 4.590
-0.493 5.150
0.747 4.999
-1.650 3.851
-1.290 5.575
-2.787 3.074
-1.100 2.499
-2.322 4.792
-4.380 2.332
-0.168 2.987
-1.869 3.374
-0.566 4.8 67
-3.097 9.372
0.059 4.526
-2.584 3.091
-1.933 9.216
-1.887 1.553
-2.308 9.258
-1.545 5.080
-2.395 6.545
-2.352 4.751
-2.018 6.810
-1.903 0.675
-2.144 0.275
-1.756 1.588
-1.985 0.915
-1.884 1.110
-1.950 1.209
-1.946 0.663
-1.856 0.639
-1.918 0.828
-1.963 0.023
-1.875 2.127
-1.796 1.892
-1.911 0.761
-1.912 0.373
-1.932 0.371
-1.919 0.999
-1.582 3.299
-1.732 0.704
-1.898 1.019
-1.959 0.382
-1.835 1.061
-1.867 5.113
-1.829 1.595
-1.958 2.335
-1.951 0.624
-1.925 1.081
-1.902 1.032
-1.807 1.126
-1.664 1.293
-1.743 1.959
-1.996 1.280
-1.709 2.569
-1.934 1.227
-1.962 0.362
-1.854 1.253
-1.675 1.069
-1.815 1.927
-1.646 0.908
-1.838 0.624
-1.793 1.819
-1.972 0.339
-1.757 1.592
-1.789 2.191
-1.944 2.018
-1.844 1.095
-1.905 1.438
-1.852 2.920
-1.766 1.594
-1.921 0.981
-1.813 1.228
-1.820 0.875
-1.859 1.632
-1.955 0.609
-1.806 1.201
-1.954 1.500
-1.952 0.472
-1.963 0.580
-1.818 0.839
-1.861 1.450
-1.983 0.131
-1.902 0.289
-1.984 0.157
-1.991 0.097
-1.936 0.761
-1.954 1.141
-1.968 0.261
-3.424
-1.788
-2.024
-3.461
-4 .014
-4.587
-5.67 6
-2.042
-3.911
-3.920
-3.551
-3.059
-3.267
-1.798
-2.208
-4.904
-1.230
-1.336
-2.702
-6.218
-2.900
-5.130
-3.440
-5.685
-5.397
-4.544
-3.334
-1.203
-0.900
-2.279
-3.387
-2.672
-3.124
-5.433
-3.179
-0.701
-2.60 2
-0.416
-1.916
-2.341
-3.736
-2.247
-1.345
-8.392
-2.869
-3.790
-4.090
-2.125
-4.160
-1.958
-1.537
-3.307
-2.130
-1.782
-3.242
-3.480
-6.619
-2.482
-2.364
-6.521
-1.958
-6.571
-3.721
-4.543
-3.910
¦4.744
-------
A, A' -DICHLORO-P-XYLENE
623256
A, A, A' -TETRABROMO-O-XYLENE
13209159
1, A, A-4-TETRAFLOORO-O-TOLOIDINE
393395
A, A, A-4-TETRAFLDRO-M-TOLUIDINE
2357473
A, A. A-TRIFLOORO-M-TOLOALDEH YDE
454897
A, A, A-TRIFLOORO-M-TOLONITRILE
368774
A, A-2, 6-TETRACHLOROTOLOENE
81196
A-BROMO-2' , 5' -DIMETHOXYACETOPHENONE
1204213
A-D ECANO LACTONE
706149
ABIETIC ACID
514103
ACENAPHTHENE
83329
ACETONE
67641
ACETONITRILE
75058
ACETOPHENONE
98862
ACROLEIN
107028
ADAMANTANE
281232
ALACHLOR
15972608
ALLYL CYANIDE
109751
ALLYL METHACRYLAIB
96059
AMOBARBITAL
57432
AMPHETAMINE SULFATE
60139
AMYLAMINE
110587
AMYLBENZENE
538681
ANILINE
62533
ANTHRANILAMIDE
88686
B-IONONE
79776
BENZAU>EHYDE
100527
BENZAMIDE
55210
BENZOIC ACID, SOOIOM SALT
532321
BENZOPHENONE
119619
BENZYL METHACRYLATE
2495376
BENZYLTRIETHYLAMtONIOM CHLORIDE
56371
BROMACIL
314409
BOTANAL
123728
BOTYL ACETATE
123864
BOTYLAMINB
109739
CAFFEINE
58082
CARBARYL (SEVIN)
63252
CHLOROACETONITRILE
107142
CINEOLE
470826
CIS-3-HEXEM-1-OL
928961
CYCLOHEXAHOL
108930
CYCLOHEXANONE
108941
CYCLOHEXYL ACRYLATE
3066715
DEHYDROABIETIC ACID
1740198
DI-N-BOTYLORXHOPHTHALA1E
84742
DI-H-HEXYLAMINE
143168
DIAZINON
333415
DIBENZOFORAN
132649
DIBOTYL AD IFATE
105997
DIBCTTYL FUMARATE
105759
DI BOTYL SUCCINATE
141037
DICHLOROMETHANE
75092
DIETHYL AD IPATE
141286
DIETHYL BENZYLMALONATE
607818
DIETHYL BENZYLPHOSPHONATE
1080326
DIETHYL CHLOROMALONATE
14064109
DIETHYL ETHER
60297
DIETHYL HALOHATE
105533
DIETHYL PHTHALATE
84662
DIETHYL SEBACATE
110407
DIETHYLAMIDE
109897
DIMETHYL AMINOTEREPHTHALATE
5372816
DIMETHYL NITROTEREPHTHALATE
5292455
DIPHENYL PHTHALATE
84628
DISOLFOTON
298044
3.27
5.17
2.62
2.62
2.47
2.46
64
39
72
18
92
24
34
SB
01
3.98
3.52
12
57
07
76
49
91
90
35
96
1.4a
0.64
87
18
82
00
11
88
1.73
0.97
0.07
2.36
0.45
2.76
1.34
1.23
0.81
2.78
6.50
4.72
4.77
3.81
4.12
3.82
3.91
3.54
1.25
1.71
2.76
1.46
2.59
0.89
1.07
2.47
3.82
0.58
2.45
1.92
4.53
4.02
1.75
4.22
1.79
1.79
1.74
1.71
2.30
2.59
1.70
3.03
1.54
0.58
0.41
1.20
0.56
1.36
2.70
0.67
1.26
2.26
2.33
0.87
1. 48
0.93
1. 36
1.92
1.06
1.21
1.44
1.82
1.76
2.28
2.61
0.72
1.16
0.73
1.94
2.01
0.75
54
00
00
98
1.54
3.00
2.78
1.85
3.04
1.68
2.58
2.28
2.30
0.85
2.02
2.50
2.28
1.95
0.74
1.60
2.22
2.58
0.73
2.09
2.39
3.18
2.74
-1.385
-0.342
1.473
1. 498
0.013
1.623
0.038
-1.144
1.310
0. 452
0.240
3.903
3. 316
2.203
-1.732
-0.538
0.714
2.259
-0.043
. 915
.248
.210
. 254
. 133
2. 620
0.731
. 139
.830
.821
.200
.677
.187
.276
.233
.271
. 470
.221
. 981
. 190
1.990
2.578
2. 842
2.829
0.263
0.307
0.015
-0.017
1.003
0.257
0.457
-0.187
0.664
2. 483
1.250
0.800
2.506
-0.139
3.415
1.208
1.405
0.378
2.880
0.949
0.827
-1.080
0.579
1.340
0.309
-1.492
-1.524
-0.059
-1.657
-0.063
1.090
-1.335
-0.508
-0.264
-3.938
-3.357
-2.229
1.676
0.520
-0.754
-2.281
-0.015
-1.949
-1.342
-2.275
-0.297
-2.177
-2.680
-0.756
-1.189
-2.876
-2.860
-1.246
-0.718
-2.233
-2.288
-1.276
-1.290
-2.531
-2.257
-1.019
-0.257
-2.037
-2.606
-2.864
-2.863
-0.297
-0.356
-0.078
-0.019
-1.051
-0.280
-0.533
0.142
-0.702
-2.518
-1.290
-0.837
-2.543
0.081
-3.438
-1.261
-1.463
-0.428
-2.914
-0.976
-0.865
1.007
-0.608
-0.
-3.
-3.
-2 .
1.
0.982
0.307
-1.492
-1.528
-0.145
-1.664
-0.111
1.018
-1.353
-0.481
.275
.944
.366
.232
. 5b3
0.510
-0.849
-2.317
-0.169
-1.979
-1.634
-2.395
-0.493
-2.196
-2.690
-0.760
-1.285
-2.888
-2.
-1.
-0.
-2.
-2.
-1.
.856
.248
. 830
.276
.288
.418
-1.319
-2.791
-2.276
-1.048
-0.598
-2.045
-2.613
-2.873
-2.878
-0.342
-0.362
-0.180
-0.030
-1.079
-0.289
-0.632
0.057
-0.781
-2.528
-1.3.4 3
-0.854
-2.549
-0.259
-3.439
-1.365
-1.541
-0.456
-2.976
-0.993
-0.874
0.537
-0.620
3.595
4. 857
8.552
6.185
3.434
4.748
6.448
2.950
6. 478
2.836
6.629
4.539
3.827
5. 938
2.8 38
~.912
3. 970
7.418
2.746
4.798
1. 6 97
2.443
3. 664
3.641
2.687
~.304
3. 145
3. 506
4.031
3.525
3. 916
3.421
14.168
3.740
8. 103
2.609
4.434
4. 139
2.381
3.355
5.635
7. 342
4.761
4.647
3.299
2.561
4.393
3.260
6.907
2.0 96
3.518
4.231
4.514
3.969
4.2 97
4.405
2.744
6.998
3.030
2.779
3. 155
4.669
5.751
4.220
2. 193
5. 557
1.606
1.978
991
954
816
951
830
767
883
032
945
1.926
1.908
1.943
•1.704
1.934
1.773
1.848
1. 697
1.856
1. 614
1.701
1 . 658
1 .932
1 . 896
1.940
1 . 826
1.916
1 . 994
952
732
874
988
685
842
1 .681
1.880
1.880
1.700
1.902
1.929
1.922
1.893
1.856
1.963
1.717
1.872
1.874
1.927
1.754
1.741
1.733
1.921
1.770
1.891
1.931
1.566
1.972
1.752
1.788
1 .863
1 .829
1.901
1.916
1.645
1.903
3.424
2.162
0 .180
0.941
1 .773
1 .123
2.011
0.983
1 .155
1.551
1.917
0.430
0.470
0.416
1.156
1.883
1 . 800
2.032
1.484
1.152
1 .346
1.047
2.215
1 .361
0.309
0.529
1.923
0.669
1 .655
0.227
1.753
1.349
0.290
1 .788
1.678
2 .247
0.825
1.161
2.886
0.345
0 .735
1.148
0.812
1.661
1.064
0.993
0.428
0.838
1.106
0.910
1.240
1.343
0.674
0.980
0.767
0.597
2 .242
0.627
1.373
1.117
0.751
1.910
1.178
0.580
3.056
0.857
-1.650
-5.029
-6.279
-4.688
-2.563
-3.851
-4.153
-1.490
-4.201
-3.707
-5.588
-3.050
-2.665
-3.894
-1.097
-6.751
-2.282
-4.719
-1.140
-2.979
-0.431
-0.944
-1.497
-3.185
-1.732
-4.268
-2.484
-2.439
-4.339
-2.384
-1.900
-2.521
-10.329
-1.933
-4.886
-1.585
-2.846
-2.955
-1.974
-2.066
-3.866
-5.022
-3.099
-3.235
-2.982
-1.068
-2.772
-2.140
-4.922
-0.943
-1.691
-1.713
-3.080
-1.919
-2.872
-3.068
-0.806
-5.245
-1.617
-1.498
-2.002
-3.063
-3.927
-2.829
-1.692
-3.700
-------
DIORON
330541
DL-3-BUTYN-2-OL
65337136
ETHANAL
75070
ethanol
64175
ETHYL ACETATE
141786
ETHYL HEXANOAXE
123660
ETHYL P-AMINOBENZOAXE
94097
ETHYL SALICYLATE
118616
ETHYL TRIFLOOROACETATE (MEASURED STATIC)
383631
ETHYLBENZENE
100414
EXO-NORBORNEOL
497370
FLAVONE
525826
FURAN
110009
HEXACHLORO-1, 3-BUTADIENE
87683
BEXACHLOROETBANE
67721
HEXAMETHYLENETETRAMINE (ALIPHATIC)
100970
HEXANAL
66251
HEXANOIC ACID
142621
HEXYL ACETATE
142927
HEXYL ACRYLAXE
2499958
HEXYLAMINE
111262
IODOFORM
75478
ISOBOTYL ACRYLATE
106638
ISOPIMARIC ACID
5835267
ISOPROPYL DISULFIDE
4253898
ISOPROPYL METHACRYLATE
465S349
ISOPROPYLBENZENE
98828
ISOVALERALDEHYDE
590863
M-AMINOACETOPHENONE
99036
M-BROMOBENZAMIDE
722726016
M-DIETHYLBENZENE
141935
H-NITROTOLUENE
99081
MALATHION
121755
MANOOL
596850
METHANOL-RHODAMINE B
67561
METHYL 2 , 4 -DIH YDROXYBEMZOAXE
2150472
METHYL 2,5-DICHLOROBENZOATE
2905693
METHYL 4-CHLORO-2-NITROBENZOATE
42087808
METHYL 4-CYANOBENZOAXE
1129357
METHYL ACETATE
79209
METHYL P-CHLOROBENZOATE
1126461
METHYL P-NITROBENZOATE
619501
N, N-BIS (2, 2-DIETHOXYETHYL)METHYLAMINE
6948863
N, M-DIETHYL-M-TOLOAMIDE
134623
N, N-DIETBYLANILINE
91667
N, N-DIETHYLCYCLOHEXYLAMIME
91656
N, N-DIETHYLETHANOLAMINE
100378
M,H-DIMETHYL-P-TOLOIDINE
99978
N,N-DIMETBYLANILINE
121697
M, N-DIMETHYLBENZYLAMIME
103833
N,N-DIPHENYLFORMAMIDE
607001
M- (3-METHOXYPROPYL)-3,4, 5-TRIMETBOXYBENZYLAMIME
34274048
N-ALLYLANILINE
589093
N-DECYLAMINE
2016571
N-ETHYL-M-TOLOIDINE
102272
N-ETBYLBENZYLAMINE
14321278
M-HEPTYLAMINE
111682
N-OCTYL CYANIDE
2243278
N-PHENYLDIETHANOLAMINE
120070
N-PROPYL SULFIDE
111477
N-ONDECYL CYANIDE
2437254
N - VINYLCARBJLZO LE
1484135
NAPHTHALENE
91203
NZOABIZTIC ACID
411112
NICOTINE SOLFITK
65305
NITROBENZENE
96953
2.77
2.33
-0.06
0. 70
-0.22
0.44
-0.31
0.46
0.73
0.88
2.79
1.44
1.86
1.65
3.14
1.66
1.18
1.42
3.15
1.06
1.02
1.12
4.44
2.22
1 .34
0.68
4.78
2.61
4.61
2.37
2.46
1.40
1.78
1.00
1.92
1.16
2.79
1.44
3.39
1.56
2.06
1.01
3.54
3.94
2.22
1.28
6.24
3.03
3.42
1.50
2.25
1.28
3. 66
1.20
1.23
0.86
0.90
1.35
1.65
2.00
4.50
1.34
2.45
1.37
2.36
3.30
6. 60
2. 90
-0.77
0.32
2.22
1.68
3.37
2.OS
2.49
2.16
1.72
1.61
0.18
0.74
2.90
1.71
2.02
1.81
1.39
2.63
1. 99
1.91
3.31
1.49
2. 97
1.55
0.40
1.17
2.81
1.35
2.31
1.21
1.98
1.35
0.00
1.97
0.09
2.69
2.16
1.33
4.10
1.57
2.82
1.35
2.04
1.35
2.57
1.15
3.31
1.39
0.60
1.81
2.96
1.18
4.90
1.81
4.73
1.93
3.30
1.28
6.24
3.03
1.11
2.60
1.85
1.23
1.214 -1.266
1.022 -1.071
1.503 -1.558
4.193 -4.226
2.380 -2.417
1.007 -1.046
1.571 -1.601
1.144 -1.196
4.089 -4.125
1.097 -1.128
2.369 -2.436
0.561 -0.583
1.839 -1.878
-0.902 0.849
0.126 -0.172
4.722 -4.747
1.225 -1.285
2.612 -2.645
0.542 -0.600
0.144 -0.177
1.749 -1.770
0.565 -0.603
0.440 -0.472
-0.035 -0.023
0.900 -0.919
1.601 -1.622
0.837 -0.859
0.566 -0.625
2.595 -2.609
1.963 -2.064
0.580 -0.627
1.368 -1.391
1.307 -1.430
-0.907 0.853
4.476 -4.498
1.574 -1.603
1.211 -1.268
1.311 -1.384
1.679 -1.730
2.587 -2.611
1.050 -1.118
1.394 -1.436
2.769 -2.802
2.080 -2.099
1.175 -1.227
1.356 -1.384
3.270 -3.297
1.682 -1.707
1.807 -1.829
1.594 -1.621
1.565 -1.595
2.144 -2.173
1.552 -1.588
0.024 -0.049
1.686 -1.704
1.846 -1.875
1.297 -1.345
0.744 -0.770
2.823 -2.871
1.360 -1.387
-0.216 0.157
0.729 -0.804
0.743 -0.781
0.235 -0.258
1.192 -1.235
2.159 -2.212
-1.294
3.053
-1.152
3.242
-1.781
2.888
-4.233
4.781
-2.453
4 .300
-1.071
4.071
-1.611
5.374
-1.224
3.104
-4.137
4.528
-1.160
5.193
-2.559
2.397
-0.588
7.130
-1.948
4.124
0.716
2.997
-0.209
3.478
-4.753
6.421
-1.544
2 .653
-2.737
4.863
-0.716
2.757
-0.223
4.945
-1.771
7 .307
-0.652
4.160
-0.502
4 .964
-0.116
2.708
-0.953
8 .308
-1.629
7.693
-0.870
7.129
-0.745
2 .703
-2.610
U .276
-2.046
1 .575
-0.700
3.360
-1.398
6.734
-1.341
1 .293
0. 817
2.979
-4.499
7.093
-1.610
5.487
-1.257
2.790
-1.414
2 .179
-1.763
3.125
-2.639
6.637
-1.156
2.362
-1.457
3.730
-2.813
4.886
-2.099
8.414
-1.255
3.055
-1.399
5.745
-3.317
5.965
-1.717
6.355
-1.829
7.298
-1.629
5. 966
-1.594
5.309
-2.178
5.356
-1.592
4.386
-0.056
6.315
-1.706
8.760
-1.883
5.440
-1.353
3.295
-0.781
6.252
-2.894
3.318
-1.412
5.934
-0.110
2.726
-0.896
2.118
-0.819
4.181
-0.272
6.824
-1.256
3.701
-2.228
3.021
-1.918
1.303
-1.779
1.286
-1.609
1.726
-1.907
0.417
-1.868
1.347
-1.928
1.865
-1.907
0.738
-1.866
0.749
-1.924
0.960
-1.861
1.379
-1.731
1.174
-1.938
0.739
-1.802
1.602
-1.794
2.123
-1.863
1.079
-1.947
1.043
-1.569
1.664
-1.768
2.109
-1.797
1.729
-1.853
1.774
-1.968
0.529
-1.831
1.344
-1.896
1.752
-1.873
2.301
-I.920
4.529
-1.941
1.309
-1.934
1.631
-1.739
1.343
-1.977
1.113
-2.028
0.598
-1.819
1.497
-1.929
0.803
-2.058
1.523
-1.849
0. 810
-1.975
0.347
-1.934
0.853
-2.045
0.510
-1.886
0.673
-1.855
0.802
-1.877
1.772
-1.858
0.730
-1.871
O. 683
-1.908
0.703
-1.982
0.317
-1.874
0.780
-1.884
0.873
-1.888
1.307
-1.922
0. 981
-1.967
0. 413
-1.916
0.723
-2.004
0.170
-1.942
0.534
-1.937
0.366
-1.926
0.774
-1.958
0.772
-1.917
0.658
-1.908
0.342
-1.900
0.844
-1.883
0.771
-1.848
1.097
-1.658
2.247
-1.912
2.766
-1.862
1.334
-1.930
1.763
-1.885
0.805
-1.892
0.543
-2.763
-1.889
-0. 918
-3.035
-2.945
-3.727
-3.518
-1.951
-3.367
-3.410
-1.181
-4.905
-2.559
-2.353
-2.37 4
-4.835
-0.449
-2.834
-1.836
-3.457
-5.335
-2.636
-3.850
-2.747
-7.608
-5.700
-5.531
-1.332
-8.655
-1.780
-2.308
-4.628
-2.596
-1.786
-5.093
-3.895
-2.813
-1.465
-1.985
-4.625
-1.560
-2.295
-3.325
-6.153
-2.037
-3.694
-4.073
-4.421
-5.161
-4.052
-4.090
-3.724
-2.924
-4.355
-6.26B
-3.712
-2.102
-4.154
-2.201
-3.483
-1.535
-3.152
-2.818
-5.390
-2.428
-1.936
-------
MOtUMOXC ACID
112050
3.47
NONYLPHENOL
104405
6.36
NORBORNYLENE
498668
2.56
O-FLOOROBENZALDEBYDE
446526
1.76
O-NITROBENZALDEHYDE
552096
1.74
O-TOLOALDEHYDE
529204
2.26
O-TOLONITRILE
529191
2.22
0-VANILLIN
148538
1.37
OCTYLAMINE
111864
3.04
P- (TERT-BOTYL) BENZAMIDE
56108124
2. 51
P-BROHO ANILINE
106401
2.26
P-CHLOROMETHYL STYRENE
1592207
3.43
P-DIMETHOXYBENZENE
150787
2.15
P-DIMETHYLAMINOBEHZALDEHYDE
100107
1.81
P-ETHOXYBENZALDEBYDE
10031820
2. 31
P-FLOOROPBENYL ETHER
330938
4.74
P-ISOPROPYL BENZAIDEBYDE
122032
3.07
P-PHENOXYBENZALDEHYDE
67367
3.96
P-PHENOXYPHENOL
831823
3.75
P-PHENYLAZOPHENOL
1689823
3.18
P-TERT-BOTYLPBENOL
98544
3.31
P-TERT-PENTYLPHENOL
80466
3.98
P-XYLENE
106423
3. 15
PENTABROMOPBEHOL
608719
4.69
PENTACHLOROETHANE
76017
3.63
PENTACHLOROPHENOL
87865
5.12
PENTACBLOROPYRIDIHE
2176627
4.34
PENTAFLOOROBENZALDEHYDE
653372
2.45
PENTOBARBITAL
76744
2.10
PENTYL ETHER
693652
4.57
PERMETHRIN
3264S532
6.81
PBENOBARBITAL
50066
1.47
PHENOL
108952
1.46
PHENYL 4-AMINOSALICYLATE
133119
3.15
PHENYL DISULFIDE
882337
4.41
PHENYL ETHER
101848
4.21
PBENYLTRIMETHYLAMtONIOM IODIDE
98044
0.00
PHENYLTRIMETHYLAMIONIOM METHOSOLFATE
28001584
0.00
PIPERINE (ALIPHATIC)
94622
2.38
PROPANIL
709988
3.07
PROPIONIC ACID, SODIOM SALT
137406
0.33
PROPYL ACETATE
109604
1.20
PROPYL DISULFIDE
629196
3.86
PROPYLAMINE
107108
0.48
PYRIDINE
110861
0.65
PYRROLE
109977
0.75
ROTENONE
83794
4.10
S-TRIOXAME
110883
-0.56
SACCHARIN SODIOM SALT HYDRATE
128449
0.91
SALICYLALDEHYDE
90028
1.81
SALICYLANILIDE
87172
3.27
SALICYLIC ACID MA+
54217
2.26
SECOBARBITAL
76733
1.97
SOLXETAL
100798
-0.07
T-BOTYL DISULFIDE
110065
3.32
T-BOTYLSTYREHE
1746232
4.84
TERT-BOTYL METHYL ETHER
1634044
1.05
TERT-OCTYLAMINE
107459
2.43
TETRACHLOROCATECHOL
1198556
4.29
TETRACHLOROETHYLENE
127184
3.40
TETRAHYDROFORAN
109999
0.46
TETRAHYDROFORFORYL METHACRYLATE
2455245
1.30
TETRAMETHYLAHMONIDM CHLORIDE (NOM. CONC . )
75570
0.00
THIOPENTAL
76755
2.10
TOLAZOLINE HYDROCHLORIDE
59972
2.29
TOLUENE
108883
2.73
1.58
2.124
2.20
-0.833
0.94
1.148
1.24
0.182
1.51
1.276
1.20
1.655
1.17
1.652
1.52
0.320
1.29
0.653
1.77
1.497
1.72
1.646
1.53
-0.539
1.38
2.072
1.49
1.728
1.50
1.496
2.06
0.094
1.48
0. 808
1.98
0.705
1.86
0.729
1.98
0.042
1.50
0.757
1.64
0.412
1 .06
0.951
4.89
-1.067
2.02
0. 873
2.66
-0.565
2.51
-0.184
1.96
-0.060
2.26
1.691
1.58
0.472
3.91
-1.878
2.32
2.714
0. 94
1.701
2.29
0.361
2.18
-0.975
1.70
0. 647
2.63
2.394
2.47
2.746
2.85
0.907
2.18
0.981
0.96
3.635
1.02
1.830
1.50
0.447
0.59
2. 583
0.79
2.097
0.67
2.393
3.94
0. 657
0.90
3. 814
1.83
4.266
1.22
0.308
2.13
0. 602
1.60
3.275
2.32
1.392
1.32
4.223
1.78
0.802
1.60
-0.194
0.88
2.802
1.29
1.494
2.48
0.146
1.66
1.226
0.72
3.349
1.70
1.658
1.10
2.726
2.42
1.336
1.97
2.615
0.92
1.475
-2.224
-2.189
0.811
0. 801
-1.194
-1.291
-0.240
-0.383
-1.314
-1.353
-1.679
-1.685
-1.672
-1.673
-0.367
-0.417
-0.716
-0.850
-1.518
-1.521
-1.668
-1.672
0.460
0. 109
-2.099
-2.102
-1.762
-1.765
-1.609
-1.635
-0.141
-0.170
-0.834
-0.846
-0.748
-0.795
-0.770
-0.774
-0.060
-0.075
-0.800
-0.807
-0.444
-0.455
-1.002
-1.058
1.039
1. 017
-0.904
-0.914
0.523
0.493
0.145
0.056
0.005
-0.368
-1.757
-1.796
-0.505
-0.520
1.639
1.259
-2.749
-2.766
-1.757
-1.760
-0.402
-0.425
0.911
0.762
-0.668
-0.670
-2.477
-2.535
-2.980
-3.330
-0.937
-0.962
-1.011
-1.031
-3.684
-3.732
-1.858
-1.873
-0.472
-0.502
-2.619
-2.752
-2.142
-2.295
-2.422
-2.423
-0.695
-0.716
-3.840
-3.844
-4.299
-4.313
-0.334
-0.344
-0.645
-0.647
-3.312
-3.335
-1.421
-1.457
-4.245
-4.253
-0.829
-0.849
0.130
0. 122
-2.818
-2.818
-1.554
-1.570
-0.187
-0.219
-1.270
-1.314
-3.380
-3.391
-1.696
-1.783
-2.771
-2.792
-1.376
-1.451
-2.652
-2.661
-1.532
-1.542
1.602
-2.014
7.154
-1.926
3.445
-1.821
2 .728
-1.826
4.104
-1.846
6.677
-1.940
7.965
-1.969
3.389
-1.805
2.528
-1.712
7.582
-1.955
7.154
-1.978
2.030
-1.692
5.952
-1.956
4.710
-1.945
1.417
-1.949
3.339
-1.910
6.141
-1.914
3.678
-1.839
3.926
-1.964
8.806
-1.915
3.743
-1.913
4.843
-1.903
3.175
-1.863
5.798
-1.921
5.114
-1.915
3.810
-1.846
4.087
-1.790
2.905
-1.513
2.419
-1.842
4.772
-1.901
0.665
-1.665
4.471
-1.889
2.846
-1.974
3.895
-1.848
2.517
-1.754
7.609
-1.955
1.931
-1.819
0.681
-1.635
5.275
-1.888
5.311
-1.894
3.256
-1.876
5.607
-1.894
6.169
-1.878
4.467
-1.786
3.493
-1.760
5.373
-1.959
4.114
-1.876
6.129
-1.941
4.881
-1.898
6.201
-1.905
3.723
-1.973
4 282
-1.889
5.365
-1.820
7.220
-1.917
6.098
-1.901
2.518
-1.969
10.259
-1.974
2.654
-1.894
3.875
-1.857
3.624
-1.831
5.155
-1.908
4.155
-1.854
3.593
-1.889
3.969
-1.720
4.326
-1.935
2.7 60
-1.960
0.607
-2.074
1.211
-5.128
2.077
-2.642
2.490
-2.566
1.149
-2.619
0. 836
-4.706
0.390
-5.632
0.959
-1.837
1.259
-1.157
0.580
-5.304
6.319
-9.366
2. 459
-1.699
0.683
-4.410
0.342
-3.224
1.136
-1.816
1 .348
-2.803
1.055
-4.210
1. 217
-2.280
0.852
-3.261
2.067
-6.556
0. 396
-2.417
0.628
-3.218
1. 510
-2.384
2. 132
-4.783
0.757
-3.467
0. 848
-2.296
1.885
-2.626
2.313
-0.284
0.683
-1.561
0.847
-3.199
0.798
-0.252
0.813
-2.97 4
0.939
-2.785
0.651
-2.497
1.646
-1.498
0.581
-5.319
0.689
-1.094
0. 687
-0.124
1.379
-3.655
1.196
-3.6B8
1.478
-2.546
0.964
-3.687
1.747
-4.318
3.037
-3.329
2.406
-2.743
0.233
-3.824
0.784
-2.731
0.477
-4.263
0.784
-3.271
0.800
-3.976
0.947
-3.300
1.027
-2.966
1.175
-2.945
0.940
-4.875
1.560
-4.404
1.705
-3.043
0.311
-7.225
0.497
-1.334
1.005
-2.441
1.078
-2.140
0.795
-3.474
2.813
-3.683
0.816
-2.447
1.129
-2.398
0.853
-3.244
1.141
-2.679
-------
TRANS-1, 2-DICHLOROCYCLOHEXANE
822866
TRAHS-2-CHEHYL-l-CYCLOHEXANOL
2362610
TRAtlS-3-BEXEN-l-OL
928972
TRIBOTYL PHOSPHATE
126738
TRICBLOROETHYLENE
79016
TRIETHYL NITRILOTRICARBOXYLATE
3206313
TRIETHYLENE GLYCOL
112276
TRIPHENYL PHOSPHATE
115866
TRIPHENYLPHOSPBINE OXIDE
791286
TRIPROPARGYLAMINE
6921295
TRIPROPYLAMINE
102692
THIS(2-BOTOXYETHYL) PHOSPHATE
78513
VALERALDEHYDE
110623
VANILLIN
121335
[ (IS) -EMDOJ-(-)-BORNEOL
464459
[ 1 (R) -ENDOJ - ( + ) -3-BROMOCAMPHOR
76299
I
¥—
o>
I
3.21
1.53
1.271
-1.290
-1
2.82
1.76
1.628
-1.647
-1
1.34
1.00
2. 447
-2.479
-2
3.53
2.66
0.961
-0.989
-0
2.42
1.31
1.718
-1.759
-1
0.00
2.33
1.239
-1.303
-1
1.24
1. 50
4.873
-4.900
-4
4.59
3.26
-0.033
-0.002
-0
2.83
2.78
1.737
-1.757
-1
1.40
1. 31
2.481
-2.506
-2
2.79
143
1.691
-1.733
-1
4.47
3. 98
0. 955
-1.015
-1
1.36
0.86
1. 038
-1.092
-1
1.21
1.52
1.934
-1.968
-2
2.58
1.54
1.815
-1.837
-1
2.47
2.31
2.377
-2.691
-2
8.140
-1.973
0.181
-5.652
8.277
-1.884
1.510
-5.262
4.984
-1.904
0.760
-3.376
5.987
-1.907
0.652
-3.988
3. 904
-1.917
1.074
-2.941
2.512
-1.741
1.915
-1.705
5.805
-1.942
0.688
-4.165
4.515
-1.969
0.702
-3.618
7.904
-1.973
0 . 266
-5.514
6.425
-1.952
0.542
-4.529
3.795
-1,900
0.592
-2.463
2.668
-1.765
0.928
-1.222
2.943
-1.719
2.439
-1.937
4.617
-1.865
1. 528
-3.324
7.264
-1.951
0.558
-5.071
0.S06
-2.296
0.825
-1.939
290
665
491
996
777
407
90S
004
758
509
745
091
311
007
840
180
-------
So:-,;-, i on 1: Ecological Assessment, at Hazardous Was to S i I os
DISCUSS ION
Mark D. Sprengot-
S. Environmental Protection Agency
Environmental Response Team i'MS-101)
Woodbridge Avenue
Edison, N.J 088.17
The Fnvironmental Response Team fERTl responsibilities include assistance to on
scow ' tsord i nat ors arid remod i al pro.;<-rt manager; ' OSC* and I?! "Ms' in conduct ing
environment a). assessment s at hazardous waste sites. The f o! 1 owing comments are
di root ed : i ! 11> • di {'('>• u! ! wh i. it have been found in ' he i nt < -rpre tat ion of
environmental data and the application of statistical techniques to real world
sites.
In conducting an onvironmental assessment, the objective is to conduct good
science and to apply this within the context of the site, this includes
appropriate data collection and interpretation. The endpoint is to make
scientifically sound statements as to the distribution and magnitude of
contamination, and the impact or potential impacts, to the environment related
to the site. It must bo clearly understood that within the Superfund program
work conducted must be directly linked back to the site and that the objective
is site management decisions. These factors may put severe limitations on the
data that can be generated.
As with any study, the statistical analyses to be conducted should be considered
prior to the data collection. However, this is frequently not the case at waste
sites, as much data is collected with no initial intent of statistical
evaluation. An example of this is samples collected to determine the presence
of contaminants at the site, the initial use is not to determine the extent of
contamination nor is it intended to be in any way representative of the magnitude
of contamination. The data is, therefore, biased. A second example of data
biasing is the sampling of specific areas based upon knowledge of the fate and
transport of contaminants. In order to assess the impacts of contaminants on a
stream, for example, determination of the level of contamination in sediment and
water are conducted. In addition, biological parameters may be determined, such
as toxicity, bioaccumulation or community level impacts. These determinations
are not being done randomly within the stream being studied. Depositional areas
along the length of the stream are sampled, assuming that the contaminants will
accumulate on the fine particle size organic material found in the depositional
areas, and that many of the chemical and biological parameters of interest co-
vary. In many streams, the areas which are not depositional are erosional and,
therefore, inappropriate for evaluation of impacts due to contamination. The
length of the stream is utilized to establish a concentration gradient of site
contaminants. From this discussion it is clear that contaminant distribution or
-147-
-------
sprjt in! impact results in no e\ al aal ; ..n of var i ab I! i I y w i' h i n each samp I ing
location. An ex;uiij>! e of sample si/.e 1 iuilt .•* ions : ncl ad.-s a mamma ] accumu I at ion
tit udy. A series of sump I in>>; locat tons wo» e selected, and iimumkj 1 s collected for
contaminant accumulation measurements. Due to t lie proximity of sampling
locations only a limited number.of trapping could be done before inducing the
migration of organisms from one area to another. This resulted in uneven sample
sizes and a reduced data set for each sampling location.
Despite the limitations which are forced upon the environmental studies conducted
at waste sites, definitive answers or statements are required by the RF'M or OSC
The conclusion that additional siuuples or work are needed is often not viable
as the demand for site action requires a decision to be made.
-148-
-------
Session 4: Ecological Assessment at Hazardous Waste Sites
DISCUSSION
Wayne L. Myers
Forest Biometrics
Penn State Univ.
Univ. Park, PA 16802
Biota occur in almost infinite variety and with many dimensions of
temporal dynamics. Furthermore, development of quantitative detail on the
resident biota of an area that transcends the plot level is both time
consuming and expensive. To complicate matters further, biological
compensatory adaptation works to buffer impacts and alter patterns of
expression over time. Perhaps the ultimate difficulty is that the
occurrence of something "unusual" in any one species on any one site at any
one time is, in fact, now so unusual. Therefore, generalized data
collection is certain to be an expensive way to document ecological effects
of contaminants; and is also likely to be inconclusive. For the same
reasons, transforming encountered data into information that speaks to a
contamination issue is an uncertain proposition at best.
The most effective way to build a case for impact attributable to a
contaminant is to have prior information (i.e., a hypothesis) as to the
manner of likely expression, and to target data collection on the response
variables of the hypothesized-based impact model. Linkages are critical to
constructing evidence from data. Therefore, the impact model should
constitute a composite hypothesis that is elaborated in terms of linkage
between multiple elements of biotic communities and key features of their
environment. While one unusual occurrence may not be convincing, a set of
unusual occurrences in a predicted pattern becomes difficult to explain
away through chance or unspecified other influences. The evidence becomes
particularly persuasive when a coherent spatial pattern emerges that
encompasses the distribution/transport of suspected causal agents. Thus,
the concern for capturing spatial distribution of contaminants that was
raised in discussion of Session //I becomes reinforced.
The impact model that constitutes the composite hypothesis which drives
data collection and analytical synthesis must have a scientific foundation
that is anchored in laboratory work and strengthened by in-situ biological
tests. Field case studies are important but lack sufficiency due to the
shifting constellations of variables that enter into field situations. In
the absence of such a scientific foundation, spatial/temporal co-occurrence
of contaminants with biological/ecological anomalies is suggestive but
remains open to question.
What emerges is the pressing need for development and organization of
knowledge bases in support of ecological assessment that are contaminant
specific. Ad hoc assessment provides a poor basis for litigation, and no
amount of sophisticated statistical gymnastics will change this.
Concurrence is given to enthusiasm of presenters for multiple endpoints.
However, multiple endpoints must be tied together by integrative analysis.
Otherwise, multiple endpoints constitute little more than loose ends.
-149-
-------
Knowledge-based information technologies and statistical techniques for
spatial synthesis are essential ingredients of comprehensive capability
for ecological impact assessment. Much research and development in these
areas remains to be done, but the advanced technologies already available
in the form of expert systems and GIS have not been effectively harnassed.
Thus, the potential marginal utility of new research will not be realized
fully until a comprehensive program of technological integration is
undertaken. There may never be a better programmatic platform for such
an undertaking than that provided by Superfund.
-150-
-------
COMMENTS BY PARTICIPANTS
On. Kapustka, Shirazi, and Linder
Susan Braen Norton (U. S. Environmental Protection Agency): When a battery
of bioassays are run, a habit of some researchers is to point out only the
results that support their hypothesis (e.g. organisms a, d, and f died when
exposed to site media). What methods can be used to incorporate all of the
results of battery? Alternatively, can we deal with this on a "weight of
evidence" type scheme?
Mark D. Sprenger (U. S. Environmental Protection Agency): Statistical issues
which need to be addressed in "real world" field situations include:
(1) the potentially biased design for sampling location selection. Samples
may be biased due to physical/chemical parameters (TOC and grain size in
soil/sediments) and by known sources of contaminants;
(2) it must be realized that data generated and interpretation must be
defensible (for cost recovery reasons);
(3) we frequently are faced with unbalanced, non-normally distributed data
with small sample sizes; due to the nature of the site;
(4) most hazardous waste sites are small, restricting design options and
sample size;
(5) alternate sources of contaminants or observed effects frequently exist;
(6) hazardous waste sites are scientific, social and political, this must
be considered in directing statistical research;
(7) environmental "replication": (pseudo replication) must be addressed.
-151-
-------
SOME STATISTICAL ISSUES RELATING TO THE CHARACTERIZATION
OF RISK FOR TOXIC CHEMICALS
William M. Stiteler and Patrick R. Durkin
Chemical Hazard Assessment Division
Syracuse Research Corporation
Syracuse, New York 13210
INTRODUCTION
The evaluation of the level of hazard posed by chemical
contamination at toxic waste sites involves a number of
statistical issues. One important component in the hazard
evaluation process for noncarcinogens is an index or
reference dose describing what might be considered as a safe
or acceptable level of exposure. In this paper some issues
relating specifically to this reference dose are discussed.
BACKGROUND
A distinction is generally made between carcinogens and
noncarcinogens on the basis of the underlying biological
mechanisms by which the toxic effect takes place. In
particular, with a noncarcinogenic chemical, which will be
referred to as a "toxicant", it is generally accepted that
some "threshold" level exists below which exposure will
cause no ill effect. With carcinogens, on the other hand,
it is generally agreed, that no such threshold is
appropriate. While there is not universal agreement on this
point, as a rule the approach with cancer risk assessment is
to assume that any amount of the chemical will result in a
response from some portion of the population.
Because of the threshold theory associated with
toxicants, the risk assessment methodology has emphasized
the identification of a "safe" level of exposure. The
current approach is to determine what is known as a
"reference value" (RfD), (formerly known as "acceptable
daily intake" or ADI). The RfD is based on human data if
available, but most often on experimental animal data. The
idea is to establish a level (dosage) at which no effect is
observed (i.e threshold) and then to extrapolate to the
human population by applying uncertainty factors. The
reader interested in a more complete description of the
process is referred to Dourson and Stara (1983).
Specifically the process involves establishing, from
the available experimental data what is known as a "no
observed adverse effect level" (NOAEL). Generally this
involves finding a dose greater than zero for which no
statistically significant difference in the response can be
-152-
-------
determined. This NOAEL is considered as a "lower bound" on
the true threshold.
On the other side of the threshold, the lowest dose at
which there is an adverse effect is established. This
"lowest observed adverse effect level" (LOAEL) estimates a
dosage considered to be in excess of the threshold dose.
The true threshold is then bracketed by the NOAEL on
the left and the LOAEL on the right. In keeping with the
need for conservatism in this kind of risk assessment, the
NOAEL might be taken as a surrogate threshold.
To arrive at a human RfD the NOAEL is divided by a
series of uncertainty (or "safety") factors. The first is a
factor, usually ten, designed to protect "sensitive" humans.
This is applied even if a NOAEL is determined from long-term
human exposure data. The reasoning is that the data are
representative of "average" healthy humans and may not be
indicative of the response of some subpopulation of weak or
sensitive individuals.
A second uncertainty factor of ten is applied if the
NOAEL is not based on human data. A third uncertainty
factor of ten is applied if the NOAEL is determined from
short-term or subchronic exposures to the toxicant.
If a NOAEL is not available, a LOAEL might be
substituted. In that, case an additional uncertainty factor
is used. This uncertainty factor is generally assigned a
value between 1 and 10 depending on the sensitivity of the
adverse effect.
The final RfD is thus an extrapolation obtained by
dividing an experimentally derived number by a factor
ranging anywhere from 1 to 10,000. While there is some
experimental support for the uncertainty factors described
above (Dourson and Stara 1983), it cannot be denied that
there is a certain element of arbitrariness in the process.
Some empirical investigations of the distribution of the
various uncertainty factors have generally indicated that
they tend to be overly protective. (Anatra et al., 1986;
Chambers et al., 1988; Durkin et al., 1988)
The issue is of particular relevance to risk assessment
applied to hazardous waste sites. The hazard index is based
on a comparison of the exposure level to the RfD for each of
the chemicals present. There is a clear possibility then,
given the magnitude of some of the uncertainty factors, and
the degree of reliability associated with them, that the
hazard index could be driven by "uncertainty".
The NOAEL/LOAEL approach to determining a RfD has also
been criticized for another reason (Crump 1984). The NOAEL
(or LOAEL) is most often based on a statistical test of
significance. Specifically the observed response at some
dose level is compared with the observed response for a
control group with the null hypothesis being that there is
no difference. The ability of a test to detect a difference
is directly related to the size of the sample.
-153-
-------
These concerns about the adjusted NOAEL method have led
to recommendations for alternative methodologies for
determining a RfD. Crump (1984) and Dourson et al. (1985)
have suggested using a dose-response model to determine a
"benchmark dose" (BD) which was defined by Crump as "a lower
statistical confidence limit for the dose corresponding to a
specified increase in level of health effect over the
background level".
The first step in the benchmark method, as described by
Dourson et al. (1985), is to fit a dose-response model and
calculate the lower 95% confidence limit on the dose
associated with 10% risk. This value serves as a substitute
LOAEL. Assuming that the dose response model is based on
the assumption of lifetime exposure and that less than
lifetime data has been converted by an appropriate dose
transformation, one of the uncertainty factors of 10 is
taken care of. The animal to human extrapolation is then
accomplished by dividing the benchmark dose by the cube root
of the ratio of human weight (70 kg) to animal weight. This
leaves the other two uncertainty factors, those designed to
protect the sensitive subpopulation of humans and to
extrapolate from a LOAEL to a NOAEL, to be applied as
before.
Perhaps a somewhat more objective approach to picking
this last uncertainty factor between 1 and 10 is available
with the benchmark method in that the slope of the dose-
response curve can be used as a guide (Dourson et al. 1985).
The slope at the 10% risk point is taken as an indication of
how rapidly the dose-response curve is dropping toward the
threshold. If the slope is steep a value of one is used.
If the slope is shallow a value of ten is used. A moderate
slope suggests an intermediate value.
This benchmark methodology shows promise as an
alternative to the (NOAEL\LOAEL)/Uncertainty approach to
estimating a safe level of exposure. It gets around the
disturbing fact that anyone with an interest in establishing
the NOAEL at a high level would be rewarded for keeping the
sample small in order to make the statistical test less
powerful. Because the benchmark dose is a lower confidence
limit on dose associated with a fixed risk level, small
samples would be expected to result in a larger NOAEL on the
average. Overall, good experimental design and large
samples would be more likely to produce rewards with the
benchmark method.
STATISTICAL ISSUES
Regardless of whether the current RfD methodology is
continued or is replaced by an alternative such as the
benchmark method, there are several statistical issues that
should be addressed. Three specific issues that will be
-154-
-------
outlined here are: (1) determination of statistical
properties of uncertainty factors, (2) estimation of
parameters in threshold models, and (3) methods for
converting continuous response data to dichotomous.
1. Uncertainty Factors
Uncertainty factors really play two roles. They are
used to extrapolate, as for example from animal species to
human. They also are used for a different, albeit related,
purpose and that is as a "safety" factor. They are supposed
to provide a conservative cushion in the face of uncertain
knowledge about the various extrapolation factors. The goal
should be to elevate the level of knowledge and
understanding to the point where risk assessors would feel
comfortable using the term "extrapolation" factors.
2. Threshold Models.
If mathematical modeling is to play a role in the
hazard assessment methodology for toxicants, it will be
necessary to deal with thresholds.
In theory, it is a simple matter to modify an existing
dose-response model, such as those used in cancer risk
assessment, to incorporate a threshold. For example, Crump
(1984) gave several examples, including the following
modification of the multistage model:
P(d) = c+(l - c) {1 - expt-q^d-dg)- ... -q„ (d-d0) * ] >
for d 2 d0
= c for d < d0
where Oscsl,d0aO, and a 0 for i = l,2,...,k.
In the above model, the parameter c represents the
background response rate and the parameter d0 represents the
threshold.
Since this model represents a straight forward
modification of the multistage model with only one new
parameter, d0/ to be estimated, it might appear to require
only a simple modification of the algorithm currently in use
for determining the maximum likelihood estimates of the
other parameters. This is not the case. A careful look at
the likelihood function will reveal the problem.
Typical dose-response data consists of a control group
and one or more dose groups. Suppose there are M nonzero
doses and let n0, nw .., nM represent the numbers of animals
in each of the M+l groups at doses d„=0, dw .. , dH> Let
x<>/xw •• ,xH represent the number of responders in each of
the M+l groups. Then, assuming independence, the likelihood
-155-
-------
function is given by:
L (Q; x) =
P(d)*°[ l-P(d) ]n°-x0 • P(d)xl[ l-P(d) ]nl-xl • ...
• P(d)xM[ l-P(d)
where x is the vector of responses composed of x0/xw .. ,xM
and Q is the vector of unknown parameters brought into the
likelihood by way of the assumed model P(d). For the
multistage model with threshold described above the vector Q
would consist of c,d0fqlf. ../q„. The difficulty in
estimating these parameters by maximizing the likelihood
function results from the fact that for those groups for
which d s d0, P(d) is equal to the (unknown) constant c.
For doses greater than the threshold P(d) takes on the
multistage form.
As a result, the likelihood function is made up of an
unknown number of terms of the form c*( l-c,n~*. It is
necessary to know when the abrupt change in the likelihood
takes place in order to maximize it. In other words, the
likelihood can be maximized only after the threshold is
known (or estimated).
A simple example will illustrate
another aspect to this problem. This
is the fact that the selection of the
"wrong" model for P(d) will lead to
errors in the selection of a
threshold. The statistical estimation
issue can be put aside temporarily by
assuming that each dose group in an
experiment is composed of a very large
number of animals (say several
thousand) so that chance variation in
the proportions of responders can be
neglected. This is reasonable since
the standard deviation of a binomial
sample proportion is at most . 5/Vn With an enormous sample
size any reasonable model would be expected to pass through,
or at least very close to all data points. Suppose that
doses of 0, 10, and 20 are administered to three groups and
that the proportion of responders is .10, .40, and .80
respectively.
These data are shown in Figure 1, with the three points
labeled as 1,2, and, 3 for the three doses of 0, 10, and 20
respectively. Point (1) indicates that the background rate
is 0.10 , and the horizontal dashed line passing through
that point represents that part of the model. The remaining
two points, (2) and (3), indicate a substantial added risk
at higher doses. If the response is assumed to be linear in
the added risk regionn the two points can be connected with
1
¦
o.t
a

3
...

I=,
0

2
o.t
'1
a

0 3 10 19 80 0

Om
Figure 1. Dose-
Response Points
-156-
-------
a straight line as in Figure 2. The
equation of the line passing through
these points is:
P(d) = 0.04d
The point where this line
intersects the horizontal background
line determines the threshold, d0.
The solution is d0 = 2.5.
1

0.0

/ 3
...

} = «
y

/ 2
O.K
"1 /

I
I 9 13 tO 89

Cteoa
Figure 2. Linear
Threshold Model
Suppose now that a one-hit model
is used in the added risk part of the
curve instead of the straight line.
There is one, and only one, equation of the one-hit form
that fits through the points (2) and (3). This curve is
shown in Figure 3. This time the threshold is determined by
the point where the one-hit curve intersects the horizontal
background line. This happens at d0 = 6.31.
Now suppose that a multistage
model is assumed for these data.
There is a two-stage model,
1 - exp{-(q0+q1d+q2d2) > , which fits
through all three of the points
exactly, if the parameters are taken
as follows:
q0 = 0.105361
qx = 0.005889
q2 = 0.003466
1
•
O.t

J
...

}.«
A

f 2
o.t
;i /

) 9 10 19 tO 29
Om
Figure 3. One-Hit
with Threshold
If the one-hit
This two-stage model is shown in
Figure 4, along with the one-hit
threshold previously fit to these data
model, which is the same as a multistage model with one
stage, is used there is necessarily a threshold with these
data at d0 = 6.31. If a two-stage model is used, it can fit
through all three points without a threshold. But the two-
stage model is more "flexible" than a one-stage. It can be
made to come down to a threshold at some arbitrarily
selected point, say d0 = 5.0, and still fit through points
(2) and (3). In fact the two-stage threshold model which
does that is given by:
P(d)
p for d < 5
P + (1—P) [l-expt-q^d-SJ-qjfd-S)2}] for d > 5
where qx = 0.071504 and q2 a 0.001918.
A graph of this model is shown in Figure 5.
-157-
-------
I
0.4
0.2
13
20
23
3
10
0
Do—
Figure 4. One-Hit
Threshold and Twostage
If there is in fact a
threshold, it is clear that it
cannot be located from these
data alone. If the model is
constrained, say to a linear
or one-hit, then the location
of a threshold can be
extracted from these data -
but with different values.
For a more flexible model, the
location of a threshold, if it
even exists, is indeterminate
with these data. When
sampling variation is
considered, the situation
becomes even more hopeless.
The solution to this
dilemma is to construct the
model in two steps, with the
first step being the
estimation of the threshold
(and associated background).
This is not without problems,
however. It requires
sequentially comparing
response rates, at
increasingly higher doses,
against the response rate for
the controls until a
significant difference is
detected. The spacing of the
doses in an experiment is
critical. If all doses are
taken below the unknown
threshold, the experiment will yield little information
about its location. The same is true if the low dose in the
experiment exceeds the threshold. Even if a difference is
detected between response rates at two experimental doses
the location of the threshold has only been narrowed down to
an interval which could be quite wide depending on the
spacing used in the experiment.
1
-
o a

3
9 a.a

if
a
/
I 0.4
/

/ 2
o.a
"1 /
i
i '
Q

3 10 13 20 23

Doaa
Figure 5. Two-Stage with
Threshold
3. Conversion of Continuous Data.
The nature of the effect being observed in a risk
assessment problem determines whether the response data
collected is of the discrete or continuous type. In some
cases it is the presence or absence of some condition such
as a lesion or tumor. In other cases it is a measured
response such as liver weight, or enzyme activity. There
may be both continuous and discrete response data available
-158-
-------
for the same chemical.
With the current RfD methodology this has not been a
problem. To determine a NOAEL and/or LOAEL the data could
be analyzed with something like a contingency table analysis
or Fisher's exact test if discrete, or by a t-test or
analysis of variance if continuous. In either case the
question as to what dose produces a response significantly
different from the control group could be answered.
The possibility of using an alternative such as the
benchmark method based on dose-response modeling introduces
a problem. Sometimes the response is discrete and sometimes
it is continuous. One solution would be to use different
models depending on the nature of the response. This is the
solution favored by Crump (1984). There seems to be no
simple way to determine whether or not this is a
satisfactory solution. There would be no objection to using
two different dose-response models for the same response
variable if it was determined that they both gave the same
results for any inference they might be used for. While two
models can be compared in meaningful ways for a common
response variable, any comparison of different models would
be confounded if the response variable
Anatra (1985) proposed a method
for converting continuous data to
dichotomous form. This method is
based on two assumptions: (1) the
continuous response follows a normal
distribution for the control group as
well as fo reach of the treated
groups, and (2) that some arbitrary
proportion (say 5%) at one extreme end
of the control group can be considered
as "responders". It is the latter
assumption that establishes a
definition of what constitutes a
responder in the treated groups. Any
animal with a value of the continuous variable more extreme
than the most extreme 5% of the control group is considered
to be a responder. The determination of whether the right
or left tail of the distribution for the control group is
used to establish this definition of a response depends on
whether increased exposure to the toxicant results in an
increase in the continuous response variable (e.g. increased
liver weight) or a decrease in that variable (e.g. decreased
body weight). Figure 6 illustrates this procedure. The
shaded area under the normal curve for
the treated group represents the proportion of the treated
group considered to be responders.
The Anatra method has been applied to several data sets
and appears to generally give good results. It is, of
course, difficult to quantify the performance because the
method itself defines what a responder is.
are also different.

/ A \ -
/ / \ \ -
/ ' \ '•
* * \
It \ *
/ ' \ \
/ • \ »
If \ \
/ / \ t
/ ' \ 1
1 i V
/ ' \
/ t \
/ / \
/ 1
Control
Treated

Figure 6. Anatra
Method
-159
-------
One situation where the
method clearly does not give
satisfactory results is
illustrated in Figure 7. In this
situation the treated group has a
smaller variance than the control
group. As a result, the
relationship between the control
and treated curves is such that
the proportion of responders for
the treated group, as defined by
the Anatra method, would actually
be smaller than the proportion of
responders in the control group.
Clearly there is some level of
response in the treated group,
because the mean of the treated
group has been shifted to the right. The difference in the
variances is itself evidence of some effect on the treated
group. Another, although perhaps less serious, difficulty
with the Anatra method has to do with the fact that the
proportion of the control group considered to be responders
is an arbitrary decision.
Both of these problems might be taken care of with a
simple modification. The modification that appears to solve
the problems is to define the proportion of responders in a
treated group as "the excess proportion of the treated group
taking a value of the continuous variable more extreme than
the mean of the control group". This definition is
illustrated by Figure 8.
The shaded area in Figure 8
indicates the proportion of the
treated group to be considered as
responders. Note that a portion
of the treated group is allowed
to take large values of the
continuous variable without being
identified as responders as long
as the proportion does not exceed
the corresponding proportion
indicated by the control group.
Conversely, some of the
individuals with relatively low
values of the continuous response
variable are considered to be
responders to the extent that
they occur in excess of the
control group. One obvious advantage of this definition is
that it does not require that any arbitrary proportion of
the control group be identified as responders.
In a way, this approach treats the entire control group
as an estimate of the "background" population. It is
- ControI
•• Treated
Figure 7. Anatra Method
with Unequal Variances
\ - ControI
Figure 8. Modification
of Anatra Method
-160-
-------
generally accepted in risk
assessment that a response rate
at some specified dose should be
"adjusted" for background. This
adjustment is made by calculating
either "added risk" or "extra
risk" and is usually done after a
model is fit to the data. This
is reasonable with dichotomous
response data because there is an
explicit definition of what
constitutes a response, and the
probability that it occurs at
zero dose (the background) can be
incorporated into the model as a
parameter which is estimated from
the data. This proposed
modification of the Anatra method
can be viewed as an adjustment for background made prior to
using the data to fit a dose-response model. In this case
it would be necessary to assume the extra risk or added risk
form of any dose-response model.
The proposed modification of the Anatra method should
also give reasonable results when the treated and control
groups have different variances. Figure 9 illustrates the
proportion of responders indicated by the modified
definition in the situation where the treated group has a
smaller variance than the control group and the mean of the
treated group has been shifted to the right. The Anatra
method in its original form would indicate that the treated
group has fewer responders than the control group.
REFERENCES
Anatra, M. 1985. A Method for the Estimation of Incidence
from Continuous Response Data: Proposed Modification in
Health Score Evaluation. Prepared for American Management
Systems, Inc., Federal Consulting Group, Arlington, VA.
Anatra, M., S. Bosch, P. Durkin, D. Gray, and D. Hohreiter.
1986. Investigation of the Uncertainty in the Acceptable
Daily Intake (ADI). SRC TR-86-274. Syracuse Research
Corporation, Syracuse, NY.
Chambers, N., P. Crockett, P. Durkin and W. Stiteler. 1988.
10
Figure 9. Modified Anatra
Method for Unequal
Variances.
-161-
-------
Uncertainty Associated with Reference Dose in the Absence of
Reproductive/Teratological Data. SRC TR-88-274. Syracuse
Research Corporation, Syracuse, NY.
Crump, K. 1984. A New Method for Determining Allowable
Daily Intakes. Fund, and Appl. Tox. 4: 854-871.
Dourson, M. and J. Stara. 1983. Regulatory History and
Experimental Support of Uncertainty (Safety) Factors. Reg.
Tox. and Pharm. 3: 224-238.
Dourson, M., R. Hertzberg,, R. Hartung and K. Blackburn.
1985. Novel Methods for Estimation of Acceptable Daily
Intake. Tox. and Ind. Health 1(4) 23-41.
Durkin, P., C. Eisenmann, W. Stiteler, P. Crockett and P.
Goetchius. 1988. Uncertainty Associated with the Reference
Dose (RfD) in Extrapolation from Subchronic to Chronic and
From LOAELS to NOAELS. SRC TR-88-293. Syracuse Research
Corporation, Syracuse, NY.
11
-162-
-------
ESTIMATION OF CONCENTRATION - PERCENT
SURVIVAL RELATIONSHIPS: DESIGN ISSUES
by
Ernst Linder, Department of Mathematics
University of New Hampshire, Durham, NH 03824
ABSTRACT
Concentration - percent survival relationship models play an important role in ecological
assessment of hazardous waste sites. They are used to extrapolate from 100% site samples to
samples of lower toxicities away from the site. They are the basis for computing acute and
chronic toxicity measures, such as LC50, EC50 and MATC. We discuss non-symmetric
response curves that have recently been proposed, in particular the family of power logistic
response models that includes as a special case the familiar logistic regression. We show how
optimal choices for the concentration levels change as a more general model is introduced.
Designs for estimating the concentration - percent relationship as well as for estimating toxicity
measures are investigated. We also discuss implementation issues and show for which situations
an optimal design can be achieved for extrapolating to lower concentrations levels.
Presented at the EPA workshop on Superfund Hazardous Waste: Statistical Issues in
Characterizing a Site, on February 21-22,1990 in Crystal City, Arlington, VA.
-163-
-------
1. Introduction
Acute and chronic toxicity tests are performed from samples collected at a hazardous
waste site (HWS). Results from these tests are then, combined with results from chemical
analyses, compared to field surveys. A possible link between hazardous wastes and adverse
ecological responses can be established by this approach.
Among the endpoints that are derived from acute and chronic toxicity tests are:
(1) percent survival of test organisms in 100% site sample,
(2) concentration - percent survival relationship,
(3) estimates of LC50, EC50, MATC, etc.
In situ exposure data together with (1) provide information on the toxicity of ambient
concentrations of hazardous chemicals. The probable sources and causes of toxic effects are
assessed by comparison to survey data. The concentration-percent suvival relationship (2) can
be used to extrapolate toxicity to areas farther away from the HWS or to toxicity data of sites
with decreasing concentrations. Estimates of acute and chronic toxicity (3) are most useful for
comparison of toxicity among different samples or sites. Toxicity tests for ecological
assessment at a HWS are discussed in more detail by Parkhurst et al. (1988).
We discuss in the following several issues related to the designing of laboratory toxicity
experiments for obtaining a concentration-percent survival relationship. Our discussion pertains
to aquatic toxicity testing. For assessing toxicity at a HWS, the experiment involves a complex
chemical mixture unlike the more typical one-chemical experiment that is performed for most
bio-assays. Also, once a 100% solution is obtained from the HWS, only dilutions are possible.
In other words, we can only observe responses at smaller concentrations of the toxic mixture.
Some well-known optimal designs for estimation of a concentration-response relationship need
to be modified for this particular situation.
There is a rich literature on optimal designs for concentration-response models. Design
considerations for the probit and logistic models are addressed from the frequentist viewpoint in
Abdelbasit and Plackett (1983), Wu (1985), Tamhane (1986), and Kalish (1988). Robust
estimation is considered in several of these papers, as well as in Miller and Halpern (1980).
Bayesian approaches to designs have also been developed (Freeman (1970), Kuo (1983),
Tsutakawa (1972,1980), and Chaloner and Larntz(1988)). Design derivations are model
dependent. Most of the literature on design issues pertains to a two parameter scale-location
familiy of symmetric concentration-response curves such as the probit and logistic models.
Designs for asymmetric models have not been studied widely. A popular extended four-
parameter model for concentration-response curves was proposed by Prentice (1976). This
family includes as special cases the logistic and the probit model. A special case of Prentice's
familiy of curves is the three-parameter, so-called power logistic model, that has been studied
further by Gaudard et al. (1990a). The same authors also investigated design issues related to
-164-
-------
this model (Gaudard et al. (1990b)).
In Section 2, we discuss some of the modeling questions. It is expected that several
competing models will be considered for assessment at a HWS. Optimal model choice provides
better estimates and more powerful tests for comparisons of toxic responses between sites. We
review approaches for obtaining optimal designs for the logistic model and discuss robustness
issues in Section 3. The power logistic model serves as a vehicle to shed light on how designs
are derived and evaluated for more general models in Section 4. We hope that these derivations
can serve as guidelines for similar approaches for different models chosen according to the
specifics of a particular site. Questions about implementation of these designs for dilution
experiments for ecological assessment will be addressed in Section 5. We finish with a
discussion and references in Sections 6 and 7.
2. Model - Choice
Concentration-response relationships are estimated from experiments where nj subjects
are exposed to concentration levels x,. for i=l k. The proportion pj of subjects that respond is
recorded at each concentration. In acute toxicity tests, the response is usually death from
exposure. The random variable njPj follows a binomial distribution with probability of response
p(Uj) = F(Uj-, fJ.n), where Uj=f3O and -»
-------
One of the models that we have studied extensively (Gaudard et al. (1990a,b)) is the
so-called power logistic model, abbreviated PLM. first proposed by Prentice (1976), in which
F(u) = =• , for some m > 0.
(1 + exp(-u))
This family of curves is, in some sense, the simplest true generalization of the logistic model.
The power logistic models are reasonably tractable analytically and computationally and provide
a good context for the study and illustration of design robustness. Figure la displays the shapes
of F(u) for varying values of the parameter m. The response curve shifts to the right as m
increases. In other words, the tolerance is stochastically larger for large m. As m increases, the
skewness of the tolerance distribution also increases, while the variance and kurtosis decrease.
Table 1 displays the moments of the tolerance distribution for various values of m. If
concentration - response data behave exactly in the opposite way as described above, an
alternative version of the PLM is obtained by transforming the standardized concentration to -u,
resulting in
F(u) = 1 =- , for some m > 0.
(1 + exp(u))
This class of models, called the type 2 class, is exhibited in Figure lb. The discussions that
follow are about the type 1 class of curves. Similar discussions with the obvious sign changes
follow, of course, for Type 2 models.
It is interesting to note that concentrations that are effective to a small proportion of
subjects, such as EC5 or EC10 vary considerably for small changes of m when ml, small changes in m affect right tail effective concentrations such as EC90. In this case the
sensitivity is not as severe as in the former case. Designs for concentration-response models are
commonly stated in terms of the responses as follows: Choose a concentration that induces a
response with probability p. While we find that most designs, expressed in this manner, are
fairly robust with respect to misspecification of the parameter m, it turns out that resulting
concentrations towards the end of the tails are highly sensitive to changes in m. In particular, the
most common design, the D-optimal design, is alterered drastically if the wrong value of m is
chosen, when the true m is not close to 1.
We propose to use maximum likelihood estimation procedures for model fitting. An
iterative technique, such as the Newton-Raphson method is required to solve the likelihood
equations in the 3 parameters and m. Because of the dependence of m on mean, variance,
skewness and kurtosis, it is often difficult to find good starting values that would guarantee
convergence of the iterative procedure. Iterative fitting can be achieve with less complication in
the edgeworth-series distribution model (Singh (1987) ) and the Stukel model (Stukel (1987) ).
For the PLM, Gaudard et al. (1990a) suggest to first estimating skewness and kurtosis of the
tolerance distribution using their respective moment estimates and then using the resulting most
plausible value of m as a starting value for the estimation procedure. Similarly, estimates of the
-166-
-------
Table 1: Mean, variance, skewness and kurtosis for the PLM for various values of m.
Note: {J=l, )x=0 (from Gaudard et al. , 1990b).
m
mean
variance
Skewness
kurtosis
0.2
-4.71
27.95
-1.69
4.81
0.4
-1.98
8.92
-1.12
3.05
03
-138
6.59
-0.86
239
0.6
-0.96
529
-0.62
1.93
0.8
-039
3.95
-0.26
1.41
1.0
0.00
3.29
0.00
120
12
0.29
2.91
0.19
1.15
1.5
0.61
238
0.38
1.19
2.0
1.00
2_29
0.58
1.33
2.5
1.28
2.14
0.70
1.47
3.0
1.50
2.04
0.77
139
4.0
1.83
1.93
0.87
1.76
5.0
2.08
1.87
0.92
1.87
: mean = (i+ i|/(l)]/p ; variance = [^'(ni) +'~''(l)]/^2; skewness = ^ ^ ^ ;
[f'(m) + Y'(l)]
kurtosis = ^ + ^ fl) where ty(m) = dl"^ and denotes derivatives.
+ y'(l)l2 dm r(m>
Figure 1: Response curves for the power logistic model (PLM) with exponent values
m = 0.2, 0.5, 1, 2, 5. Figure la: Type 1 class. Figure lb: Type 2 class.
-167-
-------
mean and the variance of the tolerance distribution lead to plausible values of and 0 that can
be used as starting values. Prentice (1976) and Gaudard et al. (1990a) present data examples
where model fits of the PLM are superior to that of the logistic model and comparable to that of
the four-parameter models mentioned above.
3. Optimal Designs for the Logistic Model
Introduction: The nonsequential experimental design problem for a concentration- response
experiment is to choose distinct concentration levels xj,...x^ and the numbers of independent
binary response trials nj^.-.n^ to take at these concentration levels, subject to £nj=n. A design is
said to be optimal if it optimizes a statistical inference criterion.
Optimal designs for nonlinear models, unlike for linear models, depend on the values of
the parameters . However, the goal of the experiment is to estimate these same parameters, hence
they are unknown at the design stage. This apparent dilemma is usually resolved by applying
two-stage or sequential procedures. In the first stage, a preliminary small experiment is
conducted that encompasses a wide range of concentrations. As a result, the range of response
proportion is likely to vary between 0 and 1. The estimates from this first experiment, often
called range-finding experiment, can be used to implement an optimal design in the second
stage. Sequential designs comprised of more than two stages are particularly useful for
estimating effective concentrations, such as EC25, EC50.
D-Optimal Designs: Several criteria have been proposed for optimal designs, such as
D-optimality, A-optimality, E-optimality and G-optimality. These criteria are defined by
functions of the elements of the Fisher information matrix IC2) = -E[d2 ln(f(xl^))/3^2] , where
f(xl]5) is the likelihood function of the parameter vector 0 , and E denotes expectation. As a
general reference, see Silvey (1980). D-optimality is achieved by maximizing the determinant
of the Fisher information matrix, which is equivalent to minimization of the volume of an
asymptotic confidence ellipsoid for the parameters. It is well known that D-optimal designs are
minimal designs, in the sense that, for a two-parameter scale-location model, the design is
supported only on two design points (i.e. concentrations). Kalish and Rosenberger (1978) derive
various optimal designs for the logistic model. D-optimality is achieved when concentrations
are chosen that induce p=0.176 and p=0.824 proportions of responses.
Designs for Estimating ECq: Estimates of toxicity threshold concentrations, such as LC50,
EC50, LD25, etc., are obtained from a fitted concentration-response relationship of the form
F(0(x-n) simply by substituting the estimated parameters into the formula ECq (=EC100q) =
F'kqVfl + H . These measures of toxicity play an important role for assessment at a HWS.
Often, toxicity is compared in terms of LC50, EC50 and others among different samples and
sites (Miller et al. (1985), Greene et al. (1988)). Toxicity threshold concentrations are essential
-168-
-------
for risk assessment and regulation of toxic chemicals (Peltier and Weber (1985), Barnthouse et
al. (1986), Linder et al. 1986)).
If the sole purpose of an analysis is to estimate ECq, the experimental designs that are
optimal for this situation are different from the ones discussed above. If we knew the exact
concentration-response relationship, common sense would tell us to choose the concentration
x=ECq which would result in a perfect measure of ECq. This, in fact, is the optimal design for
estimating ECq as long as q is not too extreme, such as when q<0.1 or q>0.9 (Wu, 1988). In
reality, however, the uncertainty about the shape and location of the relationship forces the
experimenter to "search" for a plausible estimate of ECq. In terms of statistical design, this
problem is often phrased using the widely accepted criterion of minimum asymptotic variance of
the estimator of ECq, denoted by ECq. Optimality of the one-point designs quickly brakes down
when the model is slightly invalid. While efficiency studies address this issue, they usually do
not provide clear guidelines for the design when initial estimates are poor.
Optimal designs with more than three concentrations have been studied from a Bayesian
point of view by Tsutakawa (1980) and Chaloner and Larntz (1988). Several sequential
estimation procedures for ECq have been proposed, such as the Up-and-down method, the
Robbins - Monro procedure (Robbins and Monro (1951)) and, more recently, a procedure based
on parametric models (Wu (1985)).
Bayesian Designs: Experimenters frequently are reluctant to carry out minimal designs as
presribed by optimality theory in cases where the initial estimates of the parameters are not very
reliable. Such cases typically arise in biological in-vivo assays and ecological assessment
situations where natural variability between experiments is considerable. Experiments are thus
carried out in fewer stages (usually two) with more than two concentration levels (often six) at
which the responses of the subjects are tested. This design is more likely to provide reasonable
estimates since response ratios are very unlikely to be all ones or zeros.
Bayesian designs offer great potential for the situation where initial estimates are not
very reliable. A prior distribution for the parameters is assumed that adequately reflects the
uncertainty from preliminary experiments. Because of the difficulties for implementation,
optimal design results are used as guidelines rather than strict prescriptions, and hence, small
changes in the prior distribution do not substantially change the nature of the resulting setup of
the experiment.
Tsutakawa (1980) derives optimal Bayesian designs for estimating ECq for the logistic
model, where k=3 or k=6 concentrations are equally spaced and n=120 or n=480 subjects are
allocated at equal proportions to the concentrations. A prior normal distribution for ECq and a
prior gamma distribution for 3 are assumed. The width of the spacings between concentrations
generally increases as the uncertainty (prior variance) of ECq increases. Chaloner and Larntz
(1988) solve the Bayesian design problem for the logistic model for changing arbitrary values of
k and nj. They assume independent uniform prior distributions for the parameters n and J3.
-169-
-------
Similarly to Tsutakawa(1980), the designs changes mostly as the prior variability of (j. changes.
As the range of the prior uniform for pi inreases, the optimal design changes from a two-point, to
a three-point to a six-, seven- or eight-point design. The latter have concentrations that are
nearly equally spaced and have nearly even allocation of subjects, which is the design that is
chosen on intuitive grounds by many experimenters. On the other hand, the resulting two-point
designs closely resemble the D-optimal designs, while the three-point designs include an
additional concentration in the center but with a smaller allocation.
4. Optimal Designs for the PLM
For the PLM. D-optimal designs can be derived numerically when m is known (Gaudard
et al. (1990b). Table 2 lists D-optimal response probabilities for the PLM. These probabilities
appear to be robust to changes of m but they lead to quite different concentration values uit
particularly, when m is small. Gaudard et al. (1990b) also investigate robustness of the
D-optimal design with respect to misspecifiaction of n and (3, and compute 80% efficiency
regions of the "p^ x P2 - design plane".
Table 2: D-optimal designs for the power logistic model (PLM) for various values of m.
m
Pi
P2
ul
u2
0.2
0.206
0.876
-7.90
0.06
0.4
0.229
0.854
-2.66
0.73
0.6
0.221
0.841
-2.43
1.10
0.8
0.192
0.832
-1.93
1.35
1.0
0.176
0.824
-1.54
1.54
1.5
0.149
0.811
-0.94
1.90
2.0
0.138
0.803
-0.55
2.16
2.5
0.122
0.798
-0.28
2.36
3.0
0.114
0.794
-0.06
2.53
4.0
0.104
0.788
0.27
2.79
5.0
0.098
0.785
0.53
3.00
Efficiency considerations for estimating ECq led Gaudard et al. (1990b) to suggest that
both concentrations should be chosen so that the true unknown ECq lies between these
concentrations with high probability. The asymptotic variance of ECq decreases as both
concentrations approach ECq, but, a minimum value does not exist. Sensitivity analysis show
that it is better to space the two concentrations farther apart, to be certain to capture the unknown
ECq in the middle. If, on the other hand, both concentrations are either smaller or larger than
ECq, the asymptotic variance of ECq increases drastically. The danger of having potentially
sharply increasing variances is greatly reduced when the design is augmented by a third
concentration, chosen close to ECq.
-170-
-------
For HWS assessment, a choice will have to be made in favor of one of the many possible
methods. This choice might be site-specific. It will depend on several factors, such as, maximal
sample sizes, time constraints, available information about the concentration-response
relationship, the degree of accuracy required (quality control) for measures of toxicity.
5. Modifications for Dilution Experiments
Optimal designs are usually implemented in several stages, the reason being that, since
they are formulated in terms of a known concentration-response relationship, the uncertainty
about that relationship (i.e the model) requires collection of some information. Implementation
is carried out in different ways depending on the situation of the research problem. The
two-stage approach to ecological assessment at hazardous waste sites, as described in
Warren-Hicks et al. (1988), is conducive to effective implementation of optimal statistical
designs. We illustrate, in the following, some initial approaches to implementation of designs
that we find relevant for ecological assessment at a HWS.
The first measurement of responses to toxicity of a mixture of chemicals at a HWS is
typically made in a 100% site sample, observing percent survival of a test species. The obvious
question related to D-optimal designs is how to choose a second concentration with resulting
response probability p given the observed response pmax. Note that p < pmax, since only
experiments can be conducted that have decreased concentrations of the hazardous waste. Values
of p that maximize the determinant of the Fisher information matrix (det(I)) in the PLM, hence
are D-optimal in this simple situation, are given in Table 3 for several choices of m. Notice that
the least extreme values for p occur for the models with m between 0.5 and 1. Also, sample
sizes are assumed to be equal for the two concentrations.
Table 3: Response probabilities that maximize the determinant of the Fisher information
matrix in the power logistic model (PLM) for given higher probability pmax-
Pmax
m =0.2
o
II
e
m = 1
II
B
II
6
0.9
0.228
0.263
0.214
0.166
0.125
0.8
0.162
0.193
0.166
0.132
0.101
0.7
0.127
0.147
0.133
0.108
0.083
0.6
0.101
0.113
0.106
0.088
0.068
0.5
0.080
0.087
0.083
0.070
0.055
0.4
0.061
0.065
0.063
0.055
0.044
03
0.044
0.046
0.045
0.040
0.033
The question of existence of D-optimal designs is more realistically addressed as follows:
Given an observed response proportion pmax at a 100% site sample, is it possible to find
-171-
-------
D-optimal designs using only lower concentrations than that corresponding to pmax? We study
this issue by computing det(I) for three design concentrations, one of which corresponds to pmax.
The other two concentrations correspond to response proportions pi and p2. We report the
values of p^ , p2 and det(I) when det(I) achieves a local maximum for various values of pmax and
m in the PLM, in Tables 4, 5 and 6. In Table 4, the sample sizes at the three concentrations are
equal, in Table 5, the sample size at pmax is three times larger and in Table 6, the sample size at
Pmax ^ve times larger- Two situations arise: D-optimal responses are both either smaller or
larger than pmax or they lie on both sides of pmax. In the first case, D-optimal designs are
achieved if both concentrations are chosen at the same level. For responses p < pmax, the
optimal responses are the values in Table 3 from the two-point design. These results are
independent of the sample sizes, hence they are identical in all three tables. D-optimal responses
in the second case occur when concentrations are chosen one on each side of ECpmax. Here, the
results depend on the sample sizes.
In Tables 4 to 6, only the D-optimal cases where both design points lie below ECpmajc
can be realistically considered for obtaining concentration-response curves involving lower
concentrations of the mixture of toxic chemicals collected at the HWS. These cases are marked
with a (*) in the tables. The calculations show that increasing the samples size of the 100%site
sample has a strong positive effect. In Tables 5 and 6, pmax values as low as 0.5 still allow us to
find D-optimal designs with lower concentrations. Models with lower values of m generally are
more restricitve: Fewer cases arise with D-optimal responses below pmax.
Table 4: D-optimal three-point designs for the power logistic model with one response
probability (pmax) fixed. The table gives the values pi, P2 and det(I) where
det(I) is at a local maximum. Note: = n2 = 03 . (Note • denotes dilution designs)
Pmax

m = 0.2

m = 1

m = 5

P1
P2
det(I)
P1
P2
dct(I)
P1
P2
det(I)
0.9
0.228
0.228
0.086 *
0.214
0.214
0.371 *
0.102
0.728
0.896
0.8
0.193
0.910
0.087
0.174
0.838
0.401
0.098
0.777
0.938
0.7
0.174
0.912
0.085
0.160
0.865
0.396
0.092
0.815
0.931
0.6
0.165
0.906
0.084
0.147
0.869
0.392
0.086
0.832
0.914
0.5
0.164
0.898
0.084
0.136
0.861
0.390
0.079
0.835
0.898
0.4
0.170
0.890
0.085
0.131
0.853
0.392
0.073
0.831
0.891
0.3
0.184
0.882
0.086
0.135
0.840
0.396
0.070
0.821
0.898
-172-
-------
Table 5: D-optimal three-point designs for the power logistic model with one response
probability (pmax) fixed. The table gives the values p^, P2 and det(I) where
det(I) is at a local maximum. Note: n^ = n2 ; n3 = 3n^. (* denotes dilution designs).
Pmax

m = 0.2

m = 1

m = 5

P1
P2
det(I)
P1
P2
dct(I)
P1
P2
dct(I)
0.9
0.228
0.228
0.086 •
0.214
0.214
0.371 *
0.125
0.125
0.822 *
0.8
0.162
0.162
0.079 *
0.166
0.166
0.399 *
0.100
0.100
0.938 *
0.7
0.150
0.937
0.059
0.133
0.133
0.359 *
0.083
0.083
0.903 *
0.6
0.132
0.924
0.058
0.106
0.106
0.293 *
0.068
0.068
0.797 *
0.5
0.923
0.923
0.067
0.107
0.893
0.270
0.055
0.055
0.655 *
0.4
0.907
0.907
0.077
0.894
0.894
0.293
0.058
0.864
0.604
0.3
0.891
0.891
0.084
0.867
0.867
0.359
0.869
0.869
0.636
Table 6: D-optimal three-point designs for the power logistic model with one response
probability (pmax) fixed. The table gives the values pi« P2 and det(I) where
det(I) is at a local maximum. Note: n^ = 03 ; n3 =5^. <* denotes dilution designs).
Pmax

m = 0.2

m = 1

m = 5

P1
P2
dct(I)
P1
P2
det(I)
P1
P2
det(I)
0.9
0.228
0.228
0.086 •
0.214
0.214
0.371 •
0.125
0.125
0.822
0.8
0.162
0.162
0.079 *
0.166
0.166
0.399*
0.100
0.100
0.938
0.7
0.127
0.127
0.057 •
0.133
0.133
0.359 *
0.083
0.083
0.903
0.6
0.939
0.939
0.055
0.106
0.106
0.293 *
0.068
0.068
0.797
0.5
0.923
0.923
0.067
0.098
0.912
0.249
0.055
0.055
0.655
0.4
0.907
0.907
0.077
0.894
0.894
0.293
0.052
0.875
0.551
0.3
0.891
0.891
0.084
0.867
0.867
0.359
0.869
0.869
0.636
5. Discussion
Designing of experiments with respect to statistical criteria of optimality is often
neglected in research and analysis. Well-designed studies however produce more reliable
estimates and lead to more powerful conclusions even in rather noisy systems and environments.
Modern computing power enables us to quickly evaluate a large array of designs and to "tailor"
an experiment to the specifics of a particular problem.
Implementation of designs that are found to be optimal in theory is often difficult. We
have seen that two-stage approaches for ecological assessment at hazardous waste sites can be
-173-
-------
taken advantage of for implementation of optimal designs for estimating concentration-response
relationships and measures of toxicity such as LC50, EC50, EC25, etc. However designs that
are based on a single best guess of the parameters should be implemented cautiously, if large
variability exists between experiments or if initial estimates are poor. Bayesian designs appear
useful in such situations for the following reasons:
- The uncertainty about the first best guess can be nicely incorporated in the model via a suitable
prior distribution for the parameters. In some cases, alternative parametrizations might be
necessary.
- Unlike in certain estimation problems, the experimental framework that is borne out of a
Bayesian design is not very sensitive to changes in the prior.
- Certain Bayesian designs for estimation of concentration-response relationships conform to the
popular intuitive approaches closer than corresponding optimal frequentist designs.
There are numerous issues in optimal Bayesian design theory for concentration-response
experiments that call for further study. Optimal Bayesian designs should be developed for
extended models of the logistic and probit, such as the PLM. or even for more general four-
parameter models. In spite of our claim to the contrary, the effect of changing prior parameters
on the design needs to be examined more closely.
The choice of the design for hazardous waste site remediation efforts depends on several
factors, such as:
- time constraints
- preliminary information on the toxicities
- maximal affordable sample sizes
- purpose of the estimation procedure
- how realistically the implementation can be carried out.
Obviously, no single method will be universally optimal for all situations. Preliminary
calculations regarding the feasibility of certain designs will need to be made, such as those
presented in Section 5. Modern computing power permits us to conduct such calculations in a
expedient fashion. For example, all numerical computations of this paper were carried out using
the mathematical software GAUSS on a personal computer. A optimization routine was used to
compile the tables. The calculation of each entry (triplet of numbers) in Tables 4,5 and 6 took
about 30 seconds on a 386 PC. The programming effort for Tables 4, 5 and 6 was minimal,
with about 20 lines of code.
ACKNOWLEDGMENT
The author would like to thank the colleagues of the study group on "Design Issues for
Quantal Response Models" at the University of New Hampshire, Marie Gaudard, Marvin Karson
and Siu-Keung Tse for their collaboration which part of this paper is based on, and for their
inspired comments and support.
-174-
-------
REFERENCES
Abdelbasit, K. M. and Plackett, R. L. (1983). Experimental design for binary data. J. Amer.
Statist. Assoc., 78, 90-98.
Barnthouse, L. W. and Suter, G. W. II (1986) (eds). Users's Manual for Ecological Risk
Assessment. ORNL, 6251, Oak Ridge National Laboratory, Oak Ridge, TN 37831.
Cbaloner, Kathryn and Larntz Kinley. (1988). Optimal Bayesian design applied to logistic
regression experiments. Preprint. School of Statistics, U. of Minnesota.
Copenhaver, T. W., and Mielke, P.W. (1977). Quantit analysis: A quantal assay refinement.
Biometrics, 33, 175-186.
Finney, D. J. (1978). Statistical Methods in Biological Assay. 3rd edition. New York: Hafner.
Freeman, P. R. (1970). Optimal Bayesian sequential estimation of the median effective dose.
Biometrika, 57, 79-89.
Gaudard, M. A., Karson, M„ Linder, E., and Tse, S.K. (1990a). The power logistic model for
quantal response analysis. Preprint: Dept. Mathematics, University of New Hampshire,
Durham, NH 03824. (Submitted for publication).
Gaudard, M. A., Karson, M., Linder, E.t and Tse, S.K. (1990b). Efficient designs for estimation
in the power logistic quantal response model. Preprint: Dept. Mathematics, University of
New Hampshire, Durham, NH 03824. (Submitted for publication).
Greene, J.C, Warren-Hicks, W.J., Parkhurst, B.R., Linder, G.L., Bartels, C.L, Peterson, S.A.
and Miller, W.E. (1988). Protocols for Acute Toxicity Screening of Hazardous Waste
Sites, Final Draft. U.S. Environmental Protection Agency, Corvallis, OR. 145 pp.
Kaiish, L. A. (1988). Efficient design for estimation of median lethal dose. Tech. Report.
Dana-Farber Cancer Institute and Dept. of Biostatistics, Harvard S. of Public Health.
Kaiish, L. A. and Rosenberger, J. L. (1978). Optimal designs for the estimation of the logisitc
function. Tech. Report Nr. 33. The Pennsylvania State University.
Kuo, L. (1983). Bayesian bioassay design. Annals of Statist. 11, 886-895.
Linder, E., Patil, G. P., Suter, G.W.II, and Taillie C. (1986). Effects of toxic pollutants on
aquatic resources using statistical models and techniques to extrapolate acute and chronic
effects benchmarks. In Oceans 86. Proceedings: Volume 3: Monitoring Strategies
Symposium. Marine Technology Society. Washington D.C.
Miller R. G. and Halpern, J. W. (1980). Robust estimators for quantal bio-assay. Biometrika,
67,535-542.
Miller, W.E., Peterson, S.A., Greene, J.C., and Callahan, C.A. (1985). Comparative toxicology
of laboratory organisms for assessing hazardous waste sites. Journ. Environ. Qual. 14:
569-574.
-175-
-------
Morgan, B.J.T. (1985). The cubic logistic model for quantal assay data. Appl. Statist., 34:
105-113.
Parkhurst, B.R., Linder, G.L., McBee, K., Bitton, G., Dutka, B.J., and Hendricks,
C.W.(1988). Toxicity tests, in Ecological Assessments of Hazardous Waste Sites: A
Field and Laboratory Reference Document, eds: Warren-Hicks W., Parkhurst, B.R. and
Baker, S.S. U.S. EPA and Kilkelly Environmental Associates, Raleigh, NC. 27622.
Peltier, W. and Weber, C.I. (1985). Methods for Measuring the Acute Toxicity of Effluents to
Aquatic Organisms. Third ed. EPA/600/4-85/013. Env. Monitoring and Support Lab.,
ORD, US EPA, Cincinnati, OH.
Prentice, R. L. (1976). A generalization of the probit and logit methods for dose response
curves. Biometrics, 32, 761-768.
Robbins, H. and Monro, S. (1951). A stochastic approximation method. Ann. Math. Statisti.,
22: 400-407.
Silvey, S. D. (1980). Optimal Design. Chapman and Hall, London and New York.
Singh, M. (1987). A non-normal class for distribution function for binary dose-response
curves. J. Appl. Statist. 14: 91-97.
Tamhane, A. C. (1986). A survey of literature on estimation methods for quantal response
curves with a view toward application to the problem of selecting the curve with the
smallest q-quantile (EDIOOq). Comm. Statist. A, Vol 15, 2679-2718.
Tsutakawa, R. K. (1972). Design of experiment for bio-assay. J. Amer. Statist. Assoc. 67,
584-590.
Tsutakawa, R. K. (1980). Selection of dose levels for estimating a percentage point of a logistic
quantal response curve. Appl. Staitist., 29,25-33.
Warren-Hicks, W., Parkhurst, B.R. and Baker, S.S. (1988). Ecological Assessments of
Hazardous Waste Sites: A Field and Laboratory Reference Document. E652/11/21/88-F.
U.S. EPA and Kilkelly Environmental Associates, Raleigh, NC. 27622.
Wu, C. F. J. (1985). Efficient sequential designs with binary data. J. Amer. Statist. Assoc., 80,
974-984.
Wu, C. F. J. (1988). Optimal design for percentile estimation of a quantal response curve. 213 .
223 in: Optimal Design and Analysis of Experiments, edited by Dodge, Y. , Fedorov, V.
V. and Wynn, H.P., Elsevier.
-176-
-------
by Kapustka, Under, Shirazi
Comment for discussion of "Some Statistical Issues Relating
to the Characterization of Risk for Toxic Chemicals,"
by W.M. Stitelerand P.R. Durkin
Starting with the NOAEL/LOAEL/Threshold methodology, the paper briefly
reviews the current approach to calculating a "safe" level of exposure to
toxic chemicals and shows that the methodology is fraught with uncertainty.
Then the authors proceed to review the statistical problems associated with an
alternative approach known as "Benchmark dose." These problems, too, are
shown to be numerous.
The paper is well written. It succeeds in frustrating the reader by
outlining seemingly endless problems that are present in the way of coming up
with a "safe" number based purely on statistical grounds. The various
statistical approaches may improve the calculations by a factor of 2 only to
be disregarded by applying safety factors of 100-fold or more in a policy
level. It is apparent that policy and statistical problems can be reduced by
integrating the biology into the problem.
There are to date countless numbers of biological experiments with
chemicals and species of all kinds that remain under-utilized. Organizing,
classifying, and interpreting these data can produce more definitive classes
of dose-response curves, providing information on the mode of biological
response, narrowing the confidence band and giving more support for basis of
policy. The number may not be associated with any measure of error other than
the local standard deviation of the response. The question is: How badly
does this crude approach miss the mark given the optimization approach of this
paper? It will be helpful to generate some discussion of this problem. The
critical question is whether the theoretical construct can be or will be
accepted by the practitioners. In any case, there is a clear need to relate
together a little better?
-177-
-------
by Kapustka, Linder, Shirazi
Comment on "Estimating of Concentration-Percent Survival
Relationships: Design Issues," by Ernst Linder
The author considers the problem of biological tests with a 100% site
sample and asks the question: What dilution can best produce a desired
response, say 50% survival of test organisms? He provides an example of an
answer to the question by using a power logistic function whose parameters are
evaluated with an optimal design strategy. He correctly maintains that the
outcome from such optimal design approach produces a more reliable estimate of
the concentration.
The mathematics and the solution approach is well explained and refer-
enced but could be understood only with a prior background in the area. The
approach will not be easy to understand by a typical practitioner. Many
practitioners have considerable knowledge of the dose-response relationships
and their insight often help them in producing toxicity estimates. The paper
can be made more useful by providing examples of comparison of actual data and
comparison of concentration approach with optical design approach showing the
entire dose-response relationship.
Bioassays typically require range finding tests and the tests are
hopefully simple so that a wide range of dilutions (concentration) can be
tested relatively simply to establish the probable shape of the dose-response
curve, variabilities are often great and observed only in the response
variable. The variability of the toxicity endpoint is never directly measured
and must be inferred from the measured response. A lay person may do
"eyeball" fit or regression to come up with a toxicity endpoint. Current
approaches are appropriate for 50 years past where there was inadequately data
to do better.
-178-
-------
c o ivnvT e: isr rr s by participants
On S ri ci Dur kin
Herbert Lacayo (U. S. Environmental Protection Agency): This presentation
clearly demonstrates that one needs more dose groups and more animals per group
in order to tie down a threshold. So, if an industry group (i.e. a registrant)
claims a threshold then they can submit as large a study as necessary to do so.
Not put the standard 4 dose group with 50-70 animals per group.
Ernst Linder (University of New Hampshire): The threshold models suggested
raise some interesting questions regarding the design. As you mentioned,
sequentially closing in on the threshold would be preferable. But, what if this
is technically not feasible? Should several doses be chosen near what one:
perceives to be close to the possible threshold and a few at the higher end of
the dose spectrura. I think this calls for some mathematical investigation. The
scenarios are numerous. Any optimal designs derived for these models need to be
scrutinized with respect to applicability and implementation. A joint effort
between toxicologists, scientists, experimenters and statisticians will help
achieve progress and success in this area.
On Linder*
Herbert Lacayo (U. S. Environmental Protection Agency): It must be made very
clear that any new model is not just a little better than what is used. It must
be very much better or people won't accept the change. Even a good improvement
is difficult to implement.
-179-
-------
EVALUATING THE ATTAINMENT OF INTERIM CLEANUP STANDARDS
G. P. PaLi1 and C. Tai11ie
Center for Statistical Ecology and Environmental Statistics
Department of Statistics
The Pennsylvania State University
University Park, PA 16802
1. INTRODUCTION
Cleanup s tandards at hazardous waste sites are ordinarily given as
absolute concentrations on a chem i ca I-by-chemica1 basis. This paper deals
with a cleanup standard of a different type in which it is required that
the total contaminant mass be reduced by a specified percentage during the
first year of remediation. The standard is of an interim character only,
intended to validate that the innovative remedial technology is performing
satisfactorily.
Evaluating the attainment of such a standard requires a pre-remediation
baseline sample as well as a second interim sample taken after one year of
operation. Although this is a familiar two-sample problem, the data are
generally so skew as to preclude the use of normal theory methods. In this
paper, we demonstrate how likelihood ratio tests can be used to carry out the
evaluation. In particular, we show how the information in the baseline sample
can be employed to validate the chi-square approximation to the likelihood
ratio null distribution, and how adjusted critical points can be obtained if
the chi-square approximation proves to be inadequate. The baseline
information is also used to obtain the operating characteristics of the test
which can be helpful in selecting a sample size for the second, interim,
sampling episode. The key step consists of using the baseline data to
identify a parametric distributional model which is then used to form the
likelihood function. It is also necessary to adapt the likelihood ratio
procedure to the one-sided hypothesis tests that arise in this context.
The proposed methods are illustrated with data from a hazardous waste
site on the National Priorities List, at which the Consent Decree
stipulates a 70 percent interim reduction in the contaminant mass. For
this site, we find that the gamma distribution provides an excellent fit
to the baseline data and that the chi-square approximation is quite
satisfactory even for smalt samples.
The final three sections of the paper consider nonparametric rank
tests as an alternative to parametric modeling. We find, for local
alternatives only, that rank tests can be quite competitive in terras of
power with the likelihood ratio test provided the available data are
employed to select a near-optimal rank procedure. Methods for making such
a selection are discussed.
-180-
-------
2. TYSON'S SITE
The silo in quest ion is an abandoned sandstone quarry along the
Schuylkill River northwest of Philadelphia that was used for septic and
chemical waste disposal during the 1960s and early 1970s. The site was
ranked number 25 on the NrPL and is badly contaminated with a large suite
of VOCs including TetrachIoroethene, Toluene, Ch Iorobenzene, Ethyl
Benzene, Xylenes, and 1,2,3-Trich1oropropane. The contaminants have
affected the groundwater and the highly fractured sandstone bedrock
beneath the site. However, this paper is concerned only with
contamination of the topsoil at the site, which consists of two discrete
regions (former East Ligoon and Corner West Lagoon > . Details are given
for the East Ligoon; the »tatiatical analysis for the West l-agoon is
similar except that the results are less definitive due to a smaller
samp t e size
Originally, it had been planned to remediate the site by soil
removal. This was later replaced by an innovative vacuum extraction
technology that had apparently proved successful in Puerto Rico (see
Resources, 1988 ) .
In addition to final chemica I-by-chemicaI cleanup standards, the
Consent Decree stipulates an interim standard requiring removal of at
least 70 percent of the total contaminant mass during the first full year of
operation. The interim standard was apparently imposed because of the
innovative nature of the vacuum extraction technology. In order to
evaluate attainment of the interim standard, two sampling episodes were
pianned:
(a) Initial Ej>i st>d i- intended to establish a baseline for future
comparison. This episode was carried out during the summer and fall of
1988 during installation of the extraction wells. The data from this
episode are analyzed in this paper.
lb) Interim Episode to be carried out after one year of operation.
In statistical parlance, evaluating the attainment of the interim
standard calls for a "two-sample hypothesis test." A somewhat novel
feature is that one of the samples was already available while the second
sampling episode had yet to be carried out (when this paper was prepared)
so there is a question of sampling design as well as hypothesis testing.
One of the goals of this paper is to illustrate how the data from the
first episode can be used to develop the operating characteristics (power
curves) of the tests which would then be helpful in choosing the sample
size for the second sampling episode.
During the first episode, soil samples were taken, usually at 5 ft.
depth intervals, from a core in the near vicinity of each extraction well.
Analysis of the depth profiles reveals a large variation with depth as
well as a "seaming" effect in which much of the contaminant mass occurs
-181-
-------
in one or mure layers or' varying distance In* low the surface.
We decided not to treat the samples of a given core as independent
observations but instead to combine these values into a single number
intended to represent the total contaminant mass of the core. This number
was obtained by averaging the sample values and then multiplying by the
depth-to-bedrock of the core I Lhe units of measurement are ppk-ft where
"ppk" stands for parts per thousand). There were two primary reasons for
combining the data in this manner. First, we did not feel that we could
adequately model or account for the "seaming" effect mentioned above,
which was quite pronounced. Second, and more importantly, for the samples
to be treated as independent would require that each sample come from a
separate core. Multiple samples from a single core is a situation known
as "cluster sampling."
To summarize, then, the "contaminant mass" values are subject to
three primary sources of variability:
(a) spatial heterogeneity owing to the horizontal locations of the
different cores;
(b) sampling effect reflecting vertical heterogeneity since only a
tiny fraction of each core was actually sampled and analyzed; and
(c) ana!yt i ca1 error.
Examination of the data indicates that the sources (a) and lb) are very
large.
Figure la shows the well locations in the EasL Lagoon (the numbers
are we 1 I-identifiers without numerical significance). Figure lb shows the
contaminant mass as measured at each well location, rounded down to the
nearest integer. The spatial heterogeneity is evident from the Figure
with the eastern half of Lhe Lagoon being nearly contaminant-free. Since
there would be little purpose in sampling this region during the second
sampling episode, we have excluded this region from the subsequent
analysis, leaving N=;Ji> contaminant mass values in the data base (Figure
lc) .
It should be pointed out that in addition to the official EPA data,
separate samples were collected and analyzed on-site by the subcontractor
responsible for installing the extraction wells. In a few instances,
labelled "Anomalous" in Figure 1, the two contaminant mass values were
markedly different. Examination of the depth profiles revealed that the
discrepancies were due to the seaming effect in which one contractor
happened to collect a sample from the seam while the other contractor
missed the seam. The values in Figure lb are Lhe pairwise maxima of the two
sets of contaminant masses Except for Figure 1, the rest of this paper
examines only the official EPA data.
-182-
-------
TYSON'S EAST
(a) WELL LOCATIONS
¦ Anomalous Data
\ ^ Road '
BedRocW
ear" Surface
o
¦r,
TYSON'S EAST
Fence
-£ "29
(b) CONTAMINANT MASS
¦ Anomalous Data
67
45
J62
I1J 224
¦218
O
/
o
o
^ 3edRocU
Near Surface
500
TYSON'S EAST
(c) SELECTED WELLS
Selected for Analysis
¦81 "20
¦89 «,7
¦19 ,22
¦14 *82
BedRock
Near Surface
FIGURE 1
-183-
500
-------
J. IDENTIFYING AN APPROPRIATE PARAMETRIC MODEL
In this section we look for a parametric statistical distribution
that adequately summarizes the N=32 contaminant values described in the
previous Section. This parametric model will then form the basis for
developing an appropriate likelihood ratio hypothesis test.
Figure 2 (upper half) shows the histogram of the contaminant values.
Notice that the distribution is J-shaped, is extremely skew, and has a
single observation in the extreme right tail. A direct application of
normal theory to these data would be clearly inappropriate. The next most
attractive poss i b iI i ty i s the lognormal distribution which can be assessed
by comparing the logged data to the normal curve. See the lower half of
Figure 2, where the units on the horizontal axes are standard deviations
from the mean. The normal curve matches the data quite well in the
left-hand tail but fares poorly in the middle portion and upper reaches of
the distribution. The latter is the most important part of the
distribution for assessing cleanup effectiveness.
Shenton and Bowman (1977 1 and Bowman and Shenton (1981) have
developed a formal goodness-of-fit test of normality which is based upon
the joint distribution of skewness and kurtosis. This is illustrated in
Figure which depicts a number of contours corresponding to different
sample sizes. When sampling from a normal distribution, the
skewness-kurtosis pair should fall inside the appropriate sample-size
contour for 90 percent of the samples. Consequently, a skewness-kurtosis
pair that fall outside the contour is evidence against normality at the
10 percent significance level. The skewness-kurtosis pair of interest
here is labelled EA in Figure 3 and falls well outside the N=32 contour.
So we can reject the normal distribution at the 10 percent significance
level. Figure 4 shows that normality can be rejected even at the 5
percenL I eve 1 .
-184-
-------
TYSON'S EAST
\ ~ \ i r - " ¦¦
ALL CHEMICALS
Selected Wells
n = 32
I—I
rrl rrfl m n , n n , n_
O. 100 200 300 400 500 600 700 800
Contaminant Mass
TYSON'S EAST
ALL CHEMICALS
Selected Wells
n = 32
Norma
0
1
2
-2
3
-3
Ln(Contaminant Mass)
FIGURE 2
-185-
-------
90 PERCENT LEVEL

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.3 0.9 1.0 1.1
ISKEWNESSI
Joint contours of skewness and kutosis when sampling from a
noraal population for various sample sizes. Samples should
fall within the indicated contour 90 percent of the time.
WA West Lagoon
All Chemicals, n=20
EA East Lagoon
All Chemicals, n=32
WJ West Lagoon
Joint Chemicals, ri=20
EJ East Lagoon
Joint Chemicals, n=32
FIGURE 3
-186-
-------
95 PERCENT LEVEL

ISKEWNESSI
Joint contours of skewness and kutosis when sampling from a
normal population for various sample sizes. Samples should
fall within the indicated contour 95 percent of the time.
WA West Lagoon
All Chemicals, n=20
EA East Lagoon
All Chemicals, n=32
WJ West Lagoon
Joint Chemicals, n=20
EJ East Lagoon
Joint Chemicals, n=32
FIGURED
-187-
-------
Two statistical distributions that can potentially capture the
J-shapedness of Figure 2 are the Pareto distribution and the gamma
distribution. These have probability density functions as follows:
(a) Pareto distribution
(1+0X)
x > 0
where /? = shape parameter
6 - scale parameter.
(b) Gamma distribution
x > 0
where k = shape parameter
A = scale parameter.
The gamma has a J-shape only when 0 < k < 1 while the Pareto has a
J-shape for all /? > 0. Also notice that the Pareto density is always
finite at x=0 while the gamma becomes infinite at x=0 for 0 < k < 1.
The histogram in Figure 2 is more suggestive of an infinite value at the
origin.
We have fitted both the gamma distribution and the Pareto
distribution to the data using the method of maximum likelihood. The
estimated parameter values are as follows:
The fitted Pareto and gamma curves are superimposed on the histograms
in Figure 5. It is evident that the Pareto is unsatisfactory while the
gamma cures match the histograms across the entire range of the data about
as well as can be expected with only 32 observations.
Pareto
0 = 0.329720
0 = 0.992309
(1)
Gamma
k = 0.324794
A = 0.003307 .
(2)
-188-
-------
TYSON'S EAST
ALL CHEMICALS
Selected Wells
n = 32
Gamma
z Pareto
0.
100
200
300
400
500
600
700
800
Contaminant Mass
TYSON'S EAST
ALL CHEMICALS
Selected Wells
n = 32
Normal
Gamma
Pareto
3
2
0
-2
-3
Ln(Contaminant Mass)
FIGURE 5
-189-
-------
Since the right tail of the distribution is important for assessment
purposes, we have calculated the predicted right-tail frequencies as
f o11ows:
Expected Frequency > X
X
Pareto
Gamma
400
500
600
700
4.45
4.13
3.90
3.70
1.87
1.20
0.78
0.51
Since there is exactly one observation above X=400 with that observation at
about X=800, the gamma has a right hand tail that is reasonable in light of
the data while the right-hand tail for the Pareto is much too heavy.
A final strategy for assessing goodness-of-fit is to embed the
distributional family of interest (here the gamma) into a larger family
(i.e., one with more parameters), fit the larger family to the data, and
see how closely the result matches that for the original family. In the
case of the gamma, a convenient larger family is the 3-paraoeter beta
distribution of the second kind whose pdf is as follows:
c) Beta distribution
The beta includes the Pareto as the special case when a=l. Also, when
0 •* as and 0 -* 0 with 00 -» A > 0, then the beta distribution approaches
the gamma with parameters k=a and A. When we fitted the beta
distribution using an iterative maximum likelihood routine and printed the
results of each iteration, the values of 0 were found to steadily
increase, those of 0 steadily decreased, and the routine terminated when
a = 0.324957, 0 = 6372276, 0 = 5.19158 x 10~10
00 = 0.003308 .
Comparison with the fitted gamma parameters (see equation 2),
k = 0.324794, A = 0.003307 ,
is rather convincing evidence in favor of the gamma.
x > 0
where a = first shape parameter
0 - second shape parameter
0 = scale parameter .
-L90-
-------
4. SETTING UP THE NULL HYPOTHESIS AND
DEVELOPING THE LIKELIHOOD RATIO (LR) TEST
A basic issue is whether the null hypothesis should assume that
the interim standards have been achieved, or that they have not been
achieved. That is: should the burden of proof rest with the EPA or with
the subcontractor? The answer seems to be a matter of some disagreement,
at least for final cleanup standards (see EPA, 1989, for one point of
view). But for interim standards, we think that a strong case can be made
for assuming the standards are met unless there is good evidence to the
contrary. The situation is analogous to quality control in a
manufacturing plant where one would not shut down without good reason.
Thus, we set up the hypotheses as follows:
H : interim standards met
o
H : interim standards not met.
a
More formally, let /i and n be the population means during the initial
and interim sampling episodes, respectively. Then the hypotheses are
H : /j < 0.3 u
o r
H : n > 0.3 n .
Si
Finally, introducing a new parameter p = /i//i to represent the ratio of
population means, we get
H : p < 0.3
o
H : p > 0.3 (3)
a r
We propose carrying out the test with a variant of the likelihood ratio
procedure. We assume that a parametric model has been identified to
describe the duta. The previous section identifies the gamma family for
Tyson's East, but for now let the argument be general and let
ip = (pj,...,pp) represent the parameters needed to describe the parametric
model. To each possible value of
-------
A
by L(full). The "hat" aimply means maximum and the "full" means that the
maximum is taken over all possible values of the parameters (the full
parameter space). The population means p and fi are also functions of
the parameters and then so is p = pip)/. Substituting the maximum
* A
likelihood estimate ) take on the value specified in the null
hypothesis, i.e. that p(p) = .3, then p can no longer range over the
full parameter space but is restricted to a subspace whose dimension is
A
one less than that of the full parameter space. Let L(restricted) denote
the maximum of the likelihood over that restricted subspace. Certainly,
A A
Ll restricted) < Lt full)
and
A
q < restricted) <
L( full)
When the above ratio is close to unity, the evidence from the data
is consistent with the hypothesis that
0.3 ut significance level a provided both of the
d
following conditi .nld:
A A
(i) p > 0.3 where p is the likelihood estimate of p under the
fuI I mode I, and
-192-
-------
2 2
(ii) LR > x (I,2a) where x 11,2a) denotes the 2a-critical point of
the chi-square distribution with one degree of freedom. The 2a is used
instead of a because the test is one-sided.
Straightforward modifications of the usual theory of likelihood
ratio testing shows that the above procedure is valid provided the sample
sizes are large. More important than the asymptotic theory is the
question of whether the above chi-square approximation is reasonably
accurate for the small sample sizes that are encountered in practice. The
next Section answers this question irr the affirmative, at least for the
-gamma distribution.
5. VALIDITY OF THE CHI-SQUARE APPROXIMATION
We now assume that the N=32 observations from the initial sampling
episode follow a gamma distribution with parameters k and A. We also
assume that a hypothetical number N of observations to be made during
the interim sampling episode would also follow a gamma distribution with
the same index parameter k but with a possibly different scale parameter
X . Since the mean of the gamma distribution is /i = k/A, it follows
that the parameter p = /i//i introduced in the previous section is simply
the ratio p - A/X of scale parameters. Also the full parameter space of
-------
Target Level
N .05 .10
10 .0471 .0924
15 .0486 .0939
20 .0513 .0983
30 .0517 .100
40 .0576 .111
50 .0557 .109
60 .0554 .110
70 .0624 .117
80 .0573 .114
90 .0565 .113
100 .0605 .116
200 .0605 .113
With 10,000 replications, the simulation error is about ±0.004 for the 0.05
target level and about i0.006 for the 0.10 target level, both with 95 percent
confidence. Overall, the chi-squure provides an excellent approximation,
although there is a general tendency for the realized level to rise from
slightly below the targeted level to slightly above the targeted level as n"
increases from 10 to 200.
Although the approximation here is sufficiently accurate that no
correction is really called for, Table 1 shows how the chi-square critical
value could be adjusted to better achieve the targeted significance level.
Table la gives the actual number of rejections out of 10,000 replications for
various tentative critical values. With a target level of 0.05 we would hope
to see 500 rejections out of the 10,000 replications, and these are
indicated by an asterisk. The simulation error is such that the asterisk
could move up or down by about four rows in the Table. The critical level
corresponding to each asterisk (i.e. to each value of N) can be read from
the first column. These are recorded in Table lb. Finally, a formula that
crudely approximates the relation in Table lb was found to be
0.05 Critical Value = 2.99 - 4/N . (4)
-194-
-------
TABLE 1
Table la. Rc icction f reiiuence s out of 10,000 re pi i cat tons .
Size of Second Sample (N)
Cri tical
Value
10
15
20
30
40
50
60
70
80
90
100
200
2.56
523
533
552
554
621
616
606
664
623
637
655
671
2.59
509*
521
541
546
613
600
594
657
611
621
644
654
2.62
489
515
530
537
598
582
581
646
596
602
630
633
2.66
481
504*
525
531
583
574
572
638
584
589
621
624
2.69
475
493
516
520
577
563
558
628
577
571
611
611
2.72
464
482
507
516
570
553
550
622
565
561
596
597
2.76
452
471
503*
506
560
543
535
612
563
550
589
580
2.79
440
465
489
497*
552
537
527
5§8
554
536
571
572
2.82
428
453
482
483
542
529
521
586
548
530
560
565
2.86
421
442
471
474
529
522
514
575
536
517
549
558
2.89
414
431
463
459
517
514
505*
566
529
506*
538
549
2.92
407
423
448
451
504*
500*
490
558
516
491
525
539
2.96
398
412
444
443
489
492
482
542
506
484
516
526
2.99
390
399
426
434
477
482
470
537
499*
469
502*
509
3.03
377
393
421
425
464
473
465
518
489
456
488
498*
3.06
371
383
416
415
453
465
458
504*
479
447
475
486
3.10
364
372
408
404
441
453
453
493
470
440
469
479
TABLE lb. Empirical 0.05 critical values for various values of N.
N
Critical Value
10
2.59
15
2.66
20
2.76
30
2.79
40
2.92
50+
2.99
-195
-------
TABLE 2
TABLE 2a. Rejection f r ei/uenc t e v out of 10,000 replications.

Si ze
of Second Sample
(N)

i t i ca1

'alue
10
15
20
30
40
50
60
70
80
90
100
1.49
1017
1038
1069
1106
1221
1232
1215
1274
1246
1243
1278
1.51
1006*
1025
1050
1083
1199
1205
1190
1253
1227
1225
1261
1.54
993
1004*
1041
1065
1173
1181
1169
1242
1201
1204
1246
1.56
971
987
1023
1050
1157
1165
1156
1227
1185
1181
1221
1.59
950
972
1008
1035
1142
1139
1140
1210
1170
1163
1198
1.61
942
960
998*
1018
1127
1120
1116
1194
1159
1146
1183
1.64
926
943
987
1006*
1111
1097
1097
1176
1140
1128
1165
1.66
908
926
969
989
1087
1069
1080
1160
1124
1109
1147
1.69
894
911
945
971
1073
1046
1057
1142
1108
1087
1129
1.72
876
898
930
959
1062
1029
1036
1124
1090
1063
1110
1.74
861
880
911
941
1047
1015
1011
1102
1066
1051
1088
1.77
846
872
900
925
1031
995*
991*
1088
1054
1027
1071
1.80
835
848
881
900
1005*
978
971
1069
1033
1010
1061
1.82
819
839
872
887
986
970
955
1048
1015
997*
1042
1.85
803
825
857
875
967
945
943
1026
992*
983
1016
1.88
787
813
840
857
946
927
929
1002*
976
967
999
1 .90
772
799
827
842
929
916
919
985
960
954
988
TABLE 2b. Empirical 0.10 critical values for various values of N.
N Critical Value
10 1.51
15 1.54
20 1.61
30 164
40+ 1.85
-196-
-------
Although (4) is rather ad hoc, it is convenient in computer programs and was
found to be excellent in achieving the target significance level of a = 0.05
(see the next Section). It should be emphasized that (4) has been validated
only for Tyson's East Lagoon, i.e., N=32 and k = 0.325, see equation (2).
Table 2 repeats the derivation but for the target level of 0.10. An
approximate formula for the critical value is
0.10 Critical Value = 1.85 - 4/N . (5)
6. OPERATING CHARACTERISTICS FOR THE LR TEST
Simulation was used to determine the power of the proposed test at the
0.05 significance level and again at the 0.10 level. Technical details
can be found in Appendix A. Results are shown in Table 3 and Table 4.
The first row in each Table corresponds to the null value p = 0.3. The
close match between the values tabulated in the first row and the target
significance level is indicative of the accuracy of the formulas (4) and
(5) for the critical values. Figure 6 shows the power curve when
a=0.05 and for the alternative p-0.5 .
As might be expected with an initial sample size of N=32, the power
of the test is not particularly high. For example, if the vacuum
extraction technology is operating at only 40 percent efficiency (p=0.6)
instead of the desired 70 percent and if the second sample size is N=30,
then the test detects this fact with a .45 probability at the a=.05
level and with a .605 probability at the a=.10 level. It should also be
noticed that the power tends to level off once the second sample size
reaches the N=30 or N=40 value (Figure 7). Additional samp Iing during
the second episode produces very little return in terms of the power.
(Note that N refers to the number of cores, not the number of soil
samples.) Of course, this is because no amount of sampling during the
second episode
-197-
-------
T ABLE 3. Power of the likelihood ratio test at the 0.05 significance
level.
Alternative Second Sample Size (N)
p
10
20
30
40
50
60
75
100
.30
.048
.050
.051
.049
.052
.049
.052
.050
.35
.082
.089
.094
. 101
.097
.098
. 102
. 106
.40
.121
. 138
.152
.166
.163
. 167
. 177
. 181
.45
.167
. 198
. 222
.239
.248
.256
.266
.273
.50
.215
.265
.299
.320
.337
.344
.361
.374
.55
.263
. 333
.376
.400
.428
.437
. 462
.480
.60
.312
.397
.450
.479
.516
.527
.553
.570
.65
.358
.457
.517
.552
.596
.608
.637
.658
.70
.399
.512
.580
.621
.663
.680
.708
.733
.75
.440
. 564
.636
.679
.725
.743
.768
.794
.80
.479
.61 1
.686
.733
.777
.796
.821
.843
.85
.515
.654
.729
.779
.815
.839
.863
.882
.90
.548
.691
.768
.819
.854
.874
.893
.913
TABLE 4. Power of the likelihood ratio test at the 0.10 significance level.
Alternative Second Sample Size (N)
P
10
>0
30
40
50
60
75
100
.30
. 104
. 106
.098
.101
. 100
.105
.102
. 107
.35
.158
. 167
. 173
.177
.180
.188
.190
. 197
.40
.213
.239
.260
.269
.279
.293
.292
.306
.45
.271
.319
.351
.370
.387
.398
.398
.419
.50
.331
.400
.437
.463
.486
.499
.510
.530
.55
.388
.477
.524
.549
.578
.597
.609
.640
.60
.441
.547
.605
.628
.659
.680
.695
.724
.65
.491
.606
.674
.699
.729
.750
.769
.800
.70
.535
.662
.728
.759
.786
.809
.827
.856
.75
.574
.707
.774
.808
.835
.854
.871
.897
.80
.610
.747
.816
.848
.877
.886
.905
.929
.85
.643
.785
.850
.880
.905
.916
.930
.948
.90
.674
.812
.878
.905
.926
.933
.951
.963
-198-
-------
POWER CURVE (LR TEST)
a -0.05
N-32
N = 30
0.3
0.4
0.5 0.6 0.7
Alternative (p)
0.8
0.9
Figure 6
-------
o
00
POWER VS
a
0>
o
Sn
a>
d.
a>
£
o
a.
o
to
o
o
CN
a-0.05
p = 0.5
N -32
o
0.
20 4 0
Second
SAMPLE SIZE
1 1 1 1 r
Likelihood Ratio Test
i i i i i
60 80 100
Sample Size
Figure 7
-------
can overcome the baseline noise that results from only N=32 observations
in the initial episode.
Sections 5 and 6 have assumed that both sampling episodes can be
summarized by gamma distributions with the .sum? index parameter k. In
order to carry out the simulations, it is necessary to have a definite
numerical value for k during the second episode, and with no data yet
available there is little choice except to assume that k will have the
same value as during the first episode.
In fact there are many factors that might cause k to change at the
second episode (e.g., different sampling protocols or different choice of
depth levels for the physical samples). Accordingly, once the
second-episode data is in hand, the likelihood ratio test should not be
performed without first checking that k is the same for the two episodes.
If there is any indication that k has changed, then there is no
difficulty in modifying the likelihood ratio test to take this into
account. The computations must be done by computer anyway, so the test
with unequal k is basically no more difficult than that with equal k.
It simply means that the performance of the test may be somewhat different
from the expected performance as indicated in Table 3 and Table 4.
7. NONPARAMETRIC TESTS
The previous sections have described a parametric strategy which
consists of using the available data to identify a parametric distributional
model and applying the likelihood ratio procedure for hypothesis testing.
Such an approach suffers from several disadvantages. First, the method
requires considerable human effort as well as a good deal of computer time.
More importantly, the method may not be robust to departures from the
supposed distributional model. Such departures, which may be inevitable in
the face of small sample sizes, can result in a loss of efficiency (lowered
power curve) and even a loss of validity (incorrect significance level). In
this section, we consider the application of nonparametric (rank) tests to
evaluating the attainment of cleanup standards. Besides guaranteeing
validity, these tests often have good power characteristics compared with
parametric competitors provided that one applies an appropriate rank
procedure. As it turns out there are systematic techniques available for
selecting an appropriate rank test in light of the data.
First we consider the situations in which nonparametric tests might be
applied. In general, one can identify at least three classes of cleanup
standards:
(1) numeric or risk-based standards;
(2) background standard in which the remediated site is compared with a
supposedly clean region; and
(3) interim standards of the type applicable at Tyson's Site.
-201-
-------
Numeric standards call for a one-sample test while the other two types of
standards involve two-sample tests. The parametric approach described
previously can be applied in any of these three situations provided one is
able to identify a satisfactory parametric model. But, for the most part,
rank tests can only be applied to two-sample problems. It is true that there
are one-sample rank procedures (for example, the Wilcoxon signed rank test),
but these depend heavily on assumptions such as symmetry that are unlikely to
be satisfied in practice.
The null hypothesis for a two-sample rank test always asserts (or
includes) equality of the two distributions and some adaptation is required to
handle hypotheses like those encountered at Tyson's Site:
H : /i < 0 . 3/i
fi > 0 .3/i
Here one need only multiply the observations in the first sample by 0.3 and
then do a one-sided test for equal means. Henceforth, we routinely assume
that such adjustments have been made.
In what follows, we first consider two specific rank tests (Wilcoxon
test and Savage test) and compare their power performance with that of the
likelihood ratio test in the context of Tyson's Site. Then we take up the
general question of linear rank tests and describe how the available data can
be used to select an optimal such test (asymptotically locally most
powerful). For Tyson's Site, the optimal linear rank test is given
approxima11 ey L>y the Savage procedure.
8. WILCOXON AND SAVAGE TESTS
The Wilcoxon two-sample test is the best known of the rank procedures
and can be found in any nonparametrics textbook. It is a minor disgrace that
the Savage test does not receive similarly wide treatment. Brief discussion
of the test is included in Hajek (1969), Hajek and Sidak (1967) and Lehmann
(1975) and a fuller treatment is given by Hsieh (1988). A censored form of
the test has also received wide treatment in the survival analysis literature
(see Kalbfleisch and Prentice, 1980).
We begin by illustrating the procedures with two entirely hypothetical
samples:
Samp lei: 22, 29, 35, 42
Sample 2: 18*, 36*, 45*
where we use a star to designate observations from the second sample. The
first step in any rank procedure is to pool the two data sets, then arrange
the combined data in increasing order and replace observations by increasing
ranks, all the while retaining the stars:
-202-
-------
Pooled data: 18*. 22, 29 35, 36», 42, 45*
Ranks: I*, 2, 3, 4, 5*. 6, 7* .
The Wilcoxon test statistic is the sun of the starred ranks,
T = 1* + 5* + 7* ,
and rejects
H when
o
T is large.
The Savage test statistic is slightly more complicated. First, consider
the harmonic series in reverse order starting with the reciprocal of the
combined sample size.
1
1
1 1
I
1
I
7*6*5 + 4 + 3*2 + l
and compute the partial sums:
1-
1/7

fl2 '
1/7
+
1/6

a3:
1/7
+
1/6

1/5

V
1/7
+
1/6
+
1/5
+
1/4

5*'
1 n
+
1/6
*
1/5
+
1/4
+
1/3

a6:
1/7

1/6
+
1/5

1/4
+
1/3 +
1/2
7* "
i n
~
1/6
~
1/5
~
1/4

1/3 +
1/2
These partial sums are indexed by the possible ranks (1 through 7) and are
known as the Savage scores. The Savage test statistic is the sum of the
starred Savage scores,
T=a„+a_+a„,
1 2* 3
and rejects when T is large.
Exact critical values for the Wilcoxon test are tubulated in Wilcoxon et
al (1970) for sample sizes as large as 50. The only tables we have seen
for the Savage test are in Hajek (1969) and these are only for sample sizes up
to 10. However, approximate critical values can be obtained for each test
using suitable normal approximations (see llajek, 1969).
We have used simulation with 10,000 replications to obtain the power
curves for the two tests assuming both samples come from gamma distributions
with index parameter k=.32 as suggested by the first sampling episode at
Tyson's Site. The results are shown in Table 5. Approximate critical points
were obtained from the normal approximations; these critical points could
-203-
-------
have been, but were not, adjusted empirically as described in Section 5.
Examination of the first two rows of Table 5, where the simulation error is
±.004, indicates that the normal approximation is quite reasonable for the
Wilcoxon test. While the normal approximation is also satisfactory for the
Savage test, the rejection frequencies appear to depart systematically
from the nominal .05 level depending on the second sample size.
Power curves are plotted in Figure 8 for the Savage, Wilcoxon, and
likelihood ratio tests. The Savage test shows a substantial power gain
over the Wilcoxon test — a gain that would certainly justify the tiny bit
of additional arithmetic involved in using the Savage test instead of the
Wilcoxon test. The Savage test is less powerful than the likelihood ratio
test, particularly for non-local alternatives, but a rank procedure cannot be
expected to match the performance of a parametric procedure under matching
distributional assumptions. A central question — not addressed here — is
the degradation of the power and validity of the likelihood ratio test that
might result from plausible departures from the assumed gamma distributions.
-------
TAHLE 5. Power of the Savage and the Wilcoxon tests at the 0.05 significance
level . For each alternative, the first row gives the power of the Savage
test, while the second row gives the power of the Wilcoxon test.
, terna t i ve
P
10
20
Second Sample Size (N)
30 40 50
60
75
100
30
.053
.058
.053
.049
.049
.043
.043
.044

.046
.052
.049
.045
.049
.049
.048
.052
35
.086
.094
.097
.088
.090
.085
.090
.088

.067
.079
.081
.076
081
.085
.086
.089
40
. 127
. 141
.148
.146
.148
. 151
.154
. 149

.090
.111
.118
.115
. 126
. 128
.131
. 135
45
. 172
. 195
.209
.212
.219
. 224
. 223
.227

.116
.149
. 157
. 162
177
. 176
.185
. 188
50
.215
.250
.274
.283
.296
.298
.302
.314

. 142
.186
.203
. 209
. 232
.230
.246
.252
55
.257
.308
.339
.355
.373
.382
.389
.404

.170
.219
.245
.261
281
.289
.304
.318
60
.300
.363
.405
. 427
444
.457
.469
.485

.192
.254
.291
.313
.333
.346
.364
.379
65
.336
.415
.467
.498
.511
.528
.548
.566

.219
.293
.334
.360
.383
.397
.419
.441
70
.372
.467
.521
.562
.577
.594
.626
.643

.245
.325
.376
.407
.436
.453
.476
.500
75
.410
.512
.575
.618
.638
.657
.691
.708

.269
.361
.415
.451
.481
.500
.530
.556
80
.443
.554
.624
.668
.692
.714
.741
.763

.292
.397
.455
.497
.526
.547
.582
.608
.85
.476
.597
.672
.714
.736
.763
.783
.812

.315
.426
.490
.535
.567
.591
.627
. 656
90
.505
.634
.710
.750
.777
.805
.826
.850

.337
.457
.527
.573
.604
.632
.667
.698
-205-
-------
POWER CURVES
o
oo
Likelihood Ratio
Savage
Wilcoxon
O CM
O
0.3
0.5
0.6
0.7
0.4
0.8
Alternative (p)
Figure &
-------
9. LINEAR RANK PROCEDURES
We have just seen that the Savage test is more powerful than the
Wilcoxon test for the Tyson's Site data. How could one have known this
beforehand and, more specifically, how can one identify an "optimal" rank
procedure in light of the available data? The answer is that one needs to
identify a statistical distribution that approximately describes the data,
and then the scores for the optimal rank procedure can be calculated in
terms of this distribution. This sounds suspiciously like the parametric
approach in that we need to model the data. The difference is that the
validity of the rank test is assured in any case and the performance
(power) of the rank test is relatively insensitive to correct
identification of the underlying distribution. As will be seen below, this
insensitivity can be employed to advantage
with a nearby distribution that leads to a
rank procedure.
by replacing the distribution
computationally more convenient
For the general linear rank test, one needs an increasing set of
'scores,"
1
< a2 <
• •• < a.
where M is the pooled sample size. As
corresponding to the ranks of the second
test statistic is
in the Savage test, subscripts
sample are given stars, and the
T = sun of starred scores.
The null hypothesis is rejected when T is sufficiently large and
approximate critical values can be obtained from normal approximations
(see Hajek and Sidak, 1967, or Heltmansperger, 1984, for example).
Furthermore, for (i) a given data distribution (under H ) and (ii) a given
¦ o
class of alternatives, there exists a "best" set of scores in the sense of
maximizing the local asymptotic power (Hajek and Sidak, 1967; Serf Iing,
1980, Chapt. 9; Hettmansperger, 1984. Chapter 1 of Behnen and Neuhaus,
1989, contains a succinct summary.)
For item (i), let fix) and F(x) denote the pdf and cdf
corresponding to the data distribution. Thus, at Tyson's Site, f(x) and
F(x) are given approximately by a gamma distribution with index parameter
k=0.32 . For item (ii), we consider scale slippage alternatives. This
means that the contamination levels after one year, Y, are related to
preremediation levels, X, through the equation Y=cX where c is a
constant. Of course, this is only an approximation since c will vary
across the site depending on depth, soil type, distance from extraction
well, etc. With the above notation, define the function
-207-
-------
£>( u) — • F (u), 0
-------
Jr. and N. Phillip Rosa of EPA Statistical Policy Branch for their support
and encouragement. We are also thankful to our colleagues Nicholas Bolgiano
and Marilyn T. Boswell for their interest and several technical discussions.
REFERENCES
Abramowitz, M., and Stegun, I. A. (1972). Handbook of Mathematical
Functions. Dover.
Ahrens, J. H., and Dieter, V. (1974). Computer methods for sampling from
the beta, Poisson, and gamma distributions. Comput tug, 12, 233-246.
Behnen, K., and Neuhaus, 0. (1989). Rank Tests with Estimated Scores and
Their Application. Teubner, Stuttgart.
Bowman, K. 0., and Shenton, L. R. (1977). Maximum Likelihood Est imation
in Small Samples. MacMillan.
Bowman, K. 0., and Shenton, L. R. (1981). Moment (Vb^.b^) Techniques.
Technical Report ORNL/CSD-83, Oak Ridge National Laboratory.
EPA (1989). Methods for Evaluating the Attainment of Cleanup Standards.
Volume 1: Soils and Solid Media. Statistical Policy Branch (PM-223),
U.S. EPA, Washington, DC.
Greenwood, J. A., and Durand, D. (1960). Aids for fitting the gamma
distribution by maximum likelihood. Technometrics, 2, 55-65.
Hajek, J. (1969). A Course in Sonparametric Statistics. Holden-Day, San
Francisco.
Hajek, J., and Sidak, Z. (1967). Theory of Rank Tests. Academic Press,
New York.
Hettmansperger, T. P. (1984). Statistical Inference Based on Ranks.
Wiley, New York.
Hsieh, H. K. (1988). Savage test. In Encyclopedia of Statistical
Science, Vol. 8, S. Kotz and N. L. Johnson, eds. Wiley, New York.
Kalbfleisch, J. D. , and Prentice, R. L. (1980). The Statistical Analysis
of Failure Time Data. Wiley, New York.
Lehmann, E. L. (1975). Sonparamet r ics: Statistical Methods Hascd on
Ranks. Holden-Day, San Franciso.
Lehmann, E. L. (1986). Test inn Statistical Hyjwtheses. 2nd Ed. Wiley,
New York.
-209-
-------
Resources (1988). Newsletter published by the Environmental Resources
Management Group. Volume 10, No. 3. (Summer Issue).
Serf ling, R. J. (1980). Approx i mat i on Theorems of Mathematical
Statistics. Wiley, New York.
Shenton, L. R., and Bowman, K. 0. (1977). A bivariate model for the
distribution of V^and b^. J. Amer. Statist. Assoc., 72, 206-211.
Shiue, W. K., and Bain, L. J. (1983). A two-sample test of equal gamma
distribution parameters with unknown common shape parameter.
Technomeirica , 25, 377-381.
Shiue, W. K , Bain, L. J . and Engelhardt, M. (19881. Test of equal
gamma-distribution means with unknown and unequal shape parameters.
Technometries, 30, 169-174.
Wilcoxon, F., Katti, S., and Wilcox, R. (1970). Critical values and
probability levels for the Wilcoxon rank sun test and the Wilcoxon
signed rank test. In Selected Tables in Mathematical Statistics,
Vol. /, H. L. Harter and D. B. Owen, eds. Markham, Chicago, pp. 171-259.
Woinsky, M. N (1972) A composite nonparametric test for a scale
slippage alternative. inn. VatH. Statist., 43, 65-73.
-210-
-------
APPESDIX A
ONE-SAMPLE AND TWO-SAMPLE GAMMA MODELS
This Appendix provides the technical background for the main body of
the paper. Section A.l reviews maximum likelihood estimation for the
one-sample gamma model. This is the procedure used to fit the gamma
distribution to the data from the first sampling episode at Tyson's site.
Obtaining the likelihood estimates reduces to solving a single
transcendental equation, which we refer to as the Greenwood-Durand
equation since Greenwood and Durand (1960) gave rational Chebychev
approximations to the solution for a certain range of the sufficient
statistics. In Section A.'2, we extend their solution to all values of the
sufficient statistics. The remaining sections of the Appendix discuss
likelihood estimation and likelihood-ratio hypothesis testing for the
two-sample gamma model. In particular, we show that the likelihood
equations for both the full and restricted models reduce to the
Greenwood-Durand equation.
Some approximate but non-LR procedures for the two-sample gamma
problem can be found in Woinsky (1972), Shiue and Bain (1983), and Shiue
et al (1988). Also, a uniformly most powerful unbiased (UMPU) test exists
for this problem (N. Nagaraj, personal communication), as follows directly
from general results on multiparameter exponential families (Lehmann,
1986, pp. 145-150). But small sample implementation of the UMPU test is
very complicated. Since we look upon Tyson's Site and the two-sample
gamma problem as simply an example, we have preferred general approaches,
such as the likelihood ratio, which would be applicable in a wide variety
of distributional settings.
A.1 ONE SAMPLE GAMMA MODEL
Let X(, X? X^ be a random sample from the gamma distribution,
Gamma(k,A), with density function given by
f ( x; k. A) = A^x*4 ^exp(-Ax)/H k), x > 0 . (Al)
Here, k > 0 is the index parameter and A > 0 is the scale (intensity)
parameter. The mean of the distribution is
M = k/A . (A2)
The sufficient statistics are
SX = E X. (A3)
LPX = /n( IlXj)
-------
= E /n( X. ) . U4)
The subroutine GA.MSITF calculates these sufficient statistics.
Alternatively, SX can be replaced by the arithmetic mean
AM = SX/N . (A5)
A A
The maximum likelihood estimates, k and A, are obtained as the solutions
of the following equations,
^(k) - /n(k) ~ /n(AMi - LPX/N =0 (A6)
k/A = AM , { A7)
where ^(• ) denotes the digamraa function. The procedure is to solve the
first equation for k and then the second equation for A. Note that the
second equation equates population and sample means.
A.2 GREENWOOD-DURAND EQUATION
If we write t = /n(AM) - LPX/N, then t can be calculated from the
data and equation (A6) becomes
k) - inikI ~ t = 0 . (A8)
Greenwood and Durand S1960 1 have obtained the following rational Chebychev
approximations for the solution of equation (A8):
k = UtJ + bt ~ c) /1, 0 < t < .577216 (A9)
k = dt +et+f 577216 < t < 17 , (A10)
t[t +gt+h]
where
a = - .0544274
b - .1648852
c = .5000876
d = .9775373
e = 9.05995
f = 8.898919
g = 11.968477
h = 17.79728 (All)
The maximum error in IA9) is 0.016 percent while that in (A10) is
0.011 percent (exactly twice as large as the maximum errors claimed by
Greenwood and Durand). These approximations are discussed in Bowman and
Shenton (1977) but, unfortunately, with a typographical error in the
-212-
-------
coefficients. We have independently verified that the original
Greenwood-Durand coefficients, as given in (All), are indeed correct.
The range t < 17 corresponds to k > .0513 which covers most
situations of practical data-analytic interest. But for a large-scale
simulation of the type undertaken here, there is a reasonable chance
that the limit t < IT would occasionally be violated. The approximate
solution has accordingly been extended to
I = T * ^n(T) - 27 ~ f<2)/T - s-(3)/T ^ t > 17 , (A12)
1 - 1/T * f (2)T2
whe re
T = t t- T
Tf = 57721 56649 (Kuler's constant)
f(2) = 1.64493 40668 (Zeta function)
f(3) = 1.20205 69032
The approximation is accurate to at least 0.013% and the accuracy increases
as t increases. The subroutine GAMFIT uses these approximations to
solve the Greenwood-Durand equation.
The rest of this section is devoted to the derivation of (A12). The
overall strategy is to obtain the crude approximation
k ~ l/( t+-y) = 1/T (A13)
(t large, k small) and use this as the starting value in a one-step Newton
iteration which yields (A14) after some further approximations. We need
the Taylor expansion of l>k) for small k (Abramowitz and Stegun,
1972, formula 6.3.14):
^-(k) = - 1+k)
= - £ - 1 - f(3)kJ ~ ... . (A14)
Retaining the first term and substituting for ^(k) in the
Greenwood-Durand equation (A8) gives
- r- - /n( k) + t = 0 .
k
Since 1/k dominates /n< k) for small k, we initially ignore ln(k)
-213-
-------
and solve to obtain k = 1/t as a first approximation. Numerical
investigation shows that (A13) is a somewhat improved approximation
(accurate to about 10% when t=17). Next let u = 1/k and
F(u) = - ^(1/u) + /n(1/u) (A15)
= - ^(k) +• /n(k) ,
ao that the Greenwood-Durand equation becomes
F(u) = t (A16)
1/k = u
Applying a Newton iteration to 'A16) with u^ as starting value gives
F(u ) + F'(u )(u-u ) = t
o oo
or
t - F(u )
U = Uo * F'(u )° (Al7)
0
But, from (A14) and (A15),
F(u) s - /n(u) t u + 7 - f(2)/u + f(3)/u2 ~ . . . (A18)
According to (A13), we take u^ = T and substitute (A18) into (A17) which
gives the desired result when terms are retained to second order.
A3 TWO-SAMPLE GAMMA MODEL
Suppose we have independent samples from separate gamma
d i s tr ibuti ons,
Xj, X2, XN from Gamma(k,A)
Y , Y, Yrr from Gamma(k.X) ,
12 n
whose index parameters are assumed to be equal (akin to the equal-variance
simplication of the Behrens-Fisher problem). Thinking of the Xs as the
"before" sample, the Ys as the "after" sample, and using overbars to
indicate "after" quantities, we use p to denote the ratio of means:
-214-
-------
p = ii/n = (k/A) /( k/A i
= A/A . (A19)
The hypothesis of interest is
H : P < p
o o
H : p > p (a = significance level) ,
a o
where p^ is a specified number (pQ = 0.3 for the situation of Tyson's
Site).
We use the terra "full model" to refer to the case in which p is
unrestricted (i.e., A and A can vary independently), and the term
"restricted model" for the case in which p - p (i.e. A = p A). The
o o
likelihood ratio test statistic is given by
LR = -2 ln[L(restricLed)/L(ful1)]
= 2[li restricted ) - /(full)] , (A20)
A A
where L(reslricled) and L(full) are the maximum values of the likelihood
function under the restricled and full models, respectively, and where
A A
I = -/n(L) denotes the negative log-likelihood. For large sample sizes
(N and N), LR has approximately a chi-square distribution with one degree
of freedom. Conventional likelihood ratio tests are two-sided and reject
H when
o
LR > x2(».a)
To Lake into account Lhe one-sided naLure of Lhe test under consideraLion,
we ha'
hold:
we have suggested rejecLing Hfl when both of Lhe following conditions
A
p > 0 (A21a)
o
LR > x2U.2a) . (A21b)
Here p is the maximum likelihood estimate of p - /i//i = A/X under Lhe
full model. Notice that the criLical level in (A21b) is 2a instead of
-215
-------
a.
A. 4 LIKELIHOOD EST I MAT I ON FOR THE FULL .MODEL
The likelihood function for the two-sample gumma model is given by
N r , N
L(k,A,X) = fl Ff I X. ;k,A)l n Ff (y.; k,A*]
i=l ' J j=lL J J
After substituting the pdf from equation (Al) and taking negative
logarithms, this becomes
/(k,A,A) = ~tn[L(k,A,A)]
= (N+N) /nTi k) - (k-1)[LPX+LPX]
-Nk /ntA) + A SX
-Nit /n
-------
A.5 LIKELIHOOD ESTIMATION FOR THE RESTRICTED MODEL
For the restricted model, we have A = X/p and 3X/3X = l/p The
r0 o
negative log-1ike!ihood is given by (A22) and putting dtldX =0 yields
= AM* (A24)
where
AM* = AM ~ — . (A25)
N+N N+N po
The first estimation equations are then
A = k/AM* (A26a)
X = X/pQ . (A26b)
As before, the final estimation equation is obtained by setting dl/dk = 0
and using (A26) to eliminate A and X . The result has the form of the
Greenwood-Durand equation,
^(k) - /n(k) ~ t = 0 , (A26c)
where
t = ^n( AM*) * — tn(p )
N+N 0
-[ LPX + LPX ] /1 N+N) . (A27)
A.6 MAXIMIZED NEGATIVE LOG-LIKELIHOOD
Evaluation of the likelihood ratio test statistic (A20) requires the
maximized negative Iog-1 ike Iihoods under both the full and restricted
models. These are obtained by substituting the appropriate likelihood
estimates for k,A,X into (A22). A common simplified form can be
obtained when we notice that, for both models,
A SX + X SX = (N+Ntk .
-217-
-------
See equations (A2Ja, b) for the full model and equations ( A2-1, A25 ) for the
restricted model. Substituting into IA22) gives
/ = (N+N) AiHk) + (N-rVik - (k-1)[LPX+LPX]
A
- N k /n(i) ¦ N k /nil) , (A28)
valid for both the full and restricted models. The subroutine GAM2LF uses
A
(A28) to evaluate I once the likelihood estimates have been obtained
using GAMFIT.
One final comment may be of interest here. Since the likelihood
estimates under the full model of the population means are the sample
means lA23a,b), it follows that
p = AM/AM
and the rejection condition (A21a) is simply
AM/AH > p
o
A
This is a peculiarity of the gamma distribution; in general p would not
be the ratio of sample means in other distributional settings. We also
note that LR depends upon the other sufficient statistics, LPX and LPX,
as well as the arithmetic means Thus, the likelihood ratio test proposed
here uses information beyond that contained in the ratio of arithmetic
means .
-218-
-------
WPESD1X H
COMPLEMENTS AND POSSIBLE FUTURE WORK
This writeup provides some comments and associated references
supplemental to the paper itself. We also point out some possible
directions for future research. Section 1 is concerned with parametric
approaches while Section 2 deals with nonparametric techniques.
B.l PARAMETRIC METHODS
B.1.1 Other General Purpose Parametric Procedures.
Once a parametric model has been identified, there is always the
option of developing a test protocol specifically designed for and
possibly having some optimality property for that particular model. We
have preferred the likelihood ratio (LR) test because of its
applicability in a wide variety of distributional settings. (Of course,
the implementation is mode I-specific.) Besides the LR test, there are
several other general purpose testing protocols:
(i) Wald-type tests (Wald, 1943; Vaeth, 1985);
(ii) efficient score tests (Rao, 1948; Bera and McKenzie, 1986).
See Buse (1982) and Rayner and Best (1989, Chap. 3) for expository
discussion of these tests. Both are asymptotically equivalent to the LR
test, although Rao (1948) conjectures that the efficient score test may
be locally more powerful than either Wald's test or the LR test (also
see Chandra and Joshi, 1983). The validity of Wald's tests depends on
asymptotic normality and this assumption has to be carefully validated
in each case because it depends not only on the sample size and
distributional model but also on the particular parametrization of the
model. On the other hand, the efficient score and LR tests are both
invariant under parameter transformations, and it appears that the
choice between the two tests can often be made on the grounds of
mathematical convenience. For example, score tests require the likelihood
estimates under li but not under H.; offsetting this, the information
0 A
matrix has to be calculated for score tests. We typically find that
score tests are more convenient than LR tests when H and H. differ
o A
by many degrees of freedom (i.e., when there are many parameters under
test). Only one parameter is under test for cleanup evaluation, and
score tests do not seem to offer any major convenience over LR tests in
this context. But, it may be of interest to compare the power of the
two tests — particularly in light of Rao's conjecture.
B.l.2 Gamma-Specific Tests.
Several tests have been developed specifically for the two-sample
-219-
-------
gamma problem:
(i) uniformly most powerful unbiased (UMPU) test (N. Nagaraj,
personal communication),
(ii) F-test (Shiue and Bain, 1983);
(iii) z-test (Woinsky, 1972).
The existence of the UMPU test is a simple consequence of a general
theorem for multiparameter exponential families (Lehmann, 1986, pp.
145-150). As discussed in the Appendix to the paper we have not pursued
distribution-specific optimal tests since these tests are rarely
available and our focus was on protocols that could be applied for a
wide variety of parametric models. Even so, it is certainly desirable
to determine the UMPU power curve for the two-sample gamma problem, if
only to compare it with the likelihood ratio power curve. The
difficulty is that the small sample conditional critical values for the
UMPU test are intractable so that approximations would have to be
developed and (the hard part) validated.
The P-test of Shiue and Bain (1983) rests on the fact that, for the
two-sample gamma problem, X/Y follows an P-distribution whose degrees
of freedom depend on the unknown index parameter k. Shiue and Bain
give adjustments to the F-critical values that are intended to
approximately correct for the estimation of k by maximum likelihood.
We have not investigated the power of their test as compared with say
the LR-power because the test is so specific to the two-sample gamma
problem and because their approximations appear to be somewhat
unsatisfactory for small k.
Woinsky's z-test is based on the asymptotic normality of log(X/Y).
In this respect, the test is essentially of the Wald-type. Although
motivated by the two-sample gamma problem Woinsky points out that the
asymptotic normality is distribution-free and the test can be regarded
as general purpose, at least asymptotically. For this reason, and
because the test is so ea«y to apply, it would be very desirable to
assess the normal approximation and to compare the test's small sample
power with that of the LR lest.
B.1.3 Robustness of Parametric Procedures.
Although parametric procedures such as LR tests often have
excellent power characteristics, this performance is sometimes achieved
at the price of extreme sensitivity to departures from the
distributional assumptions. It would be instructive to assess this
sensitivity of this LR test for some plausible distributional
departures. The sensitivity of the z-test and of the UMPU test
described in the previous section might be assessed as the same time.
-220-
-------
Certainly, the LTMPU test is expected to be more sensitive than the LR
test, but it would be useful to document the degree of sensitivity.
This could be a strong argument against the use of optimal tests, if the
sensitivity of the ITMPU test is particularly extreme.
For Tyson's Site, a mixture model would appear to be a particularly
plausible alternative to the gamma distribution. A mixture is
compatible with the histogram and could be accounted for physically by
two separate regimes — one of high contamination and the other of lower
contamination.
B.1.4 Goodness-of-Fit.
We did not carry out a formal goodness-of-fit test for the gamma
distribution. This is something that should be done particularly in*
view of the potential sensitivity of parametric procedures to
distributional assumptions. The most straightforward test is Pearson's
X . For all but the simplest models, numerical integration will be
required to obtain the cell probabilities. Also, maximum likelihood
estimators should be obtained from the grouped data, requiring
additional numerical integration. Alternatively, if likelihood
estimates are obtained from the ungrouped data, there is a partial
recovery of degrees of freedom and the test statistic can only be
2 *
bounded between an^ *K-1 w^ere K is the number of cells and
p is the number of parameters (Chernoff and Lehmann, 1954). But the
need to group the data and the dependence of test outcome on the
2
grouping is the most discomfiting feature of X tests.
Neyman (1937) has developed a class of goodness-of-fit tests that
avoid the need for grouping. These have come to be called Neyman smooth
tests. The idea is to embed the distribution under test into a larger
parametric family in which the additional parameters are coefficients of
an orthogonal system of polynomials. An efficient scores procedure (see
Section C.l.l) is then applied to test the null hypothesis that all the
additional parameters vanish. As originally developed by Neyman, the
test was applicable to simple hypotheses only (i.e., to goodness-of-fit
tests of distributions that do not involve any unknown parameters). The
procedure has been extended, in several ways, to handle composite
hypotheses. Rayner and Best (1989) give an excellent and up-to-date
survey of the entire area. Unfortunately, their examples generally
involve relatively simple distributions, such as normal and exponential.
It would be useful to develop the Neyman smooth test for the gamma
distribution both to document and assess the computational difficulties
that may be encountered with nontrivial distributional families and to
2
compare the power of Lhe procedure with that of the X test. Note
that some work in this direction has been carried out by Bargal (1986),
at least for censored samples.
221
-------
B.1.5 Bartlett Adjustmenl.
We have used simulation to validate and correct for the critical
value in the likelihood ratio (LR) test. There is also available an
analytical correction, known as the Bartlett 11937) adjustment, which
consists of multiplying the LR test statistic by a constant chosen so
that the modified statistic has the same expected value as the
chi-square distribution. Formulae for the correction factor are quite
involved and can be found in Lawley (1959). More recent work includes
Barndorff-Nielsen and Cox (1984), Barndorff-Nie1 sen and Hall (1988),
Cordeiro (1983), Jensen (1986), McCullagh and Cox (1986), and Ross
(1987).
There is considerable theoretical and empirical evidence that the
Bartlett adjustment is quite effective, at least for continuously
distributed data. (The adjustment may not result in an improved error
rate for problems involving discrete data; see Frydenberg and Jensen,
1989). Unfortunately, the correction factor can be difficult or
impossible to calculate in many cases (Ross, 1987), which is why we have
preferred simulation as a more generally applicable technique for
adjusting critical levels However, it should be possible to obtain the
Bartlett adjustment for the particular example of the two-sample gamma
distribution; comparison with the empirical adjustments obtained in the
text would be of interest in documenting the effectiveness of the
Bartlett adjustment.
B.1.6 Sensitivity of Power Curves to Estimated Nuisance Parameters.
In general, power curves depend on the unknown values of the
nuisance parameters in the problem and one has to approximate the true
power curve by using estimates for the nuisance parameters. The power
calculations in Sections 6 and 8 of the paper are no exception. They
depend on the index parameter k and we have used the estimate k=.32
obtained from the pre-remediation data. It would be of interest to
determine how the power curves (and also the critical point adjustments)
vary with k, at least for those k within the range of reasonable
estimation error.
B.2 NONPAKAMETR1C METHODS
B.2.1 Adequacy of Normal Approximation for Savage Test.
To our knowledge, the only tables of critical values for the Savage
test of those of Hajek (1969) for sample sizes not exceeding 10.
Hajek and Sidak (1967, p. 151) give a recurrence relation that could in
principle be used to extend the tables to larger sample sizes; but it is
unclear how far the tables could be extended in view of the limitations
of storage and execution time. If the Savage test is to be widely used,
-222-
-------
it would be helpful to provide some guidance in regard to the adequacy
of the normal approximation for various combinations of the two sample
sizes. Here it should be noted that Cox (1964) has given an
approximation involving the F-distribution and Bickel (1978) has
indicated how Edgeworth expansions may be used to improve upon the
normal approximation for linear rank statistics. Also, see Does (1983,
1984) and Koning and Does (1988).
B.2.2 Power Curve of Generalized Savage Test.
Section 9 of the paper develops a score generating function
involving the inverse of the incomplete gamma function. The
corresponding rank tests might be referred to as generalized Savage
tests. Inverse incomplete gamma functions are sufficiently complicated
that generalized Savage tests are unlikely to be adopted in routine
practice. But it would be useful, on a one-time basis, to obtain the
power curve for the case k=.3*2. This would establish an upper bound on
how closely linear rank tests can match the performance of LR tests.
Also, if, as expected, there is little difference between the power of
the Savage test and the generalized Savage test with k=.32, then there
would be considerable justification for the routine use of the Savage
test in situations involving short-tailed J-shaped distributions. It
may also be noted that the case k=l/2 is amenable to calculation since
the inverse incomplete gamma function is then expressable in terms of
the inverse standard normal distribution functions. Calculation of the
asymptotic relative efficiency, in the sense of Pitman, suggests that
there is little to be gained by using the linear rank test for k=1/2
instead of k=l. See James (1967) and Woinsky (1972).
¦ B.2.3 Rank Tests for More Complicated Alternatives.
The theory of Section 9 of the paper applies the very simple case
of scale slippage alternatives. With only preremediation information
available at Tyson's Site and without a detailed engineering
characterization of the remediation process, it was ncessary to work
with some such simple and approximate class of alternatives. With
additional information, it might be possible to work with more specific
alternatives and, presumably, to develop more sensitive rank tests. For
example, Gilbert and Simpson (1990) have considered mixture-type
alternatives. In particular, when both samples are available, it may be
feasible to use the data to help identify the class of alternatives.
The book by Behnen and Neuhaus (1989) gives a systematic account of such
data-driven methods.
REFERENCES
Bargal, A. I. (1986). Smuoth tests of fit for censored gamma samples.
Conmuit. Statist. - Thcor. Ueth., 15, 537-549.
-223-
-------
Barndorff-NieI sen, 0. E. , and Cox, D. R. 11984). Bartlett adjustments
to the likelihood ratio statistic and the distribution of the
maximum likelihood estimator. J. R. Statist. Sue., B., 46, 483-495.
Barndorff-Nie1 sen, 0. E. , and Hall, P. (1988). On the level-error after
Bartlett's adjustment of the likelihood ratio statistic.
Hume1 r i ka , 75, 374-378.
Bartlett, M. S. (1937). Properties of sufficiency and statistical
tests. Proc. Royal Site.. 4., 160, 268-282.
Behnen, K., and Neuhaus, G (1989). Rank Tests with Estimated Scores
and Their \pplication. Teubner, Stuttgart.
Bera, A. K., and McKenzie, C. R. (1986). Tests for normality with
stable alternatives J. Statistics. Comp. Stmiil . , 25, 37-52.
Bickel, P. J. (1974). Edgeworth expansions in nonparametric statistics.
-4/1/1. Statist., 2, 1- 20.
Buse, A. (1982). The likelihood-ratio, Wald, and Lagrange multiplier
tests: An expository note. .4mer. Statist., 36, 153-157.
Chandra, T. K., and Joshi, S. N. (1983). Comparison of the likelihood
ratio, Rao's and Wald's test and a conjecture of C. R. Rao.
Sankhya, A. 45, 226-246.
Chernoff, H. , and Lehmann, E. L. (1954). The use of maximum likelihood
2
estimates in x tests for goodness of fit. Ann. Math. Statist.,
25, 579-586.
Cordeiro, G. M. (1983). Improved likelihood ratio statistics for
generalized linear models. J. Roy. Statist. Soc. , H, 45, 404-413.
Cox, D. R. (1964). Some applications of exponential ordered scores.
J. Roy. Statist. Soc.. H, 26, 103-110.
Does, R. J. M. M. (1983). An Edgeworth expansion for simple linear
rank statistics under the null-hypotheais. Ann. Statist., 11,
607-624.
Does, R. J. M. M. (1983). The asymptotic behavior of simple linear
rank statistics. Statist. Seerland., 38, 109-130.
Prydenberg, M., and Jensen, J. L. (1989). Is the 'improved likelihood
ratio statistic' really improved in the discrete case?
Hiometrika, 76, 655-661.
Gilbert, R. , and Simpson, J. (1990). Environmental monitoring and
statistical sampling. Paper presented at U.S. EPA Workshop on
-224-
-------
Superfund Hazardous Waste: Statistical Issues in Characterizing a
Site, Arlington, VA. February.
Hajek, J. (1969). ^ Course in Hon parametric Statistics. Holden-Day, San
Franc i sco.
Hajek, J., and Sidak, Z. (1967). Theory of Rank Tests. Academic Press,
New York.
James, B. R. (1967). On Pitman efficiency of some tests of scale for
the gamma distribution. Proc. Fifth Berkeley Symposium, Vol. V.,
pp. 389-393.
Jensen, J L. (1986). Similar tests and the standardized log likelihood
ratio statistic. Hiometrika, 73, 567-572.
Koning, A. J, and Does, K. J. M. M. (1988). Approximating the
percentage points of simple linear rank statistics with
Cornish-Fisher expansions (Algorithm AS 234). Appl. Statist., 37,
278-284.
Lawley, D. N. (1956). A general method for approximating to the
distribution of likelihood ratio criteria. Biometrika, 43, 295-303.
Lehmann, E. L. (1986). Testing Statistical Hypotheses. 2nd Ed. Wiley,
New York.
McCullagh, P., and Cox, D. R. (1986). Invariants and likelihood ratio
statistics. Ann. Statist., 14, 1419-1430.
Neyman, J. (1937). "Smooth" tests for goodness of fit. Skand.
Aktuarlettdskr., 20, 150-199.
Rao, C. R. (1948). Testa of significance in multivariate analysis.
Hiome t r i ka, 35, 58-79.
Rayner, J. C. W., and Best, D. J. (1989). Smooth Tests of Goodness of
Fit. Oxford, New York.
Ross, W. H. (1987). The expectation of the likelihood ratio criterion.
Inl. St atist . Rev., 55, 315-330.
Shiue, W. K., and Bain, L. J. (1983). A two-sample test of equal gamma
distribution parameters with unknown common shape parameter.
Technometrics , 25, 377-381.
Vaeth, M. (1985). On the use of Wald's test in exponential families.
Int. Statist. Rev., 53, 199-214.
Wald, A. (1943). Tests of statistical hypotheses concerning several
parameters when the number of observations is large. Trans. .Amer.
-225
-------
Math. Shi' . , 54, 426-482.
Wilcoxon, F. , Katti, S., and Wilcox, R. ( 1970). Critical values and
probability levels for the Wilcoxon rank sum test and the Wilcoxon
signed rank test. In Selected Tables in Mathematical Statistics,
Vol. I, H. L. Harter and D. B. Owen, eds. Markham, Chicago, pp. 171-259.
Woinsky, M. N. (1972). A composite nonparametric test for a scale
slippage alternative. Ann. Hath. Statist., 43, 65-73.
-226-
-------
Session 6: Evaluating the Attainment of Cleanup Standards
DISCUSSION
Herbert Lacayo
U. S. Environmental Protection Agency, PM 223
401 M Street
Washington, DC 20460
You said something that interests me very much. The business of the consent
decrees and the ROD and having some specified statistical language which at
present is lacking.
It's close to lunchtime and I'll try to keep myself fairly short. I was excited
when I looked at the paper of GP and Chuck because I remember my graduate school
days prettv well because they weren't all that long ago and I recognize this
part—that part—that part—everything except the fortran programming. But, I've
been given all the parts but never, never have' they been put together. This can
be made into a case study manual for agency statisticians. What you might want
to do in the case which always arises where you don't have normality - what do
you do then? Are you stuck? Do you have to do nonparametric work? Or, can you
perhaps capture some of the nice parts of classical statistics. It would take
a lot of effort. There is one fly in the ointment which is, despite the
intensive computer work, even more computer work is required to see what happens
if you add a data point or subtract a data point. You have to be sure that your
results aren't very sensitive to small inputs. That's the only fly in the
ointment. But, otherwise it's great.
The last comment that I have is that lawyers are very important to EPA as my
colleague, Barnes, was talking about. When the rubber hits the road, it's
usually a lawyer that's driving. Lawyers have to understand what we are doing
because they are charged with the responsibility of getting everyone together.
I was lucky to have the experience of being in one of these meetings with a
responsible party, the EPA people, and it's a big huge array of people. You get
the responsible party and their contractor plus two lawyers. You get the EPA
people involved and their one lawyer* usually. I didn't realize, since we have
so many lawyers around here, that they went into court..they don't. It's the
district attorney or assistant district attorney. So, there is an assistant
district attorney and maybe a collaborator there who are in on the meeting when
you meet formally with a responsible party. Then the state that is involved,
there are some state (our counterparts) involved and in addition to them there
is a state attorney who is just sort of listening in too. Now, if these people
don't understand clearly what we are doing, they are going to consider these guys
are just an added irritation—we don't need you. I can't really blame them.
They don't understand what we're doing.
The paper of GP and Chuck simply says this. To me it is obvious when I read it.
If you care to do interim sampling, this is a fairly nice defensible way of going
about it if you want to use parametric methods. It says no more and no less than
that. The paper will appear in the proceedings and it will be clear that that
is the top line, the middle line, and the bottom line of that.
-227
-------
Session B: Evaluating the Attainment of Cleanup St.mdards
\ ^ T c;r*Tfcc
Barnes Johnson
S. Environmental Protection Agency
f: '••• of Sol id Wr,OS -!U 1 '
401 M St reef
Washington, DC 204RO
My comments fall into three areas. First, I have some general comments about the
paper that. I would like to share with you and then I want to talk in general
about some other application suggest ions that I might throw on the table. My
second category of comments fall into one of the themes that we have had through
the workshop: future research needs and other application suggestions. Finally,
I want to ;>« • t ,;i :iv -;c. ipbo\ 1 1 ! talk about t h i;j work in the larger policy
<<>i]t>--:t and add wow p'-rspecf ive to t!i: j :;e:>:> i on and what it i«.- ijvs because I
think the implications are immense.
With respect to ::iy general comments on the paper, I enjoyed GP and Chuck's paper
and I would encourage- all of you to obtain a copy. Tin.- paper io highly
readable; its fundamental points are clear. I applaud GP and Chuck in that:
re:;pent. I noticed that much of the discussion after the presentation emerged
because the authors are dealing with a real problem that has real constraints.
The most difficult part of any statistical application is trying to make the
methods work :ji:d--r "field" conditions. CP and Chuck were facer) with application
constraints and although there are questions about the averaging across depth and
1 !.«• r,,i 'mi if the distribution they « hose to us-- and so on, there are ether '.hia-'y
wli i ch pert i! n I o the applicat ion which T think OP and Chuck did an •»¦(•! lent juj,
of carrying through.
T'.'.o additional general observations that I would like to make follow on to my
comment that I made yesterday. One is, the work of CP and Chuck in the area of
modeling the g;irama distribution, that work basically purchased power. To
determine if the Savage scores werp the appropriate nonparamet ri c procedure there
had to he information about the gamma. The parametric power was purchased by the
gamma modeling. When you work in the parametric arena often what you are buying
by being able to impose a model is power. However, it would not pay the
responsible party (RP) to go through this analysis..to pay somebody to do this
modeling..to buy this power; because it is not in the best interest of the HP to
demonstrate that the technology is not working. So, we need people at EPA who
can do what GP and Chuck do if we are going to have this null and alternative
hypothesis set up in.this way.
The other thing that I would like to point out, and GP and Chuck make reference
to it in the paper but didn't really discuss it here. A lesson relevant to
environmental hazardous waste sampling is once you make a decision about the size
of your background sample, you put an upper cap on the amount of power that you
can get. This follows from a general principal of experimental design. Balanced
designs are normally the most powerful. Low fixed initial background sample
-228-
-------
n i zcs will -j > powi.-rfa! harrier. So, when argument s are made at sites
that only two or three background samples are needed, problems are certain to
arise i 'h«- object i w is to compare background samp I ing with site sampling. Tn
the simplest of t^rms, you must pay just as much attention to background sampling'
as to site sampling if your game is to get ns much power as possible.
With these comments about the paper, I want to move onto the application
suggestions that I have. I would like to direct this suggestion to GP, Chuck and
also t o Dick Gilbert in the work that he mentioned yesterday. This problem was
one of looking at attainment, of ar and other re 1 ated cngi neer i ng
informal ion, then one could decide whether the technology is actually working or
not. Rather than having this trigger of 70% in a testing type problem, could we
put this into the arena of an estimation problem. I.et's estimate what the
percent removal has been after a period of time. Rut, of course, this is only
going to apply to kinds of technologies like vacuum extraction or some other soil
remediation technology that takes place over a continuum and can be monitored so
that is another constraint. By vi
-------
!"nkv tonks, the- 1 arrest fact ion of t hem will be due to ingestion of polluted
wat.i'i's. The principal ory'ani <• compounds found in leachut.es are small chlorinated
hydro- , b<•?;•••;, toluene and xylene. " i :n e World War IT, more than 40
billion pounds of small chlorinated hydrocarbons have hnnn distributed, of which
10 billion pounds or more have been incorporated into waste sites. Much of that
has already been leached. Vast suisu of in. sriey will be spent in removing
cont aminanti from j'Toundwat er. The cost will be determined bv the e t'fect iveness
of t he technologies emp 1 oyed. Part, icularly premising is b ioroinediat ion. The
costs also will be determined by the answers to t lie questions .it what level of
contamination is toxicity negligible. How clean is clean.'
This work expands the how clean is clean issue into an isuuo of. . V>U, we get the
standard.,we figure out how clean is clean and the next question is how do we
measure it and how certain do we want to be about that measurement. It's these
last, two questions: how do we measure (how do we .sample" arid hew certain do we
want to be with respect t() that decision t hat are ai .o equally important to this
who 1e how i Jean is clean issue.
So, I think this is highly important work in this area of evaluating the
attainment of cleanup standards. I think it's one of the key measures, the key
tools that the Superfund program is going to need to be able to move away from
sites, to be able to call them clean, and I think it also formally recognizes and
underscores the fact that we aren't going to be one hundred percent confident
that sites are one hundred percent clean.
-230-
-------
Session 6:
Evaluating the Attainment of Cleanup Standards
DISCUSSION
Jeri Weiss
U.S. Environmental Protection Agency
Region I HPR-CAN1
JFK Federal Building
Boston, MA 02203-2211
I work in the Waste Management Division's Superfund Support
Section in Region I of the EPA. The support section provides
technical assistance on health risk assessments for the Superfund
program. This presentation will provide a regional perspective
on our attempts to implement some of the approaches identified in
the "Methods for Evaluating the Attainment of Cleanup Standards,
Volume I: Soils and Solid Media." I would first like to qualify
this talk by noting that I am not a statistician and this is a
work in progress and should not be construed to represent
Regional policy.
Until recently, most of Regions I's efforts have focused on
developing risk based cleanup levels which are protective of
human health and the environment. These values are then
incorporated into the Record of Decision (ROD) and Consent Order
without any discussion of attainment criteria to be used to
demonstrate compliance. Because we believe we are leaving
ourselves in a vulnerable position, EPA Region I's Risk
Assessment Workgroup has been examining how the attainment of
cleanup standards could be determined after remediation has
occurred.
We began by reviewing the "Methods for Evaluating the Attainment
of Cleanup Standards, Volume I: Soils and Solid Media." Our goal
was to develop a decision rule or an applicable criterion that
could be incorporated into the ROD to determine whether a site
has attained the cleanup standard. Initially, we took the naive
approach that we could choose a single criterion that would be
used for all sites such as, the 95th upper confidence limit of
the mean or one of the multiple attainment criteria mentioned in
the "Methods for Evaluating Attainment" such as, most of the soil
concentrations are below the cleanup standard and that the
concentrations above the cleanup standards are not too large.
We set out to test possible decision rules with real data from
sites in our region. Having only a few sites in the region which
have undergone remediation we choose two sites for case studies.
Both of these sites had undergone remediation and had
confirmatory sampling. Rather than presenting the details of our
analyses, I would like to present some of the lessons we learned
using real data and the resources available in the region.
-231-
-------
The first lesson was the practical problems of attempting to do
any simple statistics in the region. There was no software
program that was easily accessible that allowed us to do
descriptive statistics. The data for our case studies were only
available in hard copy. The Waste Management Division also lacks
access to statisticians or those with statistical expertise.
With my experience from this conference, it is possible that I
will be able to expand my search for statistical expertise and
software.
The second lesson that our work group learned first hand (which
is apparent to all statisticians) is that the environmental data
is not normally distributed and most of our tests of attainment
assume normality. From this we learned the importance of viewing
the descriptive statistics as well as the geographical
distribution of the d?ta. Simple displays the distributions of
the data with histograms and box plots provide valuable
information.
The third realization is that the choice of an attainment
criterion should be influenced by the health effects of the
contaminant in question. Localized areas of high contamination
should not be permitted for substances with acute effects while
they may be of lesser concern for compounds with chronic effects.
Conversely, measures of central tendency may be suitable criteria
when only chronic effects are of concern. Thus it is not likely
that a single criterion will suffice when dealing with many
pollutants having acute or chronic (or both) effects.
Lastly, we realize the importance of how sampling plans, sampling
design, and sampling size strongly influence our ability to draw
conclusions about the data. We should not leave these factors to
chance as is currently done in the RODs. Therefore, our next
step should integrate the sampling design, sample size or power
requirements into the RODs to reduce this element of chance and
vulnerability we felt initially. We welcome suggestions on how
to try to implement cleanup standards and improve the statistical
expertise in the region given our limited level of resources.
-232-
-------
COMMENTS BY PARTICIPANTS
On Pa-til and Taillie
Cynthia J. Kaleri (U. S. Environmental Protection Agency): This is a very
important concern as more and more sites are entering the construction phase of
the Superfund process. However, several factors should be considered in
evaluating the effectiveness of a technology following implementation. If, in
the example of in-situ soil gas remedy, the purpose of taking discrete depth
samples in soils was to evaluate the interim effectiveness of the remedy,
then individual depth statistics might prove more valuable than average
concentrations taken down to bedrock. This type of work would help the decision
maker first evaluate if the design of the remedy can be modified to improve the
effectiveness of the remedy before evaluating the success/failure of the remedy.
Conclusion: If the data is there, try and use it effectively, based on Data
Quality Objectives... my soapbox for all sessions. A statistician would be of
great value on a Regional level since I believe that he should be involved
throughout the Superfund process...as other experts are...up close and personal
with individual sites and their specific Data Quality Objectives.
Roy L. Smith (U. S. Environmental Protection Agency): The histogram for the
Tyson's data appeared bimodal, suggesting the presence of at least two
statistical populations. Fitting the gamma distribution to these data, as if
they were a single population, may have been inappropriate.
Mindi Snoparsky (U. S. Environmental Protection Agency): Input from a
hydrogeologist would have provided the necessary information needed to pick a
specific stratum as a monitoring point. This would have been preferable to
averaging the values from each well. It is not a good idea for a statistician
to work in a vacuum!
Additionally, if the statistician was provided with a basic explanation of how
the remedy was supposed to work and the problems associated with it, he would
have been able to use his creativity in a more realistic manner.
Jeri Weiss (U. S. Environmental Protection Agency): It appears that the
underlying hypothesis is that the distribution of the data (i.e. gamma) prior to
the remediation will be the same after the remediation has begun. Is this an
appropriate hypothesis? Is it possible that the remediation might change the
distributon of the contamination data after 70* of the contamination has been
removed?
David Schaeffer (University of Illinois): I suggest that you examine
monitoring data EPA has, for example, in STORET. There are millions of
On
-233-
-------
observations on over 100 parameters. Some data goes back for 20 years. If you
can work out statistical methods for this random data, they should have higher
power when sampling programs art: designed, if you don't find a data set with a
distribution you like, just try another set.
Around 1980, EPA collected data on Lake Michigan at several depths for an
extended time period and with large spatial coverage. Unfortunately, I don't
have details - perhaps Phil Lindenstruth (STORET) can help.
I an not familiar with EPA's air monitoring database, but this should also be
looked at. Cities such as Chicago collect data daily.
Also, there is apparently a large multilevel air monitoring grid at Dugway -
again, I don't know how much data is available.
There needs to be a progression, perhaps:
1. work with real data to learn the characteristics of such data;
2. develop statistical methods for risk, sampling, etc.;
3. test methods with some available sets, make corrections, and predict, using
new sets;
4. apply to superfund sites - do models fit? Why not - what data
characteristics have changed? etc.
Consider using census data to e.g. simulate composite sampling. For example
a "composite" can be represented by a family, a "presence" might be a child under
6 years of age or a girl < 6, etc. Various spatial scales are available, and
numerous types of measurements are available.
-234-
-------
Session 7: Statistical Issues and Approaches for the
Characterization and Remediation
DISCUSSION
John Warren
Office of Policy Planning and Evaluation
Statistical Policy Branch
U. S. Environmental Protection Agency, PM 223
401 M Street, SW
Washington, DC 20460
One unusual characteristic of the Environmental Protection Agency is the
independence of its constituent offices, and it is therefore axiomatic that the
Office of Policy, Planning, and Evaluation can only give advice to the Office of
Solid Waste and Emergency Response in a fraternal manner; this conference
epitomizes the relationship this Agency needs. To paraphrase, how do we get the
valuable information discussed at conferences and workshops such as this to those
who really need it? There are several ways that are worth discussion.
Probably the most positive way is through sponsoring conferences and workshops
that bring together theory and practice. In this particular instance of
Superfund site characterization, the theoreticians are statisticians developing
new or improved methodologies, and the practitioners are the managers and
decision makers at present operating with minimal guidance in statistical
techniques.
Although we may be agreed that a dedicated conference or workshop is the most
productive method of moving the Agency along the path of good science and
informed decision-making, it is rare that, it can be done on a consistent basis
due to problems in funding and timing. Other conferences that have substantial
elements of Superfund related statistics should be encouraged to include
environmetrics and environmental statistics as a regular event.
The Annual Conference on Statistics, sponsored by the Statistical Policy Branch
for Agency personnel, has always had a substantial applied Superfund content.
The seventh conference, March 1991, Richmond, Virginia, will continue to devote
substantial resources to Superfund site characterization. The Annual (summer)
meeting of the American Statistical Association (ASA) has a large Superfund
component within those sessions sponsored by the Statistics and the Environment
Section. Given the popularity of the sessions at the 1990, Anaheim, meeting, it
seems future sessions will be expanded to encompass a larger audience. The ASA
Winter 1991 Conference in New Orleans, Louisiana, January 3-5, has the
environment as its theme and will have one session on Superfund site
characterization and several other sessions containing Superfund-related
methodologies.
Holding conferences and workshops, and most importantly, making the results of
-235-
-------
the meetings available for circulation, is the least, abrasive way to disseminate
valuable information. What about other tried and true methods of bringing
information to those who need it the most? The most common is through the
written word; Scientific publications and Agency guidance.
But there is a potential problem. Unless the written material precisely
addresses the problem in hand, there is a tendency to put the volume up onto the
shelf and leave it to collect dust. The line we must walk is extremely chin for
on one side is the area "not relevant to my problem," and on the other "too
technial for my immediate use."
This gives us something to aim for—the line. It will be very challenging because
in addition to balancing relevance and technicality is the balancing of timing-
and understanding. Timing is crucial to a manager as the luxury of pondering
decisions and investigating all alternatives is seldom feasible: decisions must
be made in days, not months. Understanding the technical complexity of a
statistical technique is very important because it is the ease by which we write
guidance and advice that leads to good, firm decisions. In guidance documents
we must strive to make the material so appealing to a decision-maker that the
infamous phrase "I'll give it to the contractors and read it later at my leisure"
is rarely heard.
Having agreed that we must communicate clearly to managers, how do we explain one
of the tenets of statistics; the validity of models? As a part of the scientific
community, we statisticians have a tendency to take the concept of modelling for
granted. One of the first things we must get across to decision-makers is that
models are only an approximation of reality. To imply a model is an exact
representation of a physical process is precisely GP's Type III error—something
that is all too common in situations where decisions must be made on processes
that are difficult to fully comprehend. We owe it to our clients, the decision-
makers, to fully explain what a model implies and, equally as important, the
assumptions necessary to make the model a reasonable approximation to the actual
physical process.
An example of this is contuined in Evan Englund's presentation where he discussed
how making different assumptions in constructing and operating kriging algorithms
resulted in different interpretations, and by implications, different decisions*.
Decision-makers need to know the sensitivity of the process to perturbations*
even though they may not really want to! *
Again, a word of caution. We can not simply state the key assumptions without
explaining what they mean for this would lead to confusion in the decision making
process. As statistical advisors, we must avoid glib scientific phrases and at
all costs eschew Evan Englund's statistician, who announced, "It's mine unci T
only give it to those who understand the creed."
The advent of personal computers and canned programs that perform multiple
statistical analyses have simultaneously helped and hindered our role as
statistical consultants. The help that quick, affordable and accurate
statistical routines have given us is well documented, but the hinderance due to
-236-
-------
the misinterpretation of packaged programs (GP's Type IV error) is often hidden.
We must be prepared to spend valuable time with the decision-maker to prevent
misapplication of statistical techniques.
Future Direction
We take it for granted that as the problems confronting the decision-makers
become more complex, so must become the solutions we statisticians develop.
Unfortunately many decision-makers try to disagree! "Give me an all-purpose
statistical test," is what they demand, we must be careful not to accede to this
request. There are too many instances, of the standard t-test being used
inappropriately for, as the saying goes, "If you only have a hammer, then every
problem becomes a nail." We must be very cautious when supplying hammers.
To help the decision-maker, we must develop methods that address some of the
situations that call for other than standard methods. For ex.unple, translating
risk-based standards into false-positive and false-negatives, not an easy problem
to resolve and one that will call for tremendous interaction and effort between
statisticians and risk-assessors. Another problem to be resolved is how to help
a decision-maker reach a conclusion when faced with a simultaneous standard.
These standards ask for both the mean and certain order statistics to meet fixed
values with various probabilities without specifying the nature of the
distribution.
In the area of sampling, a manager or decision-maker is often forced to operate
under very restrictive budget and resource conditions and often wants to "cut
corners" in data collection. We have to provide guidance on how to incorporate
nonrandom data (found, or encountered data) with data collected under a
probabilistic sampling scheme. We must also be prepared to advise on how to
compare data sets collected with different quality assurance protocols, be able
to advise on the number of reference samples, quality assurance samples, field
samples, duplicate samples, reference samples, etc.
Finally, we must establish a two-way dialogue with decision-makers such that they
can clearly articulate their problem, and rely on us to develop an answer they
can understand; as Dick Gilbert said in the opening session, "Communication is
the key to successful application of statistics."
237-
-------
Session 7: Statistical Issues and Approaches for
Characterization and Remediation: A Discussion
Susan Braen Norton
Office of Research and Development (RD-689)
U.S. Environmental Protection Agency
401 M Street, SW
Washington, DC 20460
Samples of ^ufShlt?he
used to evaluate tvg what is the potential magnitude,
extent of Uosure to contaminants. This
frequency, and duration^ ^ xp^ question and will briefiy
discussion will ues considered when designing the collection
^ra^lySii of ^i™taTdata for use in exposure assessment.
- conducted at sites usually are combined
Exposure assess estimate the health risks associated
with toxicity C°aietSe In order to collect the most
with cof^®1ilaata1°for exposure assessment, then, it is important to
appropriate Stimates will eventually be used m the risk
consider how e^osure estimates^! operating procedure for
assessment process. chronic, noncarcmogenic risks
assessing oa'""°9en" aaily dose estimates with toxicity
involves combining %osea for noncarcinogenic effects, and
values (i.e., Rer noaenic effects). Inherent in basing risk
Slope f^ °"tt° avara„e daily dose are the assumptions that the
estimates on the. average ^.nt Qn tha apeoi£io pattern of
toxic effects there are no age-specific differences in
exposure, and th . cokinetiC models can consider the pattern
sensitivity. Some' P d attention is being given to age-specific
risXs^OSUHowever, most routine assessments currently use average
daily doses to estimate risks.
Exposure to contaminants at sites, expressed as dose, is
1 hv multiplying the concentration term by intake rate and
calculated by mult p y dJvidi by body weight. Average daily dose
exposure duration, exposures occurring during an exposure
is calculated by warning ^ t^al number of dayg in the period,
period and dividing y chemical/kg body weight-day. Two common
resulting m axnosure assessment are that the concentration
assumptions used_i xp ^ individual is independent of the intake
of chemical oontact * t t body weight ratio is constant
rate, and that toe intake rate t^ y^ ^ considered tQ b-
over time. If «»««r ™°daUy <£se will be proportional to th«
reasonable, then th „trat
-------
temporally-averaged concentration by using another series of
assumptions. In a special case, which is particularly useful for
assessing contaminated soil, the assessor assumes that an
individual will randomly contact the contaminated medium within a
particular area. In this case, the concentration averaged over the
randomly contacted area can be used to estimate the average
concentration contacted over time.
When a spatial average is used to estimate a temporal average,
it is important to carefully choose the area included in the
calculation. One possible approach is to first outline the
contaminated area, and then consider activity patterns that may
overlap with or be contained within that area. For instance, if a
recreational or trespasser scenario is being considered, an
individual might randomly contact a fairly large area over time.
On the other hand, if a residential scenario is being assessed, an
individual might may randomly move over a smaller area, for
example, the size of a house lot.
There are several statistical methods that can be used to
calculate the spatial average from data collected randomly or on a
grid. In some cases, it may be desirable to use the best estimate
of the average concentration in the exposure assessment. In other
cases, when a conservative estimate of exposure is desired, an
upper confidence limit on the average may be more appropriate.
Using an upper confidence limit will ensure that the true (but
unknown) arithmetic mean will not be underestimated very often. An
additional advantage is that the uncertainty in sampling is
quantitatively considered in the exposure estimates. As the sample
size increases or variation in the samples decreases, the upper
confidence limit on the average decreases.
In addition to assessing exposures at sites, these issues can
help frame questions posed during the remediation process.
According to the above reasoning, soil target concentrations
developed using risk assessment would be associated with an
averaging area. Then, the efficacy of remedial activities can be
evaluated by determining whether the average concentration across
the area exceeds the target'concentration.
In conclusion, assessing exposure at sites may require a
different data collection and analysis strategy than that used to
determine the extent of contamination. Long term risks are
routinely calculated based on estimates of average daily dose,
which is proportional to the arithmetic average concentration
contacted over time. When spatial averages can be used to estimate
the temporal average, data collection and analysis should consider
areas over which contact can be assumed to randomly occur. By
considering how data will be used to estimate exposure and risk, we
can increase the utility of data collected at sites, develop the
most appropriate statistical techniques for summarizing the data,
and improve exposure assessments conducted at sites.
-239-
-------
Session 7: Statistical Issues and Approaches for Site
Characterization and Remediation: A Discussion
Wayne L. Myers
Codirector, Office for Remote Sensing of Earth Resources
Penn State Univ.
University Park, PA 16802
My recent completion of two years as expatriot technology advisor to
USAID/Indla malces It difficult for oe to discuss technical detail in
isolation from technological context. My concern for the Superfund context
is that statistical technology apparently operates in a mode that might be
characterized as being somewhere between the country doctor and paramedic.
The "country doctor" connotation is indicative on the one hand of delivery
in handbook/manual mode that is fast becoming archaic in the age of
knowledge-based computer software. It is indicative on the other hand of a
search for one-size-fits-all simplicity, which is set in contrast to most
of the modern technological arena where the goal is matching complexity
against special circumstances in a process of customizing rather than
simplifying technology. The "paramedic" connotation is indicative of a
propensity for setting statistical protocol to be applied in Isolation from
the extensive technological infrastructure that is developing around
spatial Information management such as geographic information systems.
There appears to be tremendous scope for technological integration to
serve the needs of site characterization for guiding remediation.
Knowledge engineering and spatial data technologies offer prime prospects.
There Is certainly no lack of social/environmental consequence to justify
such initiatives, and the programmatic funding base is of major magnitude.
The Statistical Policy Branch faces an overabundance of opportunity for
exercising leadership in marshalling information technologies to augment
or ensure the effectiveness of Superfund and set the stage for efficiency
in follow-on programs.
-240-
-------
STATISTICAL ISSUES AND APPROACHES FOR THE CHARACTERIZATION AND
REMEDIATION: A DISCUSSION
Research Needs: The Study of Composite-Sample Forming Methods and
Retesting Procedures Using Actual Samples From Hazardous Waste
Sites.
Michael R. Murray
The Pennsylvania State University
Soil and Environmental Chemistry Laboratory
104 Research Building A
University Park, PA 16802
The Problem: It is apparent from the papers presented at
this workshop that a major area of future research will involve
the study of composite-sample forming methods and retesting
procedures as related to hazardous waste sites. The testing of
these procedures have, for the most part, been examined by
utilizing simulated data sets and existing data bases. For the
latter, only the sample data have been available for study and
not the actual samples.
Several questions that simulated data and existing data
bases cannot answer with respect to future research goals are:
1. What are the mixing and measurement errors associat-
ed with the various compositing schemes?
2. What are the cost and time constraints associated
with the various compositing schemes?
3. How adaptable are these compositing schemes within
the setting of today's environmental testing labora-
tories?
To address the above questions, it is necessary to have
samples from hazardous waste sites and to actually composite
and test these samples in the laboratory. Unfortunately, existing
samples are generally not available for research purposes. At the
Soil and Environmental Chemistry Laboratory we have begun to
establish a soil bank in which soil samples from hazardous waste
sites are stored and are made available for in house research
projects. Such a soil bank should prove invaluable for composite
sample research. Obviously such a bank is only useful for those
pollutants which maintain their chemical integrity over long
periods of time, such as heavy metals.
Two sets of soil samples which we presently have within our
soil bank are described.
241-
-------
Site A consist of an existing agricultural field (5.6
acres) which has become contaminated with the heavy metal cadmium
(Cd) due to sewage sludge spreading during the 1960's. As part of
a geostatistical research project, in which Michael R. Murray is
the principal investigator, 458 soil samples and 50 plant tissue
samples have been collected from this site. The Cd concentration
in the soil samples range from 0.65 to 150 mg/kg. The coeffi-
cient of variation for the study site is 105%. Spatial analysis
of the existing data set indicates that a strong spatial trend in
Cd concentration exist at the site. Furthermore, several local-
ized hotspots of Cd have been found.
Sit* B, like site A, involves heavy metal contamination of
soils due to land spreading of industrial sewage sludge. The
state regulatory agency has required implementation of a detailed
soil sampling and remediation program at this site. A total of
18.4 acres, comprising three separate fields, were required to
be sampled. As directed by the state regulatory agency, each
field was divided into 100 by 100 foot grid cells. A total of 80
grid cells were established. Three random locations were then
chosen within each cell. A soil core was taken at each sample
location to a depth of 18 inches. This soil core was then divided
into three separate samples of 0-6, 6-12, 12-18 inch depths.Thus,
a total of 9 samples (3 samples per depth) were taken from each
cell. In all, a total of 720 soil samples were taken from the
site. The mean grid cell concentrations of Cd ranged from 0.70 to
42.6 mg/kg. The coefficient of variation is 55.6%. Furthermore,
the spatial distribution of Cd appears to be relatively homogene-
ous as compared to the Cd spatial distribution for Site A.
While these two sites are similar in the type of pollution,
they differ in that site A has a strong spatial structure compo-
nent associated with it, where as site B tends to be relatively
homogeneous. Because the actual soil samples are available,
laboratory studies examining composite-sample forming methods and
retesting procedures are possible. Furthermore, studying methods
of forming composite samples which incorporate the spatial struc-
ture of the data could be performed. It is very likely that
different compositing approaches would be used for each site. The
use of such soil banks for studying the applicability of compos-
ite-sample forming methods and retesting procedures for hazardous
waste sites is strongly encouraged.
242-
-------
Session 7: Statistical Issues and Approaches for the
Characterization and Remediation: A Discussion
Nicholas Bolgiano
Center for Statistical Ecology and Environmental Statistics
Department of Statistics
Penn State University
University Park, PA 16802
Predicting contaminant concentration responses surfaces has generally been
performed after assuming that a stationary small-scale stochastic process
approximates the contamination process. However, the complexity of contamination
processes and the need for accurate, precise, and cost-effective sampling and
remediation indicates a need for further sophistication in contamination models.
The modeling process is often one of adding features of determinism when
additional complexity is desired (Lehmann, 1990), and this approach is advocated
here.
The needs for further sophistication in spatial analysis of hazardous waste data
include (i) characterization of spatial processes, (ii) analyzing relationships
among spatial variables, (iii) accurately assessing the power of response surface
predictions, and (iv) determining the desired sampling locations. These needs
became apparent from our retrospective analysis of data from two Super fund sites,
the Dallas Lead Site and the Palmerton Site (Bolgiano et al., 1990). Of interest
to the site characterization of both sites was knowledge of the pattern of heavy
metal fallout from smelters. Our analysis indicated that a small-scale
stochastic process did not appear to exist at the scale of sampling that was
employed at either site. Instead, the contamination processes appeared to be
composed of a large-scale, regional process associated with the investigated
smelters, and local contamination processes related to industries other than the
smelters and to motor vehicles. As a result, the employed sampling scale may
have been finer than the scale of sampling necessary to characterize the regional
process but smaller than the scale necessary to delineate the local processes.
The highest observed lead concentrations at the Dallas Lead Site were associated
with local industry and not the lead smelters. Prediction of the extent of local
hotspot contamination would likely be overestimated if the autorcorrelation
structure of the large-scale process were employed for interpolation (Censheimer
et al. , 1986). Perhaps the best solution to predicting the extent of local
hotspot contamination is realized by second-stage sampling near those locations.
The determination of second-stage sampling locations thus depends upon the
identification of hotspot locations from first-stage data. Further, if one has
the capability of predicting the pattern of contamination prior to first-stage
sampling, as was performed at the Palmerton Site from knowledge of wind putterns
and topography, then first-stage sampling could be designed to measure the
important anticipated components of the contamination process.
The validity of the kriging variance may be questionable if the process is
nonstationary, though kriging is often advocated because it provides a variance
-243-
-------
estimate. Confidence intervals on the predicted response surface are often
desired, but it remains uncertain if a purely stochastic model can provide this.
Questions that have been posed for both the Dallas Lead Site and the Palmerton
Site that require characterization of spatial processes other than modeling the
autocorrelation structure. Consideration of human health remediation at the
Dallas Lead Site required knowledge of how smelters and motor vehicles
contributed to soil lead contamination (Carra, 1984). At the the Palmerton site,
the spatial variability of lead also appeared to be partially attributable to
motor vehicles (Starks et al., 1987). Much of the pattern of heavy metal
contamination at these sites appeared to be related to spatial features, such as
the alignment of the contamination contours at the Palmerton Site with the ridge
and valley topography, and the lead level decrease in Trinity River floodplain
of the Dallas Lead Site DMC area. Stochastic models of contamination ignore the
relationship of contamination data to other attainable spatial data. Therefore
spatial statistics technology that allows decomposition of data into the
subprocesses of interest and to stochastic components is needed to address
substantive site characterization questions. Questions about the relationship
of multivariate spatial variables also arise in ecological site investigations.
Here, the goal is often to link chemical and biological field data with
laboratory toxicity results to determine the spatial extent of hazardous waste
site effects upon the biota (Kapustka et al., 1990). This can be quantitatively
acheived by overlaying predicted maps of chemical and biological responses, but
quantitatively-based decisions necessitates the analysis of multivariate data.
Thus, a challenge to environmental statisticians lies in assessing relationships
among multivariate spatial data, as well as appropriately modeling univariate
chemical data.
References
Bolgiano, V. C., Patil, G. P., and Taillie, C. (1990). Spatial statistics,
composite sampling, and related issues in site characterization with two
examples. See this Proceedings.
Carra, J. S. (1984). Lead levels in blood of children around smelter sites in
Dallas. In Environmental Sampling for Hazardous Wastes, G. E. Sweitzer and J.
A. Sant.olucito, eds. American Chemical Society, Washington, DC. pp. 53-66.
Genshiemer, G. J., Tucker, W. A., and Denahan, S. A. (1986). Cost-effective soil
sampling strategies to determine amount of soils requiring remediation. Tn
Proceedings of the National Conference on Hazardous Wastes and Hazardous
Materials, March 4-6, Atlanta, GA. pp. 76-79.
Kapustka, L., Shirazi, M. A., and Linder, G. (1990). Ecological assessment of
hazardous waste sites. See this Proceedings.
Lehmann, E. L. (1990) Model specification: The views of Fisher and Neyman, and
later developments. Statistical Science, 5, 160-168.
Starks, T. H., Sparks, A. R., and Brown, K. W. (1987). Geostatistical analysi3
of Palmerton soil survey data. Environmental Monitoring and Assessment 9
239-261. .
-244-
-------
STATISTICAL ISSUES & APPROACHES FOR SITE CHARACTERIZATION
AND REMEDIATION: AN OVERVIEW
DISCUSSION
John Zlrschky, Ph.D., P.E.
Office of Sen. Jaaes Jeffords
530 Dirkson Senate Office Bldg.
Washington, D.C. 20510
Introduction
The existence of this conference is proof that the application of
statistics to hazardous waste issues has made great progress. In 1980,
while working with the Region VII Field Investigation Team (FIT), it
seemed much of the early sampling that was conducted could be described
as haphazard at worst, judgmental at best. Publications such as TranStat
helped improve the types of sampling that were conducted, however, use of
statistics was still not commonplace. Even now, statistics is not
commonly used in the investigation of hazardous waste sites. Sampling
points are picked not for scientific reasons, but for convenience or based
upon noneducated guessing as to where the contamination would lie.
There are several reasons, in my opinion, why statistics is not more
commonly used: 1) lack of understanding about statistics, 2) the
relatively large number of samples required to use statistical approach,
and 3) engineering bias against statistics. Possible solutions to these
impediments are described below.
Education
The best means to overcome the lack of understanding of statistics
is to educate the environmental community on statistics. Greater
dissemination of knowledge is needed. Such education will not be easy,
as many hazardous waste professionals are somewhat prejudiced against
statistics. I believe education would bast be accomplished by first
training federal and state regulatory personnel in statistical concepts.
The consulting community will not use statistics to evaluate hazardous
waste sitas or other environmental concerns until regulators require it
for two reasons. First, statistical techniques typically require more
samples than are normally collected using "judgement" techniques.
Consultants who propose statistical approaches to site characterization
will almost always propose more samples than consultants who propose a hit
and miss approach. Since cost Increases with the number of samples and
cost is a major factor in my client's decisions to award projects, it Is
not in the consultant's short-term interest to propose a mora costly
approach than his competitor. Once regulators become more aware of
statistical sampling design, they may begin to require a statistical
approach eo site investigations. Vhen all consultants are required to use
statistics, then consultants with statistical expertise will be better
able to compete with their less knowledgeable competitors.
Familiarity with statistics is the second factor discouraging use
of statistics. Neither the regulators nor regulated community are
-245-
-------
knowledgeable, generally, on statistics. A training workshop for
regulators should be developed and routinely conducted. Many agencies
have contact persons and coordinators for regulatory programs; similar
individuals should be designated for statistics. The act of appointing
such an individual will also help show how serious the agency is about
statistics.
Greater training is also needed as part of EPA's technology transfer
program. Several years ago, I suggested to EPA that they include sessions
on practical applications of statistics to hazardous waste Issues as part
of one of the nationwide technology transfer seminars. This suggestion,
in hindsight, was not sufficient. To properly present the material, a
dedicated technology transfer workshop seems more appropriate, with an
accompanying publication issued. This course alone would not qualify one
as a statistician, but would instead introduce one to the importance of
good site characterization. The agency could go further by requiring the
use of certified statisticians, much like the construction grants program
required the use of a certified value engineers on large projects, to
ensure that agency funds were well spent.
The final area to encourage is academic training. Courses on
environmental statistics are extremely limited. The agency should
encourage environmental science and engineering programs to: 1) offer
dedicated environmental statistics courses, or 2) require coursework in
statistics offered through other departments. Research funding would
encourage graduate students to pursue statistics, thereby providing both
trained professionals or academic instructors upon graduation.
Research Needs
The use of statistics will often require that a greater number of
samples be collected than might otherwise be collected. At many sites,
the cost of analyses exceeds the cost of excavation and disposal of the
contaminated material. For example, to excavate, transport, and dispose
of a cubic yard of heavy metal contaminated soil in the southeast United
States costs on the order of $250-300. An EP Toxicity metals scan can
cost approximately $200, excluding the sampling costs. For every sample
analyzed, slightly less than a cubic yard of soil could have been
removed/transported/disposed. As analytical costs increase, more and more
soil could be removed for each sample taken. Unless the sampling plan is
carefully devised, it may be cheaper to assume contamination and excavate
without sampling. Research needs to be conducted on optimizing total site
remediation costs including sampling, e.g., when is it more cost-effective
to remove material believed to be contaminated first and then do
statistical sampling. When is it best (economically) to composite, and
over how large an area should compositing be done? More research emphasis
is needed on the economics of sampling. If high costs can be overcome,
PRP's will be more villing to use statistical approaches.
A second research area is the application of statistics to risk
assessment, and vice-versa. At radioactive waste disposal sites, cleanup
levels can be based both upon the concentration and volume of
contamination. Currently, there appears to be a lack of understanding as
to what a certain level of contamination really means. For example, on«
-246-
-------
pound of dioxln could be distributed in 1 cubic yard of soil or 200 cubic
yards of soil. While the mass of contaminants is the same, the threat
posed by each distribution of contaminants is different. The greater the
area covered, the greater the risk of public exposure. Risk assessment
could be very helpful in determining such things as the size and shape of
a hot spot of concern and the degree of sampling warranted.
Risk assessment combined with statistics is also needed for
selecting disposal options. Regulatory agencies have required removal of
six inches of soil (the minimum that can b« affectively excavated) because
the top inch or so shoved lead levels In excess of cleanup standards.
High lead levels are not desireable in surface soils to which public
exposure may occur, and these soils should be removed. The practical
limits of excavation equipment, however, result in nonintentional dilution
of contamination. As a result, the excavated contains a much lover
concentration of contaminants than does the thin layer of contaminated
soil, and may contain concentrations which, if in-situ, could safely be
left in-sltu. Under these situation, cleanup-standards should differ from
disposal standards. The excavated soil, if sampled after excavation, may
not appear hazardous. If the contaminant is not easily transported (e.g.,
through groundwater) and as long as deliberate attempts at dilution are
not made, the final form of the soil as ic would be disposed should be
used to determine Its fate, not the ln-place soil concentrations.
Different statistical techniques may be required to determine the average
concentration in situ vs. in truck. Research on such techniques should
be conducted and disseminated.
Engineering Bias
I am not sure why there Is a bias by some consultants against the
use of statistics, but such a bias does exist. Perhaps these individuals
either had a bad statistics teacher, or no statistics instruction at all.
Such bias ajgalnst statistics may be hard to overcome. Similarly, some
statisticians appear to be prejudiced against real-world applications and
do not appreciate that complex statistical procedures which may confuse
the public may not be applicable to remediation projects. Education seems
the best means of fighting both prejudices.
There is also an economic bias against statistics. Few firms have
on-staff or ready access to an environmental statistician. Thus, if
statistics are required, some firms will not be able to offer this
service. Consulting firms tend to sell what they can do, and to downplay
those services they do not offer. A firm regulatory commitment to
statistics will force recalcitrant firms to Include statistics in the
RI/FS process.
Secondly, many professional have a legitimate concern about the cost
effectiveness of statistical sampling. Practical decisions can be made
with fewer than the statistically required number of decisions. The
current public bias is towards Not In My Backyard. Thus, a great many
more samples than statistically required may be needed to justify leaving
a waste in-place; whereas, few samples may be needed to plan a removal
action. Regulatory and public policy issues should also be integrated
with statistical survey design.
-247-
-------
Statistical Evaluation of Cleanup Levels
Superfund Program Needs
Jennifer Haley, Chemical Engineer
Remedial Operations and Guidance Branch
Bill Hanson, Chief
Remedial Operations and Guidance Branch
Hazardous Site Control Division
Office of Emergency and Remedial Response
U.S. EPA
BACKGROUND
Cleanup levels for soils at Superfund sites are used for
three different situations:
1. Concentrations above which soil must be excavated,
2. Concentrations to which soil must be treated,
3. Concentrations over which a cap must be placed.
Volumes or areas to be addressed are estimated based on remedial
investigation data and confirmed during the remedial action based
on additional sampling and analysis, currently, cleanup
concentrations are commonly specified as maximum values and
excavation, treatment, or capping will continue until no samples
indicate concentrations exceeding the specified level. In some
cases, confidence levels are indicated but statistical analysis is
generally limited. No standard policy has been recommended by the
Agency for statistically confirming the attainment of cleanup
levels, though the various options have been described (U.S. EPA,
1989).
PROGRAM NEEDS
In order to provide guidance on how to specify cleanup levels
statistically in decision documents and establish the
corresponding sampling strategy in work plans, it is important to
understand the implications of various methods in terms of sample
size and to evaluate which methods reflect appropriate confidence
levels under different potential exposure scenarios. The following
steps are suggested:
1. Define a limited number of hypothetical case situations
to be used as models. These should cover the basic site
types for which different statistical methods would be
warranted (e.g., acute verses chronic threat,
containment verses cleanup to unrestricted use).
-248-
-------
2. Identify the statistical methods which could be used to
determine that the response action has met its
objectives and the corresponding statement indicating
what showing is made (e.g. 90% of the site will achieve
a 10 ppm cleanup with a 95% confidence, the average
concentration will not exceed 15 ppm with a 99%
confidence).
3. Lay out the required showing for the hypothetical case
studies that would be required that correspond to the
statistical method options identified. Present the pros
and cons of the various approaches in terms of sampling
requirements, possibility of exceedance of cleanup level,
simplicity of approach, and other pertinent
considerations.
The information developed in a study of this nature would
provide policy-makers the basic information needed to develop
recommendations on statistical approaches, provide guidance on the
application of the methods using hypothetical scenarios, and
indicate the level of confirmation data required for various
cleanup scenarios to be used in the remedy selection process.
U.S. EPA, Methods for Evaluating the Attainment of Cleanup
Standards, Volume 1, Soils and Solid Media, 1989.
-249-
-------
Session 7: Statistical Issues and Approaches for the
Characterization and Remediation: A Discussion
Royal Nadeau
U. S. Environmental Protection Agency
Environmental Response Branch
2890 Woodbridge Avenue
Edison, NJ 08837-3679
I feel a little bit like Jennifer. I don't really have a whole lot of earth
shaking comments to make. Just to give you a little perspective of where I sit
in the general scheme of things, I work with the Environmental Response Team
which is also within the Superfund program but our function is primarily to
provide technical support in the form of studies of one kind or another ranging
all the way from extended contamination to conducting treatability studies for
determining the appropriate technology and making recommendations. Our
constituents are comprised of the OSC and RPM community within the regions.
There are about 90 OSCs throughout EPA and 360 give or take on which day you take
the census within the RPM community. So you can see that we have a fairly large
constituency that we follow. Those are people we deal with directly. They are
the ones that turn us on. One thing I think is important to relay to you in
terms of constituents and that is that we all have constituents that we work for.
I think some of you in academia may feel you are distant from your constituents
but in reality you are riot. As long as you have something to do with public
agencies, in particular, your ultimate constituency is the public itself. And
there is where the rub comes in so to speak or the thrust if you want to look at
it from a positive standpoint in terms of who do we really want to get to. Who
do we really want to satisfy? And, who are our customers? If we have to deal
with hamburgers or gourmet foods then we definitely have to think of the public.
So, it is along those lines that if there is one area that needs to be addressed
from wherever you are in this room, it is in the area of communication. And, for
the last two days, I think that is very evident. I think by and large we have
had some very good communication going on here not only in the presenters
themselves but also in the reactions from the audience. In that light, as I was
telling GP earlier, I feel that this has been one of the best workshops that I
have attended and because of the ways things have been structured here and it
looks like Pepi and GP are to be commended here and also in setting up for SRA
in facilitating it.
In that light, I would like to tell you about an effort that the ERT is engaged
in. That is a workshop, actually a prototype, that we are planning on having the
last two days of March in Edison. Any of you who have an interest and feel as
though you could make it, we would like to have you participate. It is kind of
open-ended. It is in the area of data depiction and interpretation or
misinterpretation. Whatever. It is a workshop that originates out of some
feelings that I have developed internally. I don't have a lot of solid examples
although when I have queried some of my colleagues they have told me not to worry
saying they had lots of examples of places to talk about where data has been used
improperly or abused or whatever. So, that's the thing that we are going to be
-250-
-------
dwelling upon. Our target audience happens to be the OSCs and RPMs and also the
technical support groups that are out there within the Superfund programs in the
regions. A lot of the things that have been said here will feed into this and
I an very glad to have had the opportunity to be here so I can take that back to
our workshop. We are purposefully not emphasizing QAQC. The presumption will
be that the data will already be stamped useful or useable when we start to work
with it with the possible exception of the design of the sample, realizing that
when you screw up in that area it will affect your interpretive powers
throughout. However, in terms of the goodness of the data we are going to assume
it is ok to use and go on from there.
In that respect, I really don't have much else to say. I think I can answer your
question though about why perhaps statisticians have not been asked to come out
to a lot of sites. It simply is this. There is a tremendous concern and perhaps
from a lot of peoples' minds too much but nevertheless, with the promulgation of
the recent rules regarding workers safety and the necessity for getting trained
and having the number of hours for sufficient training and that may be the reason
why a lot of folks have not been sought after or asked to make a site visit. I
think Larry can vouch for that. That may be the reason. If you feel you have
something to contribute and can convince the appropriate officials in the region
that it would be a good idea that you made a site visit, there are ways and means
that you can get safety trained so you would be able to access a site and do your
thing.
-251-
-------
Session 7: Statistical Issues and Approaches for the
Characterization and Remediation: A Discussion
G. P. Patil
Center for Statistical Ecology and Environmental Statistics
Department of Statistics
Penn State University
University Park, PA 16802
Fortunately, we don't have to start from scratch. The issue of what constitutes
unreasonable degradation came up in the Ocean Dumping Act. In response to that,
Joel O'Connor, who was in the Ocean Assessment Division of NOAA, initiated a
program and project on how to identify control site, clean site, background site,
background standards based on field considerations so that the field data will
have desired coverage of Type I error and Type II error. And, Joel, and NOAA
approached us at Penn State to be their statistical arm and literally several
scores of data sets, potential data sets, were identified. Scores of fisheries
scientists and marine biologists who were still responsible for the data sets and
managers were identified. We sifted through those data sets and had lots of
dialogues and discussions in written form and drafts. And, we came up with some
guidelines for developing what have become known as indices for coastal and
estuarine degradation. Some approaches were spelled out as to how one would go
about identifying or trying to identify the control site in space or in time.
The issues were also on what level you should choose — alpha? We thought one
in ten was partly reasonable because perhaps a manager, decision maker might go
in for two terms consisting of ten years or eight years and the Type II error was
chosen to be one in three so that at least the manager is not caught napping in
two successive years! Of course, these were the managerial and administrative
aspects of the problem. The biological, chemical, and environmental variabili ty
implicit in these data sets also corroborated in the same direction. Dick
Gilbert also made this point on the background standards and Larry is now raising
this as an important issue of the Superfund problem. Perhaps, Joel O'Connor
could be approached now. He is in EPA, Region II in New York doing ocean policy
coordination.
That exercise showed us that we cannot think of generating a crystal ball which
every decision maker wanted to have. We began to say, 'How about a crystal
cube?' And, everyone was very happy. So, it has come to be known as crystal
cube for coastal and estuarine degradation. So much so, at the research affair
exhibition at Penn State (Penn State has 23 campuses), the Vice President for
Research and Dean of the Graduate School who is also a member of the Science
Advisory Board to EPA, chose this project for display and literally made a
crystal cube for those ten different indices with green, and yellow, and red, and
so on and so forth. It would be nice for everyone involved here to be familiar
with it in case background standard is something that one is looking for.
To sum up, communication also has turned out to be a major concern on the minds
of all of us. What is communication? It is more than one-way interaction. It
cannot occur in this case unless everyone is at least bilingual and tries to be
-252-
-------
bilingual at least. And, everyone tries to play a dual role of being both a
professor and a student at the same time. It can be done. I am sure that we have
shown in these two days that it can be done and enjoyably so. If we can continue
with this spirit, also in an administrative manner, that would work. That is
where the concept of triad comes. Who are the three? The manager, the scientist
and engineer, and the statistician need to meet at one time and in one place for
one purpose rather than choose to meet in pairs leaving the third one out
whosoever that third one is. So, if we can, in the Superfund area, adopt this
triad approach that would help. The key there is a data set. The substantive
issue, evidence and everyone involved is still responsible for that issue, for
that evidence, and for its use. Hope this works and helps to meet some declared
goals we had in the beginning of our workshop. The goals have been somewhat
broadened rather than shortened. One goal was to develop and put out a
proceedings of statistical research issues and needs for substantive answers.
I am sure all of us have done our homework even while here - done our discussion
sheets - and surely we'll carry some discussion sheets home and also send them
back. At the same time, we will have in the mail another comment sheet,
evaluation sheet, of this workshop, a forward looking evaluation of this
workshop so that later on th.at response will be of some use.
253
-------
OOIVTIVTEISrT S BY PARTICIPANTS
Mindi Snoparsky (U. S. Environmental Protection Agency): The regions could
use assistance regarding the use of statistics for the attainment of cleanup
standards for pump and treat remedies.
Other Concerns: 1) How can statistical issues be written into a ROD?
2) An understanding statistically of when "steady state" conditions may have
been met is needed (sometimes this may occur after the designated cleanup
standard in the ROD). A statistician should work closely with a pump and treat
expert (i.e Joe Keeley) on this issue.
Llewellyn Williams (U.S. Environmental Protection Agency): Recommend reaching
concensus and putting forward some of the critical issues for implementation by
Program offices. Need statistical sampling guidance and experimental design
guidance integrated into regulatory programs, even if it is not optimal.
Haggling over which tool should not result in the" recommendation of no tool. We
need to more effectively market the statistical approaches and ideas upon which
we agree.
254-
-------
\r c par t s \ t c c hr c p \ ' 1 '>'>!)• t i t I c s .m: -1-
PENN STATE CENTER FOR STATISTICAL ECOLOGY AND ENVIRONMENTAL STATISTICS
Technical Reports and Preprints
86-1001 STATISTICAL ISSUES IN COMBINING ECOLOGICAL AND ENVIRONMENTAL
STUDIES WITH EXAMPLES IN MARINE FISHERIES RESEARCH AND MANAGEMENT
by G.P. Pat iI . G.J. Babu, tf.T. Boswell, K. Chatterjee,
E. Under and C. Tail lie
86-1101 ESTIMATION OF RELATIVE FISHING POWER OF DIFFERENT VESSELS
by G. J. Babu. W. Pennington and G. P. Pati I
86-1102 FIELD BASED COASTAL AND ESTUARINE STATISTICAL INDICES OF
MARINE DEGRADATION
by W. T. Boswell and G. P. PatiI
86-1103 TIME SERIES REGRESSION METHODS FOR THE EVALUATION OF THE
CAUSES OF FLUCTUATION IN FISHERY STOCK SIZES
by W.T. Boswell, E. Linder, J.K. Ord, G.P. PatiI and
C. Tat 11 te
86-1104 EFFECTS OF TOXIC POLLUTANTS ON AQUATIC RESOURCES USING
STATISTICAL MODELS AND TECHNIQUES TO EXTRAPOLATE ACUTE
AND CHRONIC EFFECTS BENCHMARKS
by Ernst Linder, G.P. Pati I, Glenn W. Suter II and C. Taillie
86-1201 LOGNORMAL DISTRIBUTIONS AND THEIR APPLICATIONS IN ECOLOGY
by Brian Dennis and G. P. PatiI
86-1202 POWER SERIES DISTRIBUTIONS AND THEIR CONJUGATES IN
STOCHASTIC MODELING AND BAYESIAN INFERENCE
by R. S. Abdul-Razak and G. P. Paul
86-1203 A GENERAL TECHNIQUE FOR GENERATING PSEUDO MULTIVARIATE
RANDOM VARIABLES: A MARKOV PROCESS APPROACH
by Suhhash R. Lele and W. T. Boswell
86-1204 A STUDY OF THE RELATIONSHIP BETWEEN DIVERSITY INDICES OF
BENTHIC COMMUNITIES AND HEAVY METAL CONCENTRATIONS OF
NORTHWEST ATLANTIC SEDIMENTS
by H. C. Bol^iano, G. P. Pati I, and R. S. Re id
86-1205 APPLICATION OF EVENT TREE RISK ANALYSIS TO FISHERIES MANAGEMENT
by Ernst Linder, G. P. PatiI, and Douglas Vau%han
87-0401 WEIGHTED DISTRIBUTIONS
by G. P. Patil, C. R. Rao, and M. Zelen
87-0402 BIVARIATE WEIGHTED DISTRIBUTIONS AND RELATED APPLICATIONS
by G. P. PatiI, C. R. Rao and M. V. Ratnaparkhi
87-0501 SELECTION OF ENDPOINTS FOR A CRYSTAL CUBE, AND DEVELOPMENT OF
INDICES FOR COASTAL AND ESTUARINE DEGRADATION FOR USE IN
DECISIONMAKING
by M. T. Boswell and 6'. P. Patil
-------
\re par t s \ t echreps \ J'><>0 \ i i i ! c s .ml
-2-
PENN STATE CENTER FOR STATISTICAL ECOLOGY AND ENVIRONMENTAL STATISTICS
Technic at Reports and Preprints (continued)
87-0502 A PERSPECTIVE OF COMPOSITE SAMPLING
by W. T. Ho swell and G. P. Paul
87-0503 DATA-BASED SAMPLING AND MODEL-BASED ESTIMATION FOR
ENVIRONMENTAL RESOURCES
by G. P. Patt I, G. J. Babu, R. C. Hennemuth, W. L. Myers
W. B. Rajarshi, and C. Tai111e
87-0504 ROLE AND USE OF COMPOSITE SAMPLING" AND
CAPTURE-RECAPTURE SAMPLING IN ECOLOGICAL STUDIES
by V. T. Ho swell. K. P. Hurnhum, and G. P. Patt I
87-0505 ON TRANSECT SAMPLING TO ASSESS WILDLIFE POPULATIONS AND
MARINE RESOURCES
by F. L. Ramsey. C. E. Gates, G. P. Pat i I , and C. Tat 11ie
88-0301 STATISTICAL ANALYSIS OF RECRUITMENT DATA FOR EIGHTEEN
MARINE FISH STOCKS
by C. Tail lie, G. P. Patil, and R. C. Hennemuth
88-0302 MODELING AND ANALYSIS OF RECRUITMENT DISTRIBUTIONS
by C. Tail lie, G. P. Patil, and R. C. Hennemuth
88-0303 RECRUITMENT DISTRIBUTIONS AND INFERENCES ABOUT LONG-TERM
YIELDS
by G. P. Patil, C. Tatllte, and R. C. Hennemuth
88-0304 KERNEL METHODS FOR SMOOTHING RECRUITMENT DATA: COMPARISON
OF CONSTANT AND VARIABLE BANDWIDTHS
by C. Tail he, G. P< Patil and R. C. Hennemut h
88-0305 A SIMULATION MODEL FOR FISH RECRUITMENT AND FISH STOCK SIZE:
AN IMPLEMENTATION FOR MICRO COMPUTERS
by W. T. Hoswcll and Juan Palmer
88-0306 STATISTICAL ANALYSIS OF CATCH-PER-TOW DISTRIBUTIONS
by W. T. Ho swell and G. P. Patil
88-1201 STATISTICAL ECOLOGY, ENCOUNTERED DATA, AND META ANALYSIS:
A FEW PERSPECTIVES OF STATISTICAL ECOLOGY
by G. P. Pal t I
89-0101 PROBING ENCOUNTERED DATA, META ANALYSIS AND WEIGHTED
DISTRIBUTION METHODS
by G. P. Patil and C. TaiI lie
-------
\repor t s \ t echreps \ I'>'>/) \ t i t Ie s .ml
-3-
PENN STATE CENTER FOR STATISTICAL ECOLOGY AND ENVIRONMENTAL STATISTICS
Technical Reports and Preprints (continued)
89-0501 PERFORMANCE OF THE LARGEST ORDER STATISTICS RELATIVE TO
THE SAMPLE MEAN FOR THE PURPOSE OF ESTIMATING A POPULATION MEAN
by G. P. Putt I and C. Tat I lie
89-0601 EVALUATION OF THE KRIGING MODEL FOR ABUNDANCE ESTIMATION OF
MARINE ORGANISMS
by fi. C. Holifianu. W. T. Bo swell. G. P. Patil, and C. Taillie
89-060*2 ASSESSING SCALES OF SPATIAL VARIABILITY IN CHESAPEAKE
BAY TRAWL DATA
by ,V. C. Holiftano. W. T. Ho swell, G. P. Pat i I , and C. Tail lie
89-1001 ANALYSIS OF *111TE PERCH ABUNDANCE TRENDS IN THE CHOPTANK
AND YORK RIVERS
by ,Y. C. Holt;iano. W. T. Hoswell, anil G. P. Pali I
90-0601 SADDLEP01NT .APPROXIMATION FOR THE UMPU TEST IN THE
TWO-SAMPLE GAMMA PROBLEM
by C. Tail lie and G. P. Pat11
90-0901 EXPECTED NUMBER OF TESTS FOR VARIOUS RETESTING SCHEMES
WITH COMPOSITE SAMPLING IN THE PREVALENCE/ABSENCE CASE
by V. T. Hoswell and G. P. Paul
90-0902 SPATIAL STATISTICS, COMPOSITE SAMPLING. AND RELATED
ISSUES IN SITE CHARACTERIZATION WITH TWO EXAMPLES
by V. C. Hot\>iano, G. P. Patil, and C. Tail lie
90-0903 ASSESSING SCALES OF SPATIAL VARIABILITY IN CHESAPEAKE BAY
TRAWL DATA
by V. C. Hol^iano, G. P. Patil, and C. Tail lie
90-0904 SMALL-SAMPLE BEHAVIOR OF RAO'S EFFICIENT SCORES TEST IN
THE TWO SAMPLE GAMMA PROBLEM
by C. Tat I lie. R. P. Waterman, and G. P. Patil
90-0905 METHODS FOR COMPUTER GENERATION OF UNIFORM AND
NON-UNIFORM RANDOM VARIABLES
by M. T. Hoswell, S. Gore, and G. P. Patil
-------
\re por t s \ t echreps \ 10901 1111e s .mi -4-
PENN STATE CENTER FOR STATISTICAL ECOLOGY AM) ENVIRONMENTAL STATISTICS
Technical Reports and Preprints (continued)
90-1001 COMPOSITE SAMPLE DESIGNS POR CHARACTERIZING CONTINUOUS
SAMPLE MEASURES RELATIVE TO A CRITERION
by V. T. Boswelt, .V. C. Bolguna, and G. P. Patil
90-1002 A STATISTICAL APPROACH TO THE EVALUATION OF THE
ATTAINMENT OF INTERIM CLEANUP STANDARDS
by G. P. Patil and C. Tail tie
90-1003 PERFORMANCE OF THE LIKELIHOOD RATIO TEST IN THE
TWO-SAMPLE GAMMA PROBLEM AND ITS APPLICATION TO CLEANUP
EVALUATION AT HAZARDOUS WASTE SITES
by G. P. Pan I and C. Tail lie
90-1101 A BAYESIAN APPROACH TO COMPOSITE SAMPLING FOR HAZARDOUS
WASTE SITE CHARACTERIZATION
by M. T. Roswcll, R. E. Macchtave11i, and G. P. Patil
90-1102 THE ART OF COMPUTER GENERATION OF RANDOM NUMBERS
by M. T. Bo swell, S. Gore, and G. P. Patil
SOURCES OF BIAS IN HARVEST SURVEYS POR MARINE FISHERIES
by C. F. Bonzek, W. L. Myers, 8. W. Parolari and
G. P. Patil
ENCOUNTERED DATA ANALYSIS AND INTERPRETATION IN ECOLOGICAL
AND ENVIRONMENTAL WORK: OPENING REMARKS
by R. C. Hennvmuth, G. P. Patil and S. P. Ross
CAN WE DESIGN ENCOUNTERS?
by R. C. Hennemut ft, G. P. Patil and C. Taillie
RISK ANALYSIS IN THE GEORGES BANK HADDOCK FISHERY:
A PRAGMATIC EXAMPLE OF DEALING WITH UNCERTAINTY
by B. E. Brown and G. P. Patil
THE CRYSTAL CUBE POR COASTAL AND ESTUARINE DEGRADATION
by If. L. Pugh, M. T. Bo swell and C. P. Patil
SELECTION OF ENDPOINTS POR A CRYSTAL CUBE, AND DEVELOPMENT OF
INDICES POR COASTAL AND ESTUARINE DEGRADATION POR USE IN
DECISIONMAKING
by M. T. Boswell and G. P. Patil
-------
The Pennsylvania Slate Urn versify Teieonones:
Director
Oeoarrmentof Statistics
303 Pond Laboratory
University Parte. 'A 16802 USA
314/865-9
-------