United States
Environmental Protection
Agency
Environmental Monitoring Systems
Laboratory
Las Vegas NV 89114
Research and Development
EPA-600/S4-83-020 Aug. 1983
Project Summary
Preparation of Soil Sampling
Protocol: Techniques and
Strategies
Benjamin J. Mason
This report sets out a system for
developing soil sampling protocols. The
body of the report discusses the factors
that influence the selection of a par-
ticular sampling design and the use of
a particular sampling method Statis-
tical designs are discussed along with
the appropriate analysis of the data
Three appendices are included. One
consists of a discussion of the steps
that must be taken to arrive at the
desired protocol. The remaining ap-
pendices present two examples of pro-
tocols-one for a shallow spill situation
and the second for a deep contamina-
tion plume. A technique called kriging
is presented as an approach to the
analysis of data collected during a soil
sampling program. This technique
allows the researcher to develop maps
of the pollution levels and to assign a
statistical precision to the data at each
point on the map. The data presented
on these two maps can be evaluated to
identify areas where additional samples
are needed to reach a desired level of
precision for the area covered by the
maps.
This Project Summary was developed
by EPA's Environmental Monitoring
Systems Laboratory. Las Vegas, NV, to
announce key findings of the research
project that is fully documented in a
separate report of the same title (see
Project Report ordering information at
back).
Introduction
Considerable effort has been expended
in developing protocols for monitoring
airborne and waterborne pollutants. The
complexities of sampling the soil system
hinder the development of appropriate
field procedures.
This report provides a means for de-
veloping a protocol to satisfy the soil
sampling needs of the U.S. Environmental
Protection Agency (EPA) and other agencies
with similar needs. The methods outlined
are adequate tools for producing reliable
estimates of the spatial distribution of
soilborne pollutants. The investigator
should already be somewhat familiar with
the general properties of soils and the
behavior of pollutants in the soils environ-
ment A soil chemist and a statistician
should be available from the outset to
consult about the selection processes used
in developing the protocol The investigator
must be able to modify the protocol to
meet unusual conditions that frequently
arise in the field and were not specifically
covered during the design or planning
phase of the program.
EPA scientists need to follow rigorous
chain-of-custody and quality assurance
procedures in conducting their investiga-
tions. A brief discussion of those require-
ments is included in this report as is a
treatment of the application of a relatively
new geostatistical technique (kriging) for
estimating the distribution of pollutants in
soils.
The Soil System
Because factors such as clay and organic
matter content texture, permeability, pH,
and cation exchange capacity influence
the rate and route of migration, these
parameters and their variability must be
considered in the sampling desiga One
statistical measure of variability is the
coefficient of variation (CV). Coefficients
of variation for soil parameters have been
-------
reported ranging from as low as one to two
percent to as high as 850 percent with
the variation slightly higher for physical
than for chemical properties.
CV=s/yx100
where CV = coefficient of variation
in percent
s = standard deviation of
sample
y= mean of sample
The inherent variation found in data
collected from any preliminary or previous
soil sampling is of particular value in
designing a sampling plan for the same
area.
Initiating the Soil Sampling
Study
The soil sampling protocol should begin
with a clear statement of the objectives of
the study followed by a series of steps or
decision elements leading to the proper
sampling design. The goals of the study
must be elucidated and must be agreed
upon by all parties involved prior to design-
ing and implementing the field sampling
program. The objective statement should
also include data reliability desired and the
resources committed to the study.
The scientist must establish the con-
fidence level desired and the allowable
margin of error to be met by the results.
Laboratory systems have been developed
to the point where considerable confidence
can be placed in the results. The sources
of uncontrolled variation are too great,
however, for a field study to meet the
precision found in a laboratory. The only
alternative is to select a level of confidence
that is acceptable and attainable within the
limits of the resources available for the
study.
Types of Soil Pollution
Situations
Major types of sampling situations that
the environmental scientist is likely to
encounter include large or localized areas
where pollutants are either in the surface
layers, moved down into the soil profile, or
contained within well-defined plumes.
Factors such as length of time over
which the contamination has occurred,
physical and chemical properties of the
pollutantf s), type of soil and past use of the
area must all be considered in determining
the appropriate study design.
Review of Background Data
All available documents dealing with the
study area, including pertinent newspaper
accounts, should be collected and eval-
uated. In general, the information of value
should deal with the historical use of the
area, current and'old drainage patterns,
groundwater flow and use, and associated
environmental and health problems.
Geological characteristics are important
not only for determining routes of pollutant
migration but also for stratifying an area by
homogenous soil types. Parent materials
and bedrock often play an important part
in determining how the pollutants will
react in the soil. Information on the nature
of the bedrock, the groundwater elevations,
the direction of groundwater flow and the
sources of recharge to the aquifer should
be acquired before finalizing the monitoring
plan.
Statistical Designs
Four basic statistical sampling designs
can be used in soil studies -- simple
random, stratified random, systematic and
judgmental sampling.
Simple random sampling is the basis for
all probability sampling techniques used
in soil sampling and serves as a reference
point from which modifications to increase
the efficiency of sampling are evaluated.
Simple random sampling in itself may not
give the desired precision because of the
large statistical variations encountered in
soil sampling. Therefore, one of the other
designs may be more useful. Where little
information exists about the area to be
studied or the pollution distribution, the
simple random sampling design is the
only design other than the systematic grid
that can be used effectively.
The procedures for determining the
number of samples required to meet a
predetermined precision in a simple random
sampling design are the basis for allocating
samples to a strata in the stratified random
design. They can also be used to determine
the number of sample points needed in the
systematic sample design.
If an estimate of the variance can be
obtained from either a preliminary experi-
ment, a pilot study, or from the literature,
the number of samples required to obtain
a given precision with a specific confidence
level can be obtained from the following
equation:
n = t2as2/D2
where D is the precision given in the speci-
fications of the study; s2 is the sample
variance and t is the two-tailed t-value
obtained from the standard statistical
tables at the a level of significance and (n -
1) degrees of freedom. D is usually
expressed as ± a specified number of
concentration units (i.e., ± 5.00 ppm).
The equation can also be written in terms
of the coefficient of variation (CV) as:
n = (CV)2t2o/p2
where CV is the coefficient of variation; p
is the allowable margin of error expressed
as a percentage (D/y) and y is the mean of
the samples.
The margin of error is needed in deter-
mining the number of samples required to
meet the precision specified. This is often
expressed as the percentage error that the
scientist is willing to accept or it may be a
difference that he hopes to detect via the
study. The margin of error chosen is
combined with the confidence level to
derive an estimate of the number of samples
required. The smaller the margin of error,
the largerthe number of samples required.
As the t-value is dependent upon the
number of degrees of freedom, it is neces-
sary to use an iterative approach to
calculate the number of samples required.
Curves can be prepared to plot the number
of samples against the coefficient of varia-
tion, thus avoiding iterations.
The total cost of soil studies often follows
a linear form of equation which sums the
overhead, sampling and analytical costs.
That equation is used along with the
equations designed to provide the number
of samples required to ultimately arrive at
a number that will satisfy the budget and
yield the desired precision.
Once the number of samples has been
determined, their locations can be planned.
A map of the study area is overlaid with an
appropriately scaled grid The starting
point of the grid should be randomly
selected rather than located for convenience.
This can be accomplished by selecting
four random numbers from a random
number table. The first two numbers
locate a specific grid square on the overlay.
The second two identify a point within thai
grid square. This point is fixed on the map
and the entire grid shifted so that the lower
right corner of the original grid square lies
on the point chosen. This technique
eliminates some of the questions ofter
raised about bias in grid sampling. A
second alternative is to select a pair o1
coordinates at random, which becomes
the starting point for the grid.
Prior knowledge of the sampling are<
and information obtained from the back-
ground data can be combined with ir>
formation on pollutant behavior to reduct
the number of samples necessary to attair
a specified precision. The statistical tech
nique used to produce this savings i:
called stratification. Basically it operate:
on the fact that environmental factors pla'
-------
a major role in leaching and concentrating
pollutants in certain locations. Stratified
random sampling shou Id lead to increased
precision if the strata are selected in such a
manner that the units within each stratum
are more homogenous than the total popu-
lation. Each stratum is handled as a
separate simple random sampling effort
At least two samples must be taken
from each stratum in order to obtain an
estimate of the sampling error. The num-
ber of sampling units is usually allocated
according to a proportion based on the
land area covered by each stratum, i.e., if
the area of soil in one stratum is 25
percent of the total study area, then 25
percent of the samples would be taken
from that stratum. Proportional allocation
is used in soil sampling work primarily
because the variance within a general area
tends to be constant over a number of soil
types. A pilot study would confirm if this
is, in fact, the case. If the variances are
materially different, the basis of allocation
must be changed or stratification may be
inappropriate.
The systematic sampling plan is an
attempt to provide better coverage of the
soil study area than could be provided with
the simple random sample plan, and may
result in better precision and less bias. The
method is easy to apply and, therefore, has
been frequently used. Samples are col-
lected in a regular pattern (usually a grid or
line transects) over the area under investi-
gation. The starting point is located by
some random process; then all other
samples are collected at regular intervals
in one or more directions. The orientation
of the grid lines should also be randomly
selected except when a pollution plume is
the subject of the investigation. The orienta-
tion of the grid should be such that the lines
are parallel to the general trace of the plume.
The spacing on the grid is particularly
important if regionalized variable theory is
used to design the study. This theory is
the basis of a sampling program termed
"kriging," which has the advantage of
developing estimates of concentrations
over geographic areas and also provide a
measure of the point-to-point confidence
limits. The theory is based upon the
spacing of data points along the grid lines.
The samples must be close enough to pro-
vide a measure of the continuity of the lo-
cation to location variation within a soil
study area. If, however, a measure of the
mean and variance of the population is the
focus of the grid sampling array, the samples
must be placed outside of the "range" of
the variance of each point. This allows the
collection of samples that are not influenced
by regionalized variables. If a significant
percentage of the variation occurs within
the first few meters of a point, the range
beyond which kriging is not effective
probably lies at a distance often meters or
greater. This lies within the range within
which the mean and variance of the popula-
tion are the only parameters that can be
determined.
The systematic sampling plan is ideal
when a map is the final product It
provides for a uniform coverage of the area
and allows the scientist to have points to
use in generating a plot of isoconcentra-
tions.
Two factors may limit the use of this
design. First, the estimation of the samp-
ling error is difficult to obtain from the
sample itself unless replicate sampling is
used at a number of sites. The variance
cannot be calculated unless a mean suc-
cessive difference test is used to evaluate
the data. Secondly, the presence of trends
and periodicity in the data create problems
when the direction of the grid aligns with
the pattern in the data. Soil sampling
efforts seldom encounter a cyclic pattern
or periodicity to a degree that creates a
problem. Trends, however, are common in
soil pollution work and frequently are the
whole purpose for the sampling effort
The final statistical sampling design
considered is termed judgmental samp-
ling. It is typically used in concert with one
of the other methods in order to cover
areas of unusual pollution levels or where
effects have been observed. The major
problem with this approach is that it is
subject to bias and, therefore, faulty con-
clusions. If judgmental sampling is used,
duplicate or triplicate samples should be
taken to increase the level of precision.
When historical data are not available for
use in planning a study, the use of a
phased approach is required. The first
phase of the study might incorporate a
simple random study design with a 68
percent confidence level, the results of
which would be used to design a more
definitive study with a 95 percent confi-
dence level. A stratified random design or
a systematic sampling grid approach could
be used to obtain data with a higher
confidence limit The grid design would
allow the researcher to analyze the data
using kriging and thus find where addition-
al samples are needed to further refine the
sampling design so that the entire area is
covered at the 95 percent confidence
level.
Control sites are used quite often in
major soil studies, especially if the study is
designed to determine the extent and
presence of local pollution. Sites for
controls must be as representative as
possible of the study area and subject to
the same type pollution sources except for
the specific pollutant under investigation.
In most cases it is desirable to spend as
much time searching out data on the
control as on the study area.
Sample Collection
There are two layers within the soil
column of primary importance. The surface
layer (0-15 cm) includes recently deposited
pollutants. Pollutants deposited as a result
of liquid spills or long-term seepage of
water soluble materials may be found at
depths ranging up to several meters.
Plumes emanating from hazardous waste
dumps or leaking storage tanks may be
found at considerable depths. The methods
of sampling these layers are only slightly
different however. Samples can either be
collected with some form of core sampling
or auger device or through the use of
excavations or trenches. I n the latter case.
samples are cut from the soil mass with
spades or short punches.
Several surface soil studies have made
use of a punch or thin walled steel tube
(15 to 20 cm long) to extract short cores
from the soil. The soil punch method is
fast and can be adapted to a number of
analytical schemes provided precautions
are taken to avoid cross-contamination
during shipping and in the laboratory.
Perhaps the most undesirable sample
collection device for the surface layer is
the shovel or scoop. This device is often
used in agriculture, but where samples are
being taken for chemical pollutants, the
inconsistencies are too great.
Percolation or precipitation will move
surface pollutants into the lower soil hori-
zons where the use of devices, that will
extract a longer core is required. Soil
probes collect a 30 to 45 cm column of
relatively undisturbed soil, while augers
collect a "disturbed" sample in approxi-
mately the same or slightly longer column.
The detail often desired in research studies
or in cases where pollutants are suspected
cannot be met effectively through the use
of augers. In these cases some form of
core sampling or trenching must be used.
For measuring the vertical distribution of a
pollutant that augers and probes are not
recommended as they tend to contaminate
the lower samples with materials from
above.
Trenching is used to carefully remove
sections of soil when a detailed examination
of pollutant migration patterns and detailed
soil structure is required. It should be
used only in those cases where detailed
information is desired because of the
-------
relatively high costs associated with ex-
cavating. Typically a trench, approximately
1 meter wide, is dug to a depth of about 30
cm belowthe desired sampling depth. The
samples are taken from the sides of the pit
using a soil punch or a trowel. The
maximum effective depth for this method
is about 2 meters unless done in some
stepwise fashion.
Sampling for underground plumes is
perhaps the most difficult of all soil samp-
ling requirements. Often it is conducted
along with groundwater and hydrological
sampling The equipment required usually
consists of large, vehicle mounted power
auger and coring devices although some
portable tripod-mounted coring units are
available.
Frequently, several soil samples collected
from a given location are composited to
minimize analytical costs. The key to any
statistical sampling plan is the use of the
variation within the sample set to test
hypotheses about the population and to
determine the precision or reliability of the
data The composite sample provides an
excellent estimate of the mean but does
not give any information about the variation
within the sampling area A compromise is
possible, however, by collecting and an-
alyzing duplicates or triplicates at a per-
centage of the locations. A similar re-
quirement will be imposed by the quality
control program.
A vital component of the protocol is an
adequate definition of the records required
during the study. Good records are essential
should litigation be a potential end point
for the study- Each result may be questioned
in an attempt to either discredit or verify
the data presented.
Data Analysis
Finally, the data analyst must keep in
mind the purposes) for which the samples
were collected. These purposes can usually
be characterized as: An estimate of the
mean level of pollutant in a geographic
area; a determination as to whether the
pollution measured is above some standard
or is higher than the ambient levels found
in the control areas; or to quantitatively
document the area) extent and depth of
the pollution and the confidence with
which it is known.
Protocol Development
A scheme for preparing a soil sampling
protocol via a decision tree process is
included as an appendix to the report It is
not possible to address in a single protocol
all of the variables that may be encountered
in a field program. The procedure outlined
permits, therefore, the development of a
protocol specifically tailored to the re-
quirements of a study. An example protocol
for a described scenario is presented
Benjamin J. Mason is with ETHURA, McLean, VA 22101.
Robert D. Schonbrod is the EPA Project Officer (see below).
The complete report, entitled "Preparation of Soil Sampling Protocol: Techniques
and Strategies," (Order No. PB 83-206 979; Cost: $13.00, subject to c~hange)
will be available only from:
National Technical Information Service
5285 Port Royal Road
Springfield, VA 22161
Telephone: 703-487-4650
The EPA Project Officer can be contacted at:
Environmental Monitoring Systems Laboratory
U.S. Environmental Protection Agency
P.O. Box 15027
Las Vegas, NV 89114
* US. GOVERNMENT PRINTING OFFICE' 1983-659-017/7157
United States
Environmental Protection
Agency
Center for Environmental Research
Information
Cincinnati OH 45268
Postage and
Fees Paid
Environmental
Protection
Agency
EPA 335
Official Business
Penalty for Private Use $300
PS 0000329
AGENcr
Ml,,!!,.,,!!,,!!,,,,!,,!,!,,!!
------- |