United States
 Environmental Protection
 Agency
Environmental Monitoring Systems
Laboratory
Las Vegas NV 89114
Research and Development
EPA-600/S4-83-020  Aug. 1983
Project  Summary
Preparation  of  Soil  Sampling
Protocol:   Techniques  and
Strategies
Benjamin J. Mason
  This report sets out a system for
developing soil sampling protocols. The
body of the report discusses the factors
that influence the selection of a par-
ticular sampling design and the use of
a particular sampling method Statis-
tical designs are discussed along with
the appropriate analysis of the data
Three appendices  are included.  One
consists of a discussion of the steps
that must be taken to arrive at the
desired protocol.  The remaining ap-
pendices present two examples of pro-
tocols-one for a shallow spill situation
and the second for a deep contamina-
tion plume. A technique called kriging
is presented  as an  approach to the
analysis of data collected during a soil
sampling  program.  This technique
allows the researcher to develop maps
of the pollution levels and to assign a
statistical precision to the data at each
point on the map.  The data presented
on these two maps can be evaluated to
identify areas where additional samples
are needed to reach a desired level of
precision for the area covered by the
maps.
  This Project Summary was developed
by EPA's Environmental Monitoring
Systems Laboratory. Las Vegas, NV, to
announce key findings of the research
project that is fully documented in a
separate report of the same title (see
Project Report ordering information at
back).

Introduction
  Considerable effort has been expended
in  developing  protocols  for monitoring
airborne and waterborne pollutants. The
complexities of sampling the soil system
hinder the development of appropriate
field procedures.
  This report provides a means for de-
veloping a protocol to satisfy the soil
sampling needs of the U.S. Environmental
Protection Agency (EPA) and other agencies
with similar needs. The methods outlined
are adequate tools for producing reliable
estimates of the spatial distribution of
soilborne pollutants.  The investigator
should already be somewhat familiar with
the general properties of soils and the
behavior of pollutants in the soils environ-
ment  A soil chemist and a statistician
should  be available from the outset to
consult about the selection processes used
in developing the protocol The investigator
must be able to modify the protocol to
meet unusual conditions that frequently
arise in the field and were not specifically
covered during the design  or planning
phase of the program.
  EPA scientists need to follow rigorous
chain-of-custody and quality assurance
procedures in conducting their investiga-
tions. A brief discussion of those require-
ments is included in this report as is a
treatment of the application of a relatively
new geostatistical technique (kriging) for
estimating the distribution of pollutants in
soils.

The Soil System
  Because factors such as clay and organic
matter content texture, permeability, pH,
and cation exchange capacity influence
the rate and route  of migration,  these
parameters and their variability must be
considered in the sampling desiga One
statistical  measure of variability  is the
coefficient of variation (CV).  Coefficients
of variation for soil parameters have been

-------
reported ranging from as low as one to two
percent to as high as 850 percent with
the variation slightly higher for physical
than for chemical properties.

CV=s/yx100
     where CV = coefficient of variation
                in percent
            s = standard  deviation  of
                sample
            y= mean of sample

  The  inherent variation found  in data
collected from any preliminary or previous
soil  sampling  is  of particular value in
designing a sampling plan for the same
area.

Initiating the Soil Sampling
Study
  The soil sampling protocol should begin
with a clear statement of the objectives of
the study followed by a series of steps or
decision elements leading to  the proper
sampling design. The goals of the study
must be elucidated and must be agreed
upon by all parties involved prior to design-
ing and implementing the field sampling
program. The objective statement should
also include data reliability desired and the
resources committed to the study.
  The  scientist must establish the con-
fidence level desired and the allowable
margin of error to  be met by the results.
Laboratory systems have been developed
to the point where considerable confidence
can be placed in the results. The sources
of uncontrolled variation are too great,
however, for a field study to meet the
precision found in a laboratory. The only
alternative is to select a level of confidence
that is acceptable and attainable within the
limits of the resources available for the
study.


Types of Soil  Pollution
Situations
  Major types of sampling situations that
the environmental scientist is likely to
encounter include large or localized areas
where pollutants are either in the surface
layers, moved down into the soil profile, or
contained within well-defined  plumes.
  Factors  such as  length of time over
which  the contamination has occurred,
physical and chemical  properties of the
pollutantf s), type of soil and past use of the
area must all be considered in determining
the appropriate study design.

Review  of Background Data
  All available documents dealing with the
study area, including pertinent newspaper
accounts,  should  be collected and eval-
uated. In general, the information of value
should deal with the historical use of the
area, current and'old drainage patterns,
groundwater flow and use, and associated
environmental and health problems.
  Geological characteristics are important
not only for determining routes of pollutant
migration but also for stratifying an area by
homogenous soil types.  Parent materials
and bedrock often play an  important part
in determining  how  the pollutants will
react in the soil.  Information on the nature
of the bedrock, the groundwater elevations,
the direction of groundwater flow and the
sources of recharge to the aquifer should
be acquired before finalizing the monitoring
plan.
Statistical Designs
  Four basic statistical sampling designs
can  be used in  soil  studies  -- simple
random, stratified random, systematic and
judgmental sampling.
  Simple random sampling is the basis for
all probability sampling techniques used
in soil sampling and serves as a reference
point from which modifications to increase
the efficiency of sampling are evaluated.
Simple random  sampling in itself may not
give the desired precision because of the
large statistical  variations encountered in
soil sampling. Therefore, one of the other
designs may be more useful. Where little
information exists about the area to be
studied or the pollution distribution, the
simple random sampling  design  is the
only design other than the systematic grid
that can be used effectively.
  The  procedures for determining the
number of samples required  to meet a
predetermined precision in a simple random
sampling design are the basis for allocating
samples to a strata in the stratified random
design. They can also be used to determine
the number of sample points needed in the
systematic sample design.
  If  an estimate of the variance can be
obtained from either a preliminary experi-
ment, a pilot study, or from the literature,
the number of samples required to obtain
a given precision with a specific confidence
level can be obtained from the following
equation:
             n = t2as2/D2

where D is the precision given in the speci-
fications of the study; s2  is the sample
variance and t  is the two-tailed t-value
obtained from  the standard  statistical
tables at the a level of significance and (n -
1)  degrees of  freedom.  D  is usually
expressed  as ±  a  specified number  of
concentration units (i.e., ± 5.00  ppm).
The equation can also be written in terms
of the coefficient of variation (CV) as:

           n = (CV)2t2o/p2

where CV is the coefficient of variation; p
is the allowable margin of error expressed
as a percentage (D/y) and y is the mean of
the samples.
  The margin of error is needed in deter-
mining the number of samples required to
meet the precision specified. This is often
expressed as the percentage error that the
scientist is willing to accept or it may be a
difference that he hopes to detect via the
study.   The margin of error chosen  is
combined with the confidence  level to
derive an estimate of the number of samples
required. The smaller the margin of error,
the largerthe number of samples required.
  As the t-value is  dependent upon the
number of degrees of freedom, it is neces-
sary  to  use an  iterative  approach to
calculate the number of samples required.
Curves can be prepared to plot the number
of samples against the coefficient of varia-
tion,  thus avoiding iterations.
  The total cost of soil studies often follows
a linear form of equation which sums the
overhead, sampling and analytical costs.
That equation is used along  with the
equations designed to provide the number
of samples  required to ultimately arrive at
a number that will satisfy the budget and
yield the desired precision.
  Once the number of samples has been
determined, their locations can be planned.
A map of the study area is overlaid with an
appropriately scaled grid   The  starting
point of the grid  should be  randomly
selected rather than located for convenience.
This  can be accomplished by selecting
four  random numbers from a  random
number  table.   The first  two numbers
locate a specific grid square on the overlay.
The second two identify a point within thai
grid square. This point is fixed on the map
and the entire grid shifted so that the lower
right corner of the original grid square lies
on the point chosen.   This technique
eliminates  some of the questions ofter
raised  about bias  in grid  sampling.  A
second alternative is to select a  pair o1
coordinates at random, which becomes
the starting point for the grid.
   Prior knowledge of the sampling are<
and information obtained from the back-
ground data can be combined with ir>
formation on pollutant behavior to reduct
the number of samples necessary to attair
a specified  precision. The statistical tech
nique  used to produce this savings  i:
called stratification.  Basically it operate:
on the fact that environmental factors pla'

-------
a major role in leaching and concentrating
pollutants in certain locations. Stratified
random sampling shou Id lead to increased
precision if the strata are selected in such a
manner that the units within each stratum
are more homogenous than the total popu-
lation.   Each stratum is  handled as  a
separate simple random sampling effort
  At least  two  samples must be taken
from each stratum in order to obtain an
estimate of the sampling error. The num-
ber of sampling units is  usually allocated
according to  a  proportion based on the
land area covered by each stratum, i.e.,  if
the area of soil  in  one stratum is  25
percent of the total study area, then 25
percent of the samples would be taken
from that stratum.  Proportional allocation
is used in soil  sampling work primarily
because the variance within a general area
tends to be constant over a number of soil
types.  A pilot study would confirm if this
is, in fact, the case.  If the variances are
materially different, the basis of allocation
must be changed or stratification may be
inappropriate.
  The  systematic  sampling  plan is an
attempt to provide better coverage of the
soil study area than could be provided with
the simple random sample plan, and may
result in better precision and less bias. The
method is easy to apply and, therefore, has
been frequently  used.  Samples  are col-
lected in a regular pattern (usually a grid or
line transects) over the area under investi-
gation.  The starting point is located by
some  random  process; then all other
samples are collected at regular intervals
in one or more directions. The orientation
of the grid lines should also be randomly
selected except when a pollution plume is
the subject of the investigation.  The orienta-
tion of the grid should be such that the lines
are parallel to the general trace of the plume.
  The  spacing on the grid is particularly
important if regionalized variable theory is
used to design the study.  This theory is
the basis of a sampling  program termed
"kriging,"  which  has the advantage of
developing estimates of concentrations
over geographic areas and also provide  a
measure of the  point-to-point confidence
limits.   The theory  is  based upon the
spacing of data points along the grid lines.
The samples must be close enough to pro-
vide a measure of the continuity of the lo-
cation  to location variation within a soil
study area. If, however,  a measure of the
mean and variance of the population is the
focus of the grid sampling array, the samples
must be placed  outside of the "range" of
the variance of each point. This allows the
collection of samples that are not influenced
by regionalized variables.  If a significant
percentage of the variation occurs within
the first few meters of a point, the range
beyond which  kriging  is  not  effective
probably lies at a distance often meters or
greater. This  lies within the range within
which the mean and variance of the popula-
tion are the only parameters that can be
determined.
  The  systematic sampling plan is ideal
when  a  map is  the final product   It
provides for a uniform coverage of the area
and allows the scientist to have points to
use in generating a plot of isoconcentra-
tions.
  Two factors may limit the use of this
design. First, the estimation of the samp-
ling error is  difficult to obtain from the
sample itself  unless replicate sampling is
used at a  number of sites.  The variance
cannot be calculated unless a mean suc-
cessive difference test is used to evaluate
the data. Secondly, the presence of trends
and periodicity in the data create problems
when the direction of the grid aligns with
the pattern in the  data.   Soil sampling
efforts seldom encounter a cyclic pattern
or periodicity to a degree that creates a
problem. Trends, however, are common in
soil pollution  work and frequently are the
whole purpose for the sampling effort
  The  final statistical sampling design
considered is termed judgmental samp-
ling. It is typically used in concert with one
of the other  methods in order to cover
areas of unusual pollution levels or where
effects have  been observed.  The  major
problem with this approach is  that it is
subject to bias and,  therefore, faulty con-
clusions. If judgmental sampling is used,
duplicate or triplicate samples should be
taken to increase the level of precision.
  When historical data are not available for
use in planning a  study, the use of a
phased approach is required.   The first
phase  of the study might incorporate a
simple random study design with a 68
percent confidence level,  the results  of
which  would be used to design a more
definitive study with a 95 percent confi-
dence level. A stratified random design or
a systematic sampling grid approach could
be  used to  obtain data  with  a higher
confidence limit The grid design would
allow the  researcher to analyze the data
using kriging and thus find where addition-
al samples are needed to further refine the
sampling design so that the entire area is
covered at the 95  percent  confidence
level.
  Control  sites are  used  quite often  in
major soil studies, especially if the study is
designed  to  determine the extent and
presence  of  local  pollution.   Sites  for
controls  must  be as  representative as
possible of the study area and subject to
the same type pollution sources except for
the specific pollutant under investigation.
In most cases it is desirable to spend as
much  time searching out data on the
control as on the study area.

Sample Collection
  There  are  two layers  within  the soil
column of primary importance. The surface
layer (0-15 cm) includes recently deposited
pollutants.  Pollutants deposited as a result
of liquid  spills  or long-term  seepage of
water soluble materials may be found at
depths ranging  up  to several  meters.
Plumes emanating from hazardous waste
dumps or leaking storage tanks may be
found at considerable depths. The methods
of sampling these layers are only slightly
different however. Samples can either be
collected with some form of core sampling
or auger device  or  through  the use of
excavations or trenches. I n the latter case.
samples are cut from the soil mass with
spades or short punches.
  Several surface soil studies have made
use of a  punch or thin walled steel tube
(15 to 20 cm long) to extract short cores
from the soil.  The soil punch method is
fast and can  be adapted to a number of
analytical schemes provided precautions
are taken to avoid  cross-contamination
during shipping and  in the laboratory.
  Perhaps  the  most undesirable sample
collection device  for the surface layer is
the shovel or scoop.  This device is often
used in agriculture, but where samples are
being taken for chemical pollutants, the
inconsistencies are too great.
  Percolation or  precipitation will  move
surface pollutants into the lower soil hori-
zons where the use  of devices, that will
extract a longer  core  is  required.   Soil
probes collect a 30 to 45 cm column of
relatively undisturbed soil,  while augers
collect a  "disturbed" sample in approxi-
mately the same or slightly longer column.
The detail often desired in research studies
or in cases where pollutants are suspected
cannot be met effectively through the use
of augers.  In these  cases some form of
core sampling or trenching must be used.
For measuring the vertical distribution of a
pollutant that augers and probes are not
recommended as they tend to contaminate
the lower samples  with  materials  from
above.
  Trenching is  used to carefully remove
sections of soil when a detailed examination
of pollutant migration patterns and detailed
soil structure is required.  It should be
used only in those cases where detailed
information is  desired because  of the

-------
  relatively high costs associated with ex-
  cavating. Typically a trench, approximately
  1 meter wide, is dug to a depth of about 30
  cm belowthe desired sampling depth. The
  samples are taken from the sides of the pit
  using a soil punch or  a trowel.   The
  maximum effective depth for this method
  is about 2 meters unless done in some
  stepwise fashion.
    Sampling for  underground plumes is
  perhaps the most difficult of all soil samp-
  ling requirements.  Often it is conducted
  along with groundwater and hydrological
  sampling The equipment required usually
  consists of large, vehicle mounted power
  auger and coring devices although some
  portable tripod-mounted coring units are
  available.
    Frequently, several soil samples collected
  from a given location are composited to
  minimize analytical costs. The key to any
  statistical sampling plan  is the use of the
  variation within  the sample set to test
  hypotheses about the population and to
  determine the precision or reliability of the
  data  The composite sample provides an
  excellent estimate of the mean  but does
  not give any information about the variation
  within the sampling area  A compromise is
  possible, however, by collecting and an-
  alyzing  duplicates or triplicates at a per-
  centage of the locations.  A similar re-
  quirement will be imposed by the quality
  control  program.
    A vital component of the protocol is an
  adequate definition of the records required
  during the study. Good records are essential
  should litigation  be a potential end point
  for the study- Each result may be questioned
  in an attempt to either discredit or verify
  the data presented.
       Data Analysis
         Finally, the data analyst must keep in
       mind the purposes) for which the samples
       were collected. These purposes can usually
       be characterized as:  An estimate of the
       mean level of pollutant in a geographic
       area; a determination as to whether the
       pollution measured is above some standard
       or is higher than the ambient levels found
       in the control areas;  or to quantitatively
       document the area) extent and depth of
       the pollution and  the  confidence  with
       which it is known.
Protocol Development
  A scheme for preparing a soil sampling
protocol via a decision tree  process is
included as an appendix to the report  It is
not possible to address in a single protocol
all of the variables that may be encountered
in a field program. The procedure outlined
permits, therefore, the development of a
protocol  specifically tailored  to the re-
quirements of a study. An example protocol
for a described scenario is presented
          Benjamin J. Mason is with ETHURA, McLean, VA 22101.
          Robert D. Schonbrod is the EPA Project Officer (see below).
          The complete report, entitled "Preparation of Soil Sampling Protocol: Techniques
            and Strategies," (Order No. PB 83-206 979; Cost: $13.00, subject to c~hange)
          will  be available only from:
                  National Technical Information Service
                  5285 Port Royal Road
                  Springfield, VA 22161
                  Telephone: 703-487-4650
          The EPA Project Officer can be contacted at:
                  Environmental Monitoring Systems Laboratory
                  U.S. Environmental Protection Agency
                  P.O. Box 15027
                  Las  Vegas, NV 89114
                                                        * US. GOVERNMENT PRINTING OFFICE' 1983-659-017/7157
United States
Environmental Protection
Agency
Center for Environmental Research
Information
Cincinnati OH 45268
                 Postage and
                 Fees Paid
                 Environmental
                 Protection
                 Agency
                 EPA 335
Official Business
Penalty for Private Use $300
              PS    0000329
                                               AGENcr
                                                                   Ml,,!!,.,,!!,,!!,,,,!,,!,!,,!!

-------