National Soil Monitoring Program


    RESEARCH     TRIANGLE    INSTITUTE
                                       1864/14/03 - Oil
                               National Soil Monitoring Program
                                              by
                                         Roy Whitmore
                                       Martin Rosenzweig
                                          John Hines
                                  Research Triangle Institute
                                    Research Triangle Park,
                                     North Carolina  27709
                                    Contract No.  68-01-5848
                                 Task Manager:   William Smith
                                  Project Officer:   Ann Carey
                                 Design and Development  Branch
                                 Exposure  Evaluation  Division
                           Office of Pesticides  and Toxic  Substances
                                Environmental  Protection Agency
                                    Washington,  D.C.   20460
                                                                       March 1981
                                                                       Draft of Final
RESEARCH  TRIANGLE  PARK,  NORTH   CAROLINA  27709

-------
                     Disclaimer

This document is a preliminary draft.   It has not been
released formally by the Office of Testing and Evaluation,
Office of Pesticides and Toxic Substances, U.S. Environmental
Protection Agency, and should not at this stage be construed
to represent Agency policy.   It is being circulated for
comments on its technical merit and policy implications.

-------
                        TABLE OF CONTENTS

                                                             Page

      NATIONAL SOIL MONITORING PROGRAM	      1

 1.1  General Description of the Program	      1
 1.2  The Rural Soils Network Survey Design 	      1

      1.2.1  General Considerations 	      1
      1.2.2  The Probability Sample Design	      1
      1.2.3  Limitations as a Monitoring Network	     18
      1.2.4  Uses in Regulatory Action	     19
      1.2.5  User Needs and Historical Uses of the Data .  .     19

 1.3  Alternate Survey Designs for the RSN	     20

      1.3.1  Design Option One	     20
      1.3.2  Design Option Two	     22
      1.3.3  Design Option Three	     24

 1.4  Present Network Operations	     34

 1.5  Alternate Operational Designs for the RSN	     35

 1.6  Recommended Modifications 	     36

 1.7  Statistical Findings and Charts for the RSN	     37

      1.7.1  Introduction	     37
      1.7.2  Sampling weights 	     37
      1.7.3  Stratification	     42
      1.7.4  Analysis	     44

 1.8  Capabilities for Performing Special Studies 	     53

 1.9  Toxic Substances Other Than Pesticides in Soils  ...     53

1.10  Implementation Plan for a New Survey Design
      of the Rural Soils Network	     54

EVALUATION OF CHEMICAL ANALYSIS	     90

2.1  Objective	     90
2.2  Discussion	     90

      2.2.1  Analytical Methodology 	     91
      2.2.2  QC/QA	     97
      2.2.3  Accuracy and Precision	     97
      2.2.4  Minimum Detectable Levels	     99

2.3  Fate of Pesticides in Soils	    102

2.4  Recommendations	    103

-------
                            TABLE OF CONTENTS

                                                                 Page

REFERENCES	   105

APPENDIX A:  Questionnaire on Chemical Analysis of Soil ....   A-l

APPENDIX B:  National Soil Monitoring Program -
             Pesticide Analysis Report Form 	   B-l

APPENDIX C:  Analytical Methodology for Organochlorine and
             Organophosphorous Pesticides and Trifluralin .  .  .   C-l

APPENDIX D:  Sampling Weights for the Rural Soils Network (RSN)   D-l

APPENDIX E:  Construction of an Analysis Data File	   E-l

-------
                             LIST OF TABLES

Table                                                            Page

 1.1      Sampling Rates (%) Which Provide Standard Relative
          Precision of County Level Estimates for 10 Size-
          classes and 3 Sizes of Unit	    5
                                              *
 1.2      Dichotomization of the Land Use Code	    8

 1.3.3.1  Construction of the Cost Model	   30

 1.3.3.2  Cluster Effect for Selected Values of p and n2. .  .  .   32

 1.3.3.3  Minimum Cost Allocation Subject to the Constraint .  .   33

 1.7.1    Fiscal Years of Data Collection for the Rural Soils
          Network	   40

 1.7.2    RSN Sites in Counties Having Both Irrigated and
          Remainder Strata, but only 160-acre PSU's 	   47

 1.7.3    Compounds with No Detectable Levels in Cropland Soils   48

 1.7.4    Compounds with No Detectable Levels in Noncropland
          Soils	   49

 1.7.5    Statistics for Compounds with Few Detectable Levels
          in Cropland Soils for Round One	   52

 1.7.6    Statistics for Compounds with Few Detectable Levels
          in Noncropland Soils for Round One	   53

 1.7.7    Statistics for Compounds with Detectable Levels in
          Noncropland Soils for Round One 	   54

 1.7.8    Statistics for Compounds with Detectable Levels in
          Cropland Soils by Census Division for Round One ...   55

 1.7.9    Statistics for Compounds with Detectable Levels in
          Cropland Soils by Cropping Region for Round One ...   72

 2.1      Pesticides and Toxic Compounds Analyzed Under NSMP.  .   93

 2.2      Procedures for the GC Analysis of Pesticides for
          the NSMP	   95

 2.3      Average Recoveries for Some Organochlorine Pesticides
          from Soil	   99

 2.4      Precision for Some Organochlorine Pesticides in Soil.  101

 2.5      Detection Limits of Pesticides in Soils 	  102

-------
                             LIST OF FIGURES

Figures                                                          Page

 1.1      Typical Stratification of a Township	      3

 2        Sample Points on a 160-acre Sample Area 	      7

 2.1      Capillary GC/ECD Chromatogram of Arochlor 1242 and
          Arochlor 1260	     57

-------
                            EXECUTIVE SUMMARY
1.   Introduction

     The purpose of  the review of the National Soils Monitoring Program
(NSMP) is to:

     a)   Describe the network,
     b)   Assess it current effectiveness,
     c)   Provide design options.

The NSMP has two. components, the Urban Soils Network (USN) and the Rural
Soils Network (RSN).  Its purpose has been to monitor pesticide residues
in soils in the conterminous United States.

     The USN will be reviewed in a later report.

     This report considers  the RSN review which  represents  a major and
time-consuming  effort.   It embraces  the assembly and  review of design
documents, the correspondence files and memoranda relating to operational
activities, and the computer data files including editing and correcting
data  entries  where necessary.   It also  includes  analyses of  the  data
using  the sampling  weights developed during  the establishment  of the
structure of the survey design.

     This report contains a brief and complete description of the statis-
tical  design  of the  RSN,  and  its  parent the  CNI.   It  is  therefore a
valuable asset in understanding, analyzing or modifying the soil monitor-
ing efforts of the federal government.

1.1  General Description of the Program

     The National Soil Monitoring Program consists of two networks:   (1)
the  Urban Soils  Network and  (2)  the Rural  Soils Network.   The Rural
Soils Network is a probability subsample of the 1967 Conservation Needs
Inventory sample.  The  area sampled by the Rural Soils Network includes
all of the  conterminous United States except for areas considered to be
urban in  character.   These  urban areas are monitored by the Urban Soils
Network,  which consists of a stratified sample.

1.2.1  General Considerations

     The  fact  the  Rural Soils  Network  (RSN)  is a probability sample
makes possible  valid statistical inferences to  the  population sampled,
namely all  rural  soils of the  conterminous United States.   Moreover,
inferences are possible for all reasonably large geographic areas within
the United States,  for example cropping regions and larger States.  Some
State exclusions must be noted in analyzing the data.

     The  operational  design of the RSN  makes  possible  some interesting
statistical  analyses.   Because  soil  and crop  specimens are  obtained
simultaneously  at  harvest  time  from matched  sites,   the  relationship
between pesticide  levels  in soils and harvested  crops  can be analyzed.
                                  -i-

-------
Also, since  some  sites were sampled at  a  four year interval, trends in
pesticide residue levels can be investigated.

1.2.2  The Sampling Design

     The Rural  Soil Network  (RSN)  is a probability sample  of 10-acre
sites from  the population  of all rural land  areas  in the conterminous
United States.  Each  10-acre  site is located by a probability subsample
of the data  points  of the 196JL Conservation Needs Inventory (CNI).  The
CN1,  in  turn, is a probability  sample  of all rural  land areas in the
conterminous United States.

     The CNI is a  stratified random  sample  of primary  sampling units
(PSU's) from each county of the  conterminous  United States,  except for
those counties strictly metropolitan in  character.  The standard size of
the PSU's was 160  acres, although AO^acjre, 100.-ac.re, and j640-acre PSU's
were not uncommon.  The standard sampling rate was two percent, however
this rate was increased or decreased in  order to eitner provide estimates
of nearly  equal precision  for all counties and  to  oversample areas of
special  interest.   The  sampling rates  varied within strata  from less
than one percent to approximately thirty-two percent.

     In the  CNI,  data were collected  for  each of a series of points at
every CNI sample site.  The land use data collected for each CNI sampling
point was  used to  classify the  point as either a  cropland  point or a
noncropland  point.  The  sampling design of the RSN specified that JD.._02_5
percent.,oj  the cropland  and  Q_J).Q25. perjcent __of_ the .noncropland of the
rural conterminous  United  States would  be sampled.  A subsample of the
CNI  cropland sampling points  was selected and  used to  locate the RSN
cropland sample sites.  The RSN noncropland sample sites were located by
a subsample of the CNI noncropland sampling points.

     The operational  design of the Rural  Soils Network (RSN) specifies
that each cropland  site be randomly designated as a first-year, second-
year, third-year, or  fourth-year cropland site, such that one-fourth of
the  cropland sites  in  each  State  will be  sampled each  fiscal year.
Noncropland sites were handled in the same manner.  Specimens were to be
collected at  each site no less than once  every four years and not more
than once per year.   Soil specimens were  obtained  by compositing fifty
soil cores,  2-inches  in diameter by 3-inches in depth.  Cropland speci-
mens were to be obtained immediately before or at harvest time.

1.3  Alternate Survey Designs

1.3.1  Design Option One

     A minimal change alternative would be to subsample the current RSN.
This option mainly addresses the problem of the cost of the RSN although
the need  for national  and regional estimates  is  also considered. (^Any
need to eliminate reliance upon the 1967 CNI is not addressed?]

     This option  does offer  the advantage that  it can  be quickly and
easj^y.  implemented,   possibly  while   other   alternatives  are  under
development.
                                  -ii-

-------
     Replicate subsamples are recommended if this option is to be imple-
mented,  even  if""it  is "only on  a temporary basis.  For  example,  if 50
percent  of  the  RSN sites are to  be  surveyed,  five subsamples that each
comprise  a  10 percent  subsample can be used.  At  least  five replicate
subsamples should be selected.   The use of replicate subsaraples makes it
possible  to estimate  sample variances  easily by  using the  theory of
replicate subsamples.

     It  may also  be useful to select  the  subsamples  at different rates
within domains of interest.  Identification of strata of special interest
within the domains just considered can be used to increase the possibil-
ity of finding toxic substance residues.

1.3.2  Design Option Two

     A  design analogous  to the design that  produced the  present  RSN
sample  can  be based  upon the 1982 jJational .Resources Inventory (NRI).
Use of  the 1982  NRI will provide  up-to-date land  use informationT  A
subsampling procedure  to obtain  adequate  precision at minimum cost is
proposed.   This  can  be accomplished  by  identifying  areas  where  toxic
residues are likely to be found and giving these areas a greater probabil-
ity of being selected for the RSN sample.

     It  is suggested that counties be used as primary sampling units for
the second  phase  sample.   The  data from the  present  RSN indicates that
counties  are  generally  heterogeneous with  respect to  toxic residues."
Thus, it would  be advantageous  to select relatively few counties with a
larger number of sample sites.   The use of counties as PSU's will reduce
travel costs associated with data collection.  More importantly, smaller
areas like counties can be effectively stratified into areas where toxic
residues are likely to be found.

     The  RSN sample sites  are   to  be  located  at  NRI  sample  points.
Sample  counties  are selected from  the counties  in the  NRI sample, so
that counties where  toxic substance residues are likely  have a greater
chance  of selection.  Thus, it  is suggested that  counties  be selected
with probability proportional to size (FFS), where the size measure is a
measure of the likelihood of finding toxic residues.

     Efficient  sampling within  the  selected  counties can  result  from
careful  stratification.   The NRI sampling points  within a  county  are
first stratified  into  cropland points and noncropland points, to insure
adequate representation of each of these land types and because agricul-
tural chemical  residues  are  more likely to be found in cropland.  Local
land use  characteristics  can be  used to further stratify both the crop-
land points and the noncropland points.

1.3.3  Design Option 3

     Review of  the  data indicates large numbers of zero valued observa-
tions,  and relatively few positive observations.   This analytic challenge
has been  discussed  elsewhere [See Lucas et al, Recommendations  for  the
National  Surface  Water Monitoring  Program for Pesticides.   Report  No.
                                  -ill-

-------
RTI/1864/01-02I]. The conclusion of that analysis was that the appro-
priate measures of "level" are:

(1) The proportion of positive dete.c.tignsj i.e., the relative
frequency of last stage sampling units positive for the
substance(s) under investigation, and

(2) The_ pjropar.tion _of__s,amp_l_ing units.. c^ntaj^J5g^^ncejn^:rations_ of
substance above some specified level. -This level may signal
the existence of an undesirable situation.

The proposed design is a two-stage area probability sample with
stratification of the sampling units at each level. The first stage or
primary sampling units (PSUjs)_ are, ^counties . Geographic stratification
is provided by the jour _ Census Regions . Allocation of PSU's to these
regions is in proportion to the land area eligible for the study. Using
additional variables to allocate the sample is unlikely to be useful at
this level due to the variety of land use within each Census Region.
The eligible land area is currently defined by the membership require-
ments of the RSN and USN. It may be advantageous from administrative as
well as fiscal and statistical grounds to combine the activities of the
soil networks, and consider SMSA counties as a stratum within the survey.
This point requires further review, however initial investigation
suggests savings are likely.

With the extension of monitoring responsibility from pesticides to
toxic substances in general, some revision of the approach is indicated.
The following stratification variables are therefore proposed for the
PSU's in addition to the geographic stratification above:

(1) Land area,
(2) Population density,
(3) Agricultural activity, and
Industrial activity.
SLe.c.oad stage sampling ..units (SSU^s) are 10-acre plots. These are pro-
posed as the final stage units or analysis units on the assumption that
they are sufficiently homogeneous that the effects of sub samp ling are
negligible . This is a verifiable proposition. The problem with SSU's
this small is the ability to locate them in the field. The lack of
identifiable boundaries renders exactly locating them most difficult.
f| To ease this difficulty, ejiumeratioji_districts (ED's) are proposed as
readily identified segments . The problem' is reduced to locating the SSU
within the ED, or any suitable sub segment chosen to facilitate the task.

SSU's will be allocated equally to PSU's. A detailed field-use
protocol will locate the specimen for collection, leaving the minimum of
discretion for the field personnel in the selection of these sites. The
protocol will specify a grid locating multiple specimen collection
sites. The soil collected in a given plot would be composited, unless
the homogeneity of the 10-acre plot is under investigation.

-------
1.4  Present Network Operations

1.5  Alternate Operational Design

     The  operational  design of  the  Rural Soils Network  (RSN)  was well
conceived   for   monitoring  agricultural  pesticides   and  herbicides.
However, much pesticide  and herbicide residue may often  be  leached out
of, or vaporized from, the cropland soil by harvest time.

1.6  Recommended Modifications

1.7  Statistical Findings

     Several types of analyses are of interest for the RSN data, notably:

     (1)  Estimation  of  base levels  for residues of  toxic  substances,
     (2)  Estimation  of  changes  in  mean  levels  of  toxic  substance
          residues  from  the  first  round  to the  second  round  of data
          collection, and
     (3)  Estimation  of   relationships   between  soil  and crop  residue
          levels.

The  reason  for analyzing  the  RSN data  in this  study  was  to  obtain a
measure of  the  degree of precision that could be  obtained for analysis
of residue data based upon the present data.   It was decided that estima-
tion of  base levels  of  residues would  be sufficient.   In  particular,
estimation of levels  was undertaken for the first round soil data only.

     It was  found  that the data values for most compounds were predomi-
nantly zero.  The predominance of zero values in the residue data results
in J-shaped  distributions for  the  amount of residue detected  for most
compounds.   This  type  of  data  presents  some  rather  unique  analysis
problems.  For  example,  the weighted mean of  the raw data values has
little meaning  if  most values  are zero and a few are very large.  Thus,
some type of  data transformation  is  generally required  in order  to
obtain a  meaningful  analysis  [See Lucas, et al,  Recommendations for the
National  Surface  Water Monitoring Program for  Pesticides.   Report No.
RTI/1864/01-02I].  Ideally, each compound  should be considered individ-
ually  to  determine an appropriate  transformation, if  any.   Ubiquitous
compounds like arsenic may not require transformation.

     For  analyses  on the  proportion scale,  all  data  values above the
minimum  detectable level  (MDL)  were  replaced  by  the  value one.   The
weighted mean on  this scale is a weighted estimate of the proportion of
the sampled  land  area with a residue level in excess of the HDL.  Since
this scale was felt to be generally the most appropriate for analysis of
the  residue  data,  the  standard error  and  the  design effect  for the
estimated proportion were also computed.

     Estimation of standard errors and design effects required that some
strata  be combined.  Since it was not possible to account  for all dimen-
sions of the CNI stratification, the standard errors computed are undoubt-
edly conservative  estimates.   This   results  in  similarly  conservative
interval  estimates  of the  proportion of sampled areas where  levels  of
the compound exceed the minimum detectable level (MDL).

                                  -v-

-------
     The design  effect  is  the ratio of the  sample  standard error to an
estimate of what the standard error would have  been if a simple random
sample of the same size had been used, i.e.,

     ___ _     Estimated S.E. (For the design used)
               Estimated S.E. (Simple Random Sample)


Alternatively, the design  effect can be thought of  as  the ratio of the
actual sample size to the  sample size that  would  be required to obtain
an  estimate  with  the same  standard error  based  upon a  simple random
sample.   Generally   stratification  decreases  the  design  effect,  while
clustering increases it.  Thus, since the CNI stratificatiorT~can be used
and  there  is no  clustering of  sample  sites in the RSN  sample, design
effects less one would be expected.  This would indicate that the design
produced  smaller standard  errors  than  would a  simple  random sample of
the same size.  Many of the design effects shown in Tables 1.7.7 through
1.7.9 are indeed less than one.  However, some design effects are substan-
tially greater than one.  It is not clear therefore that the CNI strati-
fication was particularly  advantageous  for estimation of proportions of
detections for toxic.substance residues.
                  "t-pt-^ttZJt*
1.8  Capabilities for Special Studies

1.9  Toxic Substances Other than Pesticides in Soils

1.10  Implementation Plan for a New Survey Design of the Rural Soils
      Network

2.0  Evaluation of Chemical Analysis

     Information  on  the quality  of the pesticide data compiled by the
NSMP is not  currently available to users of the program's computer data
file.  Some measure of this quality is necessary for meaningful statisti-
cal evaluation of the data and practical interpretation of the results.
To this purpose,  a  limited review of the current analytical methodology
was  conducted  and  information  compiled  on the accuracy (recoveries),
precision  (coefficient  of  variation) and  minimum detectable  levels of
each of the pesticides monitored under the program where such information
was available.

     Over  thirty toxic  substances  have been  monitored under  the  NSMP
including  several chemical  classes:   1)  organochlorine  pesticides;  2)
PCBs*; 3)  trifluraline; 4)  organophosphorous  pesticides; and  5)  heavy
metals.  All analyses (~ 450 soil specimens/year) are carried, out at the
Toxicant Analysis  Center,  Bay  St.  Louis,  Mississippi.   However,  heavy
metals have not been analyzed in soil sinceM9_7J£

     Nearly  all  procedures  applied to the  analysis of  pesticides  and
PCBs in  soil specimens  used an  initial  extraction followed by column
chromatography clean-up.   Final quantitation  of pesticides  was carried
out using external standard techniques with gas chromatography (GC).   In
general,  confirmation  of detected pesticides  was performed  by changing
the selectivity of the GC column or detector.  Each set of specimens was
*
 Polychlorinated Biphenyls

-------
accompanied by  a blank  and  ne or  more controls  (fortified  blanks)  to
check  contamination  and pesticide  recoveries  during the  extraction,
clean-up and GC analysis procedures.

     Levels of  heavy  metals  in  soil  specimens  were  determined  using
atomic absorption spectroscopy (AA).  Plane AA was used for lead, cadmium
and arsenic and  the cold vapor techniques for mercury.   No  information
was available on the current accuracy, precision and limits of detection.

     Relatively little information  was  readily available on the current
accuracy,  precision and  MDLs** for  pesticides and  PCBs in  soil.   Of
particular interest are individual values for accuracy and precision for
each pesticide in  each of the specimen matrices (crops, water and sedi-
ment).  An average of each of these values derived from replicate analy-
sis over a period of time would also provide an indication of the method
stability  for a particular  pesticide  in  a  specific  matrix.   Recovery
data  for  each pesticide  was judged a  reasonable indication  of method
accuracy  since  analytical   results  not  corrected  for  recoveries  and
losses during the  analysis  can represent a  significant contribution  to
error in the reported result where such recoveries are low.

     Relatively  little  recovery  and precision  data were available  at
levels near the  pesticide MDLs.   It is particularly important that such
data be provided to users of the computer data files since it represents
the "worst" case in terms of the data quality.

     Limited  review of  analytical  methodology used in the NSMP and  an
attempt to compile  data  for  the average accuracy, precision  and MDL in
soil  for  each toxic  substance monitored under  this program  provide  a
basis for the—following recommendations;

1.   Accuracy (that is, recoveries) and precision data must be generated
     for  all  pesticides  monitored in  the  NSMP.   The  data   should  be
     generated at   two  different  levels  (e.g.,  at  the  MDL  and at  ten
     times the MDL).  The results  for controls analyzed with each set of
     specimens would be  the best means  of  providing  this  information
     since it is  necessary  that  control data  be  made  accessible  to
     computer data  file  users  in  any event.   Controls  must be run with
     each  set of specimens  and  should consist of a  blank (unfortified
     soil free from the  analytes  of interest) and two fortified blanks
     (one fortified  at the MDL and another  at "terr~t"imes~t:He" MDL)T~TKe
     analytical results for the controls should be reported on a separate
     form  (especially  designed for control data) and  encoded  such that
     there is a one-to-one association with the particular set of speci-
     mens  with  which  they   were  analyzed.    The  encoding should  allow
     later computer retrieval of control data for any particular specimen
     set or group of sets (for example, geographic area, over a specified
     period of time,  or  for a particular pesticide).   The availability
     of this  information in  a  retrievable form to data file  users would
     provide  the  means   for assessing  data  reliability now  lacking.
     Further,  any  duplicate  specimen  analyses  must be  reported in  the
     computer data  file  as  they  provide the  best  means of  assessing
  Minimum detection levels

                                  -vii-

-------
     method precision on  a  continuous  basis.   Duplicate results must be
     specifically encoded such that they are retrieved as a group (e.g.,
     all duplicates for a particular matrix and pesticide over a speci-
     fied period of time)  as well as with the initial analytical results
     for the specimen.  The need to make routine control data available
     to program .data file users cannot be "dyefempfiasized.  This does not
     preclude the  use of specialized  controls  (e.g.,  SPRMS);  however,
     these results  should also be included in the computer file encoded
     to allow facile retrieval both as a group and with their particular
     specimen set.

2.   The  pesticides included  on  the  routine  monitoring list  must be
     reviewed on a  regular  basis and appropriate deletions or additions
     made.  Specifically,  the need for routine analysis of organophosphor-
     ous pesticides in soil  should be reviewed as this class of compounds
     is known to be unstable and has seldom been reported in either soil
     or  sediment.   Once  the  baseline  has  been  established  for  such
     compounds, three choices are possible:  1) cease to analyze for the
     compound(s)  except  under  special circumstances   (e.g.,  after  a
     chemical  spill or when contamination is  suspected from  a recent
     application);   2)  analyze  for the  compound(s)  on a  more  frequent
     basis; and  3) concentrate  efforts on the  analysis of degradation
     products of known toxicity where these exist.  Decisions concerning
     the analysis  of  toxic  substances  under the NSMP should be based on
     information generated in other agency data files (e.g., USDA, USGS,
     etc.) as well as data generated within EPA.

3.   Soil specimens should  be  characterized as to the percent carbon or
     percent inogranic  residue.  This  information must  be  included on
     the report form (along  with moisture content) as part of the speci-
     men characterization (source).  Significant trends may otherwise be
     missed with respect  to the soil type and  its effect on toxic sub-
     stance accumulation,  degradation and transport.

4.   Control specimens  (in  the  matrix  of interest)  should be included
     with any specimens either stored for extended periods or shipped to
     another  site   for  analysis.   This is  particularly  important  for
     toxic compounds  which  are known to be unstable;  i.e., organophos-
     phorous pesticides.  The results of  these  "storage controls" must
     also be included in the computer data file with appropriate encoding
     for specific retrieval.

5.   Analytical methodology  should be updated to include state-of-the-art
     capillary  GC   techniques.   This would provide  a higher  degree of
     confidence in  the resulting data  through  increased resolution and
     sensitivity.   The use of higher resolution analytical techniques is
     a move toward the quantitation of PCBs (and technical chlordane) as
     their individual  isomers.  This approach is  far  more useful than
     the present method of attempting to identify, patterns and averaging
     components, since the toxicity and biodegradation of the individual
     isomers are not identical.

6.   The  pesticide recoveries  should  be  monitored  for  each  specimen
     analyzed by initial  fortification  of  the specimen with appropriate
                                  -viii-

-------
compound(s).   Subsequent  analysis  of  the  compound level  should
enable comparison  of data between specimens with  increased confi-
dence that anomalous results will be detected.   The use of internal
standard quantitation techniques would normalize recoveries between
specimens and should be considered.

Detailed  information  on all  analytical  procedures  under  the NSMP
should be documented in  one  source.   The procedures must then be
maintained current with ongoing improvements and modifications made
by the analytical  laboratories.   Such updating requires both flex-
ibility and regular review by program management.
                            -ix-

-------
                  1.  NATIONAL SOILS MONITORING PROGRAM
1.1       General Description of the Program

     The National Soils Monitoring Program consists of two networks:  1)
Urban Soils Network and 2) Rural Soils Network.  The Rural Soils Network
(RSN) is a two phase probability sample.  The first phase sample was the
1967 Conservation  Needs Inventory  (CNI)  sample.  The  RSN sample  is  a
probability subsample  from  the  ultimate sampling units of the 1967 CNI.
The  area sampled  by  the RSN includes all  of the  conterminous  United
States  except for  areas considered  to be  urban in  character.   These
urban areas are  monitored by the Urban Soils Network, which consists of
a sample of the urban areas.

1.2       The Rural Soils Network (RSN) Survey Design

1.2.1     General Considerations

     The fact that the Rural Soils Network (RSN) is a probability sample
makes possible  valid statistical inferences to  the  population sampled,
namely  all  rural  soils of  the  conterminous United  States.   Moreover,
inferences  are  available  for  all  reasonably  large  geographic  areas
within the United  States,  e.g.,  cropping regions and the larger States.
However, the  decision  not to collect data in  some States restricts the
population for which inferences are valid.

     The operational design of the RSN makes  possible  some interesting
statistical analyses.   Since soil and crop  samples  are obtained simul-
taneously at  harvest time,  the relationship between pesticide levels in
soils and harvested crops can be analyzed.   Also, since each sample site
is  sampled  at four-year  intervals,  trends  in pesticide  residue levels
can be investigated.

1.2.2     The Probability Sample Design

     The Rural  Soils Network  (RSN)  is a probability  sample  of 10-acre
sites from  the population  of all rural land areas  in  the conterminous
United States.  Each 10-acre site is located by a point determined by a
probability subsample  of  the data points of the 1967 Conservation Needs
Inventory (CNI) which  is,  in itself, a probability  sample of all rural
land areas in the  conterminous  United States.   Among the lands included
in  the  CNI  are  the following:  (a) privately owned  land,  both personal
and corporate; (b)  land owned by State and  local  governments; (c) land
owned by the  federal government;  and (d) Indian  land.   Among the areas
excluded are:  Ponds and  lakes  of more than two acres,  all streams, and
urban or built-up areas.

1.2.2.1   The CNI survey

     The 1967 CNI did not however map, that is, collect data for, federal
noncroplands.    This portion of  the  CNI  was indefinitely  postponed,
although all federally  owned rural land areas did receive their share of
CNI  primary   sampling  units.  Federally owned cropland  operated  under
lease or permit was, however, mapped by the 1967 CNI.

                                 -1-

-------
     Urban  or built-up  areas excluded  from  the  CNI  have a  specific
definition and not  all  areas inside city and village limits are consid-
ered  urban or  built-up, whereas  some areas  outside city  and village
limits are.  In particular,  urban or built-up areas are defined as areas
of 10 acres  or more,  consisting of  residential  sites,  industrial sites
(except  strip mines,  borrow and  gravel  pits),  railroads,  roadways,
cemeteries,  airports,  golf  courses, shooting  ranges,  institutional and
public administration sites,  and "similar kinds of areas."1  The exclu-
sion  of  urban  or  built-up  areas (of  10 acres or  more)  from  the CNI
resulted in  excluding  of some counties that were  strictly metropolitan
in character.

     The CNI  sample sites were selected by the Statistical Laboratories
at Cornell University and Iowa State University.  The sampling sites for
thirteen  States   in the northeastern  United  States  were  selected  at
Cornell.    All other  sampling sites  were  selected  at  Iowa State.   A
deeply stratified sampling design was  used for  the  CNI.   Counties were
treated  as  strata  within all States.   Little  more  is  known  about the
procedure  used at  Cornell,   except that  the standard  sampling  rate was
about 2  percent  and the standard  size  of a primary  sampling unit  (PSU)
was 100 acres.  The stratification used at Iowa State sometimes involved
large  scale  geographic  stratification  between  the  State and  county
levels,  e.g.,  a  sandhills stratum was designated  in Nebraska,  and  in
many States irrigated areas  were treated as a stratum.

     The sampling  procedure   followed  at  Iowa State  can best  be under-
stood by first considering  the procedure most  commonly  employed in the
States of  the  western  United States that are divided into townships.   A
township is  a 6  mile by 6 mile  square of  land  (see  figure 1.1).  Each
regular  township contains  36 sections.   This  township  consists  of  6
rows, each containing 6 sections.  Three geographical strata were formed
from  this  township:  1) the  first stratum was the northern 2  rows;  2)
the  second stratum was the  middle 2 rows; and  3)  the  third stratum
consisted  of  the 2 southernmost rows.  Each stratum  then contained  48
quarter-sections  (160-acre  square  PSU's),  from  which  a  predetermined
number of  PSU's  were  randomly selected.  The standard sampling rate for
the 1967 CNI was the selection of one PSU from each stratum of 48 PSU's.
Thus,  the  standard sampling rate was  approximately 2 percent (1/48).

     Estimates of nearly equal precision were desired for all counties.
The sampling procedure just  described was believed to provide sufficient
precision  for a  county with 384 to  767  acres of  inventory  acreage.
Thus, a  sampling rate  of  less  than 2% was used in some  of the larger
counties, and  more  than  2%  in some of the smaller counties.  The sampl-
ing  rate was  also  generally increased  in irrigated strata and other
areas of special interest.

     In order  to  increase the sampling rate from 2% to 4%, two quarter-
sections  were  selected  from  each stratum,  rather than  one.  However,  a
i
 Basic Statistics  —  National Inventory of Soil  and  Water Conservation
Needs, 1967.
                                 -2-

-------
        Section one
        sq.  mi.,
        640  acres
Stratum 1
Stratum 2
 Stratum 3
                                                           Sampled location is
                                                           0.25 sq. mi., i.e.,
                                                           a quarter-section,
                                                           i.e., 160 acres.
                                                                     T
                                                                     i
                                                                      t
                                    6 mi
                Figure 1.1  Typical Stratification of  a Township

           (Source:  personal communication from  Iowa  State University,
                     Statistical Laboratory).
                                 -3-

-------
decrease in the sampling rate from 2% to 1% was accomplished by changing
the stratum size from 12 sections to 24 sections with one quarter-section
being selected  from  each of the 24 section strata.   Thus, a decrease in
the sampling rate  from  2% was accompanied by an increase in the stratum
size.

     It was also desirable at times to change the size of the CNI sampl-
ing site from the usual 160 acres.  In some large counties in the western
United States with large tracts of relatively homogeneous soil type and
usage, CNI  sample  sites  consisted of one section or 640 acres.  In some
highly developed agricultural areas of special interest, sites consisting
of 40 acres, a sixteenth-section, were sometimes used because of consider-
able heterogeneity between fields.

     The above  considerations  led to the establishment of Table 1.1 for
the determination  of a  standard sampling rate based  upon the inventory
acreage  of a  county and  the size  of sampling unit  to be  used.   The
standard sampling  rates  shown in Table 1.1 were determined so that the
relative precision of county level estimates would be constant, i.e. not
dependent upon  either county or sampling unit size.  This table was not
strictly adhered to, however.

     The sampling procedure just described was used in all States samples
designed at Iowa State.   Township and section boundaries were artificially
imposed  upon  counties that  were not  already  surveyed into  such divi-
sions.  Whenever possible,  township and section boundaries were made to
follow lines of longitude and latitude in the same manner as in section-
ized States.

     Many  counties are  not  regular in shape  so that  there were often
partial townships, strata,  and sections around their borders.  Sections
around such borders were included in the sampling frame only if at least
part of the section was in the county being sampled.  Such sections were
then grouped into  strata for sample selection.  The strata were usually
composed of twelve  sections  each,  just  as  twelve  sections  form  one
stratum  in  the standard  sampling scheme depicted  in Figure  1.1.   Any
sampling units  that  fell outside the  county of  interest  as a result of
this procedure were subsequently ignored.

     For each  sampling  location, i.e. PSU, determined  by the procedure
just described,  the CNI  collected  data at  each of a  series of points
within that PSU.   In order to determine the positions of these sampling
points, an  aerial  photograph of the sampling  location  was  obtained.   A
spinner or template consisting of a grid of small holes was then centered
over the photograph and  spun.   A  deterministic procedure was  used  to
choose a hole  for  the location of the spinner in a fashion that allowed
some variety in  the  choice of the  spinner  location without introducing
personal bias.2   When the  template came  to rest,  the  location of each
hole was marked on  the  photograph.   The  first point in  the upper left
2
 The procedure for  selecting  the spinner hole is  described  in Appendix
#2 of the National Handbook for Updating the Conservation Needs Inventory
(U.S.D.A., Washington, D.C., August 1966).

                                 -4-

-------
      Table 1.1.  Sampling Rates (%) Which Provide Standard Relative
                  Precision of County Level Estimates for 10 Size-classes
                  and 3 Sizes of Unit
County
size-class
1
2
3
4
5
6
7
8
9
10
(Square
47
48
96
192
384
768
1,536
3,072
6,144
12,288
miles)
and less
95
191
383
767
- 1,535
- 3,071
- 6,143
- 12,287
and over

40 acres
16
8
4
2
1
1/2
1/4
1/8
1/16
1/32
Size of unit (PSU)
160 acres
32
16
8
4
2
1
1/2
1/4
1/8
1/16

640 acres
64
32
16
8
4
2
1
1/2
1/4
1/8
*
 Source:  Taylor, Howard L.  Statistical Sampling for Soil Mapping Surveys,
June 1962, courtesy of the Iowa State University Statistical Laboratory.
                                 -5-

-------
corner of  the sample site was  point number one.  The points  were  then
numbered consecutively along a line proceeding from left  to right and/or
up.  The consecutive  numbering  of the sampling points then continued in
the same manner with the line of points just below the first line.  This
procedure continued until all points in the sample area had been numbered
as illustrated in Figure 2. These points constitute an aligned two-dimen-
sional systematic  sample within each selected PSU.3   Such an alignment
of  points  in  a strictly  North-South  and East-West  manner  should  be
avoided because  of the  tendency to develop land  use  in  such a pattern;
the spinning of the template alleviates this.

     Various  sampling  templates  were  prepared  so  that  template  and
aerial photograph  scales  could  be matched to obtain a constant sampling
density.   It was most convenient to assign the sampling  points in local
USDA  offices,  since local Soil Conservation Service  offices generally
had  the  needed aerial  photographs  in their  files.   However,  the local
USDA personnel  did not  always follow the sampling protocol specified by
the design.  For instance,  it appears that the  templates were not spun
for  Nevada,  and  the template  was  often not  properly  matched  to  the
photograph scale in New Mexico.

     Exhibit  1   is  a photocopy  of  an  aerial  photograph of  a specific
160-acre CNI sample site with 34 consecutively numbered sampling points.
The  point  density of the  template  used for this  site was the standard
point  density  intended for  all sites, except  for the  640-acre sites.
The point density of the templates used for 640-acre sites was one-fourth
that of the other sites, since 640-acre sites were used only in homogen-
eous  land  areas.   Thus,  160-acre  and  640-acre sites usually received
from 34 to 39 sampling points and 40-acre sites usually received between
9 and  11 points.

     Exhibit 2 is a photocopy of the data collection form used to record
the data for  the 34 sampling points shown in Exhibit 1.   The data items
that  were  used  in  determining  the Rural  Soils  Network  (RSN) subsample
were the Field  Mapping  Symbols  and the Land  Use Codes.   In particular,
this  information was used to classify each sampling point as either a
cropland point  or  a noncropland point as shown in Table  1.2.  It should
be noted that sampling points that inadvertently fell into areas outside
the  target population,  i.e.,  urban  areas,  water  areas,  and  federal
noncropland, were classified as noncropland points.

     The counts  of cropland  and noncropland points were accumulated as
shown  in Exhibit 3 for the purpose  of  selecting the  RSN subsample from
the  CNI  sample.   The data for  the  CNI  sites  shown in Exhibits  1 and 2
appear on the fourth line in Exhibit 3.  In particular, Exhibits 1 and 2
are  for  State  16, Kansas;  County  66, Nemaha;  site  number  5-2-2R.   A
total of 34 points were sampled at this site and 19 of these points were
designated as cropland  points.   Thus, the proportion of cropland points
at  this  site  was 19/34 = 0.55882.  However, the sampling rate in Nemaha
County was 2.257%;  i.e., the ratio of sampled acreage to  total inventory
acreage  in  Nemaha  County  was about 0.02257.   Thus,  in  order to adjust
the  cropland  proportion to  a standard 2%  sampling rate, the "cropland
ratio" was computed as
3
 See, for example, Cochran, W. G. [1977, pg 228].  Sampling Techniques.
Wiley, New York.
                                 -6-

-------
           1

           7

          13

          19

          25

          31
 1

 2
 4
 7

11

16

21
                                                  27   32
                  36   38
                                + is center point
              Figure 2:   Sample.EOints  on a 160-acre  Sample Area  .

Note:  The numbers above are the point  numbers  for  the first  points on  each  line.

Source:  Appendix #2 of the National Handbook for Updating the Conservation  Needs
         Inventory (U.S.D.A.,  Washington, D.C., August 1966).
                                           -7-

-------
             Table 1.2.  Dichotomization of the Land Use Code
                           CROPLAND CATEGORIES
     Land Use Codes
Nonirrigated   Irrigated
     L10
     L20
     L30
     L40
     L50
     L60
     L90
Lll
L21
LSI

LSI
L61
L91
Corn and sorghums
All other row crops
Close grown field crops
Cultivated summer fallow
Rotation hay and pasture
Hayland
Orchards, vineyards, and bush fruits
                         NONCROPLAND CATEGORIES
     Land Use Codes
Nonirrigated   Irrigated

     L70
     ISO
     LOO
     P10          Pll
     P20
     F10
     F20
     H10
     H20

     Field Mapping Sybmol

          UB
          FED
          Wl
          W2
          W3
                 Conservation use only
                 Temporarily idle cropland
                 Open land formerly used for crops
                 Pasture
                 Range
                 Commercial forest
                 Noncommercial forest
                 Other land in farms
                 Other land not in farms
                 Urban or built-up area
                 Federal noncropland
                 Water area of more than 40 acres
                 Water area of 40 acres or less
                 Intermittent water area
 Source:  Memorandum entitled "Soil Monitoring Program—Sampling Design"
from Leo G.K. Iverson to USDA PPC Inspectors.
                                 -8-

-------
CONS. NEEDS  SAMPLE
                 NE
                  &
5-2-2R
          Photo No --
          Owne r sh ip     ' _
                          -Area
                 IX-2CC-I71
                  Exhibit 1:  Aerial Photograph of a CNI Sample Location
"source:  Sampling files maintained by the EPA Field Studies Branch, Washington,
                               -9-

-------
SCS-263

   '
                                           ,C MrVCM l\l.«.UItw

                                           Use and
                                                                              SOIL CONSERVATION SERVICE
                                                                                                "J^
Size   '     Ownership .
    TS^  .   s- -   .?
Sub Basin ,
Land Res. Area _/_ __g_ ^_
                                                                            TT TT TT TC IT

                                                            Land Res. Reg. //    Agt. Sub. Reg.   / ±-
                                                                          22
    ^	Exhibit  2:   CNI ^ia^a r.nllo^f-tnn ghoai^	•	—

    *Source: Sampling files  maintained by the EPA Field  Studies Branch, Washington,  D.C.

-------
                         Exhibit 3:
Accumulation Used in Selecting the Pesticide Residue Network Subsample
                                           A.    I*   \      *   IV(«A^ *•
Source:  Samplin
files maintaine
by the EPA Fiel
Studies Branch,
Washington, DC
16
\\l
1 "
16
4 w
16
16
15
1 '-
J. ^
16
15
1 «.
A \^
1 A
* &
16
15
* w
1 ^
*• O
1 x
I n
1 U
it
* O
1 c
i. J
li
* O
1 i.
L w
li
15
1 .15
* • w
21 15
*• nj
L< 15
. 1 i
> A t^
15
1 <=>
*• *J
i <:,
i o
1 1
i O
1 £.
•*. \J
is
*> w
16
^ wJ
16
A U
'15

1 c.
1 3
Ih
* O
1 £.
1 O
1 A
1 5
16
66
66
66
66
65
6C
tc
66

66
W w
67
67
67
o7
67
67
*s •
b7
67
67
67
67
67
67
67
67
67
67
U I
67
67
67
67
U 1
67
U (
67
67
67
~ 67
67
67
67
U I
o7
67
J *
67
\J 1
67
w 1
67
(0
^—
5
5
5
5
5
5

5
5
1
1
1
1
1
1
1
1
1
1
1
1
2

^
2
2
£.
2
2
2
2
bi
2
2
2
3
3
3
3
3
3
3

3
3
•n.1
•'V-'
1
2
2
2
3
3
3

^
4
1
1
1
2
2
2
3
3
3

^
4
1
1
1
2
2
2
3
3
3

^
^
1
1
1
2
2
2
3

3

•- "%
3R
1R
2R
3R
1R
2 P.
3 1<

2R
3K
1ft

3R
1R
2R
3 P.
IK

3fi
1R

3R
IK

3K
1R
2R
3R
1R
2R
3R
1R
2R
3R
1R
2R
3R
1R

3K
IK
2R
3k
1R
33
32
33
34
36
24
36
35
35

33
35
35
36
32
36
37
35
36
37
35
40
32
36
37
38
34
36
33
35
38
38
35
37
38
35
35
36
37
37
37
38
36

38
18
. 5
10
19
22
21
31
20
24
32
24
0
21
11
10
23
0
30
5
15
2
13
24
15
32
16
10
10
22
33
23
.3
17
17
10
19
25
11
20
12
10
10
22
9
33
^*V^r i i ^^
U«cV.o
.4833'*
.13845
.26852
•49519
.54152
.77536
.76305
.50635
.60763
.83400
.64445
0.0000'J
.55703
.29897
•30577
•62513
0.00000
•83368
.13589
.39667
.05591
.31300
.73335
.40769
• 8'4624
.41198
.23778
.27179
•65231
.92255
•59223
.20599
.47525
.44956
.25749
.53116
.69890
.29897
.52390
.31734
.26^45
.25749
.59795
.21478
.97847
.40273
.74767.
.61760
.39093
.34460
.11076
.12307
.37977
.27349
.05212
.24167
.97847
.39158
.67949
.6726«
.35333
.97847
. 13973
.84257
.58179
.92255
.66046
.24461
.57077
.13222
.56540
.69068
.70667
.32615
.05591
.38623
.77247
.50321
.52890
.72097
.44730
.27956
.67949
.44956
.66112
. 7 1 VJ 1
.72097
.38051
.76363"
o.ooooo
1985.58597
1986.72443
1983.99295
1989.48814
1990.029&7
1990.80503
1991.56809
1992.07445
1992.68208
1993.51608
1994.16054
199^. 160-54
1994.74762
1995.04560
1995.35237
1955.97751
1995.97751
19^6. 81619
1996.95209
1997. 34677
1097.40468
1997.72269
1998.456^4
'1998.86423
1999.71043
2000.1224.7
2000.41025
2000.68205
2001.33436
2002.25692
2002.84915
2003.05514
2003.53040
2003.97997
2004.23746
2004.76863
2005.46754
2005.766?!
2006.29542
2006.61276
2006.67721
2007.13470
2007.73266
2007.94744
2008,92591
2052.33459
20?3. 58226
205*. 19967
2054.59081
2054.93541
2055.04618
2055.
2055.
2055.
2005 .
2056.
2057.
2057.
?058.
2058.
20? y •
2060.
2060.
2061.
2061.
2062.
2053.
2063.
2064.
2064.
2064.
2065.
2066.
2066.
2066.
2067.
2067.
2068.
2063.
2059.
2069.
16925
54902
32752
87965
12132
09979
49113
17067
84337
19670
17517
31406
15753
73932
661&5
3221^4
56696
13774
26906
83644
52713
23330
55996
61537
00211
774^0
277RO
80670
527>S8
97493
2070.25454
2070.93404
2071.38360
2072.04473
20 7?. ->56 7-5
2073.47973
2073.86024
2074.62393
2074.623^3

-------
               10     9
               — .   ^
               34   2.257
This cropland ratio  of 0.49519 was then added to the cropland accumula-
tion, which was the sum of the cropland ratios for all previously listed
sites in the State.

     The  procedure  used  to  obtain  the  noncropland accumulation  was
identical  to  that just described  for the  cropland  accumulation.   How-
ever, it was considered desirable to include federal noncroplands in the
RSN  noncropland  sample.   Although  federal noncroplands  had not  been
mapped by the CNI, the CN1 sampling procedure did assign PSU's in federal
noncropland areas.   That  is,  the CNI sample sites were selected without
regard to federal land status,  whenever a CNI sample site fell entirely
in a federal  noncropland  area, no CNI sampling  points  were assigned to
the  site.   In  order  to  obtain coverage  of these  federal noncropland
sites by the  RSN noncropland sample, the sampling staff obtained a list
of the  CNI sample  sites  in  federal  noncropland areas  for each State.
The sampling  staff  then inserted a "dummy" CNI record into the listings
of the  type  shown in Exhibit 3  for  each CNI site that fell entirely in
federal noncropland.  Each dummy record showed zero cropland points, and
a  total  number of points appropriate  for the size  of  the sample  site,
e.g., 36 points for a 160-acre PSU.

     The  grand  total  of  the  cropland  accumulation  from  Kansas  was
3426.67927, and  for  noncropland it was 3131.41689.   The  total of  these
accumulations, 6558.09616, was employed for estimation of the proportion
of  cropland  and noncropland acreage  in Kansas.   In particular,  the
estimate of the proportion of cropland acreage in Kansas was

               3426.67927
               6558.09616
                                   52.25112878%
This procedure provides  a  direct estimate of the proportion of cropland
acreage in  the State.   This estimated proportion of cropland was multi-
plied by an estimate of the total land area in Kansas, namely 52,510,720
acres, to yield an estimated cropland acreage in Kansas of

     (.5225112878) (52,510,720)    =    27,437,444 acres.
This same procedure was used for all States.

1.2.2.2  The RSN Survey

     The Rural Soils  Network (RSN) selected two subsamples from the CNI
sample  sites,  a  cropland  sample and a noncropland  sample.   The sample
design  of  the  RSN specified  that the  subsamples would  contain  0.025
percent  of  the cropland  acreage and 0.0025 percent  of  the noncropland
acreage in each State.  Thus the cropland sample in Kansas was to consist
of
                                 -12-

-------
     (0.00025) (27,437,444)   =    6,859.36 acres.

Each RSN  sample  site was to be  a  10-acre plot with an  equal  number of
plots  sampled in  each  of  four years.   Thus,  the number  of cropland
sample sites  to  be selected in Kansas in each of four years of sampling

                 n«   6'8!)9:36*"eS	c-    =    171 sites/year.
                 (10 acres/site) (4 years)
The number  of  10-acre noncropland sites to be sampled in each State was
determined  in  exactly the same manner.  Each RSN site was to be sampled
a second  time  four years after the  initial  sampling to determine rates
of change in pesticide residues.   Implementation of this design for all
States resulted in the sample sizes shown in Table 1.3.  This sample was
expected  to yield  reasonably precise estimates for cropping regions and
some of the larger States.

     Having determined  the  number of RSN cropland sites  to be selected
in a  State,  a  systematic subsample of  CNI  cropland  points was selected
from  the  cropland  accumulation for the State.  Each CNI  cropland point
selected was used  to locate a 10-acre  RSN  cropland  sample site.  It is
easiest to explain this procedure by example.  The total of the cropland
accumulation for Kansas  was 3426.67927, and  171  cropland  sites were to
be surveyed in each of 4 years.  Thus, the starting point for the sample
in Kansas was a random number between zero and

          3426.67927     ±       Q ?fi
            (4) (171)          :>.uimt>


The random  number  chosen was 0.27889, which determined the selection of
the  first RSN  cropland  site.  All  RSN  cropland sites in  Kansas  then
resulted from a sequence number of the form

          0.27889 + k (5.00976) for k = 0,1,2,..., [(4)(171)-1 = 683]

The RSN  cropland  site in Kansas that  was  considered previously in this
discussion resulted from the sequence number

          0.27889 + (397) (5.00976)     =    1989.15361,

as seen on the first line of Exhibit 4.

     The  sequence  number 1989.15361  not  only determined  that CNI  site
5-2-2R  of Nemaha  County,  Kansas  was  to be  included in  the cropland
sample  of the RSN;  it also  specified that a particular  point at  this
site was  to be used to locate the  10-acre  RSN site.  Hence, one of the
19 cropland points at this site was  determined by interpolation.  From
Exhibit  3,  the  following  cropland  accumulations   were  obtained  for
interpolation:

     State     County    CNI Site       Cropland Accumulation
      16        66       5-2-1R              1988.99295
      16        66       5-2-2R              1989.48814
                                 -13-

-------
           Table 1.3:   Design  Sample  sizes  for  the  Rural  Soil^ Network

Census Division
State
New England
Maine
New Hampshire
Vermont
Massachusetts
Rhode Island
Connecticut
Middle Atlantic
New York
New Jersey
Pennsylvania
East-North Central
Ohio
Indiana
Illinois
Michigan
Wisconsin
Pacific
Washington
Oregon
California
West-North Central
Minnesota
Iowa
Missouri
N. Dakota
S. Dakota
Nebraska
Kansas

Cropland
80
32
8
20
8
4
8
320
152
20
148
1648
276
312
568
220
272
600
180
152
268
3596
488
608
•328
636
424
428
684
Component
Noncropland
96
48
12
12
12
4
8
128
60
12
56
224
36
28
32
68
60
456
92
140
224
456
80
28
76
48
80
80
64

Total
176
80
20
32
20
8
16
448
212
32
204
1872
312
340
600
288
332
1056
272
292
492
4052
568
636
404
684
504
508
748
*Source:
Wiersma, G.B., Sand, P.F., and Cox,  E.L.  (.1971).   A sampling Design to
Determine Pesticide Residue Levels in Soils of the Conterminous United
States.  Pesticides Monitoring Journal 5CD, pp.  63-66.

                                  -14-

-------
       Table 1.3:   Design Sample Sizes  for the  Rural Soils  Network
                                 (continued)                 '
PPTIKIIQ H"ivi si on
wwllOUO WAVAOAwH
State
.South Atlantic
Delaware
Maryland
Virginia
W. Virginia
N. Carolina
S. Carolina
Georgia
Florida
East-South Central
Kentucky
Tennessee
Alabama
Mississippi
West-South Central
Arkansas
Louisiana
Oklahoma
Texas
Mountain
Montana
Idaho '
Wyoming
Colorado
New Mexico
Arizona
Utah
Nevada

Cropland
556
12
52
84
24
124
68
V120
72
452
124
112
92
124
1300
188
108
260
744
916
340
132
68
240
40
36
48
12
Component
Noncropland
376
4
12
56
36
68
40
80
80
244
52
56
72
64
552
64
60
'84
344
1280
200
120
148
140
192
176
128
176

Total
932
16
64
140
60
192
'1-08
200
152
696
176
168
154
188
1852
252
168
344
1088
2216
540
252
216
380
232
212
176
188
Grand Total
9468
                                               3812
                                   13280
                                        -15-

-------
Exhibit 4: KFRN Cropland Sampling

-------
The interpolation proceeded as follows:
1989.15361 - 1988.99295
1989.48814 - 1988.99295
(19) = 6.16 -> 7.
The interpolation figure was rounded up since an integer from 1 to 19
was required. In this case, the seventh cropland point at CNI site
5-2-2R was to be used to locate the RSN cropland site, as is also speci-
fied in Exhibit 4.

Once a defining point for a RSN site had been selected, an adjacent
second cropland (or noncropland) point was required in order to completely
determine the location of the 10-acre RSN site. If X is used to denote
the defining cropland point selected from the CNI sample, an adjacent
cropland point was to be determined by considering the other CNI sample
points in the order indicated below :
4
3 X 1
2
If an acceptable second cropland point could not be located as indicated,
then the next cropland point in the listing was taken as a first point
and the routine repeated.4 This procedure was implemented in the USDA
offices prior to field work, and some discretion was allowed. The
intention was clearly that an RSN cropland site should not be placed at
an isolated cropland point.

After two points had been selected, a designation was made on an
aerial photograph or other map of a 10-acre site with these points
centrally located. Attention was given to making the boundaries conform
with natural physical features as much as possible. Implementation of
this procedure can be illustrated by Exhibits 1 and 2. The design
specified that the seventh cropland point was to be used to locate the
RSN site. From Exhibit 2, it can be seen that the seventh cropland
point is the eighth CNI sample point. In Exhibit 1, it can be seen that
the depicted RSN site was, indeed, centered about the eighth and ninth
CNI sample points, both cropland points.

The field person was permitted to adjust the boundaries of the
designated 10-acre RSN site and was expected to prepare records so that
the site could be readily relocated for subsequent sampling at 4-year
intervals. The final sample location was to be not less than 8 acres.
If the designated site should prove to be totally unacceptable,5 the
field person was permitted the following alternatives in order of
preference:

1) Try to find 10 acres within the CNI site that are acceptable.

2) Try to find 10 acres within one-fourth mile of the CNI site
that are acceptable.
Memorandum entitled "Soil Monitoring Program — sampling design" from
Leo G.K. Iverson to USDA PPC Inspectors.
5
The authors were not able to find an explicit definition of "totally
unacceptable."
-17-
-------
3) Try to locate two smaller sites within the CNI site that equal
10 or nearly 10 acres. Sample as if they were a single site.

4) Request the USDA staff at Hyattsville to re-select the CNI
site.6
Substitute CNI sites were selected in a number of cases. The
substitutes were chosen from within the same county as the original
site. An effort was made to choose a substitute CNI site with approxi-
mately the same proportion of cropland points as the original CNI site.
However, since a random sequence number was not used to determine the
substitute site, it was necessary to randomly designate a point within
the substitute CNI site to locate the 10-acre RSN site. It is not clear
that this randomization was always performed.

There are several reasons why substitute CNI sites were sometimes
required. Re-selections were performed by the USDA staff at Hyattsville
before the sample went to the field when the selected CNI site was
already in use by the USDA. For example, the selected CNI sites were,
occasionally found to be in use by

a) the Soil Conservation Service for their crop estimates,
b) the Economic Research Service for their Pesticide Use Survey,
c) the June Enumerative Survey of the Statistical Reporting
Service.

Re-selections were sometimes necessary after the sample went to the
field because the land owner refused to cooperate. Some re-selection
was necessary because of a change of land use status. Unfortunately,
substitute sites are not designated as such on the computer records.
This is especially problematic if a substitute was selected in the
second round of data collection. First round and second round data
cannot be compared directly for a site if a substitute has been used.

1.2.3 Limitations as a Monitoring Network

The Rural Soils Network (RSN) design specified that 0.025 percent
of the cropland acreage and 0.0025 percent of the noncropland acreage
was to be sampled in each State. This criterion resulted in sample
sizes that vary considerably from one State to another. Rhode Island
received the fewest sampling units, four each of cropland and noncrop-
land. Texas received the most, 744 cropland sites and 344 noncropland
sites. Thus, reliable estimates of average pesticide levels are not
available for some geographic areas. This is a minor limitation because
estimates are not generally required for small geographic areas. The
deletion of some States when the design was implemented restricts the
population to which inferences are valid, however.
Shepherd, D.R. PPC Division Memorandum 804.3 concerning "Guidelines
for collecting sample for the National Soil Monitoring Program—1969."
-18-
-------
More significantly, the following factors must be noted:

The current design was found to be too expensive to operate.
The network as it stands was not designed to monitor non-
pesticide toxic materials, hence may be inadequate particularly
for non-agricultural areas and localized contaminants.
The stratification is now 15 years out-of-date, which means
losses in efficiency.
The two phase design renders estimating precision difficult.

1.2.4 Uses in Regulatory Action

The Rural Soils Network (RSN) could be used to identify pesticides
and other widely dispersed toxic substances for which regulatory action
is desirable. Each sample site of the RSN was to be sampled every four
years. Thus, significant increases in average levels of specific
substance, could potentially be discovered. Moreover, since residue
levels were determined for both soils and crops, the relationship between
soil and crop residue levels could be used to identify potentially
dangerous levels of soil residue. For example, if a pesticide level in
corn that is dangerous for humans has been identified, the relationship
between soil and corn concentration of that pesticide could be used to
determine a corresponding dangerous level of the pesticide in soil.

The RSN could also be used to monitor the effects of regulation of
specific toxic substances. Because each RSN site is sampled every four
years, the network could monitor the, effect of the regulation on levels
of the toxic substances in soils and crops.

The RSN may be of limited use, however, in identifying specific
violators of regulatory action. This situation results from the very
design of the RSN. The RSN is designed to be sites selected by a random
process at a given sampling rate with the location of specific sites
being confidential to protect the farm operator. Specific localities of
interest may not enter the RSN sample, but the design framework could
serve as the basis for special studies in suspected "hot spots."

1.2.5 User Needs and Historical Uses of the Data

The historical objectives of the Rural Soils Network (RSN) were as
follows:7

(1) Determine levels of pesticides and other pollutants in the
agricultural environment.

(2) Observe trends in pollutant levels through time.

(3) Determine the degree to which crops are contaminated.

(4) Determine the levels of various pollutants in agricultural
waters.
7
Shepherd, D. R. PPC Division Memorandum 804.3 concerning "Guidelines
for collecting samples for the National Soil Monitoring Program—1969."
-19-
-------
(5) Determine the concentration of certain pollutants at various
depths in the soil profile.

(6) Review program findings with recommendation of appropriate
actions in mind.

The six objectives listed above comprise the major historical user needs
for the RSN data. The regulatory uses considered in section 1.2.4 are
included in objective (6) above.

The implementation of the RSN allows only partial fulfillment of
the six objectives listed above. It appears that objective (5) has been
abandoned since soil data has been collected only for the top three
inches of soil. Objective (4) has only been partially addressed by
sampling pond water and sediment during a single fiscal year. Most
States have follow-up data with which to address objective (2) for only
one=fourth of the cropland 'sites and none of the noncropland sites.

1.3. Alternate Survey Designs for the RSN

The Rural Soils Network (RSN) is a probability sample of the rural
areas of the conterminous United States. A probability sample is essen-
tial as an objective basis for making inferences. The RSN is, however,
a subsample of the 1967 Conservation Needs Inventory (CNI). It relies
upon the CNI to identify the cropland and noncropland strata, as in
double sampling schemes. As the 1967 CNI became outdated, sites were
found in the field to no longer belong to the intended stratum, cropland
or noncropland. It has been the practice for the field personnel of the
RSN to use substitute sites in these cases. j^Jhe use of substitute sites
tends to destroy the probabilistic nature of the sample and is not
generally recommended, however.5"D Resumption of RSN data collection is
likely to result in many sites being misclassifled.

Thus, sampling considerations alone suggest that a new RSN sample
is needed. In addition, a new sampling design should address the problem
of monitoring toxic substances other than agricultural chemicals and
should attempt to reduce the cost of the monitoring network. The expense
of the RSN led to purposive deletion of entire States in the past, which
restricts the population to which valid inferences can be made. Various
alternative designs will now be considered.

1.3.1 Design Option One

A minimal change alternative would be to subsample the current RSN
on a probability basis. This option mainly addresses the problem of the
cost of the RSN, however it does also address the need for regional and
national estimates. £~bny need to eliminate reliance upon the 1967 CNI is
not addressed/^

This option does have some advantages, however. It's main advantage
is that it can be implemented quickly and easily, possibly while other
alternatives are under development. Another advantage is that direct
comparison could be made to the data collected from 1968 to 1975.
8
See, e.g., page 386 of Kish, Leslie (1965). Survey Sampling. Wiley.
-20-
-------
Careful treatment of the sites found to no longer belong to the
intended stratum would be necessary. There are at least three ways that
these sites could be handled. One possibility would be to drop these
sample sites entirely. There would be a loss in precision for estimates,
and the sampling weights would have to be adjusted to reduce the bias
that would result from deletion of these sites. Alternatively, substitute
sites could be selected, as has been done historically with this sample.
However, the use of substitute sites introduces bias that cannot be
measured or adjusted. Finally, sites can be retained as selected. This
keeps the initial weight correct and provides unbiased estimates at the
cost of a decrease in precision. The computerized data records would
need to indicate the resolution of each of these cases, whether they
were all dropped, or substitutes were selected, or retained in their
original strata. If as many as 10 percent of the sample sites require
either deletion or substitution, this design option may not be reasonably
efficient..

Data analysis problems would be aggravated by subsampling the
present RSN. The deep stratification of the 1967 CNI results in strati-
fication benefits for sample variances for the RSN. However, the sparse-
ness of the RSN sample in comparison to the CNI sample makes recovery of
the stratification effects difficult (See section 1.7). The major
problem is that many counties have no more than one RSN site. The
magnitude of this problem would necessarily increase with a subsample of
the current RSN.

Thus, replicate subsamples are recommended if this option is to be
implemented, even if it is only on a temporary basis. For example, if
50 percent of the RSN sites are to be surveyed, five subsamples that
each comprise a 10 percent subsample could be used. At least five
replicate subsamples should be selected. A defensible procedure for
selecting the replicate subsamples would be to first order the RSN sites
by States and CNI strata within States, then independent systematic
subsamples could be selected. This procedure would insure representation
of all states and as much CNI stratification as possible in each of the
replicate subsamples (or technically 'pseudo-replicate1 subsamples).

The use of replicate subsamples would make it possible to estimate
easily sample variances by using the theory of replicate subsamples.9
The results of interest would initially be tabulated separately for each
independent subsample. The variance of these results treated as indepen-
dent measurements provides a simple, unbiased estimate. The resulting
variance estimate captures all design effects, although stratification
effects and design effects are not separately estimable. This is not of
major consequence for the present RSN sample, since only one stage of
sampling is employed within CNI strata.

It might also be useful to select the subsamples at different rates
within domains of interest. The present RSN sample has widely different
sample sizes within the Census Divisions, and within the cropping regions.
If cropping regions comprise the major domains of interest, they could
be subsampled at differential rates so that each received about the same
9
See, e.g., page 19 of Cochran, W.G., Hosteller, F. , and Tukey, J.W.
[1975]. Principles of Sampling. Journal of the American Statistical
Association, 70: 13-35.
-21-
-------
number of RSN sites. Alternatively, Census Divisions could be subsampled
at differential rates, which might considerably reduce the sample size
in some of the larger States, like Texas.

Finally, identification of strata of special interest within the
domains just considered, could be used to increase the possibility of
finding toxic substance residues. For example, the noncropland RSN
sites could be stratified into industrial and nonindustrial areas.
Sites in nonindustrial areas could then be sampled at a lower rate than
sites in industrial areas. Stratification according to whether or not
toxic residues have previously been found at the site may be useful
also. Widely different sampling rates would not be used for these
strata, however, because they would form a far from homogeneous group.

1.3.2 Design Option Two

The present RSN sample is a subsample of the 1967 Conservation
Needs Inventory (CNI). A design analogous to the design that produced
the present RSN sample could be based upon the 1982 National Resources
Inventory (NRI). Use of the 1982 NRI would provide up-to-date land use
information. The NRI was designed by the Statistical Laboratory at Iowa
State University, and is currently being conducted by the Soil Conserva-
tion Service. The design of the NRI is similar to that of the CNI,
except that the standard sampling practice is to collect land use data
for exactly three random sampling points within each primary sampling
unit (PSU) of the NRI.10 Also, the NRI is based upon a more dense
sample than was the CNI. Consequently, data collection for the NRI is
over three years, 1980 to 1982.

The procedure used to select the RSN subsample from the CNI sampling
points resulted in a sample that was essentially self-weighting within
States where only one size of PSU was used (See Appendix D). Equal
weighting was an important consideration before the development of
computer software for the analysis of unequal probability samples. The
unweighted analysis of data from sample sites selected with unequal
probabilities can well lead to spurious conclusions.

Since software is now available for the analysis of unequal probabil-
ity samples, an improved subsampling procedure can be devised. The goal
of the subsampling procedure is to obtain adequate precision at minimum
cost. This can be accomplished by identifying areas where toxic residues
are likely to be found and giving these areas a higher probability of
selection. It is, of course, important that all areas have a positive
probability of being in the sample so that statistical inferences will
be valid for the entire population.

It is suggested that counties be used as primary sampling units for
the second phase sample. The data from the present RSN suggests that
counties are generally rather heterogeneous with respect to toxic
residues. Thus, it would be advantageous to select relatively few
counties with a relatively large number of sample sites, say 5 to..10.,
within each sample county. The use of counties as PSU's will reduce
The NRI sampling design also includes pilot studies of alternative
sampling designs in California, Louisiana, and Maine. In Louisiana (and
in 40-acre PSU's), there is only one random sampling point within a PSU.
-22-
-------
travel costs associated with data collection. More importantly, however,
smaller areas like counties can be stratified more effectively into
areas where toxic residues are likely to be found.

The RSN sample sites are to be located at NRI sample points. Thus,
sample counties are selected from the counties occuring in the NRI
sample, and so that counties where toxic substance residues are likely
to occur have a greater chance of selection. Thus, it is suggested that
counties be selected with probability proportional to size (PPS), where
the size measure is a measure of the likelihood for finding toxic
residues. Selection of PSU's with PPS sampling is a common technique
with resulting variances of estimates reduced to the extent that the
size measure is correlated with items of interest. Variables that can
be used to construct county size measures include:

(1) Proportion of county acreage in cropland.
(2) Proportion of county acreage in heavy industry.
(3) Intensity of agricultural activity.
(4) Degree of industrialization.
(5) Predominant crops.
(6) Predominant industries.
(7) Predominant soil types.
(8) Climate
Counties should be selected with PPS sampling within Census Divisions,
cropping regions, or some other domains to insure adequate representation
of the major domains of interest.

After sample counties have been selected, the NRI sampling points
can be used to locate RSN sample sites. The procedure used for the
current RSN cannot be used, however, since most PSU's of the NRI have
exactly three sampling points and some have only one sampling point.
Thus, it 'Isnid longer—feasible to center an RSN cropland site about two
cropland sampling points. Instead, if a cropland point is selected for
the location of a cropland sample site, it is suggested that the site be
a square 10-acre site centered at the selected RSN cropland point. If
such a site is not all cropland, percent cropland will be noted and
specimens taken and kept separately for each stratum.

Efficient sampling within the selected counties could result from
careful stratification within the sample counties. The NRI sampling
points within a county could first be stratified into cropland points
and noncropland points, to insure adequate representation of each of
these land types and because agricultural chemical residues are more
likely to be found in cropland. Local land use characteristics similar
to those suggested for constructing county size measures could be used
to further stratify both the cropland points and the noncropland points.
Finally, greater selection probabilities would be used in strata where
toxic substance residues are more likely to be found. Moreover, at
least one cropland site and one noncropland site should be selected from
each sample county that contains at least one NRI cropland and one
noncropland sample point.
-23-
-------
1.3.3 Design Option Three

1.3.3.1 Background

The target population for the National Soil Monitoring Program
(NSMP) was the land in the conterminous United States, divided between
the Rural Soils Network (RSN) and the Urban Soils Network (USN). Descrip-
tions of these networks are given elsewhere.11 Both networks were
interested in "levels" i.,e., the absolute amount of pesticide in the
soil, and "trends," the change in this amount with time.

Review of the data indicates large numbers of zero valued observa-
tions, and relatively few positive observations. This analytical
challenge has been discussed elsewhere [See Lucas et al, Recommendations
for the National Surface Water Monitoring Program for Pesticides.
Report No. RTI/1864/01-02I]. The conclusion of that analysis was that
the appropriate measures of "level" are:

(1) The proportion of positive detections, that is, the relative fre-
quency of last stage sampling units positive for the substance(s) under
investigation, and

(2) The proportion of sampling units containing concentrations of sub-
stance above some specified level. This level may signal the existence
of an undesirable situation.

(3) The geometric mean of the positive values which is a useful con-
committant to the data, identifying situations where, for example, the
proportion of positive sampling units remains constant, but the level of
concentration of toxic substance increases or decreases.

(4) Related to (3), measures based on a truncated, or censored, lognor-
mal model may prove useful.12

In the following sections, a two-stage design is proposed, and each
stage of sampling is described in some detail. Simple cost and variances
are included as means of investing the effect and expense of various
alternative sample allocations.

1.3.3.2 Overview of the Proposed Sample Design

The proposed design is a two-stage area probability sample with
stratification of the sampling units at each level. The first stage or
primary sampling units (PSU's) are counties. The 3141 counties in the
United States in aggregate constitute the total land area of the country.
Geographic stratification is provided by the four Census Regions.
Allocation of PSU's to these regions is in proportion to the land area
eligible for the study.
"National Soils Monitoring Program: Preliminary Report. January,
1980. Research Triangle Institute. EPA Contract No. 68-01-5848.

120wen and DeRouen. Estimation of the mean for lognormal data containing
zeros and left-censored values, with application to the measurement of
worker exposure to air contaminants. Biometrics: 36:707 (1980).
-24-
-------
The question of land area eligibility is currently defined by the
membership requirements of the RSN and the USN. It may be advantageous
from administrative as well as fiscal and statistical grounds to combine
the activities of the soil networks, and consider SMSA counties as a
stratum within the survey. This point requiring further review beyond
the scope of this study is not addressed. Initial investigation does
suggest that savings may reasonably be anticipated. Further discussion
is limited to tasks assigned to the RSN.

With the extension of monitoring responsibility from pesticides to
toxic substances in general, some revision of the approach seems
indicated. The following stratification variables are therefore proposed
in addition Census Regions for the PSU's:

(1) Land area,
(2) Population density,
(3) Agricultural activity, and
(4) Industrial activity.

Second stage sampling units (SSU's) are 10-acre plots. These are
proposed as the final stage units or analysis units on the assumption
that they are sufficiently homogeneous that the effects of subsampling
are negligible. This is a verifiable proposition. The problem with
SSU's this small is the ability to locate them in the field. The require-
ment for exactly locating plots is exacerbated by the absence of identi-
fiable boundaries, rendering the task most difficult. To ease this
difficulty, Census enumeration districts (ED's) are proposed as readily
identifiable segments. The problem is reduced to locating the SSU
within the ED, or any suitable subsegment adopted to facilitate matters.

SSU's will be allocated equally to PSU's. A detailed field protocol
will locate the points for specimen collection, leaving the minimum of
discretion for the field personnel in the selection of these sites. The
protocol would specify a grid locating multiple specimen collection
sites. The soil collected in a given plot would be composited, unless
the homogeneity of the 10-acre plot is under investigation.

Temporal effect is not considered. It is assumed for establishing
budget only that one collection per site per year will be made. However,
it does not seem reasonable that all toxic substances persist in soils
at stable levels throughout the year. This may be satisfactory for
heavy metals, particularly at poorly drained sites, but most pesticides
dissipate through leaching, transpiration and degradation following
application, and volatiles in all likelihood leave the soil almost
immediate!y_-_j5TThus, special studies of this phenomenon are recommended
above the monitoring effort.

1.3.3.2.1 The First Stage Sample

The first stage sampling units are counties, which are often used
as sampling units in national surveys. They are easily identified and
are political units of sufficient size that a great deal of information
is available about them. Indeed, in order to enhance the efficiency of
the proposed design, it is recommended that extensive collection of
-25-
-------
information be undertaken for each county in the U.S. This information
should include:

1) Total land area
2) Cropland and non-cropland acreages, or their estimates
3) Soil maps, characteristics - pH, organic content, etc.
4) Drainage areas and water ways
5) Weather, climatic and meteorologic data
6) Location and size of urban areas
7) Cropping patterns, major crop(s)
8) Location and types of industrial activities, including
storage sites
9) Location of dump sites
£
Moreover, the Master Area Frame maintained^the USDA should be consulted
for design information, as well as States with mandatory pesticide
reporting laws.

The size measure for the Census Region is its eligible land area.
Other measures correlated with toxic substance use do not appear feasible
at this level in view of the variability in land use. The present
proposal uses the definitions of the RSN to determine eligibility. The
number of counties (PSU's) allocated to each Census Region is in propor-
tion to its size, with at least one PSU selected from each region. The
allocation of PSU's to further strata is carried on in this fashion with
the limitations that there must be at least one PSU in each stratum.

PSU's in each stratum will be selected with probability proportional
to size (PPS) and with replacement. As before, the size measure is the
land area eligible for the RSN.

It is anticipated that the investment in the collection of the
county level information will provide substantial gains in_ precision
through effective stratification. The purpose of this stratification
will be to locate regions of approximately equal risk of exposure to
toxic substances, hence permit the effective location of sample sites.

Two points can now be made:

(1) The most effective variables for stratification will change
for different classes of toxic substances, and may change from
substance to substance, and

(2) It is not possible to anticipate which substances will be of
major interest in the future.

This leads to the conclusions:

(a) Information may be profitably collected for every county in
the United States, and

(b) Any proposed design should be as flexible as possible.
-26-
-------
Point (a) supports point (b) above by simplifying the process of making
design changes if and when they become necessary. Additionally, the
selection of stratification variables which appear to be both general
and effective offers the possibility of achieving a flexible and effi-
cient design over the near future.

The approach is to propose the selection of PSU's according to the
general stratification scheme which is found most effective at the time
of the adoption of the design. These PSU's would then establish the
monitoring network (RSN). The selection of the SSU's within the given
PSU's according to the procedure below would then determine the specific
soil specimen sites. However, it is proposed that the stratification
variables within the PSU's, and hence the soil specimen sites, be allowed
to change in response to changing interest in toxic substances. It is
intended by this technique to maximize the probability of positive
results to monitoring efforts.

1.3.3.2.2 The Second Stage Sample

The secondary sampling units (SSU's) will be 10-acre plots. Equal
numbers will be selected with^in each PSU. It is possible that there
will be more strata within PSU's than sampling units. This suggests
that stratified random sampling will not apply. There are a number of
related methods which can be used in this situation.

One procedure is to use a composite index combining several strati-
fication variables. In effect, two or more strata are combined and a
'weight' is assigned to each observation in the new stratum based on the
relative sizes of the original strata. Observations are then selected
from the new strata by the usual probability methods.

A second procedure is to consider the effect of combinations of the
strata and assure at least one observation from important combinations
is selected. This can be accomplished by employing the lay-out of an
experimental design as if the strata were treatment levels. The latin
square is used in this fashion. For example, consider the following
case with two stratification variables each at 3 "levels."
Table A. A Latin Square Selection Scheme

Geographic Location
Type 1 Type 2 Type 3

Type 1 x
Soil Types Type 2 x
Type 3 x

x = selected plot.
Here with a sample of 3 plots we have observations from each type of
location (possibly classified by potential exposure) and of each soil
-27-
-------
.typje. This can be done by: Choosing a "cell" (Soil Type x Location
Type Combination) at random, then eliminating the remaining cells in the
same row and column from further consideration. A second selection is
made at random from the cells in the remaining rows and columns. The
row and column containing the second selection are then eliminated and
the next random choice is made. This procedure is continued until all
the rows and columns are eliminated.

A third procedure generalizes the approach above and is called
"controlled selection". The typical use of this procedure is to visual
the sample in a tabular array as:
Table B. Example of Controlled Selection

Geographic Location
Site 1 Site 2 Site 3 Site 4 Total

Type 1 H!
Soil Type Type 2 n2'
Type 3 n3'

Total n ! n 2 n 3 n 4 n
Here the total number of plots assigned to a PSU, say, is n. The
constraint, or "control", imposed is that the margins of the table, the
row and column totals (or proportions if preferred), be satisfied. So,
Site 1 must appear a..^ times and Soil Type 2 must appear n.g times, and
so on. Any arrangement of the sample among the table cells which satis-
fies these constraints is acceptable. And, at least conceptually, every
such arrangement, or a specified subset, is written down and a probabil-
ity assigned to it. Then one of these arrangements is selected by
chance according to the assigned probability.

The complication introduced by this method is the loss of the
ability to obtain simply an estimate of precision. The level of control
requires either replication to obtain a variance estimate or some approxi-
mation be used.

The methodology adopted for the design will depend on the actual
stratification variables and the constraints on selection which seem
most effective. An important statistical consideration is that the
procedure used should provide an unbiased estimate of the PSU parameter
of interest (total, mean or proportion). In addition, a measure of
precision of the estimate should be capable of reasonable approximation.

1.3.3.3 Size and Allocation of the Sample

Sample size is determined by the level of precision needed to
answer the question or questions which are the reason for undertaking a
survey. The allocation of the sample is dependent upon locating sources
of variation entering the survey and the cost of controlling them. Of
-28-
-------
course, these two considerations are interdependent and cannot be solved
separately. In order to examine this quantitatively, models approximat-
ing cost and variability are constructed. These models are only intended
to indicate values depending upon circumstances which may change, but
still permitting more rational decision-making rather, than an attempt
at an exact description of budget or variability.

1.3.3.3.1 A Cost Model

The total cost of a survey depends upon both fixed and variable
costs. Fixed costs are overhead costs which are essentially independent
of the sample size - materials, rental of quarters, preparatory work,
staff salaries, and so on. Variable costs are unit costs - specimen
collection, travel, shipping, etc. For our two stage sample, we assume
a simple linear cost model,

C = GO + GI ni + Cg nj 0.2»

where

C is the total cost of the survey

CQ is the fixed cost

G! is the variable cost for county-level data

C2 is the variable cost for plots

H! is the number of counties in the sample

nz is the average number of plots per county.

The development of the costs is shown in Table 1.3.3.1. These costs are
estimated from related efforts and are only approximate. Different
methods in contracting and operating the survey will significantly alter
these costs. For example, cooperative agreements with the Department of
Agriculture or other interested agencies may produce substantially
different field costs. Also laboratory costs are included for "organo-
pesticides" and heavy metals. However, different budgeting may appro-
priately exclude part or all of these costs.

Under the assumptions given we find

C0 = $367,800

Cj. = 3,280

C2 = 926 .
Since the overhead cost includes the collection of preparatory data,
maps, etc., on all 3141 counties in the United States, this cost is not
included. It may be preferable to:
-29-
-------
Table 1.3.3.1 Construction of the Cost Model
1. Selection of counties - first stage units

Item GO - Overhead Costs Cj - per County Costs
*
Construct Frame $300,000
Stratify Frame 1,000
Develop Size Measures 300
Select Sample Counties 5,000
Develop Computerized Data Sets 2,500
Administration 3,000 100

2. Selection of plots - second stage units

Item C0 Cj. C2 - per Plot Costs

Construct Frame 3500 1000 50
Stratify Frame 4000 150
Form Segments 500 20
Select Sample Plots 3000 10 1

3. Field Work and Analysis

Collection of specimens 12000 2000 300
Laboratory Analysis** 560
Data handling 1000 5
Stat. Analysis, Reporting 20000 10
General Administration 10000

Total: C0 = $367,800 Cj = $3,280 C2 = $926 .
*
Includes preparing materials on 3141 counties.
**
Uses RTI costs, does not include analysis of toxic substances beyond
pesticides and heavy metals.
-30-
-------
(1) Do only a subset of the counties, or
(2) Spread this cost over several years.

Ignoring this factor is equivalent to using the cost equation

C • CQ — Cj nj + C2 HI n2

which clearly does not affect the relative allocation of the sample.
Using the first equation, the estimate cost of a survey of 57 counties
with an average of 18.73 plots per county is ~ ~ —

C = $367,800 + $3,280 (57) + $926 (57) (18.73)

= $1,543,367.

1.3.3.3.2 Sample Size Calculations

A minimum acceptable precision must be specified to insure the
adequacy of the survey results. The statement "I must know the amount
within 10 percent," or "The error in the proportion reported must not
exceed 20 percent," specifies a sample size under a particular survey of
a proposed study if the heterogeneity of the population under investiga-
tion is known.

For the purpose of discussion, the parameter of interest is taken
to be the proportion, p, of land (specifically of 10 acre-plots) contain-
ing detectable levels of toxic substance. The variance model for the
estimator p of p is
Var(p) = {1 + p(n2 - 1)} ,
nl n2

where p is the correlation among plots within a county,

nj is the number of counties

n2 is the average number of plots per county.

The term in brackets is called the "cluster effect", and it is convenient
to write

dc = 1 + p(n2 - 1)

This model ignores stratification and unequal weighting for simplicity.

The sample allocation problem is choose the number of counties, nlf
and the number of plots , n2 , within counties . For a given budget (which
fixes the total sample size), are we wiser to include many counties with
few plots per county, or fewer counties with more plots per county? The
solution is to balance considerations of cost and variability, that is,
-31-
-------
Table 1.3.3.2 Cluster Effect for Selected Values
of p and 0.2
Intracluster
Pesticide Correlation p

Endrin
Chlordane
Aldrin
Dieldrin
P,P'-DDE
0.01
0.06
0.125
0.169
0.231
0.298
0.430
Average Number
5 10
1.04
1.24
1.50
1.68
1.92
2.19
2.72
1.09
1.54
2.13
2.52
3.08
3.62
4.87
of Plots per County 03
15 20 25
1.14
1.84
2.75
3.37
4.23
5.17
7.02
1.19
2.14
3.38
4.21
5.39
6.66
9.17
1.24
2.44
4.00
5.06
6.54
8.15
11.32
Cluster Effect d = 1 + p(n2 - 1)
-32-
-------
Table 1.3.3.3 Minimum Cost Allocation Subject to the Constraint:

c.v. = Vv(p)/P < 0.10
p
.01
.06
.125
.169
.214
u> .298
' .430
The

Average
Cluster
Size
18.73
7.45
4.98
4.17
3.61
2.89
2.17
entries
nz = 1
Cluster
Effect
"c
1.18
1.39
1.50
1.54
1.56
1.56
1.50
in the table
C .
_i ilPl*
Cz p
*
3141
3141
3141
3141
3141
3141
3141
were
dc =
p = 0.0001
**
Est. Cost
$64,779.986
31,970.880
24,786,972
22,431,228
20,802,394
18,707,782
16,614,096
calculated from the
1 + p(nz - 1) nt =
P =
"l
3141
3141
3141
3141
3141
3141
3141
formulas
(1 - P)
0.001
Est. Cost
$64,779,986
31,970,880
24,786,972
22,431,228
20,802,394
18,707,782
16,614,096
:
2
dc / pnz(c.v.)
P
"l
622
1843
3141
3141
3141
3141
3141

= 0.01
Est. Cost
$12,828,060
18,759,020
24,786,972
22,431,228
20.802,394
18,707,782
16,614,096

p B 0.10
DI Est. Cost
57
168
271
332
389
487
623

$1,175
1,710
2,138
2,370
2,576
2,900
3,295

,928
,392
,980
,544
,024
,242
,392

*n|, the number of counties in the sample, cannot exceed the total number in the United -States.

**Estimated Cost does not include the fixed portion, C0, in the cost equation (see accompanying text)

Cost = Co * Cinj + Czn1nz

and n2 = average number of plots per county
C, = cost for first stage units = $3280
C2 = cost per second stage units = $926
p = proportion of land area containing detectable levels of toxic substance.
-------
Table 1.3.3.3 (continued) Minimum Cost Allocation Subject to the Constraint:

c.v. = VV(P)/P < 0.15
p
.01
.06
, .125
U)
r •«>
.214
.298
.430
Cluster
Size
"2
18.73
7.45
4.98
4.17
3.67
2.89
2.17
Cluster
Effect
dc
1.18
1.39
1.50
1.54
1.56
1.56
1.50
*
3141
3141
3141
3141
3141
3141
3141
p = 0.0001
Est. Cost
$64,779,921
31,971,296
24,787,138
22,431,200
20,802,403
18,708,235
16,614,068
"I
2797
3141
3141
3141
3141
3141
3141
p = 0.001
Est. Cost
$56,685,272
31.971,296
24,787,138
22,431,200
20,802,403
18,708,235
16,614,068
"1
277
820
1325
1624
1901
2375
3041
p = 0.01
Est. Cost
$ 5,712,842
8,346,534
10,456,311
11,597,666
12,590,056
14,145,832
16,085,126
p = 0.10
nt Est. Cost
25
74
120
147
172
215
276
515
753
946
1,049
1,139
1,280
1,459
,599
,223
,977
,788
,181
.570
,879
-------
Table 1.3.3.3 (continued) Minimum Cost Allocation Subject to the Constraint:
c.v. = VV(p)/P < .20
Average
Cluster
Size

1
CO
Ln
1

P
.01
.06
.125
.169

.214

.298
.430
n2
18.73
7.45
4.98
4.17

3.61

2.89
2.17
Cluster
Effect
"c
1.18
1.39
1.50
1.54

1.56

1.56
1.50
p = 0.0001
* **

3141
3141
3141
3141

3141

3141
3141
Est. Cost
$64,779,921
31,971,296
24,787,138
22,431,200

20,802,403

18,708,235
16,614,068
P
"i
1573
3141
3141
3141

3141

3141
3141
= 0.001
Est. Cost
$32,441,520
31,971,296
24,787,138
* 22,431,200

„ 20,802,403

18,705,235
16,614,068
p = 0.01
•i
155
461
745
914

1067

1335
1710
Est. Cost
$ 3,196,716
4,692,350
5,879,152
6,52.7,257
,
7,079,837

7,951,446
9,004,908
P
•i
14
41
67
83

121
155
= 0.10
Est. Cost
$ 288,735
417.326
528,729
592,737

642.417

720,692
819.860
-------
the budget goes further if we sample the less expensive units, however
precision is improved if more of our observations come from the most
variable units (since in the extreme case, if the units all have identi-
cally the same value, one observation is sufficient to tell us everything
about these units).

Using, the cost and variance equations above we find the values of
nj and n2 which optimize precision for a fixed cost are
rrrra
C2p

and
(l-p)dc
pn2 (c.v.)
2
(c.v.) is the square of the coefficient of variation or the relative
variance. It is the level of precision specified as necessary for this
survey, and is given by the equation

c.v. =
The optimal allocation and the associated cost is given for a range
of values of p, most of which represent national average values for some
of the common pesticides reported in the RSN. These values of p are
indicated in Table 1.3.3.2 along with the effect of cluster size on d ,

the cluster effect, and the names of the pesticides involved. Table

1.3.3.3 displays the minimum cost allocation and the estimated cost
corresponding to these values of p, the correlation of the pesticide
concentrations within counties. Values of the coefficient of variation
(c.v.) on the order of 10 percent are commonly accepted.

1.4 Present RSN Operations

The operational design of the Rural Soils Network (RSN) specified
that each site would be randomly designated as a first-year, second-year,
third-year, or fourth-year sample site, so that sample specimens would
be obtained for one-fourth of the sites in each State during each fiscal
year. Specimens were to be obtained at each site no less than once
every four years and not more than once per year. Soil specimens were
obtained by compositing fifty soil cores, each 2 inches in diameter by 3
inches in depth. The procedure for collecting and compositing these
cores and for collecting crop specimens is described in detail in the
PPC Division Memorandum 804.3, which is dated April, 1969, and is
entitled "Guidelines for Collecting Sample for the National Soil
Monitoring Program —1969." This memorandum specifies that soil and
crop specimens are to be obtained simultaneously at or shortly before
harvest time for the cropland sample. It also specified water and
sediment specimens should be collected from the nearest pond to each RSN
site, within one mile, four times at equal intervals during each sampling
year.

-36-
-------
The above operational design appears to have been implemented,
except that specimens from ponds have been collected in only one fiscal
year, 1973. Moreover, data collection ceased with fiscal year 1975, and
very little second round data for assessing trends is available.

1.5 Alternate Operational Design for the RSN

The operational design of the Rural Soils Network (RSN) was well
conceived for monitoring agricultural pesticides and herbicides in rural
soils, harvested crops, and rural ponds. Some modifications appear,
however, to be warranted at this time.

The operational design of the RSN specified that soil and crop
specimens be obtained simultaneously at or shortly before harvest time.
This data was to be used to monitor levels of compounds in soils and
crops, as well as establish relationships between soil and crop residues.
Crop specimens should be obtained at or shortly before harvest, since it
is the harvested crop that will be consumed. However, harvest time may
be less than ideal for obtaining soil specimens. Much pesticide and
herbicide residue may often be leached out of or vaporized from the
cropland soil by harvest time. This could explain in some measure the
preponderance of less than detectable residue levels in the cropland
soil data collected thus far (See Section 1.7).
Thus, it may be preferable to obtain cropland soil specimens early
in the growing season. It would then be necessary to carefully specify
where the soil cores were selected, e.g. on a map of the sample site, so
that crop specimens could be obtained near harvest time at practically
identical locations.

Noncropland soil specimens could be obtained whenever convenient
during the sampling year, since there appears to be no major national
relationship between annual seasons and toxic substance residues in
noncropland soils. Random points in time are preferable, but may not be
logistically feasible. However, the purposive selection a single point
in time opens up the opportunity for introducing serious bias. Whatever
protocol is adopted, it is important that the protocol be applied
uniformly across the nation so that the population being sampled is as
well-defined as possible. Sampling some areas when levels of toxic
substances are suspected to be high, but not doing so in other areas,
would lead to difficulties when making other than local inferences.

Changes in the definition of an RSN sample site that would make its
boundaries more readily identifiable would be useful. This would be
useful so that the selected sample site could be accurately identified,
and the identical site could be revisted periodically to establish
trends in residue levels. If the selected site is not precisely defined,
the value of the sampling design is lessened. Analyses of trends based
upon paired differences may lead to spurious results.

The use of a sample site larger than 10 acres may make it easier to
identify site boundaries. However, compositing of the specimens collect-
ed at a site is only'justifiable if the site is homogeneous with respect
to data items. Thus, a fairly small sample site is required if the
specimens are to be composited. The alternative would be to report
multiple specimens individually.

-37-
-------
The use of less than fifty soil cores at a sample site could reduce
the expense of collecting specimens and should be considered. The use
of a large number of cores is advisable, however, if the cores are to be
composited. This insures that the composite is representative of the
site by reducing the influence of individual cores. If multiple speci-
mens were to be reported separately within a sample site, fewer cores
might be sufficient. An experimental study could be designed to investi-
gate optimal size of sample site and optimal number of soil cores.

Elimination of pond water and pond sediment specimens is probably
necessary to keep the cost of the RSN data collection reasonable. The
operational design specified that pond specimens were to be obtained
four times at equal intervals during each fiscal year for RSN sample
sites with a pond within one mile. This procedure is commendable since
the pesticide level in pond specimens would probably vary greatly,
depending upon the turbidity of the water, the water level and the
season. The four equally spaced samples would allow compensation for
this variability. Unfortunately, this sampling protocol would probably
require a field crew devoted entirely to sampling pond water. Two
reasonably spaced collections of pond specimens for each sample site in
some sampling years may be worth considering. The pond specimens could
be collected early and late in the growing season, possibly simultaneous-
ly with the collection of soil specimens and crop specimens, respectively.

Finally, it is important that tests for all toxic substances for
which inferences are desired be performed on all sample specimens. This
may have been the intention in the past, but the data in Section 1.7
show clearly that some classes of compounds were more regularly tested
than others. All compounds for which statistical inferences are desired
should be tested in all sample specimens. This requirement may place a
practical limit on the number of classes of compounds that can be
monitored.

1.6 Recommended Modifications

Since the most cost-effective strategy for modifying the RSN depends
to some extent upon information which is not available, the following
are simply indications of a way to enhance program efficiency. Design
Option 1 seems to have little to recommend it. Its importance lies in
its connection with the historical series reflecting the operation of
the RSN from FY 1968 to FY 1973. However, given the inactivity of the
RSN in the intervening years, there is reason to believe the network
would require substantial up-dating which in itself adversely affects
the relationship between the RSN and the historical series. Moreover,
it may be possible to safeguard the series by appropriately managing the
transition to a new network.

Design Option 2 may be the most feasible economically. If a coopera-
tive agreement can be reached with the officials responsible for the
operation of the National Resources Inventory (NRI), then the field
costs may be kept down. Since the NRI is intended to produce national
estimates of various kinds, it is likely to do so for toxic substances
in an adequate fashion, and a subsample satisfactory for monitoring
purposes.

-38-
-------
Design Option 3 represents a monitoring effort geared toward toxic
substances specifically. It is expected to perform well in providing
the desired data. Should an advantageous cooperative agreement with
USDA or others not be obtainable, then this would seem to be the option
of choice. £^And» in fact, it is not impossible that conditions may
dictate that a combination of Design Options 2 and 3 be adopted. An
economical national estimate may be provided by the NRI network, and may
be profitably supplemented by local or special studies based on Design
Option STj

1.7 Statistical Findings and Charts for the RSN

1.7.1 Introduction

Data collection for the Rural Soils Network (RSN) occurred between
fiscal year 1968 and fiscal year 1975. The design specified that one-
fourth of all sites in each State would be sampled in each year.
However, the first year of sampling was regarded as a large scale pilot
study and only six States were sampled. The RSN was never fully imple-
mented; the yearly data collection effort is summarized in Table 1.7.1.
This table indicates, for example, that the random one-fourth of the
cropland sites in Maine that were designated to be first-year cropland
sites were sampled in fiscal years 1968 and 1973. It is apparent from
Table 1.7.1 that only one-fourth of the noncropland sites have been
sampled in most States. Also, most States have a follow-up sample at
approximately a four year interval for only one-fourth of the cropland
sites. Finally, it is apparent that very little data have been collected
for the Mountain Census Division of the United States, possibly because
of the expense of collecting data in this region.

In preparation for data analysis, the EPA computer records for the
RSN were checked for logical inconsistencies. Twenty-three were found.
The methods of identifying and resolving these inconsistencies are
discussed in Appendix E. Appendix E also describes the creation of a
data set with a structure that more readily lends itself to data analysis
than do the EPA data files.

1.7.2 Sampling weights

Proper analysis of the RSN data must account for the characteristics
of the sampling design by the use of sampling weights. Sampling weights
are adjustments attached to each observation of a data set which usually
reflect the probability of selection of the observation. In the case of
simple random sampling, the use of weights is quite straightforward. If
one individual in a 1000 is randomly selected, i.e., the probability of
selection is 1/1000, then each individual "represents" 1000 others and
his income, say, is multiplied by 1000 to estimate the total income of
1000 individuals. In more complex survey designs, the same approach
applies although the details become more complicated.

The weights for the Rural Soils Network (RSN) depend on two phases
of sampling: (1) The selection of the sampling points for the 1967
Conservation Needs Inventory (CNI), and (2) the subsample of the 1967
CNI points selected to locate the RSN sample plots. Therefore, the

-39-
-------
Table 1.7.1: Fiscal Years of Data Collection for the Rural Soils Network

Census Division
State
New England
Maine
New Hampshire
Vermont
Massachusetts
Rhode Island
Connecticut
Middle Atlantic
New York
New Jersey
Pennsylvania
East-North Central
Ohio
Indiana
Illinois
Michigan
Wisconsin
Pacific
Washington
Oregon
California
West^N.o.r.th_Centra 1
Minnesota
Iowa
Missouri
N. Dakota
S. Dakota
Nebraska
Kansas

68*
69
69
69
69
69

69
69
69

69
69
69
69
69

68*
72
69

70
69
69
69
69
68*
75*

Year in
2

69
70
70
70
70
70

70
70
70

70
70
70
70
70

69
73
70

70
70

70
69

Cropland
Round 1
3

70
72
72
72
72
72

72
72
72

72
72
72
72
72

72
74
72

72
72

72
70

Samples

Noncropland Samples
Round 2 Year in Round 1
4

72
73
73
73
73
73

73
73*
73

73
73
73
73
73

73
73

73
72

73
74
74
74
74*
74

74
74
74

74
74
74
74
74

74
74

74
73

21234

74 68* 69 70 72*
72*
72*
72*
72*
72*

72*
72*
72*

72*
72*
72*
72*
72*

68* 69

74 68* 69

(continued)
-40-
-------
Table 1.7.1:
Fiscal Ysars of Data Collection for the Rural Soils Network
(continued)

Census Division
State
South Atlantic
Delaware
Maryland
Virginia
W. Virginia
N. Carolina
S. Carolina
Georgia
Florida
.East-South Central
Kentucky
Tennessee
Alabama
Mississippi
West-South Central
Arkansas
Louisana
Oklahoma
Texas
Mountain
Montana
Idaho
Wyoming
Colorado
New Mexico
Arizona
Utah
Nevada

69
69
68*
69
69
69
68*
69

69
69
69
69

69
69
69
75*

75*
68*
69
69
69
69
69
69

Year in
2

70
70
69
70
70
70
69
70

70
70
70
70

70
70
70

Cropland
Round 1
3

72
72
70
72
72
72
70
72

72
72
72
72

72
72
72

Samples Noncropland Samples

73
73
72
73
73
73
72
73

73
73
73
73

73
73
73

Round 2 Year in Round 1
1 21234

74 72*
74 69 70 72*
73 74 68* 69 70 72*
74 69 72*
74 72*
74
73 74 68* 69
74

74 72*
74
74
74

74
74
74

74 68* 69

These data are not on the computer files supplied by EPA.
**
Source: Personal communications with and computer files supplied by EPA
Field Studies Branch, Washington, D.C.
-41-
-------
selection probabilities will be discussed which accompany the sampling
units in each phase.

1.7.2.1 Sample Selection for the CNI

The CNI is a highly stratified area probability sample, and its
sampling weights are rather easily determined. Since stratification
requires that units be selected in each stratum (subdivision of the
population), there is no choosing among strata. If States are strata,
we must draw a sample1 in every State. If we stratify by county, we
sample in every county, and if townships and parts of townships are also
strata then we must sample in every such stratum. So there is no selec-
tion probability to calculate for strata since each stratum has a 100
percent chance of being selected. Within strata, primary sampling units
(PSU's), usually 1 or 2, were selected purely by chance, i.e., at random
with equal probabilities.

As discussed in Section 1.2.2, all counties of the conterminous
United States that were not entirely urban, were divided into townships
and sections, or pseudo-townships and pseudo-sections. The standard
sampling procedure used strata composed of 12-section blocks (1/3 of a
township), and one quarter-section (the PSU) was drawn at random.
Hence, the probability of selection was 1/48, a sampling rate of approxi-
mately 2 percent.

Within each PSU, sample "points" were selected by use of a perfor-
ated template, which was spun to locate sampling points in an unbiased
manner. The perforations formed a grid pattern which was marked on an
aerial photograph of the PSU. The CNI sample collected data at each of
these sampling points. Among the information collected was land use
data, which was used by the RSN to classify each point as either cropland
or noncropland. Due to differences in PSU sizes and shapes and the spin
of the sampling template, the number of cropland points, the number of
noncropland points, and their total change in an unpredictable, or
random, manner. These three quantitites are then random variables that
can be used in standard statistical procedures. The RSN used these
random variables for estimation of proportions of cropland and noncrop-
land acreage in each of the States of the conterminous United States.

If we use the notation

U(i,j,k) = the total number of PSU's in stratum k of
county j in State i,
then the probability of selecting PSU i when u(i,j,k) PSU's are selected
at random from stratum k is
-42-
-------
It is shown in Appendix D that the selection of sampling points within
the PSU can essentially be ignored. The resulting sampling weight for
each of the n(i,j,k,£) sampling points in PSU £ is then

W(i,j,k,£,m)13 = for m = 1, 2, ..., n(i,j,k,£).
1.7.2.2 Sample Selection for the RSN

The RSN is based upon a subsample of the CNI sampling points. It
is intended to provide valid estimates—for- cropping regions and some of
the larger States, rather than the county level estimates available from
the CNI. The RSN is based upon systematic subsamples, one for cropland
points and another for noncropland points, selected from the sampling
points of the CNI within each State. Each sampling point selected for
an RSN sample is used to locate a 10-acre sample plot.

The RSN cropland sample is based upon a systematic subsample of the
CNI sampling points that have been classified as cropland points as
detailed in Section 1.2.2. This procedure results in a sample in which
the PSU's of the CNI occur essentially with probability proportional to
size (PPS), where "size" is measured by the proportion of cropland
points within the PSU. Thus, PSU's containing a higher proportion of
cropland points are more likely to be selected into the RSN cropland
sample.

The following notation is useful for expressing the RSN sampling
weights:

v (i,j,k,£) = number of sample cropland points in PSU £.

v(i,j,k,£) = total number of sample points in PSU £. •

r1(i,j,k,£) = the cropland ratio for PSU £ (adjusted as
detailed in Appendix D).

N-(i) = 2 r-(i,j,k,£) = Sum of the cropland ratio over
all units in State i.

n-(i) = number of RSN cropland sample sites in State i.

The probability that a PSU of the CNI will be selected into the RSN
sample is then essentially proportional to

n.(i)
13
Since 640-acre PSU's were sampled at one-fourth the rate of all other
sizes of PSU's, the appropriate weight for these sites is 4W(i,j,k,£,m).

-43-
-------
It is well-known14 that drawing equal sized samples within PSU's selected
with probability proportional to size results in a self-weighting sample,
i.e. all ultimate sampling units having the same sampling weight.
Essentially the same phenomenon occurs with the RSN samples. Most PSU's
of the CNI that are selected into the RSN sample receive exactly one RSN
sample plot. Thus, under the fairly broad assumptions detailed in
Appendix D, the sampling weights for the RSN cropland sample plots are
given by

N.(i)
W (i,j,k,£,m )15 = v(i,j,k,£) • -Vpr for m = 1,2 n(i,j,k,£).
1 I 11 - V. J. y A
Since the total number of points, v(i,j,k,£), within a PSU, is essential-
ly constant for most States, the sample is essentially self-weighting
for most States.

Details of the derivation of the sampling weights and implementation
of approximate sampling weights are found in Appendix D. The approximate
sampling weights were calculated and included in the data set constructed
for analysis purposes, which is discussed in Appendix E.

1.7.3 Stratification

The two phase sampling design of the RSN necessarily introduces
complexities into the data analysis. The first phase sample, the 1967
Conservation Needs Inventory (CNI), was a deeply stratified design. The
second phase sample was the systematic selection of ultimate sampling
units from the CNI to locate RSN sample sites. Exact variance formulas
for estimates based upon the RSN would be very difficult to derive, and
would include components of variance from both phases of the design. As
is common practice in this situation, approximate variance formulas were
used that capture most of the design effects and provide conservative
estimates of variance. The major design effects to be accounted for in
the RSN design are the stratification effects derived from the CNI
sampling design.

The RSN sampling design was described in detail in Section 1.2.2.
The dimensions of the stratification in this design are reviewed in
Exhibit 1.7.1. The first dimension of stratification in the CNI, and
hence the RSN, consists of the 48 States of the conterminous United
States. Within some States, large scale geographic strata were defined.
For example, the sandhills of Nebraska were treated as a stratum. The
irrigated agricultural areas of many States were treated as strata.
Desert areas were treated as strata in many States.

The designation of large scale geographic strata within States was
usually accompanied by the use of different sizes of PSU's in the CNI
14
See, for example, Kendall, M.G. and Stuart, A. [1968, pg 195]. The
Advanced Theory of Statistics, Vol. 3. Hafner, New York.
15
Or 4W (i,j,k,£) for 640-acre PSU's. (Recall footnote 1).
-44-
-------
*
Exhibit 1.7.1: Dimensions of the RSN Sample Design

I. Phase One Sample - 1967 Conservation Needs Inventory (CNI).

A. Dimensions of deep stratification.

1. States of the 48 conterminous United States.
2. Large scale geographic strata, etc., sandhills, irrigated
areas, etc.
3. Counties that are not entirely urban (crossed with the
large scale geographic strata to form smaller sub-county
strata).
4. Townships or pseudo-townships within counties or sub-county
strata.
5. Strata generally composed of 48 PSU's each within townships
or pseudo-townships.

B. Phase One Sample Selection

1. Usually one PSU was selected from each ultimate stratum.
2. A template was used to assign a randomly aligned two-
dimensional sample of SSU's within each sampled PSU (the
number of SSU's assigned was usually proportional to
PSU size).

II. Phase Two Sample - Rural Soils Network (RSN) subsamples

A. Systematic subsamples of the utlimate sampling units, SSU's,
from the first phase sample were used to locate the 10-acre
RSN sample sites.

*
Source: Documents from and personal communications with both the
EPA Field Studies Branch at Washington, D.C. and the Statistical Laboratory
at Iowa State University.
-45-
-------
sample. The irrigated strata were generally very hetergeneous and were
of special interest. Thus, 40-acre PSU's were usually used in these
strata. It appears that all 40-acre PSU's were assigned to irrigated
strata. In addition, the CNI sometimes employed 160-acre PSU's in the
irrigated strata. For analysis of the RSN data, a stratum was defined
within each State which consisted of all sites in 40-acre PSU's, as well
as all sites in 160-acre PSU's which fell within an irrigated stratum of
the CNI. Sites within 40 acre PSU's are given by Tables D-4 and D-5 in
Appendix D. The sites in 160-acre PSU's used in irrigated strata are
shown in Table 1.7.2.

The sandhills stratum in Nebraska was a homogeneous stratum, and
640-acre PSU's were used throughout. Geographically homogeneous strata,
such as desert lands, were also defined in the States of New Mexico,
South Dakota, Utah, and Wyoming. Apparently, 640-acre PSU's were used
exclusively within these strata as well. Moreover, a geographically
homogeneous stratum was also defined in Maine. Both 200-acre and 400-
acre PSU's were used in this stratum for Maine. Thus, for analysis of
the RSN data, a stratum was defined within each State which consisted of
all sites in the 200, 400, or 640 acre PSU's. The sites within these
oversized PSU's are given by Tables D-4 and D-5 of Appendix D.

All RSN sites of a State that were not classified as being in
either of the two large scale geographic strata just defined were
considered to be in the "remainder" stratum of that State. For States
that contained PSU's of only one size and no irrigated stratum, all
sites were considered to be in the "remainder" stratum, which was then
identical to the State stratum itself. All States in Table D-2 and D-3
of Appendix D fell into this category, except for Oregon and Idaho (See
Table 1.7.2).

1.7.4 Analysis

Several types of analyses are of interest for the RSN data, notably:

(1) Estimation of base levels for residues of toxic substances,
(2) Estimation of changes in mean levels of toxic substance resi-
dues from the first round to the second round of data collec-
tion, and |
(3) Estimation of relationships between soil and crop residue
levels.

The reason for analyzing the RSN data in this study was to obtain a
measure of precision of residue data based upon the present data collec-
tion effort. It was decided that estimation of base levels of residues
would be sufficient. In particular, estimation of levels was undertaken
for first first round soil data only.

It was found that the data values for most compounds were predomi-
nantly zeros. In fact, Tables 1.7.3 and 1.7.4 list numerous compounds
for which no detectable levels were found in the cropland and noncropland
soils, respectively.
-46-
-------
Table 1.7.2:
RSN Sites in Counties Having Both Irrigated
and Remainder Strata, but only 160-acre PSU's
State Name
(State Code)
Arizona (04)
New Mexico (35)
Oregon (41)
Idaho (16)
County Name
(County Code)
Apache (001)
Cochise (003)
Curry (009)
Hidalgo (023)
Roosevelt (041)
Torrance (057)
Crook (013)
Grant (023)
Lane (037)
Malheur (045)
Ada (001)
Adams (003)'
Irrigated Stratum.
Site Numbers
1
3
5
8
10
78,150
81,154
16,17,90,91,162, 163
20-22,94,96,166,167,169
1,64
127
Remainder Stratum,.
Site Numbers
10-13
14,15
2
9
4
8
95,168
97
Bannocke (005)
Bear Lake (007)
Bingham (Oil)
Blaine (013)
Booneville (019)
Butte (023)
Caribou (029)
Cassia (031)
Clark (033)
Custer (037)
Elmore (039)
Franklin (041)
Fremont (043)
Gem (045)
Kootenai (055)
Lemhi (059)
Lincoln (063)
Madison (065)
Oneida (071)
Owyhee (073)
Payette (075)
Power (077)
Teton (081)
Twin Falls (083)
Valley (085)
Washington (087)
4,5,193

69,102,133,195,196
134
199
13,75,76,105,202
77
79,80
143
147
211
24
213
28,153
120,154
217
93,156
31
32,95,158,220,221
96
159
2,65,190
3,98,128,191
67,68,99,130
100,131
7,8,70,132
103
11,12,74,104,137,200
138,139,201
14
107-109,203
15,78,110
16,141,204
17,111,142,205

84
117,118

25,87,150
90,216
91,121,122

29,30,92,155,218
94,157
33
125,126
222
Only sites that were surveyed by the RSN have been classified. Classification of
all sites in these counties would require considerably more effort.
*
Source: CNI site numbers corresponding to the RSN site numbers were obtained from
the EPA Field Studies Branch, Washington, B.C. The stratum classification for each
of these CNI sites was obtained from the Statistical Laboratory of Iowa State
University.
-47-
-------
*
Table 1.7.3: Compounds with No Detectable Levels in Cropland Soils

Compound Sample Size

Alachlor 6071

Photodieldrin 6071

Benzene Heptachloride 6071

Mirex 6071

Prolan 2846

Bulan 2846

Gamma Chlordane 37

Folex 2341
£
Source: Computer files supplied by EPA Field Studies Branch,

Washington, D.C.
-43-
-------
Table 1.7.4:
Compounds with No Detectable Levels
in Noncropland Soils"'
Compound
Sample Size
Alachlor 238
DCPA 238
o,p-'TDE 238
Photodieldrin 238
Endosulfan I 238
Endosulfan II 238
Endrin 238
Endrin Aldehyde 238
Endrin Ketone 238
Heptachlor 238
Isodrin 238
Lindane 238
Benzene Heptachloride 238
Methoxychlor 238
PCNB 238
Propachlor 238
Ronnel 238
Trifluralin 238
Mirex 238
Ovex 238
PCB 238
Compound
Sample Size
Bulan
Gamma Chlordane
Carbophenothion
op
Diazinon
°P Ethion
Folex
o-P Malathion
o? Methyl Parathion
op Ethyl Parathion
oPPhorate
2,4-D
Atrazine
2
0
2
2
2
2
2
2
2
2
2
1
9
Source: Computer files supplied by EPA Field Studies Branch,
Washington, D.C.

Rarely tested class of chemicals.
-49-
-------
It is also evident from these and subsequent tables in this section
that some classes of compounds were tested for more regularly than
others, which raises questions about what generalizations can be made
from this data. It would be of interest to know what criteria were used
to determine whether or not a test would be performed.

Moreover the exclusion of some States from the sample restricts the
population to which inferences are valid. It can be seen from Table
1.7.1 that nearly complete data exists for some Census Divisions, while
there is very little data for others.

The predominance of zero values in the residue data results in
J-shaped distributions for the amount of residue detected for most
compounds. This type of data presents some analysis problems. For
example, the weighted mean of the raw data values has little meaning if
most values are zero and a few are large. Thus, some type of data
transformation is generally required in order to obtain a meaningful
analysis [See Lucas, et al, Recommendations for the National Surface
Water Monitoring Program for Pesticides. Report No. RTI/1864/01-02I] .
Ideally, each compound should be considered individually to determine an
appropriate transformation, if any. A ubiquitous compound like arsenic
may not require a transformation. The analysis of the first round soil
data was computed on three scales: (1) The raw data, (2) a logarithmic
scale, and (3) a proportion scale. The raw data values exceeding the
minimum detectable level (MDL) were also analyzed as a separate data
set. The results are shown in Tables 1.7.5 through 1.7.9.

Extensive analyses were not considered appropriate for compounds
for which there were few detections - observations in excess of the
minimum detectable level (MDL) . The analyses for these compounds are
presented in Tables 1.7.5 and 1.7.6 for cropland and noncropland soils,
respectively. Each of these tables contains the following information
for the compounds represented:

(1) The sample size, i.e., number of sites for which the
presence of the compound was tested,
(2) The number of data values exceeding the minimum detectable
level,
(3) The largest amount of the compound detected at any one site
in parts per million (ppm) , and
(4) The weighted average,
I w.x.
11
of the detections in ppm where the sampling weights are
represented by w. and the detections (amounts exceeding the
MDL) are denotedxby x..
For the analyses on the logarithmic scale, the data values, say x,
were transformed to log (x+1). This is a transformation often found to
be useful for stabilizing the variances of data that consist of positive
-50-
-------
integers covering a wide range.16 The presence of many zero values for
most of the compounds makes this transformation of questionable value
for such compounds. For presentation of the findings on this scale in
Tables 1.7.7 through 1.7.9, the results have been transformed again to
the original scale. In particular, if y represents the weighted mean of
the log-transformed data, the value reported is given by

x = Antilogg (y) - 1 ,

which bears a strong analogy to the geometric mean. Actually, the
geometric mean is identically zero when any of the data values are zero.

For analyses on the proportion scale, all data values above the
minimum detectable level (MDL) were replaced by the value one (so that
their sum is the number of positive values). The weighted mean on this
scale is a weighted estimate of the proportion of the sampled land area
with a residue level in excess of the MDL. Since this scale is felt to
be the most appropriate for analysis of the residue data, the standard
error and the design effect for the estimated proportion are also pre-
sented in Tables 1.7.7 through 1.7.9. The statistical approach used for
computation of the standard errors and design effects was a first-order
Taylor series approximation as implemented in computer software developed
by RTI for analysis of nested probability samples [See SESUDAAN:
Standard Errors Program for Computing of Standardized Rates from Sample
Survey Data. Report No. RTI/1789/00-01F].

Estimation of standard errors and design effects required that some
of the strata defined in Section 1.7.3 be combined. In particular,
strata that received only one sampling unit had to be combined with
other strata to produce valid estimates of sampling variances. In order
to determine where this was necessary, the RSN records were first sorted
by States, by large scale geographic strata within States, and finally
by counties within large scale geographic strata (See Exhibit 1.7.1).
When a stratum defined by these three levels of sorting (i.e., an
individual county portion of a large scale geographic stratum) contained
only a single round one soil record, this stratum was placed into a
"residual county" stratum created within the large scale geographic
stratum. Recall that the States having no large scale geographic stra-
tification can be thought of as a single large scale geographic stratum.
Finally, whenever a "residual county" stratum within a large scale
geographic stratum consisted of only a single Round One soil record, the
stratum identification of the record in this "residual county" stratum
was changed to that of an arbitrary county within the same large scale
geographic stratum. The goal of this strategy was to achieve the maximum
possible benefits from the CNI stratification for estimation of standard
errors and design effects.

Since it was not possible to account for all dimensions of the CNI
stratification (See Exhibit 1.7.1), the standard errors computed are
16
See page 157 of Steel, R.G.D. and Torrie, J.H. [I960]. Principles
and Procedures of Statistics. McGraw-Hill, New York.
-51-
-------
Table 1.7.5: Statistics for Compounds with Few Detectable Levels
in Cropland Soils for Round One*
I/ 2/ 3/
Compound n— na. Max—
DCPA 6071 3 1190 632.92
Dicofol 6071 16 2150 370.40
Endosulfan I 6071 7 240 95.83
Endosulfan II 6071 15 1240 172.10
Endosulfan Sulfate 6071 18 2070 343.85
Endrin Aldehyde 6071 1 30 30.00
Endrin Ketone 6071 10 380 98.19
Lindane 6071 21 350 51.92
Methoxychlor 6071 1 280 280.00
PCNB 6071 4 2610 1103.87
Propachlor 6071 5 100 80.27
Ronnel 6071 1 190 190.00
Ovex 6071 1 1130 1130.00
PCB 6071 2 1490 1130.98
Carbophenothion 2341 1 230 230.00
DEF 2341 9 670 272.63
Diazinon 2341 9 170 82.01
Ethion 2341 3 240 107.95
Malathion 2341 5 360 163.26
Methyl Parathion 2341 1 10 10.00
Ethyl Parathion 2341 18 3010 296.05
Phorate 2341 10 400 76.16
2,4-D 188 3 30 17.26
-Sample size.
2/
-Number of occurrences above the MDL.

3/
-Maximum amount detected (PPM).

4/
-Weighted average of the data values in excess of the MDL (PPM).
*
Source: Computer files supplied by EPA Field Studies Branch,
Washington, D.C.
-52-
-------
Table 1.7.6: Statistics for Compounds with Few Detectable

Levels in Noncropland Soils for Round One*
Compound
Aldrin
Chlordane
o,p'-DDE
o.p'-DDT
o,p'-TDE
Dicofol
Dieldrin
Endosulfan Sulfate
Heptachlor Epoxide
Toxaphene
^
238
238
238
238
238
238
238
238
238
238
°+y
1
5
2
8
7
2
10
1
2
1
Max'/
20
500
30
50
180
290
90
80
10
520
i?
20.00
200.34
24.57
20.43
45.47
138.00
29.00
80.00
10.00
520.00
- Sample size.

2/
— Number of occurances above the MDL.

3/
- Maximum amount detected (ppm).

4/
- Weighted average of the data values in excess of the MDL (ppm),

Source: Computer files supplied by EPA Field Studies Branch,

Washington, D.C.
-53-
-------
Table 1.7.7: Statistics for Compounds with Detectable Levels in Noncropland Soils for Round One

Compound
p,p'-DDE
p.p'-DDT
Arsenic
Atrazine
n— MA v^— v — v^— Y ^~
1JOA "_i_ ™ "
238 310 37.02 3.51 0.35
238 230 54.12 3.49 0.26
233 54,170 3,957.92 3,772.27 1,618.71

P(>MDL)-/
0.09
0.06
0.95

S.D.I/
0.02
0.02
0.02

DEFF-/
0.92
1.11
1.32
(continued)
j>
-Sample size.

2/
-Maximum amount detected (ppm).

3/
- Weighted average of the data values in excess of the MDL (ppm).

4/
- Weighted average of the amount detected (ppm).

- Antilog (weighted average of log (amount +!)-!); analogous to the geometric mean (ppm),

- Weighted proportion of cases with data values in excess of the MDL.

— Standard deviation of the estimated proportion.

o /

- Design effect for the estimated proportion.

Source: Computer files supplied by the EPA Field Studies Branch, Washington, D.C.
-------
Table 1.7.8: Statistics for Compounds with Detectable Levels in Cropland Soils by Census Division for Round One

1
in
Ui

Census Division
Total RSN
New England
Middle Atlantic
East-North Central
Pacific
West-North Central
South Atlantic
East-South Central
West-South Central
Mountain

„!/
6071
72
296
1595
505
1943
482
429
546
203

Max'/
13,280
280
150
13,280
170
4,250
570
420
60
20
*
^
219.65
280.00
90.89
277.06
54.91
166.47
123.23
110.00
20.72
20.00
Aldrin
x—
23.06
4.02
.60
61.89
.99
17.56
4.10
2.76
.68
.10
X8
.54
.08
.03
1.59
.07
.54
.14
.10
.10
.01
P(>MDL)-/
0.11
0.01
0.01
0.22
0.02
0.11
0.03
0.03
0.03
0.00
S.D.J/
0.00
0.01
0.00
0.01
0.01
0.01
0.01
0.01
0.01
0.00
DEFF-
0.79
1.05
1.01
0.79
1.03
0.79
0.89
1.10
0.96
0.99
(continued)
-------
Table 1.7.8: Statistics for Compounds with Detectable Levels in Cropland Soils by Census Division for Round One
(continued)
Chlordane
Census Division
Total RSN
New England
Middle Atlantic
East-North Central
Pacific
West-North Central
South Atlantic
East-South Central
West-South Central
Mountain
„!/
6071
72
296
1595
505
1943
482
429
546
203
Max*/
13,340
2,200
3,190
6,980
2,460
8,040
13,340
7,890
260
480
M'
645.24
693.19
596.78
809.89
527.37
489.02
655.08
753.98
116.26
164.88
x—
56.74
43.30
26.26
120.76
14.71
40.96
65.30
35.67
1.26
11.32
Xg
.63
.45
.28
1.48
.15
.55
.68
.30
.05
.38
P(>MDL)-/
0.09
0.06
0.04
0.15
0.03
0.08
0.10
0.05
0.01
0.07
S.D.I/
0.00
0.03
0.01
0.01
0.01
0.01
0.01
0.01
0.00
0.03
DEFF-^
0.93
1.07
0.93
0.87
0.97
0.86
0.89
0.93
1.19
2.32
(continued)
-------
Table 1.7.8: Statistics for Compounds with Detectable Levels in Cropland Soils by Census Division for Round One
(continued)
Ln
o,p ' - DDE
Census Division
Total RSN
New England
Middle Atlantic
East-North Central
Pacific
West-North Central
South Atlantic
East-South Central
West-South Central
Mountain
ni/
6071
72
296
1595
505
1943
482
429
546
203
Max'/
510
30
100
510
380
90
140
80
250
70
M'
45.90
30.00
40.86
109.60
51.53
27.37
29.23
32.41
67.24
35.00
&
.98
.12
1.35
.62
5.02
.06
2.00
1.66
1.42
0.16
'r'
.07
.01
.12
.02
.40
.01
.23
.19
.08
.02
P(>MDL)-/
0.02
0.00
0.03
0.01
0.10
0.00
0.07
0.05
0.02
0.00
S.D.I/
0.00
0.00
0.01
0.00
0.01
0.00
0.01
0.01
0.01
0.00
DEFF-/
0.83
0.29
1.11
1.00
0.86
0.86
0.98
0.93
1.02
0.12
(continued)
-------
Table 1.7.8: Statistics for Compounds with Detectable Levels in Cropland Soils by Census Division for Round One
(continued)
Census Division
Total RSN
New England
Middle Atlantic
East-North Central
Pacific
i
in
MDL)-/
0.20
0.34
0.28
0.08
0.46
0.06
0.59
0.51
0.27
0.19

S.D.^/
0.00
0.05
0.03
0.01
0.02
0.00
0.02
0.02
0.02
0.02

DEFF-/
0.56
0.72
0.99
0.90
0.57
0.84
0.79
0.55
0.64
0.66
(continued)
-------
Table 1.7.8: Statistics for Compounds with Detectable Levels in Cropland Soils by Census Division for Round One
(continued)
p.p1 - DDT
Census Division
Total RSN
New England
Middle Atlantic
East-North Central
Pacific
West-North Central
South Atlantic
East-South Central
West-South Central
Mountain
„>/
6071
72
296
1595
505
1943
482
429
546
203
Max'/
245,180
4,650
245,180
35,920
19,750
1,420
20,260
16,070
15,860
3,230
- 3/
1044.70
850.87
5890.52
1610.18
783.42
127.08
582.11
967.43
1002.76
226.55
&
187.17
253.33
1527.29
97.31
297.64
7.80
318.45
478.18
252.82
38.88
x*/
g
1.51
4.81
2.74
.34
6.24
.30
17.36
14.64
2.90
1.05
P(>MDL)-/
0.18
0.30
0.26
0.06
0.38 ~~
0.06
0.55
0.49
0.25
0.17
S.D.I/
0.00
0.04
0.03
0.01
0.02
0.00
0.02
0.02
0.01
0.02
DEFF-/
0.56
0.56
0.99
0.93
0.62
0.83
0.73
0.52
0.56
0.76
(continued)
-------
Table 1.7.8: Statistics for Compounds with Detectable Levels in Cropland Soils by Census Division for Round One
(continued)
Census Division
Total RSN
New England
Middle Atlantic
East-North Central
Pacific
i
•p West-North Central
South Atlantic
East-South Central
West-South Central
Mountain

&
6071
72
296
1595
505
1943
482
429
546
203

Max'/
32,750
860
32,750
8,210
4,510
410
4,180
1,790
5,620
290

M'
307.91
169.43
1552.14
797.32
205.01
46.63
171.33
233.35
374.76
66.11
o,p
&
35.94
38.13
262.43
20.47
56.99
1.31
67.89
83.90
63.94
5.84
1 - DDT
'r'
.67
1.80
1.14
.14
2.35
.09
4.64
4.31
1.23
.38

P(>MDL)-/
0.12
0.23
0.17
0.03
0.28
0.03
0.40
0.36
0.17
0.09

S.D.2/
0.00
0.03
0.02
0.00
0.02
0.00
0.02
0.02
0.01
0.02

DEFF-/
0.58
0.45
0.93
0.87
0.74
1.00
0.76
0.55
0.49
0.69
(continued)
-------
Table 1.7.8: Statistics for Compounds with Detectable Levels in Cropland Soils by Census Division for Round One
(continued)
Census Division
Total RSN
New England
Middle Atlantic
East-North Central
Pacific
i
f West-North Central
South Atlantic
East-South Central
West-South Central
Mountain

n'/
6071
72
296
1595
505

1943
482
429
546
203

Max*/
38,460
8,200
38,460
31,430
20,130

500
7,470
1,250
1,670
150

M'
349.24
616.34
1978.77
859.24
357.52

32.42
177.33
135.30
159.58
38.19
P.P
x—
31.78
156.40
255.99
25.67
68.26

.63
63.33
31.50
21.10
1.86
1 - TDE
X8~
.46
2.35
.82
.14
1.27

.06
3.57
1.64
.72
.17

P(>MDL)-/
0.09
0.25
0.13
0.03
0.19

0.02
0.36
0.23
0.13
0.05

S.D.I/
0.00
0.04
0.02
0.00
0.02

0.00
0.02
0.02
0.01
0.01

DEFF-/
0.73
0.70
1.00
0.95
0.79

1.09
0.85
0.94
0.68
0.77
(continued)
-------
Table 1.7.8: Statistics for Compounds with Detectable Levels in Cropland Soils by Census Division for Round One'
(continued)
Census Division
Total RSN
New England
Middle Atlantic
East-North Central
Pacific
£ West-North Central
South Atlantic
East-South Central
West-South Central
Mountain

ni/
6071
72
296
1595
505
1943
482
429
546
203

Max*/
16,790
50
16,790
1,300
4,520
100
1,350
490
210
10

o.p
1 - IDE

x+^ & xg^ P(>MDL)^ S.D.^
387.39
50.00
2156.69
206.48
252.80
100.00
124.03
138.89
150.00
10.00
5.71
.72
91.14
1.01
13.76
.04
9.00
2.56
.20
.14
.06
.06
.27
.02
.27
0.00
.36
.08
.01
.03
0.01
0.01
0.04
0.00
0.05
0.00
0.07
0.02
0.00
0.01
0.00
0.01
0.01
0.00
0.01
0.00
0.01
0.01
0.00
0.00

DEFF-/
0.86
1.06
0.87
0.98
0.93
0.81
1.03
1.00
0.37
0.32
(continued)
-------
Table 1.7.8: Statistics for Compounds with Detectable Levels in Cropland Soils by Census Division for Round One
(continued)
Dieldrin
Census Division
Total RSN
New England
Middle Atlantic
East-North Central
Pacific
West-North Central
South Atlantic
East-South Central
West-South Central
Mountain
„!'
6071
72
296
1595
505
1943
482
429
546
203
Max*/
9,830
4,640
9,830
6,180
2,150
1,620
1,850
650
270
610
*.*
150.35
1,087.94
284.49
196.21
126.37
113.45
175.57
61.72
70.73
61.70
x-
41.14
123.64
60.22
72.36
20.69
32.92
43.34
13.05
9.42
11.69
X8
2.22
.79
1.29
4.58
.93
2.35
1.77
1.08
.68
.95
P(>MDL)-/
0.27
0.11
0.21
0.37
0.16
0.29
0.25
0.21
0.13
0.19
S.D.2/
0.01
0.04
0.02
0.01
0.02
0.01
0.02
0.02
0.01
0.03
DEFF-/
0.84
1.06
0.92
0.71
0.92
0.84
0.83
1.04
0.66
1.55
(continued)
-------
Table 1.7.8: Statistics for Compounds with Detectable Levels in Cropland Soils by Census Division for Round One
(continued)
Endrin
Census Division
Total RSN
New England
Middle Atlantic
East-North Central
Pacific
West-North Central
South Atlantic
East-South Central
West-South Central
Mountain
a*/
6071
72
296
1595
505
1943
482
429
546
203
Max'/
2,130
150
560
20
160
80
2,130
640
480
220
^
142.72
150.00
313.43
14.89
49.22
26.53
347.11
141.47
101.57
33.43
x—
1.73
2.17
3.32
.02
1.54
.15
12.28
4.07
2.21
.57
\~'
.05
.07
.06
0.00
.12
.02
.17
.13
.09
.05
P(>MDL)-/
0.01
0.01
0.01
0.00
0.03
0.01
0.04
0.03
0.02
0.02
S.D.I/
0.00
0.01
0.00
0.00
0.01
0.00
0.01
0.01
0.01
0.01
DEFF-/
0.81
1.06
0.21
0.98
0.86
0.79
1.00
0.73
0.92
0.90
(continued)
-------
Table 1.7.8: Statistics for Compounds with Detectable Levels in Cropland Soils by Census Division for Round One
(continued)
Ul
Heptachlor
Census Division
Total RSN
New England
Middle Atlantic
East-North Central
Pacific
West-North Central
South Atlantic
East-South Central
West-South Central
Mountain
„!/
6071
72
296
1595
505
1943
482
429
546
203
Max'/
1,710
40
10
1370
20
1,710
340
70
10
260
M'
101.01
25.00
10.00
102.76
20.00
109.97
93.18
18.30
10.00
140.00
&
4.78
.72
.04
12.23
.04
3.99
1.56
.34
.02
.34
- 5/
g
.20
.09
.01
.57
.01
.14
.06
.05
.01
.01
P(>MDL)-/
0.05
0.03
0.00
0.12
0.00
0.04
0.02
0.02
0.00
0.00
S.D.1/
0.00
0.02
0.00
0.01
0.00
0.00
0.01
0.01
0.00
0.00
DEFF-/
0.81
1.07
1.17
0.84
0.98
0.74
1.02
1.03
1.30
0.21
(continued)
-------
Table 1.7.8: Statistics for Compounds with Detectable Levels in Cropland Soils by Census Division for Round One
(continued)
Heptachlor Epoxide
Census Division
Total RSN
New England
Middle Atlantic
East-North Central
Pacific
i
a! West-North Central
South Atlantic
East-South Central
West-South Central
Mountain
nl'
6071
72
296
1595
505

1943
482
429
546
203
Max'/
1,080
60
60
1,080
70

330
180
720
10
50
M'
54.59
32.64
24.75
69.56
18.46

43.16
41.02
96.30
10.00
37.65
&
4.24
1.91
.83
9.21
.51

3.55
2.97
2.70
.02
1.61
X8
.31
.22
.11
.65
.08

.31
.27
.11
.01
.16
P(>MDL)-/
0.08
0.06
0.03
0.13
0.03

0.08
0.07
0.03
0.00
0.04
S.D.2/
0.00
0.03
0.01
0.01
0.01

0.01
0.01
0.01
0.00
0.02
DEFF-/
0.92
1.12
0.86
0.84
1.00

0.85
0.94
1.03
1.30
2.42
(continued)
-------
Table 1.7.8: Statistics for Compounds with Detectable Levels in Cropland Soils by Census Division for Round One
(continued)
Census Division
Total RSN
New England
Middle Atlantic
East-North Central
Pacific
West-North Central
South Atlantic
East-South Central
West-South Central
Mountain

-i'
6071
72
296
1595
505
1943
482
429
546
203

Max'/
180
0
0
180
0
50
0
10
0
0

M'
21.68
0.0
0.0
23.17
0.0
18.99
0
10.00
0.0
0.0
Isodrin
&
.16
0.0
0.0
.51
0.0
.08
0
.05
0.0
0.0

'f'
.02
0.0
0.0
.06
0.0
.01
0
.01
0.0
0.0

rOMDi,*/
0.01
0.0
0.0
0.02
0.0
0.00
0.0
0.01
0.0
0.0

8.D.2/
0.00
0.0
0.0
0.00
0.0
0.00
0.0
0.00
0.0
0.0

DEFF-/
0.96
1.00
1.00
0.99
1.00
0.82
1.00
1.08
1.00
1.00
(continued)
-------
Table 1.7.8: Statistics for Compounds with Detectable Levels in Cropland Soils by Census Division for Round One
(continued)
Toxaphene
Census Division
Total RSN
New England
Middle Atlantic
East-North Central
Pacific
£ West-North Central
South Atlantic
East-South Central
West-South Central
Mountain
„!/
6071
72
296
1595
505
1943
482
429
546
203
Max*/
36,330
0
0
0
8,300
5,970
18,100
21,000
36,330
4,960
M'
3,562.56
0.0
0.0
0.0
2,225.71
3,031.10
3,012.79
3,460.30
7,271.25
3,398.33
x-
129.98
0.0
0.0
0.0
208.16
5.08
423.65
629.80
519.17
47.19
Js-'
.32
0.0
0.0
0.0
.99
.01
1.89
2.97
.80
.12
P(>MDL)-/
0.04
0.0
0.0
0.0
0.09
0.00
0.14
0.18
0.07
0.01
S.D.I/
0.00
0.0
0.0
0.0
0.01
0.00
0.01
0.02
0.01
0.01
DEFF-/
0.64
1.00
1.00
1.00
0.76
0.81
0.84
0.71
0.66
0.51
(continued)
-------
Table 1.7.8: Statistics for Compounds with Detectable Levels in Cropland Soils by Census Division for Round One
(continued)
Trifluralin
Census Division
Total RSN
New England
Middle Atlantic
East-North Central
Pacific
i
« West-North Central
South Atlantic
East-South Central
West-South Central
Mountain
„'/
6071
72
296
1595
505

1943
482
429
546
203
Max'/
1,860
0
140
600
1,290

680
1,860
270
370
240
M'
99.33
0.0
92.95
90.40
159.72

94.74
122.55
76.00
118.86
97.07
x—
3.20
0.0
1.12
2.11
4.05

2.42
6.67
7.45
4.57
1.90
**-'
.14
0.0
.05
.11
.11

.12
.23
.48
.19
.08
P(>MDL)-/
0.03
0.0
0.01
0.02
0.03

0.03
0.05
0.10
0.04
0.02
S.D.2/
0.00
0.0
0.01
0.00
0.01

0.00
0.01
0.01
0.01
0.01
DEFT-/
0.80
1.00
0.93
0.99
0.97

0.69
0.85
0.73
0.81
0.91
(continued)
-------
Table 1.7.8: Statistics for Compounds with Detectable Levels in Cropland Soils by Census Division for Round One
(continued)
o
Arsenic
Census Division
Total RSN
New England
Middle Atlantic
East-North Central
Pacific
West-North Central
South Atlantic
East-South Central
West-South Central
Mountain
„!/
4690
59
222
1191
311
1598
402
326
410
171
Max'/
180,420
69,100
180,420
99,400
61,810
107,450
25,600
34,480
33,500
15,820
Z 3/ «^/ « 5/ DfiMnr^/ c n 7/
x — x— x — rt>MDLJ— 5.D.—
5,869.29
10,649.32
9,211.71
6,618.49
4,490.05
5,948.02
3,251.96
7,286.42
4,138.06
3,555.91
5,665
10,462
9,034
6,448
4,404
5,667
3,080
7,180
4,072
3,430
.15
.80
.18
.51
,59
.13
.14
.89
.43
.49
2,863
4,913
5,270
3,427
2,642
2,778
1,260
4,768
2,391
1.957
.07
.77
.13
.92
.87
.43
.43
.52
.27
.63
0

0
0
0
0
0
0
0
0
.97
.98
.98
.97
.98
.95
.95
.99
.98
.96
0.00
0.02
0.01
0.00
0.01
0.01
0.01
0.01
0.01
0.01
DEFF-/
1.44
1.05
1.11
1.01
0.96
1.62
0.96
0.94
1.31
0.84
(continued)
-------
Table 1.7.8: Statistics for Compounds with Detectable Levels in Cropland Soils by Census Division for Round One

(continued)

Census Division
Total RSN
New England
Middle Atlantic
East-North Central
Pacific
West-North Central
South Atlantic
East-South Central
West-South Central
Mountain
i/
-Sample size.

I/ „ 2/ - 3/
n-' Max- x+-
523 16,730 231.40
0 - -
0
235 1,380 137.22
0 - -
288 16,730 303.75
0 -
0 -
0 " "
: -

Atrazine
•Al x y P(>HDL)-/ S.D.-' DEFF-'
8
115.34 B.30 0.50 0.02 1.16
- - •
_ - -
70.21 8.12 0.51 0.03 0.99
_ - - - -
148.45 8.40 0.49 0.03 1.27
. - - -
_ - -
. - -
. - -
"~~ ~ (continued)

2 /

- Maximum amount detected (ppo).

-^Weighted average of the data values in excess of the MDL (ppra).

-'Weighted average of the amount detected (ppra).

-^Antilog (weighted average of log (amount +!)-!); analogous to the geometric mean (ppm).
c c _

- Weighted proportion of cases with data values in excess of the MDL.

- Standard deviation of the estimated proportion.

8/
- Design effect for the estimated proportion.

Source: Computer files supplied by the EPA Field Studies Branch, Washington, D.C.
-------
Table 1.7.9: Statistics for Compounds with Detectable Levels in Cropland Soils by Cropping Region for Round One
Cropping Region
Total RSN
Corn
Wheat & Small Grains
Cotton
Soybeans
General Farming
Hay
Vegetables
Fruit or Nut Orchard

&
6071
1386
1056
221
1271
699
609
557
253

Max*/
13,280
4,250
220
0
13,280
1,220
280
350
470

M'
219.65
192.83
55.25
0.0
290.73
167.63
78.28
80.69
172.23
Aldrin
x-
23.06
42.83
0.63
0.0
61.75
10.75
1.38
2.23
3.42

x*/
8
.54
1.53
.04
0.0
1.39
.31
.07
.11
.09

P(>MDL)-/
0.11
0.22
0.01
0.0
0.21
0.06
0.02
0.03
0.02

S.D.I/
0.00
0.01
0.00
0.0
0.01
0.01
0.00
0.01
0.01

DEFF-/
0.79
0.89
0.77
1.00
0.96
0.87
0.76
1.02
0.59
(continued)
-------
Table 1.7.9: Statistics for Compounds with'Detectable Levels in Cropland Soils by Cropping Region for Round One
(continued)
U)
Chlordane _
Cropping Region
Total RSN
Corn
Wheat & Small Grains
Cotton
Soybeans
General Farming
Hay
Vegetables
Fruit or Nut Orchard
„!/
6071
1386
1056
221
1271
699
609
557
253
Max'/
13,340
8,040
660
620
5,620
1,190
7,890
13,340
2,720
•M'
645.24
652.22
206.17
264.00
736.21
321.79
620.00
764.19
474.67
x-
56.74
113.85
1.47
6.30
97.13
15.29
25.55
61.76
51.77
Xg
.63
1.69
.04
.14
1.18
.27
.23
.55
.77
P(>MDL)-/
0.09
0.17
0.01
0.02
0.13
0.05
0.04
0.08
0.11
S.D.2/
0.00
0.01
0.00
0.01
0.01
0.01
0.01
0.01
0.02
DEFF-
0.93
0.94
0.81
1.06
0.98
0.97
1.59
1.26
0.86
(continued)
-------
Table 1.7.9: Statistics for Compounds with Detectable Levels in Cropland Soils by Cropping Region for Round One
(continued)
o,p' - DDE
Cropping Region
Total RSN
Corn
Wheat & Small Grains
Cotton
Soybeans
General Farming
Hay
Vegetables
Fruit or Nut Orchard
„!/
6071
1386
1056
221
1271
699
609
557
253
Ma**/
510
90
380
250
200
10
30
140
510 .
x+
45.90
28.65
121.31
41.70
31.97
10.00
20.00
39.37
63.95
^
.98
.26
.35
6.27
.52
.02
.09
1.77
9.57
**-
.07
.03
.01
.70
.05
.01
.01
.16
.70
P(>MDL)-/
0.02
0.01
0.00
0.15
0.02
0.00
0.00
0.05
0.15
S.D.^
0.00
0.00
0.00
0.02
0.00
0.00
0.00
0.01
0.02
DEFF-'
0.83
0.99
0.70
1.01
0.87
0.90
0.94
0.95
0.97
(continued)
-------
Table 1.7.9: Statistics for Compounds with Detectable Levels in Cropland Soils by Cropping Region for Round One
(continued)
p,p' - DDE
Cropping Region
Total RSN
Corn
Wheat & Small Grains
Cotton
Soybeans
•vl
i General Farming
Hay
Vegetables
Fruit or Nut Orchard
ni/
6071
1386
1056
221
1271
699
609
557
253
Max*/
54,980
550
2,270
6,210
4,760
4,550
8,090
6,820
54,980
x^
x+
303.39
68.21
127.47
344.08
226.36
154.69
272.45
222.87
974.93
&
59.68
7.21
9.62
272.61
53.45
13.84
28.48
107.16
611.57
x 5/
A
g
1.34
.48
.32
52.52
1.83
.39
.49
6.92
21.20
P(>MDL)-/
0.20
0.11
0.08
0.79
0.24
0.09
0.10
0.48
0.63
S.D.I/
0.00
0.01
0.01
0.03
0.01
0.01
0.01
0.02
0.03
DEFF-
0.56
0.93
0.61
0.87
0.56
0.95
0.80
0.72
0.88
(continued)
-------
Table 1.7.9: Statistics for Compounds with Detectable Levels in Cropland Soils by Cropping Region for Round One
(continued)
p.p1 - DDT
Cropping Region
Total RSN
Corn
Wheat & Small Grains
Cotton
( Soybeans
i General Farming
Hay
Vegetables
Fruit or Nut Orchard
a*/
6071
1386
1056
221
1271
699
609
557
253
Ma*'/
245,180
3,080
5,160
15,860
16,070
23,700
38,550
69,300
245,180
x+
1,044.70
179.52
218.00
1,144.63
793.83
707.81
847.73
1,048.12
3,131.82
^
187.17
19.55
13.67
890.55
174.66
45.21
74.62
440.77
1,753.51
Xg
1.51
.60
.31
98.48
2.25
.32
.48
8.21
20.54
P(>MDL)-/
0.18
0.11
0.06
0.78
0.22
0.06
0.09
0.42
0.56
S.D.I/
0.00
0.01
0.01
0.03
0.01
0.01
0.01
0.02
0.03
DEFF-/
0.56
0.90
0.66
0.81
0.53
0.93
0.75
0.73
0.89
(continued)
-------
Table 1.7.9: Statistics for Compounds with Detectable Levels in Cropland Soils by Cropping Region for Round One
(continued)
o.p1 - DDT
Cropping Region
Total RSN
Corn
Wheat & Small Grains
Cotton
Soybeans
i
7"1 General Farming
Hay
Vegetables
Fruit or Nut Orchard
•*/
6071
1386
1056
221
1271

699
609
557
253
Max'/
32,750
470
620
5,620
3,320

3,790
14,050
11,700
32,750
M'
307.91
71.41
69.77
328.24
212.36

225.16
519.05
279.46
738.55
x-
35.94
3.12
2.76
203.82
32.55

7.62
24.09
82.86
292.74
l?
.67
.17
.15
20.33
.97

.14
.21
2.67
5.62
P(>MDL)-/
0.12
0.04
0.04
0.62
0.15

0.03
0.05
0.30
0.40
8.D.2/
0.00
0.01
0.00
0.03
0.01

0.01
0.01
0.02
0.03
DEFF-/
0.58
1.06
0.67
0.80
0.46

0.99
0.77
0.84
0.89
(continued)
-------
Table 1.7.9: Statistics for Compounds with Detectable Levels in Cropland Soils by Cropping Region for Round One

(continued)
I
•vl
p.p1 - TDE
Cropping Region
Total RSN
Corn
Wheat & Small Grains
Cotton
Soybeans
General Farming
Hay
Vegetables
Fruit or Nut Orchard
al'
6071
1386
1056
221
1271
699
609
557
253
Max'/
38,460
1,230
370
1,670
1,250
2,070
8,200
31,430
38,460
M'
349 . 24
88.79
62.22
172.24
123.35
195.38
368.31
494.21
1,155.54
&
31.78
4.48
1.53
75.23
13.46
4.15
15.09
125.08
329.80
X8
.46
.20
.09
5.82
.55
.08
.17
1.91
2.86
Pom*,*/
0.09
0.05
0.02
0.44
0.11
0.02
0.04
0.25
0.29
S.D.2/
0.00
0.01
0.00
0.03
0.01
0.01
0.01
0.02
0.03
DEFF-/
0.73
1.02
0.64
1.00
0.75
0.94
0.81
0.86
1.03
(continued)
-------
Table 1.7.9: Statistics for Compounds with Detectable Levels in Cropland Soils by Cropping Region for Round One
(continued)
\o
o.p1 - IDE
Cropping Region
Total RSN
Corn
Wheat & Small Grains
Cotton
Soybeans
General Farming
Hay
Vegetables
Fruit or Nut Orchard
„!/
6071
1386
1056
221
1271
699
609
557
253
Max*/
16,790
340
150
490
210
100
230
4,870
16,790
M>
387.39
112.20
46.58
161.17
49.52
67.24
100.38
237.14
1,265.99
xV
5.71
.70
.23
7.10
.41
.24
.69
13.88
104.07
-r
.06
.03
.02
.22
.03
.01
.03
.28
.52
P(>MDL)-/
0.01
0.01
0.01
0.04
0.01
0.00
0.00
0.06
0.08
S.D.Z'
0.00
0.00
0.00
0.01
0.00
0.00
0.01
0.01
0.02
DEFF-/
0.86
0.97
0.46
1.11
1.08
0.86
0.87
0.90
1.04
(continued)
-------
Table 1.7.9: Statistics for Compounds with Detectable Levels in Cropland Soils by Cropping Region for Round One
(continued)

1
00
o

Cropping Region
Total RSN
Corn
Wheat & Small Grains
Cotton
Soybeans
General Farming
Hay
Vegetables
Fruit or Nut Orchard

ni/
6071
1386
1056
221
1271
699
609
557
253

Max'/
9,830
1,620
610
1,280
6,180
710
4,640
1,850
9,830

-^
150.35
149.79
51.14
86.50
165.48
109.40
128.11
132.74
442.65
Dieldrin
-4/
x-
41.14
74.68
4.43
12.04
64.05
19.19
14.94
36.92
99.08

S5/
g
2.22
8.30
.34
.60
4.58
1.03
.55
2.03
1.72

P(>MDL)-/
0.27
0.50
0.09
0.14
0.39
0.18
0.12
0.28
0.22

S.D.2/
0.01
0.01
0.01
0.02
0.01
0.01
0.02
0.02
0.03

DEFF-/
0.84
0.90
1.20
1.09
0.88
0.91
1.33
0.93
0.98
(continued)
-------
Table 1.7.9: Statistics for Compounds with Detectable Levels in Cropland Soils by Cropping Region for Round One

(continued)
i
oo
Cropping Region
Total RSN
Corn
Wheat & Small Grains
Cotton
Soybeans
General Farming
Hay
Vegetables
Fruit or Nut Orchard

„!/
6071
1386
1056
221
1271
699
609
557
253

Max'/
2,130
80
80
420
640
480
100
1,000
2,130

M'
142.72
37.11
30.76
111.21
93.08
234.28
58.40
187.72
483.73
Endrin
x-
1.73
.15
.34
7.66
.88
.72
.16
6.91
13.92

v'
.05
.01
.04
.32
.03
.01
.01
.17
.15

P(>MDL)-/
0.01
0.00
0.01
0.07
0.01
0.00
0.00
0.04
0.03

S.D.I/
0.00
0.00
0.00
0.02
0.00
0.00
0.00
0.01
0.01

DEFF-/
0.81
0.97
0.74
0.87
0.87
1.08
0.83
0.77
1.05
(continued)
-------
Table 1.7.9: Statistics for Compounds with Detectable Levels in Cropland Soils by Cropping Region for Round One
(continued)
Heptachlor

i
oo
10
1

Cropping Region
Total RSN
Corn
Wheat & Small Grains
Cotton
Soybeans
General Farming
Hay
Vegetables
Fruit or Nut Orchard
n!/
6071
1386
1056
221
1271
699
609
557
253
Max'/
1,710
1,710
10
10
940
290
260
30
190
* 3/
x+
101.01
112.48
10.00
10.00
102.72
47.69
56.48
16.74
75.22
&
4.78
12.21
.02
.10
9.57
.99
.34
.18
1.23
^
.20
.51
0.00
.02
.42
.07
.02
.03
.07
P(>MDL)-/
0.05
0.11
0.00
0.01
0.09
0.02
0.01
0.01
0.02
S.D.Z/
0.00
0.01
0.00
0.01
0.01
0.01
0.00
0.00
0.01
DEFF-/
0.81
0.84
0.81
1.07
0.95
1.04
0.76
0.92
1.06
(continued)
-------
Table 1.7.9: Statistics for Compounds with Detectable Levels in Cropland Soils by Cropping Region for Round One
(continued)
Heptachlor Epoxide
Cropping Region
Total RSN
Corn
Wheat & Small Grains
Cotton
Soybeans
oo
**» General Farming
Hay
Vegetables
Fruit or Nut Orchard
„>'
6071
1386
1056
221
1271
699
609
557
253
Max'/
1,080
350
70
40
1,080
200
720
120
180
x+
54.59
54.75
24.69
21.50
64.00
38.18
68.05
30.32
44.25
x—
4.24
9.28
.17
.62
7.56
1.59
2.17
1.37
2.88
x*/
g
.31
.82
.02
.09
.54
.15
.12
.15
.23
P(>MDL)-/
0.08
0.17
0.01
0.03
0.12
0.04
0.03
0.05
0.07
S.D.I/
0.00
0.01
0.00
0.01
0.01
0.01
0.01
0.01
0.02
DEFF-/
0.92
0.93
0.82
1.10
0.98
0.96
1.32
1.45
1.06
(continued)
-------
Table 1.7.9: Statistics for Compounds with Detectable Levels in Cropland Soils by Cropping Region for Round One
(continued)
00
Cropping Region
Total RSN
Corn
Wheat & Small Grains
Cotton
Soybeans
General Farming
Hay
Vegetables
Fruit or Nut Orchard

„!/
6071
1386
1056
221
1271
699
609
557
253

Max'/
180
180
0
0
90
20
0
10
0
Isodrin
- 3/ -4/ - 5/
X+ X Xg
21.68 .16 .02
21.46 .47 .06
0 00
0 00
24.23 .27 .03
14.98 .04 .01
0 00
10.00 .02 0
0 00

P(>MDL)-/
0.01
0.02
0.0
0.0
0.01
0.00
0.0
0.00
0.0

S.D.I/
0.00
0.00
0.0
0.0
0.00
0.00
0.0
0.00
0.0

DEFF-/
0.96
0.96
1.00
1.00
1.10
1.04
1.00
1.13
1.00
(continued)
-------
Table 1.7.9: Statistics for Compounds with Detectable Levels in Cropland Soils by Cropping Region for Round One
(continued)
oo
Ol
Cropping Region
Total RSN
Corn
Wheat & Small Grains
Cotton
Soybeans
General Farming
Hay
Vegetables
Fruit or Nut Orchard

a!/
6071
1386
1056
221
1271
699
609
557
253

Max*/
36,330
8,800
1,600
36,330
21,000
2,080
11,030
12,000
8,300

M'
3,562.56
2,761.23
810.18
4,190.03
3,932.77
2,080.00
5,174.00
2,734.31
2,601.99
Toxaphene
x—
129.98
18.74
1.78
1,394.85
261.63
2.99
22.66
216.55
267.90

x*/
g
.32
.05
.01
11.81
.67
.01
.04
.82
1.16

P(>MDL)-/
0.04
0.01
0.00
0.33
0.07
0.00
0.00
0.08
0.10

S.D.I'
0.00
0.00
0.00
0.03
0.01
0.00
0.00
0.01
0.02

DEFF-^
0.64
0.77
0.78
0.88
0.65
1.00
0.89
0.91
0.89
(continued)
-------
Table 1.7.9: Statistics for Compounds with Detectable Levels in Cropland Soils by Cropping Region for Round One

(continued)
I
OO
Trifluralin
Cropping Region
Total RSN
Corn
Wheat & Small Grains
Cotton
Soybeans
General Farming
Hay
Vegetables
Fruit or Nut Orchard
„!/
6071
1386
1056
221
1271
699
609
557
253
Max'/
1,860
600
290
310
680
310
10
1,860
1,290
M'
99.33
88.20
126.32
70.07
87.60
104.94
10.00
160.00
328.65
#
3.20
2.79
.36
11.12
6.41
.87
.01
7.40
5.33
X8
.14
.14
.01
.86
.35
.03
0.00
.22
.06
P(>MDL)-/
0.03
0.03
0.00
0.16
0.07
0.01
0.00
0.05
0.02
S.D.I/
0.00
0.00
0.00
0.02
0.01
0.00
0.00
0.01
0.01
DEFF-/
0.80
0.93
0.76
0.82
0.88
0.99
0.86
0.94
1.02
(continued)
-------
Table 1.7.9: Statistics for Compounds with Detectable Levels in Cropland Soils by Cropping Region for Round One
(continued)
Arsenic
Cropping Region
Total RSN
Corn
Wheat & Small Grains
Cotton
Soybeans
00
I"1 General Farming
Hay
Vegetables
Fruit or Nut Orchard
„!/
4690
1109
826
177
962
516
453
448
182
Max'/
180,420
31,980
37,530
38,900
107,450
64,940
51,300
69,100
180,420
- 3/
5,869.29
5,653.68
5,292.43
5,932.09
6,723.70
6,376.99
5,602.92
4,997.11
8,009.47
x—
5,665
5,467
5,091
5,827
6,521
6,234
5,275
4,851
7,654
.15
.48
.67
.79
.48
.96
.52
.67
.21
X
g
2,863.
3,101.
2,616.
3,497.
3,462.
3,360
2,058
2,367
2,415
v
07
61
57
19
38
.02
.05
.47
.32
P(>MDL)-/
0.97
0.97
0.96
0.98
0.97
0.98
0.94
0.97
0.96
S.J
0
0
0
0
0
0
0
0
0
D.-/
.00
.01
.01
.01
.01
.01
.02
.01
.02
DEFF-^
1.44
1.38
0.81
1.05
1.01
1.05
2.84
1.01
1.04
(continued)
-------
Table 1 7.9: Statistics for Compounds with Detectable Levels in Cropland Soils by Cropping Region for Round One
(continued)
I
oo
00
Cropping Region
Total RSN
Corn
Wheat & Small Grains
Cotton
Soybeans
General Fanning
Hay
Vegetables
Fruit or Nut Orchard
^&^=^=s^
.!/
523
271
26
0
102
89
17
16
1

M 2/
Max-
16,730
1.550
120
-
16,730
1,380
100
340
40

if
231.40
185.62
94.17
-
537.13
113.52
43.80
110.04
40.00

i*/
115.34
93.25
17.73
-
284.52
62.55
15.00
82.47
40.00

V
8.30
9.18
1.32
-
10.59
8.68
2.46
• 20.33
39.85

P(>HDL)-/

0.50
0.19
•
0.53
0.55
0.34
0.75
1.00

S.D.I/
0.50
0.03
0.12
"
0.05
0.05
0.12
0.10
0.0

DEFF^
0.02
1.18
2.46

0.98
1.04
1.11
0.87
1.00
-Sample size.

-^Maximum amount detected (ppm).

2'Weighted average of the data values in excess of the MDL (ppm).

-^Weighted average of the amount detected (ppm).

^Antiloge (weighted average of loge (amount +!)-!); analogous to the geometric mean (Pp»).

-'weighted proportion of cases with data values in excess of the MDL.

- Standard deviation of the estimated proportion.

- Design effect for the estimated proportion.

Source: Computer files supplied by the EPA Field Studies Branch, Washington, D.C.
-------
undoubtedly conservative estimates. Thus, the interval of values within
two standard errors of the estimated proportions will provide a conserva-
tive 95 percent confidence interval estimate of the proportion of sampled
area where levels of the compound exceed the minimum detectable level
(MDL).

The design effect is the ratio of the sample standard error to an
estimate of what the standard error would have been if a simple random
sample of the same size had been used, i.e.

DEFF = Estimated S.E. (For the design used)
Estimated S.E. (Simple Random Sample)

Alternatively, the design effect'can be thought of as the ratio of the
actual sample size to the sample size that would be required to obtain
an estimate with the same standard error based upon a simple random
sample. Generally stratification decreases the design effect, while
clustering increases it. Thus, since the CNI stratification can be used
and there is no clustering of sample sites in the RSN sample, design
effects less one would be expected. This would indicate that the design
produced smaller standard errors than would a simple random sample of
the same size. Many of the design effects shown in Tables 1.7.7 through
1.7.9 are indeed less than one. However, some design effects are substan-
tially greater than one. It is, hence, not clear that the CNI stratifi-
cation was particularly advantageous for estimation of proportions of
detections for toxic substance residues.

1.8 Capabilities for Performing Special Studies

If it were possible to completely fulfill the design of the Rural
Soils Network (RSN), it would serve as an excellent vehicle for perform-
ing special studies. With one-fourth of all sites in each State being
sampled in each year, baseline levels of pesticide residue would soon be
established for all moderate size geographic areas. Data needed for
special studies of specific pesticides or specific areas would then be
readily available.

1.9 Toxic Substances Other Than Pesticides in Soils

The NSMP currently monitors three classes of pesticides in soil,
organchlorine pesticides and trifluralin, organophosphorus pesticides,
and heavy metals. Each of these classes are analyzed using methodology
specifically designed to provide optimum selectivity and sensitivity for
that class to the exclusion of others.

Expanding the capability of the soil networks in monitoring for a
wide range of toxic substances will require the development of analytical
methodology to deal with the special characteristics of these substances
as well as those of the matrix. A wide range of new techniques (e.g.,
high performance liquid chromatography, mass spectroscopy, electrochemis-
try and capillary gas chromatography) may need to be incorporated to
accomplish this purpose. However, the design and application of effec-
tively administered QC/QA programs must be concurrent with the develop-
ment of appropriate analytical methodology.

-89-
-------
Much of the necessary methodology is already available in the open
literature or in EPA and contracting laboratories. Some may be directly
applicable to the perceived needs of the NSMP, and others will require
some degree of modification to account for differences in either the
analyte or matrix. All aspects of the methodology must be evaluated
(i.e., sample collection and storage, analyte isolations and instrumental
analysis) and the method appropriately validated in order for the NSMP
to meet the needs of those who are using the analytical data.

The working definition of "toxic substance" at present must include
virtually any substance manufactured in or imported into the United
States. Great care must be exercised in decisions regarding the choice
of substances to be monitored by NSMP. The complexity and cost of the
required methodology increase directly with the number of substances and
matrices to be analyzed. Thus, misjudgement can quickly lead to unneces-
sary or nonproductive expenditures of time and funds.

There are two major methodological approaches to the concurrent
analysis of a number of different substances. The first approach is the
development of a "survey" method in which specimen components are separ-
ated only to the extent necessary to ensure the compatability of each
component with the analytical technique. The resulting subset of speci-
mens are all for the analysis of such specimens; however, the overall
number of specimens requiring analysis is minimized. Two such "survey"
methods (Master Scheme for the Analysis of Organic Compounds in Water
and A Comprehensive Method for the Analysis of Volatile Organisms on
Solids, Sediments and Sludges) are currently being developed under EPA.
Development of a truly all-inclusive "survey" method may be neither
possible nor practical as the present methods are limited to analysis of
organic compounds which are or can be made sufficiently volatile to pass
through a capillary gas chromatograph/mass spectrometer.

An alternate approach is the development of analytical methods
optimized for a specific substance or class of substances. Each method
necessarily excludes all substances except those of similar chemical and
physical characteristics. Monitoring of a large number of different
substances would therefore require the use of a number of specific
analytical methods.

Neither approach is without its disadvantages and these must be
weighted against the goals of the monitoring network. A basic philoso-
phy must be established regarding these goals and the methodology
approach which will best serve them over the long term.

1.10 Implementation Plan for a New Survey Design of the Rural Soils
Network

A specific implementation cannot be recommended at this time, since
a specific design option has not been recommended. ^.One observation
which can be made is that the transition period should cover one cycle
in the old design, 4 years^3 Since, it is not likely to be feasible to
investigate the entire RSN, nor indeed is it necessary, a subset of the
old sites may be used. An advantageous scheme may be to link old and
new sites on the basis of geographic proximity, and compare their obser-
vations over the transition period.

-90-
-------
2. EVALUATION OF CHEMICAL ANALYSIS
2.1 Objective

The objective of this section is to conduct a limited review of the
current analytical methodology used in the National Soils Monitoring
Program (NSMP) in order to assess the quality and reliability of the
data with respect to meaningful statistical evaluation and statistical
survey design.

2.2 Discussion

Data compiled by NSMP is generated by the use of complex multi-
residue analytical methodology, and the quality of such data is deter-
mined primarily by the limitations of the methodology. These limita-
tions are normally defined in terms of the precision, accuracy and
minimum detectable level (MDL) of the analytical method for each speci-
fic analyte and specimen matrix. A knowledge of these limitations is
especially important to potential users of the analytical data since
reported substance levels are merely estimates of the "actual" levels in
the matrix. As estimates, individual values in the NSMP data file in
fact represent ranges (of values) in which the "actual" substance levels
are reasonably (i.e., with some high probability) expected to fall. The
size of the range can be adequately described by the accuracy and preci-
sion of the analytical method; therefore, a knowledge of these parameters
is required for meaningful evaluation of the data. In addition, the MDL
for the method defines (or should define) the lowest level that can be
estimated with reasonable confidence, no analytical method being capable
of absolute detection down to zero concentrations. This limit must be
considered in evaluating the practical versus the statistical signifi-
cance of trace levels and zeros reported in the data file (Hartwell
et al. 1979).

The extensive manual of recommended analytical methodology has been
published by EPA-RTP (USEPA. 1977) for use in routine multiresidue
pesticide analysis. The complexity of the sample matrices and pesticide
types routinely analyzed in practice requires that the methodology
consist of a basic analytical procedure with a large number of modifica-
tions and ancillary techniques in order to cope with problems imposed by
widely divergent pesticide levels and interferences. Each modification
or technique produces a specific effect on the accuracy, precision and
MDLs of the overall analytical method, and, hence, must be validated for
each pesticide analyzed by the method. A detailed knowledge of the
analytical procedure is therefore required in order to properly assess
the quality of the data generated by the procedure.

An extensive set of recommended QC/QA procedures has been published
by EPA-RTP (USEPA. 1979) in an effort to control the quality of data
produced by analysts and laboratories using the multiresidue pesticide
method. Laboratories adhering to these recommendations will necessarily
generate (through controls, blanks, and SPRMs*) much of the information
needed to assess accuracy, precision and MDLs for data reported to the
NSMP. Control and SPRM data are not, however, compiled or summarized in
a.
Special Pesticide Refrence Material.

-91-
-------
a single document (i.e., issued semiannually or annually), or entered
into the computer data file. Thus, for all practical purposes, the data
are lost to the potential data file users. Reporting of all control
data in the computer file, along with results for soil specimens, would
allow the data quality to be determined according to the specific needs
of the individual user (e.g., for a particular pesticide in a given
geographic area or over a specified period of time). The results of
duplicate specimen analysis are apparently not reported in the computer
data file. Again, this is valuable information lost to computer file
users.

RTI has attempted to review the analytical methodology used to
generate data under the NSMP and compile existing information on the
current quality of the data (accuracy, precision and MDL) in order to
make this information available to Program data file users. The review
is necessarily limited by the provisions outlined in the revised work
plan for this task.

In the interest of clarity and accuracy, the RTI request for detail-
ed information on analytical procedures and data quality was made in
written form. A questionnaire was submitted to William G. Mitchell of
the Toxicant Analysis Center, Bay St. Louis, Mississippi, the laboratory
currently responsible for carrying out chemical analysis under the NSMP.
The cover letter to Mr. Mitchell and the questionnaire are given in
Appendix A. The questions were designed to provide detailed information
on all areas of current analytical methodology pertinent to the quality
of the data generated by the method. It was anticipated that extensive
verbal follow-up (telephone) would be required to obtain additional
information and clarify details. The initial information from the
laboratory has been received by RTI and evaluation of the information
carried out. The results are presented below.

2.2.1 Analytical Methodology

The NSMP currently reports levels in soil for over thirty pesti-
cides and toxic substances (Table 2.1) including several chemical classes
(i.e., organochlorine and organophosphorous pesticides, trifluralin and
heavy metals). All analyses are carried out at the Toxicant Analysis
Center (TAC) in Bay St. Louis, Mississippi. The analytical results for
each soil specimen are reported on a single form (Appendix B) along with
the specific location and date at which the specimen was taken. Indivi-
dual pesticides and metals detected in the specimen are listed along
with their levels in fourteen blank spaces on the form. Reporting units
(i.e., ppm, ppb & ppt) are specified using a value code following the
particular result. Although there are spaces on the form for individual
soil characteristics such as pH, % sand, % silt, % clay and % organic
matter; these characteristics are not currently determined for urban
soil specimens. The % moisture content of each soil specimen is deter-
mined but not reported on the form. The reported results are, however,
corrected for % moisture (i.e., reported on the basis of dry solid
weight). An important point of confusion arises from the use of the
term chlordane in reporting results. The term usually corresponds
specifically to the level of y-chlordane in a soil specimen as this is
the most commonly found isomer. However, when the cr-isomer and t-nonachlor
-92-
-------
Table 2.1. Pesticides and Toxic Compounds Analyzed Under NSMP
Organochlorines

Alachlor
Aldrin
BHC
Chlordane
DDTs
Dieldrin
DCPA
Dicofol
Endosulfan 1
Endosulfan II
Endosulfan Sulfate
Endrin
Endrin Ketone
Heptachlor
Heptachlor Epoxide
Hexa chlo robenzene
Isodrin
Lindane
Methoxychlor
PCBs
Propachlor
Toxapheae
Organophosphates

DEF
Diazinon
Ethion
Malathion
Phorate
Parathion, ethyl
Parathion, methyl
Ronnel
Trithion
Other
Trifluralin
Heavy Metals

Mercury
Cadmium
Lead
Arsenic
Source: Toxicant Analysis Center (TAC), USEPA, Bay St. Louis, Mississippi.
-93-
-------
(and presumably oxychlordane since it is not listed separately in Table
2.1) are also found, all levels are reported under the term chlordane as
their sum. The term "technical chlordane" is inappropriate as one of
the major components, heptachlor, is reported separately. In order to
avoid confusion in the subsequent interpretation of the data, all individ-
ual components of pesticide mixtures (e.g., chlordane, BHC and PCB)
should be reported as such. Otherwise potentially valuable data is
lost.

Only urban soil specimens are currently collected and analyzed at
TAG; the last rural soil specimens having been analyzed in 1977. The
soil specimens are collected by either EPA or a contracting laboratory
as a pattern of 6-8 core specimens composited in a one quart, wide-mouth,
glass mason jar with a Teflon- or aluminum foil-lined cap. Specimens
are subsequently shipped to TAG at ambient temperature. Specimens
received at TAG are refrigerated until they can be analyzed. The speci-
men collection date, the date of receipt at TAG, and the date of analysis
are all stated on the result form, and thus, these data are presumably
available to computer data file users.

The analytical methodology used in the analysis of organochlorine
and organophosphorous pesticides in soil specimens is essentially the
same as that used for the analysis of sediment specimens under the
National Surface Water Monitoring Program (D. Lucas et al. 1980). The
analysis of pesticides in both matrices is performed at TAG. The speci-
fic procedure for the extraction and Florisil clean-up of soil specimens
for analysis of organochlorine and organophosphorous pesticides is given
in Appendix C. The procedure was furnished by TAG as a result of the
RTI questionnaire. The levels of pesticides in specimen extracts are
determined using essentially the same gas chromatographic techniques
applied to water and sediment (D. Lucas et al. 1980). Although triflura-
lin is a nitroaniline, its chemical properties allow it to be analyzed
with the organochlorine pesticides.

The general method used in the analysis of organochlorine and
organophosphorous pesticides involves an initial screening of specimen
extracts on a primary GC system. All positive results are then screened
on a secondary GC system that differs from the primary system in the
selectivity of either the column or the detector. Continued positive
results may then be confirmed through the use of additional analytical
techniques depending on the degree of suspected difficulties from inter-
ference, contamination, or low levels (approaching the MDL). The techni-
ques used in the application of this methodology to each pesticide group
are summarized in Table 2.2.

Quantitation of GC results for pesticides is carried out using
external standard procedures with single-point calibration. Calibration
standard concentrations are adjusted to give compound responses similar
to those in specimen extracts in order to reduce the effects of detector
63
non-linearity. The ECDs used in this program all possess an Ni source.
Actual recoveries of pesticides from soil specimens are not monitored
except via the corresponding recoveries for controls. It would be
extremely useful to fortify each specimen with a particular compound(s)

-94-
-------
Table 2.2. Procedures for the GC Analysis of Pesticides for the NSMP
,
in
Compound Class
Organochlorine
Organophosphorus
Primary
analysis
GC/ECD on
OV-1
GC/FPD on
OV-1
Secondary
analysis
GC/ECD on
OV-210
GC/FPD on
1.5% OV-1/
1.95% OV-210
Confirmation
techniques
GC/HECD
GC/ECD on
1.5% OV-1/
1.95% OV-210
GC/NPD
GC/FPD-S mode
GC/ECD
Additional
comments
Every 10th soil
analysis dupli-
cated
Every 10th soil
analysis dupli-
cated
Source: Toxicant Analysis Center (TAC), USEPA, Bay St. Louis, Mississippi.
-------
prior to extraction and clean-up and thereby monitor any anomalous
behavior in the extraction, clean-up and GC injection procedures which
may occur from time to time for a particular specimen. This technique
was used in the National Human Monitoring Program for analysis of organo-
chlorine pesticides in adipose tissue. Aldrin, which is seldom found,
was spiked into fat specimens and then analyzed as though it were endo-
genous. The internal standard quantitation method is an alternate
procedure for normalizing recoveries and was briefly examined at Versar,
Inc. for the analysis of s-triazines in water and sediment specimens (D.
Lucas et al. 1980). Promazine was used as the internal standard and
preliminary work showed promise (Bob Martin, Versar, Inc.).

All GC methodology used in the NSMP utilizes packed-column techni-
ques. The improved resolution and sensitivity that could be obtained by
incorporating state-of-the-art capillary GC techniques would considerably
increase the utility of the method and reduce the need for confirmation.
This is particularly evident in the analysis of PCBs where each indivi-
dual designation (i.e., Arochlor 1242, Arochlor 1254 and Arochlor 1260)
actually represents a complex mixture of partially chlorinated biphenyls
(e.g., tri-, tetra-, penta- and hexachlorobiphenyl) and their respective
isomers. Considerable overlap exists between the components present in
each PCB. For instance, Arochlor 1242 contains di- to hexachlorinated
biphenyls whereas Arochlor 1254 contains tetra- to heptachlorinated
biphenyls. There is currently no reason to expect the individual compo-
nents to possess the same degree of stability or toxicity. Thus, the
original pattern of components and their relative amounts may not be
preserved in complex environmental and biological matrices. Yet, it is
on the basis of the standard peak pattern that the presence of PCBs and
their levels are currently determined. Further, as was shown in the
analysis of fat specimens under the National Human Monitoring Program
(R.M. Lucas et al. 1980), FCB components can interfere with the analysis
of some chlorinated pesticides (e.g., p,p'-DDT, t-nonachlor and hepta-
chlor epoxide) at sufficiently high levels. The degree of resolution
that can now be achieved by capillary GC techniques is demonstrated by
the chromatogram of Arochlor 1242 and 1260 in Figure 2.1. These Aroch-
lors cover nearly the entire range of PCB components (monochlorobiphenyls
to octachloropiphenyls and their isomers) and yield over 80 individual
peaks by this method. Typical packed-column GC techniques yield less
than 15 peaks for these mixtures.

In general, the analysis of heavy metals in soil specimens was
carried out using flame atomic absorption (AA) techniques for cadmium,
lead and arsenic(T.J. Forehand et at, 1976), and the cold vapor AA
technique for mercury. More specific information on the current methodol-
ogy was unavailable for two reasons. First, soil specimens have not
been analyzed for heavy metals since 1979. In view of recent instrumental
acquisitions (i.e., graphite furnace and Zeeman AA) coupled with the
continual refinement of AA procedures in the ongoing analysis of other
matrices (e.g., blood, urine, etc.) at TAC; it is likely that the original
procedures (for soil) will undergo substantial modification when the
analysis of soil specimens is resumed. Second, the individual responsible
for the most recent analysis of soil specimens (1979) is no longer
employed at TAC, and thus detailed information concerning the methodology
and control data is not readily available. It has been necessary to
-96-
-------
Ji
JU
jy
hrs.
Figure 2.1 Capillary GC/ECD Chromatogram of Arochlor 1242 and Arochlor 1260

( M.2 ng total): 48m x 0.25mm id capillary with O.lp Apeizon L on

persilylated pyrex, 1.5mL/min. helium, 150°-290°
@l°/min., BCD @ 128 x 10.
Source: Toxicant Analysis Center (TAG), USEPA, Bay St. Louis, Mississippi.
-------
obtain such information directly from the analyst for virtually every
analytical method used in the five National Monitoring Programs; a
situation which further demonstrates the need for centrally located
documentation of all methodology used in a monitoring program. In view
of the lack of available data on the current analysis of metals in soil,
the quality of the analytical data cannot be determined at this time.

2.2.2 QC/QA

For organochlorine-organophosphorus pesticides in soil, individual
specimens are analyzed in sets of 10-15 with each set containing a
method blank (reagent blank) and a control. The control consists of a
fortified soil specimen (SPRM), which is generated internally. Checks
are also run on the elution pattern of pesticides from Florisil columns.

Information regarding the primary and secondary analytical techni-
ques and confirmation techniques that may be used in the analysis of
soil specimens has been summarized in Table 2.2. Decisions concerning
the adequacy of the clean-up procedure (if used), the validity of the
standards and controls, and the confirmation techniques used are reviewed
by the supervisor (i.e., William Mitchell) and the TAG QC/QA officer
(Dr. Joe Yonan).

2.2.3 Accuracy and Precision

Information on the accuracy and precision of an analytical method
is required in order to define the relationship between the analytical
result (estimate) and the "actual" analyte level in the specimen.
Although this information may be produced as part of the initial method
development and validation, it is by itself insufficient as the charac-
teristics of the method can (and frequently do) change with time, analyte
level, and matrix. Environmental matrices are particularly complex and
variable.

The replicate analysis of a specimen containing a known level of
analyte (i.e., a control) over a period of time can provide useful
information about the accuracy and precision of the method, and the
method stability. RTI has attempted to compile such information, where
it is available, for each pesticide and toxic substance listed in Table
2.1.

The accuracy of the analytical methodology is probably best reflect-
ed in the recovery of the analyte. This is particularly true for the
toxic substances monitored in the NSMP since the analytical results are
not corrected for losses during workup (i.e., recoveries). The available
recovery data for analysis of toxic substances in soil is given in Table
2.3. These values represent averages over a period of several months
and are therefore more useful as general indications of method accuracy
than a corresponding single value. Unfortunately, this information is
far from complete with respect to the number of toxic substances listed
(Table 2.3) versus the number analyzed (Table 2.1). Recovery data must
be generated for all substances analyzed. While a single value may be
found to hold for a number of similar substances,
-98-
-------
Table 2.3. Average Recoveries for Some
Organochlorine Pesticides from Soil
Pesticide
•y-Chlordane
o,p'-DDE
p,p'-DDE
p.p'-DDD
o,p'-DDT
p,p'-DDT
Dieldrin
Aldrin
Heptachlor
Heptachlor
Epoxide
Endrin
Fortification
level (ppb)
60
90
75
150
240
240
90
30
30
60
120
Average
% recovery
80
81
84
87
84
88
80
127
80
80
89
Average
error
-20%
-19%
-16%
-13%
-16%
-12%
-20%
+27%
-20%
-20%
-11%
Reported
MDL (ppb)
10
20
20
20
20
20
20
10
10
10
20
Source: Toxicant Analysis Center (TAG), USEPA, Bay St. Louis, Mississippi.
-99-
-------
the similarity must be demonstrated (and the substances thus grouped
must be specified) and does not obviate the need for subsequent monitor-
ing via controls.

Available information on the analytical precision for toxic sub-
stances in soil is given in Table 2.4 in terms of the coefficient of
variation (CV). The CV is calculated as follows:

cv _ std. deviation x 100 = % reiative standard deviation.
mean
As with recoveries, data was available for only a small number of organo-
chlorine pesticides and the fortification levels were variable (3-12
times the HDL).

No indication of specific interferences between pesticides has been
given. This is particularly interesting since PCB levels greater than 1
ppm were found to significantly interfere in the GC/ECD analysis of p,
p'-DDT, t-nonachlor and heptachlor epoxide in human fat (RH Lucas et al.
1980). The GC methodology used for the analysis of organochlorine
pesticides in water and sediment does not appear to differ significantly
from that for human adipose tissue. Thus the ubiquitous nature of PCBs
would be expected to cause interference problems regardless of the
specimen matrix. The use of high resolution capillary GC techniques
would contribute significantly to the elimination of such difficulties,
as well as increase the sensitivity of GC/MS as a highly specific confir-
mation procedure.

2.2.4 Minimum Detectable Levels

All analytical techniques are characterized by an inherent limit of
sensitivity below which the technique cannot reliably discern the
presence or absence of a particular component. Thus the procedures used
in the analysis of soil specimens must be similarly characterized by a
minimum amount of specific analytes which produce a signal response
statistically discernible from background. This analyte concentration
is defined as the minimum detectable level (MDL) and is important in the
assessment of the analytical data since concentrations reported below
the MDLs lack validity and must be considered unreliable.

The MDLs associated with the GC analysis of specific pesticides in
soil specimens are a function of instrumental operating conditions and
the amount of background introduced by residual matrix material in the
injected specimen extract. Tentative detection limits have been establ-
ished by TAC and are shown in Table 2.5 for the organochlorine and
organophosphorous pesticides.

The MDL corresponds to the amount of analyte producing a signal
equal to 5% of full scale deflection with a maximum of 1% noise (signal
to noise ratio =5:1). In cases where the chromatographic background is
significant, the MDL is taken as that amount of analyte producing a
signal equal to twice the noise level in the vicinity of the peak.
-100-
-------
Table 2.4. Precision for Some Organochlorine
Pesticides in Soil
Pesticide
y-Chlordane
o,p'-DDE
p,p'-DDE
p,p'-DDD
o,p'-DDT
p,p'-DDT
Oieldrin
Aldrin
Heptachlor
Heptachlor Epoxide (HE)
Endrin
Fortification
level (ppb)
60
90
75
150
240
240
90
30
30
60
120
CV
3%
2%
3%
3%
2%
4%
2%
10%
3%
4%
5%
Source: Toxicant Analysis Center (TAG), USEPA, Bay St. Louis, Mississippi.
-101-
-------
Table 2.5. Detection Limits of Pesticides in Soils

Compound Detection limit

Organochlorine pesticides

Early eluting 10 ppb
(BHC's, Aldrin, Heptachlor
Epoxide, Chlordane)

Late eluting . 20 ppb
(DDTs, Dieldrin, Endrin)

All multicomponent pesticides 50 ppb

Organophophorous pesticides 10-50 ppb

Source: Toxicant Analysis Center (TAC), USEPA, Bay St. Louis, Mississippi.
-102-
-------
Any response lower than the HDL is reported as not detected (ND). There
can be significant variation in analytical sensitivity,' even among
specimens of the same typ^. Consequently, the reported MDLs are typical
or expected levels realized for the majority (75-80%) of specimens.

2.3 Fate of Pesticides in Soil

After the application of pesticides to agricultural land a number
of processes may occur which lead to its transport in the environment or
its removal by chemical or biological degradation. Both of these mechan-
isms depends to a large extent on the chemical structure of the pesticide
and to a lesser extent on the soil type, clay, clay loam, sandy loam,
sand etc.

Chlorinated hydrocarbons have a well earned reputation for persist-
ence. Kearney, Nash and Isensee has compared persistence of pesticides
within each general pesticide type and gives the persistence of chlori-
nated hydrocarbons varies from 5 years for chlordane to 2 years for
heptachlor and aldrin. The persistence of phosphate insecticides is
measured in weeks by contrast. Diazinon persists for some 12 weeks
compared to Malathion and Parathion which persist for only a few weeks.
Trichloroacetic acid persists for 12 weeks compared to 2 weeks for
Barban. Intermediate between the two extremes are a wide range of
herbicides. The urea, triazine and pieloram herbicides range from 3
months for Prometryne to 18 months for Pieloram and Propazine. The
benezoic acid and amide herbicides range from 2 to 12 months and the
phenoxy, toluidine and nitrile herbicides range from 1 to 6 months
persistence.

The migration of pesticides in soils is again very closely related
to its chemical structure. Such factors as water solubility and the
absorption on soil particles affect their migration compounds such as
Aldecarb have migrated through the soil over burdened to shallow aquifers.
Halogenated hydrocarbon pesticides such as benezene hexachloride which
has a low water solubility remains entirely in the upper soil layer (2).

The effect of soil type on the fate of pesticides has been studied
by a number of workers. In very general terms absorption is greater on
clays than on sand. The organic content of the soil also affects
adsorption (3).

(1) Kearney, P. C., R. G. Nash and A. R. Isensee 1979. Persistence
of Pesticide Residues in Soils. In M. W. Miller and G. G.
Berg (ed). Chemical Fallout, Current research on persistent
pesticides, Charles C. Thomas, Publisher, Springfield, 111.

(2) Kawahara, T. M., Matsui and H. Nakamura, "BHC in Soil of Paddy
Field" Bull. Agric. Chem. Inspec. Stn. 12:42-45 (1972).

(3) Bristow, P. R., J. Katan and J. L. Lockwood. "Control of
Rhizoctoria solani by Pentachloronitrobenzene Accumulated from
Soil by Bean Plants," Phytopathology 63:808-813(1973).
-103-
-------
2.4 Recommendations

Limited review of analytical methodology used in the NSMP and an
attempt to compile data for the average accuracy, precision, and MDL in
soil for each toxic substance monitored under this program provide a
basis for the following recommendations:

1. Accuracy (that is, recoveries) and precision data must be
generated for all pesticides monitored in the NSMP. The data
should be generated at two different levels (e.g., at the MDL
and at ten times the MDL). The results for controls analyzed
with each set of specimens would be the best means of providing
this information since it is necessary that control data be
made accessible to computer data file users in any event.
Controls must be run with each set of specimens and should
consist of a blank (unfortified soil free from the analytes of
interest) and two fortified blanks (one fortified at the MDL
and another at ten times the MDL). The analytical results for
the controls should be reported on a separate form (especially
designed for control data) and encoded such that there is a
one-to-one association with the particular set of specimens
with which they were analyzed. The encoding should allow
later computer retrieval of control data for any particular
specimen set or group of sets (for example, geographic area,
over a specified period of time, or for a particular pesti-
cide). The availability of this information in a retrievable
form to data file users would provide the means for assessing
data reliability now lacking. Further, any duplicate specimen
analyses must be reported in the computer data file as they
provide the best means of assessing method precision on a
continuous basis. Duplicate results must be specifically
encoded such that they are retrieved as a group (e.g., all
duplicates for a particular matrix and pesticide over a speci-
fied period of time) as well as with the initial analytical
results for the specimen. The need to make routine control
data available to program data file users cannot be overempha-
sized. This does not preclude the use of specialized controls
(e.g., SPRMS,); however, these results should also be included
in the computer file encoded to allow facile retrieval both as
a group and with their particular specimen set.

2. The pesticides included on the routine monitoring list must be
reviewed on a regular basis and appropriate deletions or
additions made. Specifically, the need for routine analysis
of organophosphorous pesticides in soil should be reviewed as
this class of compounds is known to be unstable and has seldom
been reported in either soil or sediment. Once the baseline
has been established for such compunds, three choices are
possible: 1) cease to analyze for the compound(s) except
under special circumstances (e.g., after a chemical spill or
when contamination is suspected from a recent application); 2)
analyze for the compound(s) on a more infrequent basis; and 3)
concentrate efforts on the analysis of degradation products of
known toxicity where these exist. Decisions concerning the

-104-
-------
analysis of toxic substances under the NSMP should be based on
information generated in other agency data files (e.g., USDA,
USGS, etc.) as well as data generated within EPA.

3. Soil specimens should be characterized as to the percent
carbon or percent inogranic residue. This information must be
included on the report form (along with moisture content) as
part of the specimen characterization (source). Significant
trends may otherwise be missed with respect to the soil type
and its effects on toxic substance accumulation, degradation
and transport.

4. Control specimens (in the matrix of interest) should be
included with any specimens either stored for extended periods
or shipped to another site for analysis. This is particularly
important for toxic compounds which are known to be unstable;
i.e., organophophorous pesticides. The results of these
"storage controls" must also be included in the computer data
file with appropriate encoding for specific retrieval.

5. Analytical methodology should be updated to include state-of-
the-art capillary GC techniques. This would provide a higher
degree of confidence in the resuling data through increased
resolution and sensitivity. The use of higher resolution
analytical techniques is a move toward the quantitation of
PCBs (and technical chlordane) as their individual isomers.
This approach is far more useful than the present method of
attempting to identify patterns and averaging components,
since the toxicity and biodegradation of the individual isomers
are not identical.

6. The pesticide recoveries should be monitored for each specimen
analyzed by initial fortification of the specimen with appro-
priate compound(s). Subsequent analysis of the compound level
should enable comparison of data between specimens with
increased confidence that anomalous results will be detected.
The use of internal standard quantitation techniques would
normalize recoveries between specimens and should be
considered.

7. Detailed information on all analytical procdures under under
the NSMP should be documented in one source. The procedures
must then be maintained current with ongoing improvements and
modifications made by the analytical laboratories. Such
updating requires both flexibility and regular review by
program management.
-105-
-------
References
(Analytical Section)

Hartwell ID, Piserchia P, White SB et al. 1979. Analysis of EPA
Pesticides Monitoring Networks. Washington, DC: U.S. Environmental
Protection Agency. EPA-560/13-79-014.

USEPA. 1977. U.S. Environmental Protection Agency. Manual of Analytical
Methods for the Analysis of Pesticide Residues in Human and
Environmental Samples. Revised June 1977 under EPA Contract No.
68-02-2474.

USEPA. 1979. U.S. Environmental Protection Agency. Manual for Analytical
Quality Control for Pesticides and Related Compounds in Human and
Environmental Samples. Revised January 1979 under EPA Contract No.
68-02-2474.

Lucas D, Mason RE, Rosenzweig M et al. 1980. Recommendations for the
National Surface Water Monitoring Program: Report Two. Research
Triangle Park, NC: Research Triangle Institute.
RTI/1864/14/01-02I.

Lucas RM, Rosenzweig MS, William SR et al. 1980. Evaluation of and
Alternate Designs for National Human Monitoring Program's Adipose
Tissue Survey. Research Triangle Park, NC: Research Triangle
Institute. RTI/1864/14/02-2I.

Forehand TJ, Dupuy AE, Tai H. 1976. Determination of arsenic in sandy
soils. Analytical Chemistry 48(7): 999-1001.
-106-
-------
APPENDIX A

Questionnaire on Chemical Analysis of Soil
-------
RESEARCH TRIANGLE INSTITUTE
PO»T OFFICE BOX I 2 1 t 4 —-_|
RESEARCH TRIANGLE PARK. NORTH CAROLINA 17709 I'

CHEMISTRY AND LIFE SCIENCES CROUP October 28, 1980

Mr. William Mitchell
Toxicant Analysis Center
US Environmental Protection Agency
1105, NSTL
NSTL Station, Miss. 39529

Dear Mr. Mitchell:

The Research Triangle Institute (RTI), under contract with the
Environmental Protection Agency (EPA), is conducting an assessment of
the five National Pesticide Monitoring Programs. The Statistical Sciences
Group at RTI has been analyzing the data generated by the Network Programs
and is responsible for conducting this review. The Chemistry and Life
Sciences Group is assuming a supportive role in this effort.

We have been asked to review the current analytical methodology
being used and to evaluate the quality of data being generated in each
Monitoring Program. The main objective of this review is not to criticize
or find fault with the laboratories involved in these programs but to
identify the strengths and limitations inherent in the analytical methodo-
logy. It is important to define the state-of-the-art as it is practiced
by participating laboratories and to establish reliability factors for
the reported data. The statisticians are particularly interested in
assessing measurement error and in developing the best means for document-
ing estimates of accuracy.

We have prepared a list of questions relating to different aspects
of the analytical procedure. Some questions are concerned with procedural
matters and others are 'directed toward defining the scope of the methodo-
logy. We hope you will assist us by responding to these queries and by
suggesting possible approaches or solutions to the issues mentioned
above.

Since this evaluation must be based to some extent on your experience
and view of the capabilities of the method, your cooperation is essential
to the success of this evaluation. Your prompt response would be greatly
appreciated.

Thank you.

Sincerely,
John W. Hines, Ph.D.
Chemist

JWH/lfo

A-l
(91»| 841-6000 TROM RALEIGH. DURHAM AND CHAPEL HILL
-------
National Soil Monitoring Program

-Analytical Methodology Issues-

1. It is presumed that current laboratory procedures follow a written
analytical protocol. Please furnish a detailed copy of current
laboratory protocol along with its source (e.g., EPA Manual of
Analytical Methods). Include information on sample storage conditions
(i.e., time, temperature) compositing) prior to analysis. Also
include information on any procedural modifications required due to
individual matrix or sample characteristics (e.g., emulsions,
interferences or specific analytical requirements which might
preclude the necessity for performing certain operations).
A-2
-------
2. The following list represents compounds which have been monitored
under the National Soil Monitoring Program. Please indicate which
components are currently monitored on a routine basis, and which
are no longer monitored or are only monitored under special circum-
stances (e.g., by request, in samples from particular geographic
areas, in particular types of samples).
Organochlorines Organophosphates Heavy Metals
Alachlor DEF Mercury
Aldrin Diazinon Cadmium
BHC Ethion Lead
Chlordane Malathion Arsenic
DDTs Phorate
Dieldrin Parathion, ethyl
DCPA . Parathion, methyl
Dicofol Ronnel
Endosulfan I Trithion
Endosulfan II
Endosulfan Sulfate
Endrin
Endrin Ketone Other
Heptachlor Trifluralin
Heptachlor Epoxide
Hexachlorobenzene
Isodrin
Lindane
Methoxychlor
PCBs
Propachlor
Toxaphene
A-3
-------
3. Which of the above analytes are never or very seldom found (<1%
analyses) in general soil samples?
A-4
-------
4. Do certain individuals perform specific aspects of the program
(e.g., organophospate assays, data interpretation, QA/QC assess-
ments)?
A-5
-------
5. Are there "decision points" in your procedure where judgement is
used in selecting procedural alternatives (e.g., column cleanup,
choice of GC conditions, data interpretation)?
A-6
-------
6. Describe your daily calibration and QC procedures (standards,
spiked samples, blanks, other). Please indicate how many of each
type of control sample are used with each sample set and their
concentration levels (typical levels for standards, spiked samples),
A-7
-------
7. Describe any additional QC/QA procedures which are part of your
protocol (duplicate or split analysis, confirmatory analysis, use
of multiple GC columns, interlaboratory programs, other). Please
indicate how often these procedures are used.
A-8
-------
What is the sample concentration range analyzed by direct injection
on the GC, AA, etc (i.e., before further concentration or dilution
becomes necessary)? Please indicate the method of reporting
results at various analyte levels (i.e., above and below limit for
quantitation, below limit of detection).
A-9
-------
What are the estimates of the minimum quantitatable level (MQL) of
individual analytes in real samples? How are they determined and
to what extent does the sample matrix affect these values? What is
the criterion used in reporting a specific analyte as "not detected"
and in what manner are these results reported (zero, not detected,
less than a certain value, less than the MQL)? Is the lower limit
of quantitation different from the instrumental limit of detection?
If so, what is their relationship?
A-10
-------
10. What is your estimate of the analytical precision associated with
each component and the dependence of this parameter on the analyte
concentration in the sample? How is precision estimated? If
available, please give the precision for analysis of replicate
SPRMs or similar controls over a period of time for each analyte.
A-ll
-------
11. What is your estimate of the analytical accuracy associated with
each component and the dependence of this paramter on the analyte
concentration in the sample? How is accuracy estimated?
A-12
-------
12. What is the analyte recovery during sample workup and is the
reported concentration corrected for recovery?
A-13
-------
13. What method(s) do you use for qualitative analysis of the data?
A-14
-------
14. What method(s) do you use for quantitative analysis of the data?
A-15
-------
15. What suggestions do you have for quantitating measurement error and
documenting this information?
A-16
-------
16. What suggestions do you have for making the Monitoring Program more
efficient and meaningful (e.g., analytical modifications, choice of
analytes for analysis, cost effectiveness)?
A-17
-------
17. What are the number of person-hours (and costs, if possible) allocated
for the sample workup, sample analysis, and data interpretation
aspects of this program based on a set of samples? How many samples
per set?
A-18
-------
APPENDIX B

National Soil Monitoring Program
Pesticide Analysis Report Form
-------
Table 1.3
SECTION 1. SAMPLE IDENTIFICATION DATA

DATE RECEIVED AT LAB
i t
PESTICIDE ANALYSIS WORKSHEET
SECTION 2. SAMPLING DATA (Tu Of complei
npling I
SAMPLED BY M*f'icy «"d '«' "'"">•'
DATE SAMPLED
SITE
STATION/SITE NUMBER

16
STATE
17
18
COUNTY OR REGION
19
20
SVSTEM
J2r33 INS S NATIONAL SOILS MONITORING
I NE» NATIONAL ESTUARINE MONITORING

NW n NATIONAL WATER MONITORING
MATERIAL
34
35
36
21
22
23
26
29
30
31
CROP NUMBER (If applicable I
PESTICIDES USED (deck or Ifectfy)
2.4-0
ALDRIN
CHLOROANE
DOT
OIELORIN
ENDRIN
MALATHIUN
PARATHION
TRIFLURALIN
I ATRAZINE
DIAZINON
HEPTACHLOR
TOXAPHENE
AMPLING REMARKS
SECTIONS. SPECIFIC SAMPLE CHARACTERISTICS
(Code) 38 39 40 41 42 43
JATE ANALYSIS COMPLETED:
SECTION 4. RESIDUES DETECTED
10
PESTICIDE
21
CODE
11
31
4142
51
61
12
22
32
52
62
13
23
33
43
AMOUNT
14fisTl6|l7|iall9

_1 ill
24 2Sl26 27 28 29
i I
20
28 29! 30
S3
63
_
34|'3i°l36 37*38 39140
54
64
4SI46|47i48
55
65
56,57
66167 68

Jl
SB
59
69
SO
60
70
9110
PESTICIDE
CODE
21|22 23
31 32 33
4142
51
61
52
62
13
43
53
63
AMOUNT
14
34
IS
35
16|l7

~
36; 37
44145146147
34
64
55
65
56157
66
67
18
28
38
48
19
29
20
30
58
68
39
•9 SO
59 «
69
70
71
72
73
74
7S
76 77
78
79
80
71|72
73
74
75
76'77
78
. M • P.P M. (Jffanlii. B • P P B.
wliiile bmlv. wci weight :~r * P.P.T.
REMARKS
DATE
^ANALYST'S

PA Form 8550-2 >R.. 2-75)
PneviOUl EOlTICNgl OBSOLET
B-l
-------
APPENDIX C

Analytical Methodology for Organochlorine and
Organophosphorous Pesticides and Trifluralin
-------
Attached Methods

4.1 Extraction-Soil and Sediment

1. Weigh a 100 g specimen in a 500 ml Erlenmeyer flask and add 25 ml

of distilled water.

2. Add 50 ml of nanograde acetone and place a teflon stopper in the

flask. Shake specimen for % hour. Add 150 ml of nanograde hexane

and continue shaking for \\ hours more.

3. Decant specimen into a 500 ml separatory funnel through hexane-

washed glasswool that has been baked at 350°C.

4. Wash the specimen 3 times with separate 100 ml portions of hexane-

washed water. Discard the water (bottom layer) each time.

5. Pour the extract through a filter tube containing glasswool and a

1-inch layer of sodium sulfate that has been oven baked at 350°C.

The filtrate is collected in a screw-capped test tube.

6. Store specimens in refrigerator until ready for use. The filtrate

collected in step 5 is analyzed, without cleanup, for organophos-

phorus pesticides. Florisil cleanup is necessary for detection of

organochlorine pesticides on the electron capture type of detector.

7. The moisture content of each specimen is determined by placing 100

g of soil sample in an oven at 125°C for 24 hours and then noting

the weight loss of the sample.

Notes:

1. Run a solvent check with each group of specimens.

2. Run a fortified specimen with each group. The fortification proce-

dure is as follows: Pipet 1.0 ml of the organochlorine "Soil

Fortification Standard" A or 3.01 of a 1:3 dilution of "Soil
-------
Fortification Standard" A into 100 g of soil or sediment specimen.

Pipet 3 ml of the organophosphate "Soil Fortification Standard"

into the same specimen. Mix the standards with the specimen and

allow to stand overnight. The specimen is then extracted by the

above procedure.

3. Dry weight = weight of specimen after heating overnight at 125°C.
C-2
-------
A. Florisil Cleanup Procedure

1. Quantitatively transfer the specimen extract onto the top of

the column and collect the elution from the column into a 250

ml flask.

2. When the sample extract drains down to the top of the upper

layer of Na.SO,, add 100 ml of a mixture consisting of 10%

methylene chloride in hexane and continue collecting until the

liquid level reaches the upper Na.SO, layer. This elution is

labeled the "first fraction."

3. Replace the first 250 ml flask with a second flask and then

add 100 ml of 100% nanograde methylene chloride to the Florisil

column. Continue collecting the elution until the column

drains dry. Label the eluted portion, "fraction two."

4. To each flask add 1.0 ml of 0.01% Nujol (in hexane) and 3 to 4

glass beads. Attach a 3-ball Snyder column and place on a

steam bath or hotplate. Concentrate to ca 5 ml. Add 50 ml

nanograde hexane and concentrate to about 5 ml. Repeat the

last concentration step once more. This will remove essential-

ly all methylene chloride.

5. Pour 5 ml of hexane through the top of the Snyder column (for

rinsing) and collect in the flask.

6. Transfer specimens quantitatively into 15 ml graduated centri-

fuge tubes and place into a water bath that is maintained at

40°C.

7. Direct a purge of air into the centrifuge tube above the

liquid level until the volume of liquid is reduced to 2.5 ml.

8. Samples are now ready for CG determination.
C-3
-------
F. Concentration of Specimens on Hot Plate

1. Swirl the flasks containing glass beads until boiling occurs.

2. Do not allow the flasks to evaporate to dryness.

G. Pouring of Extracts Into Graduated Centrifuge Tubes

1. Use a small funnel to avoid losses due to direct pouring.

H. Concentration of Samples In Centrifuge Tubes With A Stream of Dry Air

1. Water bath should remain at a constant temperature.

2. Stream of air to all samples should be about the same flow

rate.

3. Concentrate all samples to approximately the same volume.

I. Column Cleanup

1. It is important that the adsorbent (Florisil) have consistent

mesh size and moisture content.

2. Exactly the same weight of adsorbent should be used for each

sample.

3. Good column technique is essential for adequate separations.
C-4
-------
C. Florisil Column Separation of Pesticides in Standards A and B

1. Components eluting in the first fraction (150 ml of 10%

methylene chloride in hexane) are:

aldrin

heptachlor

gamma chlordane

OPDDE

PPDDE

OPDDT

PPDDT

PPTDE

2. Components eluting in the second fraction (100 ml of methylene

chloride) are:

endrin

dieldrin

*heptachlor epoxide

^occasionally heptachlor epoxide may split between the two fractions.
C-5
-------
D. Florisil Column Separation of Other Common Pesticides

1. First fraction

trifluralin

toxaphene

PCB's

lindane (BHC)

PCNB

chlordane

methoxychlor

mirex

2. Second fraction

endosulfan I

endosulfan II

endosulfan sulfate

endrin, aldehyde form

endrin, ketone form

Note:

Most organophosphorus pesticides elute in the second fraction.
C-6
-------
B. Each Batch of Florisil Should Be Checked As Follows:

1. Add known volume of bench standard to Florisil column, and

take off fractions, as in the above procedure.

2. Concentrate volumes of fractions 1 and 2 to the same volume as

that originally added to the Florisil column.

3. Compare recoveries in each fraction with the bench standard.

This allows the chromatographer to determine which fraction

contains each component and the percent loss on the Florisil

column, if any.
C-7
-------
APPENDIX D

Sampling Weights for the Rural Soils Network (RSN)
-------
0. NOTATION

1967 Conservation Needs Inventory (CNI)

National Soil Monitoring Program (NSMP)

Rural Soils Network (RSN)

Rural Soils Network Cropland Sample (RSNa)

Rural Soils Network Noncropland Sample (RSN2)

Let i = 1, . . . ,48 denote the States of the conterminous

United States

Let j = 1, . . . , s (i) denote the counties of State i that are

not strictly metropolitan in character

Let k- =1, . . . , t (i,j) denote the strata in county j of State i

Let & = 1, . . . , U (i,j,k) denote the primary sampling units (PSU's),

typically 160- acre plots, in stratum k

of county j in State i

Let 8, = 1 u (i,j,k) denote the sample PSU's in stratum k of

county j in State i

[There are uncountably many secondary sampling units (SSU's), i.e.

possible sampling points, in each PSU, so it is not possible to index

the population of SSU's within any PSU.]

Let m=l, . . . ,v(i,j,k,£) denote the actual SSU's selected by

spinning the sampling template once for

PSU £ in stratum k of county j in State i.
- Although townships or their equivalent are used to stratify the sample
within counties, the township, within township, and other levels of
stratification are treated herein as a single level without loss of
generality.

D-l
-------
Let V (i,j,k,£) be the random variable representing the number of SSU's

selected by spinning the sampling template for PSU £ in stratum k.

Note that v (i,j,k,£) is a realization of V (i,j,k,£).

Let m! = 1, . . . , vj (i,j,k,£) denote the realized cropland SSU's

for FSU £ in stratum k of county j in

state i

Let m2 = 1, . . . , v2 (i,j,k,£) denote the realized noncropland SSU's

for PSU £ in stratum k of county j in

State i.

Of course, vt (i,j,k,£) + v2 (i,j,k,£) = v (i,j,k,£).

1. PHASE ONE — THE CNI SAMPLE

1.1 CNI PSU Probability

Since u (i,j,k) PSU's are selected at random and without replacement

from the U (i,j,k) PSU's in stratum (i,j,k),

p (i,j,k) = Overall probability of selection into the CNI for each

PSU in stratum (i,j,k)

u (i.j.k) ri>
" U (i.j.k) ' Uj

For the standard sampling procedure, in which one PSU was selected at

random from a stratum containing 48 PSU's,

ofi i k) = u (i'J»k) = -L = 9^
PU.J.K; - y (i>j>k) - 48 - W •

1.2 Conditional Probability for SSU's in the CNI

Recall that m = 1, . . . , v (i,j,k,£) indexes the CNI sample

points in PSU £ of stratum k. Also recall that there are infinitely

many such points available for sampling in each PSU. If the points are
D-2
-------
considered to have no dimensions and hence no area, any point picked at

random must have zero probability of being selected into the CN1 sample.

This is because there are infinitely many mutually exclusive points.

However, a point with no dimensions cannot be assigned a land use

other than that of a small undefined physical area surrounding that

point. Thus, in fact, a small undefined area centered at each CNI

sampling point was sampled rather than a point, per se. Let us then

assume that each CNI sample point is effectively a sampling unit with

area a, where the area a does not depend upon PSU or stratum. A probabil-

ity density for sample selection can then be distributed over each PSU,

resulting a positive probability for each SSU.

A reasonable simplification seems to be to assume that the proba-

bility density for selection is uniform over each PSU. This assumption

would imply, among other things, that there is no border effect. That

is, areas near the edge of the PSU are neither over- nor under-repre-

sented in the sample, both as selected and as implemented in the field.

In this case, if a single SSU were to be selected at random within a

sampled PSU, its conditional probability of selection would be a/A,

where A is the area of the PSU and a is the area of the SSU.

A random number V (i,j,k,£) of SSU's were selected from

PSU(i,j,k,£). Letting A(i,j,k,£) denote the area of PSU(i,j,k,£), the

conditional probability, given selection of PSU(i,j,k,£), for the selec-

tion of SSU (i,j,k,£,m) is then

Prob [SSU (i,j,k,£,m) is selected / PSU (i,j,k,£) is selected]

A(i>j*M) E [V(i,j,k,£) / PSU (i,j,k,£) is selected]. (2)
D-3
-------
The expected number of sample SSU's in a PSU, i.e. E[V] in (2), is

proportional to the area, A(i,j,k,£), of the PSU (except for 640 acre

PSU's). The density of the sampling template for 640-acre PSU's,

adjusted to a common photograph scale, was one-fourth that for all other

PSU's. Hence, the proportionality constant for 640-acre PSU's is one-

fourth of that for all other PSU's, hence,

E[V(i,j,k,2) / PSU (i,j,k,£) is selected]

0.25 c A(i,j,k,£) for 640-acre PSU's

1.00 c A(i,j,k,£) for all other PSU's (3)

Thus, from (2) and (3)

Prob [SSU (i,j,k,£,m) is selected / PSU (i,j,k,£) is selected]

0.25c a for 640-acre PSU's

l.OOc a for all other PSU's (4)

That is, the conditional probability of selection of an SSU is a constant

that depends only upon size of the PSU.

1.3 CNI Sampling Weights

Combining the results of 1.1 and 1.2, we can determine the overall

probability of selection for the ultimate sampling units, the SSU's, for

the CNI sample. In particular, it follows from (1) and (4) that

Prob [SSU (i,j,k,£,m) is selected into the CNI sample]

= Prob [PSU (i,j,k,£) is selected into the CNI sample]

X Prob [SSU (i,j,k,£,m) is selected / PSU (i,j,k,Ji) is selected]

0.25 a c p(i,j,k) for 640-acre PSU's

1.00 a c p(i,j,k) for all other PSU's (5)

Thus, for estimation of means, a proper sampling weight for each SSU

record in the CNI sample is simply
D-4
-------
W
for 640-acre PSU's
for all other PSU's
(6)
The constant factor, ac, cancels in any estimation of means. Of course,

this weight reflects only the unequal probabilities of selection due to

the sampling design and can be further modified to reflect missing data,

failure to accurately locate sampling points, etc.

1.4 Weighing the CNI to Estimate Total Land Area

It seems reasonable that if each SSU of the CNI is to be regarded

as having area equal to one (unit free), then the sampling weight to be

assigned to an SSU is
WT (i,j,k,£,m) =
E[Area (in acres) represented by the SSU]
Prob [PSU (i,j,k,£)]

Area of PSU (i.j.k.A)
E[V (i.j.k.JQ] .
Prob [PSU (i,j,k,£)]
. for 640-acre PSU's
for all other PSU's
0.25 c
c A(i.j.k.A)
for 640-acre PSU's
(7)
c p(i,j,k) for all other PSU's '
Of course, the proportionality constant, c, or equivalently,

(i|j|kf£)l would have to be explicitly determined, probably empir-

ically, to actually use (7) in estimation of total acreage. Although we

will not need (7) explicitly, since we are only interested in estimating

means or rates, it is reassuring that the weights (6) and (7) are of the

same form.
D-5
-------
2. THE RSN SAMPLE

2.1 Preliminaries for the RSN

The contribution of sampled CNI cropland PSU (i,j,k,£) to the

cropland accumulation used by the USDA for selecting the RSN subsample

is the adjusted cropland ratio
- Vl (i.J.M) ' 0-02
v (i,j|M) p
Thus, the total of the cropland accumulation used by the USDA in

State i is
1 * ?!
k=1 ft=l
»•" 1 JC~ X
ti,j . u
= 0.02 Z Z f.1. ., Z lirA (9)
j=1 k=1 P (i,J,k) £=1 v (i,j,k,£)

Similarly, the total of the noncropland accumulation in State i is

s(i) t(i,j) u (i,j,k) (. . ,.
N2(i) = 0.02 Z Z ... ., Z VV ^L i\ (1Q)
j=l k=l p (l'J'k) £=1

2.2 Estimation of Proportion of Cropland Acreage in
the Rural Area of State i.

The estimate used was
Z vt (i.j.k.2)
D
N2

(i)
_ J=l
Z
k=l
z'
k=l
P (i
)
P (i
>j
1
• J
,k)

,k)

u (
2=1
i,j

,k)
v (i,j

,k,2)

(i.j.k.2)
_ j=l k=l p (i.j.k) 2=1 v (i.j.k.2) (11)
" s(i) t(i,j)
Z Z U (i,j,k)
j=l k=l

= (Estimated total of the cropland proportions for all

PSU's in State i) T (Total number of PSU's in State i)

D-6
-------
2.3 More Preliminaries for the RSN
Let ni(i) denote the number of 10-acre RSNj sites to be selected in

State i. Recall that n^i) is chosen so that 0.025% of the cropland

acreage in State i is sampled.

The procedure for selecting n^i) starting points for the n:(i)

RSN! sample sites was:

1) Select a random number from the interval (0, Wi(i)) where

Wi(i) = N1(i)/n1(i). Call this random number Qi(i).

2) Select as an RSNi starting point the first CNI cropland SSU

whose contribution to the cropland accumulation causes the

accumulation to equal or exceed qi(i).

3) Repeat step (2) with qi(i) replaced by qi(i> + Wi(i), qi(i) +

It should be noted that an RSN! starting point did not uniquely deter-

mine an RSNi sample site. In particular, an adjacent cropland SSU had

to be found, and the RSN.. sample site was centered about these two

cropland SSU's from the CNI. Moreover, substitution procedures were

employed when an adjacent cropland SSU did not exist. In addition, a

substitute RSNi site was selected if the selected site either could not

be surveyed for some reason or could no longer be considered a cropland

site.

Let us consider an alternate, and perhaps more useful, representa-

tion of identically the same procedure for selecting the ni(i) starting

points for the RSN: sample. The contribution of each cropland SSU in

PSU (i,j,k,£) to the cropland accumulation is

II7"j ^ tln \ II 7~Z ^ l_\ \*-£)
D-7
-------
Let \ = 1, . . . , A(i) denote the SSU's selected into the CNI sample in

State i. The cropland accumulation may then be represented as

A(i)
Nt(i) = Z 7t(\) , (13)
\=1

where 7l(\) is given by (12) for cropland SSU's and is zero for noncrop-

land SSU's. Thus, the cropland accumulation for State i may be thought

of as partitioned into A(i) zones, where each zone has width n(\).

PSU's that are entirely noncropland will contribute a null zone with

zero width to the cropland accumulation. The RSNi procedure for select-

ing a CNI SSU as a starting point for a 10-acre RSNi site may then be

illustrated as:

qi(i) qi(i) + Y wi(i)

I n(l)l n(2)! 7i(3)l 7i(4)l n(5)| 7i(6)| 7t(7)' ' U(A.-1)' n(\) ln(A + 1 )|' '
0

A cropland SSU is selected as a starting point for locating an

site, if the sequence number

qi(i) + YWi(i) for ? = 0, 1, . . ., nx(i) - 1 (14)

hits the zone representing the SSU.

2.4 Conditional probabilities for SSU's in the RSNt given the CNI

The SSU(i,j ,k,£,m..) is selected as an RSNX starting point only if

the single random number qi(i) results in a sequence number given by

(14) that hits the zone representing SSU(i, j ,k,JH,ra-) . The chance of

multiple hits on this zone is almost identically zero since the width of

the zone representing a cropland SSU, given by (12), is very much

smaller than the distance w1(i) = N1(i)/n1(i) between cropland sequence

2/
numbers.- Thus,
2/ Multiple hits within the same PSU have occurred in the RSN sample
occasionally, however, due to the inadvertent repetition of some PSU
records in the State lists used to select the RSN subsamples.

D-8
-------
Prob (SSU (i.j.k.A.mj) will be selected as a starting point for an RSNt
site SSU (ijj.k.A.mx) is in the CNI sample)
Size of the zone representing SSU (i,j ,ktl,ini)
1 0.02
_ v (i.j.k.A) p (i.j.k)
from (12) and the fact that Qi(i) is a random number from the interval

(0, Wi(i)).

2.5 RSN sampling weights

It will be recalled that the selection of a starting point for

locating a 10-acre RSNj. site did not uniquely specify the site. There

was a procedure for determining the sampling site based on any cropland

starting point, however, as long as an appropriate site could be found

within the PSU containing the starting point. To the extent that this

procedure was strictly applied, most RSN! sites were uniquely determined.

However, it is apparent from considering several maps of RSN: sites that

the specified procedure was only adhered to loosely.

It should also be noted that the procedure for determining an RSNx

site based upon a cropland starting point was that the starting point

not be included in the resulting sample site if the starting point was

an isolated cropland point. Thus, there was an intentional bias away

from isolated cropland SSU's in the RSNlt

If the non-uniqueness of the RSNi site determined by selection of a

starting point, and bias away from isolated cropland SSU's is ignored,

we obtain from (5) and (15),

Prob (the RSNi site resulting from selection of SSU (i,j ^j^ni!) as a
starting point will be selected into the RSNi sample)
Prob (SSU(i,j ,k,H,mi) will be selected as a starting point for
an RSNi site / SSU(i, j ,]ailLJmi') is in the CNI sample)
Prob (SSU (i,j,k, £,111!) is selected into the CNI sample)

D-9
-------
0.02
v (i.j.k.JE) p(i,j,k)
0.005 a c PI (i)
0.25 a c p (i,j,k) for
640-acre PSU's
a c p(i,j,k) for
all other PSU's
for 640-acre PSU's
(16)
0.02 a c ni(i)
v (i.j.k.l) Mi(i)
for all other PSU's
4v (i,.i,k
Thus, if we are willing to accept the simply ing assumptions at the

beginning of this section, a sampling weight for estimation of means for

the RSNi site resulting from selection of SSU (i,j ,^£,011) as a starting

point is

Nl(i) for 640-acre PSU's

(17)

V (l>J>n'(i) Nl(l) for a11 °ther PSU'S-

It should be noted that the weight given by (17) will be approximately

the same for all RSNt sites within those States where only one PSU size

was used. This is because v(i,j,k,£) will be very nearly the same for

all PSU's within such a State.

Of course, a sampling weight for estimation of means for the RSN2

site resulting from selection of SSU(i,j,k,£,m2) as a starting point is
W
N2(i) for 640-acre PSU's
(18)
v(i,j,k,A) N2(i) for all other PSU's.
nz(i)
D-10
-------
3 . Comments

A strict accounting of the bias away from isolated cropland SSU's

in the RSNj would be quite difficult. It would be necessary to deter-

mine, for each RSNx site, the number of SSU's that would have resulted

in selection of the site if that SSU had been chosen as a starting

point. This number of SSU's chosen as starting points that would have

resulted in selection of the RSNj site could theoretically be any posi-

tive integer. A value of one would hopefully be predominant, giving

exact aggrement with (17) and (18). However, two and three would surely

occur also.

Consider the conditional probabilities for the SSU's in the RSN1}

given the CNI, as considered in Section 2.4. In particular, the sum

over all SSU's sampled by the CNI of the probability that SSU(i, j .k.Jfc.mi)

will be selected as a starting point for RSN^ site is as follows from

(15):

.(i) t (i,j) u (i j,k) v, (i.j.M)* v (j.j.k.JE) p (j°jik)

2=1 mi=l

t (i,j) u(i j,k) ni (i) Q.Q2 Vl

k=l
0-02 m (i) s(i) t (i,j) u(i,j,k)
jj / -• -\ •* •*
j=l k=l P 's £=1

= Mi)

from (9). Thus, the sum of the SSU probabilities for the RSN1( condi-

tional on the CNI sample being regarded as fixed, is the RSN^ sample

size for State i, namely n^i). This result lends additional credence

to the correctness of the sampling weights as described by (17).
^Summation is over the sample cropland points of the CNI sample because
these points constitute the population with regard to the conditional
RSN probabilities.
D-ll
-------
4. Approximation to the RSN Sampling Weights

Exact implementation of the sampling weights given by (17) and (18)

is not a simple task. The sample sizes n^i) and n2(i) for the cropland

and noncropland samples are readily available (See Table 1.3). However,

the State accumulations of the adjusted cropland and noncropland ratios,

Ni(i) and N2(i), are only available from the hard-copy computer records

of the RSN sample selection. These records are not entirely reliable,

since there is no guarantee that the copy available was the final copy

from which the sample was selected. Dummy records were added to obtain

coverage of federal croplands for the noncropland sample, and the data

set was otherwise edited before sample selection. The number of sampling

points, v(i,j,k,£), is again available from the hard-copy computer

records. However, it would be a monumental task to go through the

hard-copy computer records to obtain v(i,j,k,£) for each RSN sampling

site. A perusal of these sampling records reveals that individual CNI

sites were sometimes entered more than once, doubling the probability

that these sites would enter the RSN sample.

If the sampling design had been implemented exactly as described in

the text for a particular State and all PSU's were 160 acres, partial

PSU's would still occur around the boundaries of counties and other

large scale geographic strata, e.g., irrigated areas. These partial

PSU's would be "nominal" 160-acre PSU's, but would receive fewer than

the usual number of sampling points.

The full 160-acre PSU's each receive approximately 36 sampling

points. The random variation in v(i,j,k,£) may be small for the full

PSU's, and a good approximation to the sampling weights given by (17)

and (18) is achieved by using the mean number of points assigned in

place of v(i,j,k,£).

D-12
-------
Identification of the "nominal" 160-acre PSU's is not be a simple

task. It requires close examination of the CNI sampling maps, at least.

It should be noted also that actual PSU's were, in practice, sometimes

larger than their "nominal" size. These larger PSU's occurred mostly in

States that used 40-acre PSU's in "irrigated" strata, where the "nominal"

40-acre PSU's were sometimes larger than 40 acres around the stratum

boundaries. Due to the problem of identifying PSU's considerably larger

or smaller than their "nominal" size, no adjustment in the sampling

weights (17) and (18) is being proposed for RSN sites occuring in these

PSU's.

The sampling weights given by (17) and (18) are only appropriate if

the sampling design is implemented as described in the text. Examination

of the numbers of points assigned to CNI sites reveals, however, that

this was not the case. The assignment of sampling points within PSU's

was done at local USDA offices, and the design sampling protocol was not

consistently followed. For example, nearly all sites in Nevada received

approximately 36 sampling points, whether the PSU size was 40 acres or

160 acres. Moreover, it appears that the sampling template may not have

been spun for Nevada sites since most received exactly 36 sampling

points. Also, the scales of the sampling template and the aerial photo-

graph were often not properly matched, resulting in consistently more or

fewer sampling points than expected from the design protocol. For

example, many 160-acre PSU's in New Mexico received approximately 18

sampling points, rather than 36 sampling points. Thus, a single sampling

protocol was not consistently applied throughout the United States. It

is probably not possible to determine exactly what protocol was used for

each sampling site. The consequences of these variations in sampling

protocol will presently be investigated.

D-13
-------
The investigation of the effects of variations in the CNI sampling
procedure upon the RSN sampling weights will be aided by considering the
sampling weight given by (17) as a product. In particular, the sampling
weight (17) may be written as
v(i,jtkt£) p(i.j,k) MB . ,. * . for 640-acre
for aii other
PU,J,k) psuis

vU,j,M) Mi) . 4 £ 640-acre PSU's
ni(i)
(19)
v(i,j,k,A) Nt(i) . f all other psu,
ni(i)

where the first factor is the conditional RSN weight and the second
factor is the CNI weighting factor. The CNI weighting factor for 640-
acre PSU's is four times that for all other PSU's because each such
point represents four times as much land area as points in PSU's of
other sizes.
A specific case may help to clarify the effects of variations in
the CNI sampling procedure upon the RSN sampling weights. Once again,
consider the case of New Mexico where many 160-acre PSU's were sampled
at the design rate of about 36 sampling points per PSU, while many other
160-acre PSU's were sampled at the lower rate of about 18 points per
PSU. For those PSU's sampled at the proper intensity, the appropriate
sampling weight is approximately
36 Hi(i)
(i)
.
The sampling weight formula given by (19) would have to be modified for
the PSU's receiving only about 18 sampling points, since each point
represents twice as much land area. The CNI factor of the sampling
weight is doubled resulting in an approximate RSN sampling weight of

D-14
-------
18 Nt(i) 36
Ni(i)
na(i)
which is exactly the same as the first case. When half the usual number

of sampling points was assigned the conditional RSN weight was halved.

However, each sampling point then represented twice as much land area,

doubling the CNI weighting factor. In terms of probability, the condi-

tional RSN probability was doubled for each point, since it contributed

twice as much to the accumulation NjCi), but the unconditional CNI

probability was halved, since half as many sampling points were being

assigned within the PSU. Thus, the procedural variations in the CNI

sampling protocol result in no change in the appropriate mean weight for

the RSN. A single mean sampling weight is then appropriate for all

160-acre PSU's. This sampling weight is
where vlgo is the average number of sampling points per FSU when the

design sampling procedure described in the text is applied.

This weight fails to reflect the random variation in the number of

sampling points, v, assigned to a PSU within any given sampling protocol.

However, it is a proper mean sampling weight regardless of the sampling

protocol. The alternative is not feasible, requiring precise knowledge

of the sampling procedure used to assign the sampling points as well as

the number of points assigned for each PSU containing an RSN sample

site. Since this detailed information is not available, the mean sampl-

ing weights appear to be most appropriate.

The following mean weights are suggested for the RSN cropland

sample:
D-15
-------
40 acre PSU's : V*<> *l
100 acre PSU's :
160 acre PSU's
640 acre PSU's :
i)
where v. is the mean number of sampling points assigned to PSU's of area

A under the sampling protocol specified by the design. It should be

noted, however, that this protocol results in v. being directly propor-

tional to the size, A, of the PSU, except for 640-acre PSU's where v640

is identical to v160. Thus, the above mean RSN sampling weights may be

expressed as follows:

40-acre PSU's :
100-acre PSU's : 10
160-acre PSU's : l6
640-acre PSU's

where v10 is the mean number of sampling points per 10 acres assigned to

all but 640-acre PSU's under the sampling protocol specified by the

design.

Since only relative sampling weights are required for estimation of

means, the constant factor, v10, in the above sampling weights may be
D-16
-------
cancelled. Moreover, the cropland and noncropland samples of the RSN

can be regarded as two strata in the RSN sample of the rural areas of

the conterminous United States. As seen before, the derivation of the

noncropland sampling weights parallels that for the cropland sampling

weights in all respects. The ratio N1(i)/n1(i) for the cropland sampling

weights in state i is replaced by N2(i)/n2(i) for the noncropland sample.

Otherwise, the conditional RSN factor of the sampling weights and the

unconditional CNI factor remain unchanged. Thus, the final recommended

sampling weights for the RSN are as given in Table D-l. The cropland

sampling rate was 0.025 percent of the cropland acreage within each

state, and the noncropland sampling rate was 0.0025 percent of the

noncropland acreage, which is reflected by N2(i)/n2(i) being approxi-

mately 10 times as large as N1(i)/n1(i).

Table D-l: Recommended RSN Sampling Weights

PSU Size Cropland Noncropland
40 acres 4 N^iJ/n^i) 4 N2(i)/n2(i)
100 acres 10 N^O/n^i) 10 N2(i)/n2(i)
160 acres 16 NiCiJ/n^i) 16 N2(i)/n2(i)
640 acres 64 NiUJ/n^i) 64 N2(i)/n2(i)
Notation: nt(i) = Number of cropland sample sites in state i

n2(i) = Number of noncropland sample sites in state i

NjCi) = Total "cropland accumulation" for state i

N2(i) = Total "noncropland accumulation" for state i

It should be emphasized that the sampling weights shown in Table

D-l reflect only the mean differences in the portion of the selection

probabilities of the RSN sites that depend upon the size of the PSU. It

has been argued that the selection probabilities would be fairly constant

for a given size of PSU, since the total number of CNI sampling points

would be fairly constant. Undocumented variations in the CNI sampling
D-17
-------
protocol make it virtually impossible to quantify the smaller variations

in selection probabilities for RSN sample sites within the group of

PSU's of a given size. There are many other factors that may be reflect-

ed in sampling weights, but are presently ignored. Some of these factors

are:

1) Duplicate entry of some CNI sites in the list from which the
RSN sample was selected, doubling the chance of selection for
all potential RSN sites within such PSU's.

2) CNI sites missed when the RSN sample was selected.

3) Inclusion of some CNI sites that fell outside the partial PSU
being sampled.

4) Loss of some CNI site maps.

5) PSU's substantially over or under their "nominal" size.

6) Border effects, or PSU size effects, on the number of sampling
points assigned within a PSU.

7) Random variation in the total number of CNI sampling points
assigned to a PSU.

8) Failure to accurately locate the selected RSN sites, CNI
sites, and/or CNI sampling points in the field.

9) Uncertainty associated with the values found for the cropland
and the noncropland accumulations for each state.

10) The existence of multiple CNI sampling points that would all
lead to selection of the same RSN site.

11) The use of substitute RSN sites.

5. Implementation of the RSN Sampling Weights

Implementation of the approximate RSN sampling weights shown in

Table D-l required that information be gathered that was not available

on the data records. The size of the PSU from the 1967 CNI sample into

which each RSN site fell was obtained from the Statistical Laboratory at

Iowa State University. These findings are shown in Tables D-2 through

D-5. The State accumulations of the adjusted cropland and noncropland
D-18
-------
Table D-2: States With Only 160-Acre PSU's

State Code State Name State Code State Name
01
06
08
12
13
16
17
18
19
20
21
22
26
27
Alabama
California
Colorado
Florida
Georgia
Idaho
Illinois
Indiana
Iowa
Kansas
Kentucky
Louisiana
Michigan
Minnesota
28
29
30
37
38
39
40
41
45
47
48
53
55
Mississippi
Missouri
Montana
North Carolina
North Dakota
Ohio
Oklahoma
Oregon
South Carolina
Tennessee
Texas
Washington
Wisconsin
Source: Statistical Laboratory, Iowa State University

Table D-3: States With Only 100-Acre PSU's"

State Code State Name State Code State Name
09
10
24
25
33
34
Connecticut
Delaware
Maryland
Massachusetts
New Hampshire
New Jersey
36
42
44
50
51
54
New York
Pennsylvania
Rhode Island
Vermont
Virginia
West Virginia
Source: Statistical Laboratory, Iowa State University
Table D-4: States With Constant PSU Size Within Counties
State Code State Name PSU Size
05 Arkansas 160 acres
40 acres

46 South Dakota 640 acres
County Codes
9-15,23,39,49,53,59,61,89,
99,103,109,129,133-137
All others

7,19,31,33,41,47,55,63,71,
75,81,85,93,95,103,105,113,
117,121,131,137
160 acres All others

Source: Statistical Laboratory, Iowa State University
D-19
-------
Table D-5: States With Varying PSU Size Within Counties
State Code
23
04
31
32
35
49
56
State Name
Maine
Arizona
Nebraska
Nevada
New Mexico
Utah
Wyoming
PSU Size
400 acres
100 acres
160 acres
40 acres
640 acres
160 acres
160 acres
40 acres
640 acres
40 acres
RSN Site Numbers-^
7,27,29,32-34,36-39,48,61,67
All others
1,3,10-55,107,108,160,163
All others
67,68,70,180,194,195,243,
246-248,321-324,434,448,449,
and all sites in counties:
5,9,31,41,45,63,73,75,91,
117,161,165,171
All others
96,143
All others
65,67,179
1,4,59,62,64,117,120,123,125,
177,183,184
160 acres All others
640 acres 12,51,91,97
160 acres 2,3,9,45,46,48,52,54,56,90,
92,95,134,135,139,141
40 acres All others
640 acres 4-6,9,15,17-20,22-25,30-33,
35-43,46,48-54,57,60,61,66,
71,112,165,167,168,174,176
160 acres All others
- Only sites for which data was collected were classified. Completion
of this table for all RSN sample sites in these states would be very
time consuming.
*
Source: CNI site numbers corresponding to the RSN site numbers were
obtained from the EPA Field Studies Branch, Washington, D.C. The PSU
size for each of these CNI sites was obtained from the Statistical
Laboratory at Iowa State University.
D-20
-------
ratios, Nj(i) and N2(i), were obtained from the hard-copy computer

records of the RSN sample selection. The information obtained is shown

in Table D-6. The information in Tables D-2 through D-6 was then used

for the sampling weight computations shown in Table D-l for each RSN

sample record, with the exceptions noted below.

As shown in Table D-5, the RSN sample sites in the State of Maine

fell in PSU's of two sizes--100 acres and 400 acres. Actually, Maine

had a few 200-acre PSU's, but none of these were in the RSN sample. It

appears, however, from the RSN sampling documents preserved by the EPA

that each 400-acre PSU was treated as four 100-acre PSU's when the RSN

sample was selected. The effect of this treatment of 200-acre and

400-acre PSU's in Maine can be seen by considering the factored form of

the RSN sampling weight given by (19). It appears from the number of

CNI sampling points assigned to the 200-acre and 400-acre PSU's that

they were, sampled at the same rate as all other PSU's, except for the

640-acre PSU's. Thus, the unconditional CNI factor in the sampling

weight (19) is one. The conditional RSN factor is the same, on the

average, as that for 100-acre PSU's, since the total points, v, for the

100-acre portion of the 400-acre PSU is same, on the average, as that

for 100-acre PSU's. Thus, the mean sampling weight was computed for all

sites in Maine as shown for 100-acre PSU's in Table D-l.

The State accumulations of cropland and noncropland ratios, N^i)

and N2(i), shown in Table D-6 were checked for logical consistency.

This check was felt to be necessary since these values were based upon

3 /
hard-copy computer output from the RSN sample selection.- This hard-
3/
- Only a hand written copy could be found for Maine.
D-21
-------
Table D-6: State Accumulations" of Cropland Ratios, Nx(i), and
Noncropland Ratios, N2(i), Together with Computed Sample Sizes and Total Land Area

State Code
01
04
05
06
08
09
10
12
13
16
17
18
19
20
21
22
23
24
25
26
27
28
29

State Name
Alabama
Arizona
Arkansas
California
Colorado
Connecticut
Delaware
Florida
Georgia
Idaho
Illinois
Indiana
Iowa
Kansas
Kentucky
Louisiana
Maine
Maryland
Massachusetts
Michigan
Minnesota
Mississippi
Missouri

Ni(i)
467.42123
179.53666
3656.18536
1284.14536
1256.23608
58.31892
120.95724
361.80749
587.92669
655.14071
3050.59595
1616.49756
3160.09155
3426.67927
718.47949
546.63631
228.28275
423.85901
56.72992
1158.50806
2557.73877
627.86577
1702.23242

ni(i)
92
36
216*/
268
240
8
12
72
120
132
568
312
608
684
124
108
32
52
8
220
488
124
328

N2(i)
3619.41845
9063.14655
10472.11792
10698.46741
7390.47656
601.65381
154.04031
3980.34901
4066.30930
5990.26966
1747.62573
1398.49561
1515.03809
3131.41689
2969.72778
3060.22173
3735.32919.
868.16471
949.29687
3655.05437
4150.38281
3151.17520
4035.95068

Mi)
72
176
64
224
140
8
4
80
80
120
32
28
28
64
52
60
48
12
12
68
80
64
76
(1000's of acres)
Total Land Area
32,597
72,680
33,468
100,076
66,486
3,127
1,266
34,721
37,263
52,933
35,766
23,132
35,839
52,425
25,511
28,596
19,848
6,319
5,033
36,515
51,201
30,250
44,235
-------
Table D-6: State Accumulations of Cropland Ratios, NjCi), and
Noncropland Ratios, N2(i), Together with Computed Sample Sizes and Total Land Area

State Code
30
31
32
33
34
35
36
37
38
39
40
41
42
44
45
46
47
48
49
50
51
53
54
55
56

State Name
Montana
Nebraska
Nevada
New Hampshire
New Jersey
New Mexico
New York
North Carolina
North Dakota
Ohio
Oklahoma
Oregon
Pennsylvania
Rhode Island
South Carolina
South Dakota
Tennessee
Texas
Utah
Vermont
Virginia
Washington
West Virginia
Wisconsin
Wyoming

Ni(i)
1819.92407
38.16203
66.32464
57.27583
157.60745
204.96760
1212.80716
694.07227
3396.55322
1390.60275
1310.92419
774.27002
806.77905
7.95469
378.91528
2268.04346
617.46338
3732.70468
250.68592
142.21658
742.42676
879.95784
200.94192
1425.55981
365.49951

ni(i)
340
40-^
12
8
20
40
152
124
636
276
260
152
152-^
4
68
420^
112
744
48
20
84
180
24
272
68

N2(i)
10579.96484
1145.83691
8475.24410
1047.69238
811.71170
9663.70741
4933.72984
3684.40039
2544.91626
1889.30123
4200.03636
7003.43794
3020.83643
128.59467
2251.41602
4287.50000
3029.04150
17409.54254
6318.30648
1017.74146
4225.02734
4461.17404
2878.29848
3141.84985
7940.41016

ng(i)
200
120^
176
12
12
192
60
68
48
36
84
140
56
4
40
80
56
344
128
12
56
88^
36
60
148
(1000's of acres)
Total Land Area
93,098
49,021
70,264
5,769
4,810
77,688
30,670
31,331
44,442
26,206
43,819
61,587
28,804
676
19,338
48,612
26,444
168,001
52,722
5,937
25,458
42,616
15,402
35,013
62,306
- These values differ from the actual sample sizes in Table 1.3.

*
Source: Hard copy computer records of the RSN sampling maintained by the EPA Field Studies Branch,
Washington, D.C.

Source: Basic Statistics—National Inventory of Soil and Water Conservation Needs, 1967.
-------
copy record was believed to be the computer record of the final sample

selection for each State, but were not verified. The check made was to

compute, as described in Section 1.2.2, the cropland and noncropland

sample sizes, SjCi) and n2(i), from the accumulations, N^(i) and N2(i),

and the total land area of each State as shown in Table D-6. The comput-

ed sample sizes, fijCi) and n2(i), differ from the actual sample sizes

shown in Table 1.3 by no more than four for all States except Arkansas

and Nebraska. Small differences in the computed and actual sample sizes

can be explained by the fact the total land area for each State that was

used to compute the RSN sample sizes did not agree exactly with the
4/
figures shown in Table D-6.- The relatively large discrepancies for

Arkansas and Nebraska were interpreted as meaning that the accumulations

NjCi) and N2(i) shown in Table D-6 for these States are incorrect.

Thus, the ratios N1(i)/n1(i) and N2(i)/n2(i) which were used to compute

the sampling weights for these two States, from the formulas shown in

Table D-l, were the averages of N^iJ/n^i) and N2(i)/n2(i) for all

other States, except Rhode Island. The Rhode Island data was also

excluded from this average because the very small size of Rhode Island

resulted in its cropland sample size being rounded up to 4 even though

its computed value was approximately one, which deflated the value of

N1(i)/n1(i) for Rhode Island.
- This is evident from hand computations of the RSN sample sizes
preserved by the EPA for some States. The source of the land areas
actually used is not known.
D-24
-------
APPENDIX E

Construction of an Analysis Data File
E-l
-------
Construction of an Analysis Data File

The EPA computer records for the Rural Soils Network (RSN) are

structured for simple entry of the data from laboratory analyses. For

example, a laboratory test that results in less than detectable levels

of a category of compounds produces only a single entry into the data

file. In order to simultaneously analyze the data for more than one

compound it is useful to restructure the data file so that it contains a

distinct variable representing the amount detected for each of the

compounds to be analyzed. Thus, a SAS- data set with this structure
2/
was created for analysis purposes.- The contents of this data set are

shown in Exhibit E-l.

Each detection of a pesticide residue for a sample specimen resulted

in an entry into the EPA computer record for each of four variables—a

Residue Classification Code (RCC), an Individual Residue Code (IRC), an

amount, and a unit. It was found that all amounts were in units of

parts per million; thus the unit variable was not included in the SAS

file constructed. Only specific residues were tested for on a regular

basis. These compounds are listed in Table £-1. Other residues may

have been tested ocassionally, but such data cannot be used for inferen-

tial purposes. Only data for the pesticides shown in Table E-l
- Statistical Analysis System (SAS) User's Guide, SAS Institute,
Inc., Raleigh, North Carolina, 1979.
2/
— For those readers interested in using this data set, it is stored
on a user disk at the EPA North Carolina Computing Center. The fully
qualified data set name is

CN.EPAROY.SADD.PEST.SASFILE,

and the data is located in the data set member called TOTAL.

E-2
-------
Exhibit E-l. Contents of the SAS data set created
TH»CKb UStn=?Y7
BY

bUuExUMSsI
<».4U

0
1
Ib
I <3
7 1/
4
1 1
1 /
Is
?0
7
b
Ib
fi
t-1
*•«
44
Si
S/
?i
Sb
6b
?4
?s
2b
? /
?t>
58
Sf
3 |
12
34
•^
lo
37
3H
1«J
f 0
''I
4U
41
Uf>
4}
? |
fr£
4b
A3
•51
S?
nsnAfE=

VARIAUt E
ALTNUf
ANIMATE
rLAY
CUNAft
rui'NTY
fKOHNu"
rxnpRtft
Ch'M'YK
FY
LAH
1 NDUSt
OKHpAl
PESIOO?
Ptsroii
PtSI 0 16
Ptsioen
I'tP 1 OS 1
PtS 1 1 1">
ntsiun
PtS 11 61
PtSl?35
PtSI?J7
"tsi?nn
PfcS1241
Pk S 1 P 4 1
p g. <5 J p n t\
PEST?"4fr
PLS i?4fl
PtS F?b^
ctSI?oO
Pt SI 336
PtS1JJ7
PtSlljfl
PtSTl4l
P k S 1 1 4 P
P t ^ 1143
P(.S) 3^H
PtSlltO
PLSI/I^O
"f J4?1
ot.S144»
PtSI4S7
Pt S I'i'y'S
DtS [S 1?
Pt STSiJfc
PtS 1531
P£SIb J4
ptsifc^n

S T
UnSFKVATinNS
Cr. FPAKOY. SADO

TYPt
MjM
KUM
NO"
ThAR
MJP
tUf
KU"
ruf
KU"
Mi"
KljW
Ml"
MJK
NUI"
Nil"
MjM
Nlj"
KU"
MJC
Mjf
N'U"
MlH
NUW
NU'-1
SUM
M,"
Mjf
NU"
Ml"
KU"
NO"
KUM
MJ"
KU"
KUK
NU«
Miv
NU"
so"
Mif
Mif
NUf
Kb"
NUC
SUM
MJC
M,M
Mj"

CflNTtNTS (IF SAS DATA Sb
= 1237? CWtAlFI) bY JOB FPAKflY
.PEST.bASFlLE PLKST/F. = 1 J030
ALPHABFMC LIST OF V
LFNGTH POSITION FUR^A I INfOH
4
4
3
20
3
2
P
2
p
?
f
3
II
4
4
l|
4
4
4
4
4
4
It
a
4
4
4
4
4
4
4
4
4
4
4
4
4
4
U
4
4
4
4
4
4
4
4
4
4
52
44
27.J
111
33
50
Sb
S«
21
t<)
47
64
244
24Q
152
196
d04
htt
200
<>4U
7?
7o
PO
R4
I>B
208
i!l2
100
104
112
116
IPO
124
1?B
132
216
220
lib
140
144
140
60
224
156
220
IPO
184
S I t M
FN1I>»Y, "AHLH 6, 1981
Al
FRJHAY, MUCH 6, 19RI

rust RV A. films HFH IR«CK=<42 (,FI*FH«TFU HY
L»HFL

AfCFSSlfN MJMHtR
ANfcl YSIS PATE

COUNIY NANt

NUHHFh
CRUl'
F1SPAL

LAND USF

At 0»IN
bEN?tNt
UMLAN
GAMCA
UrP-'DOF
p,p-'nuf
p,p-*ni;T
OTAZ1NUN
orcn
Dltl
tMJddlJLFAf. II
SULFAIF
driJRIN ALrtHYDL
ENUP1M KtTLNt
ETHIUN
hFPTAChl (jR
HFPTATHLUK FPOXICF
ISIJOKIM
LINHANb
AI Arm un
fFTHUXYTMlCP
f»F|HYI PAKA1H1UN
f 1RFX
LVbX
*Source: Computer files supplied by the E?A Field Studies Branch, Washington D.C.
-------
Exhibit 1 (.continued)
s T A i i s i
1CAL
ANALYSIS
S> T 6 I t H
l«:iS FKfuM, MARCH 6,
33
6<4
"0
ftS
5<4
Si
4/
II M
?V
TO
44
SO
12
V
7i?
U
13
in
4
S
t
fiV
7J
lu
74
71

PbSt«i3«
PtSTf-4?
PtS1646
P£PlfrbO
PtSl6?n
PtSU/?
PfcSUt?
Ptsi<-a«
P£SI7tft
Ptsne?
Pt^I 7SS
P t S 1 7 V 1
Ph
RAIN
PUdNR
SAuPRATt
•SAMD
•51L1
SlU
Sf.C
STAIF
SIMA^t
STPATuM
TfP
l«1
YtAK
\OATA
KUH
MJM
KOM
KUM
MjM
KUM
KUM
KUM
KU"
NO"
KUM
KU"
MjM
KUM
KUM
KOM
K'UM
KUM
KUM
MJM
KUM
Ch«K
KUM
NU"
K'UM
KUM
IN.
4
a
4
4
4
n
4
4
4
4
II
4
T
3
?
4
T
T
3
3
2
20
3
T
A
'
TUTAL; SEI
108
*.
30
29s
242
ONE;
PHUTUniFLPhlN
bTMYL PAK4IHIOH
PC NO
PHIRATt
P'»Ul AM
prn
pi
-------
*

Table E-l: Pesticide Residues Tested on a Regular Basis
Residue Classification
Code (RCC)
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2~/
2-
2
Individual Residue
Code (IRC)
499
002
160
237
240
241
243
244
786
787
258
260
638
336
337
338
341
342
343
420
421
448
497
080
526
646
687
688
795
799
534
620
672..
670|/
09 14/
161-
Compound
Alachlor
Aldrin
Chlordane
DCPA
o,p - 'DDE
p,p - 'DDE
p,p - 'DDT
o,p - 'DDT
p,p - 'TDE
o,p - 'TDE
Dicofol
Dieldrin
Photodieldrin
Endosulfan I
Endosulfan II
Endosulfan Sulfate
Endrin
Endrin Aldehyde
Endrin Ketone
Heptachlor
Heptachlor Epoxide
Isodrin
Lindane
Benzene Heptachloride
Methoxychlor
PCNB
Propachlor
Ronnel
Toxaphene
Trifluralin
Mi rex
Ovex
PCB
Prolan
Bulan
Gamma Chlordane
(cont.)
- Shown on the computer records to have RCC = 7, but corrected to 2 by

personal communication with EPA Field Studies Branch, Washington, D.C.

21
- Tested for fiscal year 1972 specimens,'and thereafter.

3/
- Tested for fiscal year 1974 specimens, and thereafter.

E-5
-------
Table E-l: Pesticide Residues Tested on a Regular Basis
(continued)
Residue Classification
Code (RCC)
Individual Residue
Code (IRC) Compound
3
3
3
3
3
3
3
3
3
4
5

6
149
246
248
348
380
518
531
643
650
235
013, ,
A /
016^'
Carbophenothion
DBF
Diazinon
Ethion
Folex
Malathion
Methyl Parathion
Ethyl Parathion
Phorate
2, 4-D
Arsenic

Atrazine
4/
- Tested in fiscal years 1969, 1972, and 1973 for specimens from
"cornbelt States" only, i.e., South Dakota, Nebraska, Kansas, Missouri,
Iowa, Minnesota, Wisconsin, Illinois, Indiana, Ohio, and Michigan.

*
Source: Personal communication from the EPA Field Studies Branch,
Washington, D.C.
E-6
-------
*
Table E-2: Residue Classification Codes
Residue Classification
Code (RCC)
2
3

5
6
IRC for
"none found"
905
910

911

901
914
Compound Category
Chlorinated hydrocarbons
Organophosphorous insecti-
cides
Fhenoxy acid derivative
herbicides
Arsenic compounds
Triazines
•f.
Source: Personal communication with EPA Field Studies Branch,

Washington, D.C.
were included in the SAS data file. Table E-l also shows the RCC for

each of the compounds tested regularly, and Table E-2 gives the discrip-

tion of each of these RCC categories. The RCC categories are crucial to

proper analysis of the data because all compounds with a common RCC are

tested simultaneously. If the test is performed for compounds with

RCC = 2, for example, there are two possible types of entry into the

computer record. Either each positive detection is entered, or an IRC

code is entered to indicate no positive detections, as shown in Table

E-2. Each record in the EPA computer file corresponds to a specific

sample specimen and contains 40 repitions of fields for IRC, RCC, amount,

and unit. All of these fields were replaced in the SAS data set by 48

pesticide amount variables, one for each of the 48 compounds listed in

Table E-l. A zero was entered for each compound for which less than

2/
detectable levels were found.- A decimal point, the SAS
2/
- Indicated by one or more detectable amounts for the same RCC or a
9XX code as shown in Table E-2.

E-7
-------
missing value symbol, was entered for the amount of compound detected

3/
whenever the test for that compound was not performed.-

There are several variables, included for analysis purposes, on the

SAS data file that were not on the original EPA data file. Among these

new variables are STNAME and CONAME, the State and county names. A

variable called ROUND was constructed which has a value of one for

records from the first round of data collection, and a value of two for

the second round when the sites were revisited. Also, the sampling year

within round is given by YEAR, e.g., YEAR = 1 for first-year sample

sites. If the information were available, it would also be useful to

have an indicator variable to identify when site substitutions were

made. This is especially important when substitutions were made in the

second round; the second round data for such a site should not be directly
4/
compared to the first round data for that site.- This is an important

consideration when estimating differences in residue levels from the

first round to the second round.

Two other variables added to the data file for analysis purposes

are STRATUM and WT. The variable STRATUM is used to identify large-scale

geographic strata within States as described in section 1.7.7. The

STRATUM codes and their meanings are given in Table E-3. The variable

WT is the approximate sampling weight, which was constructed as
3/
- Indicated by no detectible amounts for the same RCC and no corre-
sponding 9XX code as shown in Table E-2.

4/
- EPA Field Studies Branch, Washington, D.C., assured RTI that such
substitutions in the second round amount to no more than 5 percent of
all second round sites.
E-8
-------
*
Table E-3: STRATUM Codes and Their Meaning
STRATUM Code Meaning

40 Irrigated stratum
I/ 2/
100 or 160- Remainder stratum^'

400 or 640- Sandhills, desert, or other relatively
homogeneous stratum

- Code used depends on PSU sizes used in the State.

2/
- All sites in many states fall into the remainder stratum.

*
Source: Constructed by RTI from: a) Data files supplied by the
EPA Field Studies Branch, Washington, D.C. b) Data and personal
communications from J. Jeffery Goebel, Statistical Laboratory, Iowa
State University.
shown in Appendix D. This variable is, of course, essential for a

weighted analysis of the data that incorporates the sampling design

implications.

Some quality assurance checks of the EPA computer files for the RSN

were made prior to creation of the SAS data set for analysis. Twenty-

three inconsistencies were discovered. These inconsistencies are sum-

marized in Table E-4, and their resolution is discussed below. Most of

these inconsistencies were resolved by consulting microfilm copies of

the Analysis Worksheets, Form 6-7, and the Sample Data Sheets, Form 6-
Maintained by the EPA Field Studies Branch, Washington, D.C.
E-9
-------
Table E-4: Data Inconsistancies in the Rural Soils Network Files
Case
Number
1

State Name
(State Number)
California (06)

Idaho (16)

Missouri (29)

North Carolina (37)

Ohio (39)

Virginia (51)

Illinois (17)

New York (36)

Alabama (01)

Mississippi (28)

Iowa (19)

Oregon (41)

Site
Number
39

138

113

559

103

Fiscal
Year
69

Sample
Material
Code (SMC)
1

Accession
Number
3196
3407
1470
11470
1471
11471
1226
11226
4063
4061
3566
3568
3468
4049
795
3476
3017
3017-'
10049
100049
204007
204117
204298
204298
312655
372655
310110
316110
Individual Residue
Codes (IRC)
13,244,243,241,260,786
911
905
13
160,260
13
13
905
13,241,786,243,240,787
911
241,260,2,243
13
13,905
911
13,911
905
13,160,914
911
905
910
13,241,243,910
914
241,244,243
13
799
910
910
905
Residue Class
Codes (RCC)
2,5
4
2
5
2
5
5
2
2,5
4
2
5
2,5
4
4,5
2
2,5,6
4
2
3
2,3,5
6
2
5
2
3
3
2
(continued)
-------
Table E-4: Data Inconsistancies in the Rural Soils Network Files

(continued)
Case
Number
15

22
23

State Name
(State Number)
Pennsylvania (42)

Nebraska (31)

Illinois (17)

Louisiana (22)

Mississippi (28)

Alabama (01)

New York (36)

West Virginia (54)
Mississippi (28)

Site
Number
164

151

105

194

54-/
49

Fiscal
Year
73

73
69

Sample
Material Accession
Code (SMC) Number
1

138

1
1

314025
340250
426298
427298
3028 .
3017-'
3621
4365
8652
8675
204111
204014
314144
314085
314097
781

Individual Residue Residue Class
Codes (IRC) Codes (RCC)
16,241,243
910
260
910
13,905
911
13,2,260
910
795,244,243,241,786,910
241,244,243,795,246
13,241,910
13,260,910
240,244,786,241,243,914
910
905,910
13,243,244,786,241,341,
795,246,799,240
2,6
3
2
3
2,5
4
2,5
3
2,3
2,3
2,3,5
2,3,5
2,6
3
2,3
2,5^

- Case 17 becomes case 9 after the site number for case 17 is corrected to 138.

2/
- Noncropland site number; land use changed from cropland (1) to noncropland (2)

3/
-' RCC8 changed to 3 as IRC8 = 246. See Table E-l.

*
Source: Computer files supplied by EPA Field Studies Branch, Washington, D.C.
-------
One final correction to the data file was to correct the cropping

region code for several counties. Valid codes for the cropping regions

are the integers from one through 8. Several records in the computer

file showed cropping region codes of 0 and 9. These records were

corrected as shown in table E-5.
E-14
-------
Table E-5. Resolution of Invalid Cropping Region Codes
State Name
(State Number)

Iowa (19)
Kentucky (21)
Minnesota (27)
Mississippi (28)
Missouri (29)
Nebraska (31)
Nebraska (31)
Nebraska (31)
Oklahoma (33)
South Carolina (35)
Tennessee (47)
Tennessee (47)
California (6)
Georgia (13)
Maryland (24)
New York (36)
New York (36)
New York (36)
North Carolina (37)
Virginia (51)
Virginia (51)
Virginia (51)
Virginia (51)
West Virginia (54)
County Naratr
(County Number)

Scott (163)
Scott (209)
Scott (139)
Alcorn (3)
Scott (201)
Hayes (85)
Scotts Bluff (157)
Thayer (169)
Cotton (33)
Dorchester (35)
Haywood (75)
Scott (151)
Alpine (3)
Invalid (4)
Invalid (18)
Bronx (5)
Invalid (32)
New York (61)
Dare (55)
Invalid (39)
Invalid (74)
Norfolk (129)
Princess Anne (151)
McDowell (47)
Cropping Region
Original

0
0
0
0
0
0
0
0
0
0
0
0
9
9
9
9
9
9
9
9
9
9
9
9
Corrected

1
6
5
3
4
2
5
1
2
4
3
6
Missing
Missing
Missing
Missing
Missing
Missing
Missing
Missing
Missing
Missing
Missing
Missing
Personal communication with EPA Field Studies Branch, Washington, D.C.
E-15
-------