Techniques for Tracking, Evaluating and Reporting the Implementation of Nonpoint Source Control Measures Agriculture


vvEPA
United States             Office Of Water     EPA 841-B-97-010
Environmental Protection         (4503F)        September 1997
Agency	
TECHNIQUES FOR TRACKING,
EVALUATING, AND REPORTING
THE IMPLEMENTATION OF
NONPOINT SOURCE CONTROL
MEASURES
            AGRICULTURE

-------
                                       EPA/841 -B-97-010
                                        September 1997
TECHNIQUES FOR TRACKING, EVALUATING,
  AND REPORTING THE IMPLEMENTATION
     OF NONPOINT SOURCE CONTROL
               MEASURES
             I. AGRICULTURE
                  Final
             September 1997
                Prepared for

               Steve Dressing

      Nonpoint Source Pollution Control Branch
    United States Environmental Protection Agency


                Prepared by

               Tetra Tech, Inc.
          EPA Contract No. 68-C3-0303
           Work Assignment No. 4-51

-------
                                                           TABLE OF CONTENTS

Chapter 1   Introduction
   1.1    Purpose of Guidance 	1-1
   1.2    Background  	1-1
   1.3    Types of Monitoring 	1-3
   1.4    Quality Assurance and Quality Control	1-4
   1.5    Data Management 	1-5

Chapter 2   Sampling Design
   2.1    Introduction  	2-1
       2.1.1  Study Objectives 	2-1
       2.1.2  Probabilistic Sampling	2-2
       2.1.3  Measurement and Sampling Errors  	2-8
       2.1.4  Estimation and Hypothesis Testing	2-11
   2.2    Sampling Considerations	2-13
       2.2.1  Farm Ownership and Size  	2-13
       2.2.2  Location and Other Physical Characteristics	2-14
       2.2.3  Farm Type and Agricultural Practices	2-15
       2.2.4  Sources of Information	2-15
   2.3    Sample Size Calculations	2-18
       2.3.1  Simple Random Sampling	2-20
       2.3.2  Stratified Random Sampling 	2-24
       2.3.3  Cluster Sampling	2-27
       2.3.4  Systematic Sampling  	2-27

Chapter 3   Methods for Evaluating Data
   3.1    Introduction  	3-1
   3.2    Comparing the Means from Two Independent Random Samples	3-2
   3.3    Comparing the Proportions from Two Independent Samples 	3-3
   3.4    Comparing More Than Two Independent Random Samples	3-4
   3.5    Comparing Categorical Data	3-4

Chapter 4   Conducting the Evaluation
   4.1    Introduction  	4-1
   4.2    Choice of Variables	4-2
   4.3    Expert Evaluations	4-7
       4.3.1  Site Evaluations	4-7
       4.3.2  Rating Implementation of Management Measures and Best
             Management Practices	4-9
       4.3.3  Rating Terms	4-10
       4.3.4  Consistency Issues	4-12
       4.3.5  Postevaluation Onsite Activities  	4-13

-------
 Table of Center
   4.4    Self-Evaluations	4-13
       4.4.1   Methods	4-13
       4.4.2   Cost  	4-14
       4.4.3   Questionnaire Design	4-17
   4.5    Aerial Reconnaissance and Photography	4-19

Chapter 5  Presentation of Evaluation Results
   5.1    Introduction 	5-1
   5.2    Audience Identification  	5-2
   5.3    Presentation Format	5-2
       5.3.1   Written Presentations	5-3
       5.3.2   Oral Presentations 	5-3
   5.4    For Further Information	5-4

References 	R-l

Glossary	  G-l

Index  	 1-1

Appendix A:   Statistical Tables	  A-l

-------
                                                                         Table of Contet
List of Tables
   Table 2-1 Applications of four sampling designs for implementation
             monitoring  	2-3
   Table 2-2 Errors in hypothesis testing 	2-12
   Table 2-3 Acres of harvested cropland in Virginia from USDOC's 1992
             Census of Agriculture  	2-14
   Table 2-4 Definitions used in sample size calculation equations	2-19
   Table 2-5 Comparison of sample size as a function of various parameters	2-21
   Table 2-6 Common values of (ZK + Z2p)2 for estimating sample size  	2-23
   Table 2-7 Allocation of Samples  	2-26
   Table 2-8 Number of farms implementing recommended BMPs 	2-28
   Table 3-1 Contingency table of observed operator type and
             implemented BMP	3-5
   Table 3-2 Contingency table of expected operator type and implemented BMP	3-6
   Table 3-3 Contingency table of implemented BMP and rating of
             installation and maintenance  	3-7
   Table 3-4 Contingency table of implemented BMP and sample year  	3-8
   Table 4-1 General types of information obtainable with self-evaluations
             and expert evaluations	4-3
   Table 4-2 Example variables  for management measure implementation analysis	4-6

List of Figures
   Figure 2-1 Simple random sampling from a list and a map  	2-4
   Figure 2-2 Stratified random sampling from a list and a map	2-6
   Figure 2-3 Cluster sampling from a list and a map	2-7
   Figure 2-4 Systematic sampling from  a list and a map	2-9
   Figure 2-5 Graphical presentation of the relationship between bias,
              precision, and accuracy  	2-11
   Figure 2-6 Example route for a county transect survey	2-29
   Figure 4-1 Potential variables  and examples of implementation
               standards and specifications 	4-5
   Figure 4-2 Sample draft survey for confined animal facility management
               evaluation  	4-15
   Figure 5-1 Example of presentation of information in a written slide  	5-4
   Figure 5-2 Example of representation of data using a combination of a pie
               chart and a horizontal bar chart	5-5
   Figure 5-3 Example representation of data in the form of a pie chart	5-6

-------
                                                    CHAPTER 1. INTRODUCTION
1.1 PURPOSE OF GUIDANCE

This guidance is intended to assist state,
regional, and local environmental professionals
in tracking the implementation of best
management practices (BMPs) used to control
agricultural nonpoint source pollution.
Information is provided on methods for
selecting sites for evaluation, sample size
estimation,  sampling, and results evaluation
and presentation. The focus of the guidance is
on the statistical approaches needed to
properly collect and analyze data that are
accurate and defensible.  A properly designed
BMP implementation monitoring program can
save both time and money.  For example, there
are over 37,000 farms in the state of Virginia.
To determine the status  of BMP
implementation on each of those farms would
easily exceed most budgets and thus statistical
sampling of sites is needed.  This document
provides guidance for sampling representative
farms to yield summary statistics at a fraction
of the cost of a comprehensive inventory.

Some nonpoint  source projects and programs
combine BMP implementation monitoring with
water quality monitoring to evaluate the
effectiveness of BMPs at protecting water
quality (Meals,  1988; Rashin et al., 1994;
USEPA, 1993b). For this type of monitoring
to be successful, the scale of the project must
be small (e.g., a watershed of a few hundred to
a few thousand acres).  Accurate records of all
the sources of pollutants of concern and a
census of how all BMPs are  operating are very
important for this type of monitoring effort.
Otherwise, it can be extremely difficult to
  The focus of this guide is on the design of
  monitoring programs to assess agricultural
  management measure and best management
 practice implementation, with particular
  emphasis on statistical considerations.
correlate BMP implementation with changes in
stream water quality.  This guidance does not
address monitoring the implementation and
effectiveness of all BMPs in a watershed. This
guidance does provide information to help
program managers gather statistically valid
information to assess implementation of BMPs
on a more general (e.g., statewide) basis. The
benefits of implementation monitoring are
presented in Section 1.3.

1.2 BACKGROUND

Pollution from nonpoint sources—sediment
deposition, erosion, contaminated runoff,
hydrologic modifications that degrade water
quality, and other diffuse sources of water
pollution—is the largest cause of water quality
impairment in the United States (USEPA,
1995). Congress passed the  Coastal Zone Act
Reauthorization Amendments of 1990
(CZARA) to help address nonpoint source
pollution in coastal waters. CZARA provides
that each state with an approved coastal zone
management program develop and submit to
the U.S. Environmental Protection Agency
(EPA) and National Oceanic and Atmospheric
Administration (NOAA) a Coastal Nonpoint
Pollution Control Program (CNPCP).  State
programs must "provide for the

-------
 Introduction
                                Chapter 1
implementation" of management measures in
conformity with the EPA Guidance Specifying
Management Measures For Sources Of
Nonpoint Pollution In Coastal Waters,
developed pursuant to section 6217(g) of
CZARA(USEPA, 1993a). Management
measures (MMs), as defined in CZARA, are
economically achievable measures to control
the addition of pollutants to coastal waters,
which reflect the greatest degree of pollutant
reduction achievable through the application of
the best available nonpoint pollution control
practices, technologies, processes, siting
criteria, operating methods, or other
alternatives. Many of EPA's MMs are
combinations of BMPs. For example,
depending on site characteristics,
implementation of the Confined Animal Facility
MM might involve use of the following BMPs:
Construction of a waste storage pond,
installation of grassed waterways, protection of
heavily-used areas, management of roof runoff,
and construction of a composting facility.

CZARA does not specifically require that
states monitor the implementation of MMs and
BMPs as part of their CNPCPs. State
CNPCPs must however, provide for technical
assistance to local governments and the public
for implementing the MMs and BMPs.  Section
6217(b) states:

    Each State program . . . shall provide for
    the implementation, at a minimum, of
    management measures . . . and shall also
    contain ... (4) The provision of technical
    and other assistance to local governments
    and the public for implementing the
    measures . . . which may include assistance
    ... to predict and assess the effectiveness
    of such measures ....
EPA and NOAA also have some responsibility
under section 6217 for providing technical
assistance to implement state CNPCPs.
Section 6217(d), Technical assistance, states:

    [NOAA and EPA] shall provide technical
    assistance ... in developing and
    implementing programs. Such assistance
    shall include: ... (4) methods to predict
    and assess the effects of coastal land use
    management measures on coastal water
    quality and designated uses.

This guidance document was developed to
provide the technical assistance described in
CZARA sections 6217(b)(4) and 6217(d), but
the techniques described can be used for other
similar programs and projects. For instance,
monitoring projects funded under Clean Water
Act (CWA) section 319(h) grants, efforts to
implement total maximum daily loads
developed under CWA Section 303(d),
stormwater permitting programs, and other
programs could all benefit from knowledge of
BMP implementation.

Methods to assess the implementation of MMs
and BMPs, then, are a key focus of the
technical assistance to be provided by EPA and
NOAA. Implementation assessments can be
done on several scales.  Site-specific
assessments can be used to assess individual
BMPs or MMs, and watershed assessments can
be used to look at the cumulative effects of
implementing multiple MMs. With regard to
"site-specific" assessments, individual BMPs
must be assessed at the appropriate scale for
the BMP of interest. For example, to assess
the implementation of MMs and BMPs for
animal waste handling and disposal on a farm,
only the structures, areas, and practices
implemented specifically for animal waste

-------
  Chapter 1
                                Introductio
management (e.g., dikes, diversions, storage
ponds, composting facility, and manure
application records) would need to be
inspected. In this instance the animal waste
storage facility would be the appropriate scale
and "site." To assess erosion control, the
proper scale might be fields over 10 acres and
the site could be  100-meter transect
measurements of crop residue. For nutrient
management, the scale and site might be an
entire farm.  Site-specific measurements can
then be used to extrapolate to a watershed or
statewide assessment.  It is recognized that
some  studies might require a complete
inventory of MM and BMP implementation
across an entire watershed or other geographic
area.

1.3 TYPES OF MONITORING

The term monitor is defined as "to check or
evaluate something on a constant or regular
basis" (Academic Press, 1992).  It is possible
to distinguish among various types of
monitoring.  Two types, implementation and
trend  (i.e., trends in implementation)
monitoring, are the focus of this guidance.
These types of monitoring can be used to
address the following goals:

•   Determine the extent to which MMs and
   BMPs are implemented in accordance with
   relevant standards and specifications.

•   Determine whether there has been a change
   in the extent to which MMs and BMPs are
   being implemented.

In general, implementation monitoring is used
to determine whether goals, objectives,
standards, and management practices are being
implemented as detailed in implementation
plans. In the context of BMPs within state
CNPCPs, implementation monitoring is used to
determine the degree to which MMs and BMPs
required or recommended by the CNPCPs are
being implemented.  If CNPCPs call for
voluntary implementation of MMs and BMPs,
implementation monitoring can be used to
determine the success of the voluntary program
(1) within a given monitoring period (e.g., 1 or
2 years); (2) during several monitoring periods,
to determine any temporal trends in BMP
implementation; or (3) in various regions of the
state.

Trend monitoring involves long-term
monitoring of changes in one or more
parameters. As discussed in this guidance,
public attitudes, land use, or the use of
different agricultural practices are examples of
parameters that could be measured with trend
monitoring. For example, the Conservation
Technology Information Center tracks trends
in the implementation of different tillage
practices from year to year (CTIC, 1994).
Isolating the impacts of MMs and BMPs on
water quality requires tracking MM and BMP
implementation overtime, i.e., trend
monitoring.

Because trend monitoring involves measuring a
change (or lack thereof) in some parameter
over time, it is necessarily of longer duration
and requires that a baseline, or starting point,
be established.  Any changes in the measured
parameter are then detected in reference to the
baseline.

Implementation and the related trend
monitoring can be used to determine
(1) which MMs and  BMPs are being
implemented, (2) whether MMs and BMPs are
being implemented as designed, and

-------
 Introduction
                                 Chapter 1
(3) the need for increased efforts to promote or
induce use of MMs and BMPs. Data from
implementation monitoring, used in
combination with other types of data, can be
useful in meeting a variety of other objectives,
including the following (Hook et al., 1991;
IDDHW, 1993; Schultz, 1992):

•  To evaluate BMP effectiveness for
   protecting soil and water resources.

   To identify areas in need of further
   investigation.

•  To establish a reference point of overall
   compliance with BMPs.

•  To determine whether farmers are aware of
   BMPs.

•  To determine whether farmers are using the
   advice of agricultural BMP experts.

•  To identify any BMP implementation
   problems specific to a category of farm.

•  To evaluate whether any agricultural
   practices cause environmental damage.

   To compare the effectiveness of alternative
   BMPs.

MacDonald et al. (1991) describes additional
types of monitoring, including effectiveness
monitoring, baseline monitoring, project
monitoring, validation monitoring, and
compliance monitoring.  As emphasized by
McDonald and others, these monitoring types
are not mutually  exclusive and the distinc-tions
among them are usually determined by the
purpose of the monitoring.
Effectiveness monitoring is used to determine
whether MMs or BMPs, as designed and
implemented, are effective in meeting
management goals and objectives.
Effectiveness monitoring is a logical follow-up
to implementation monitoring, because it is
essential that effectiveness monitoring include
an assessment of the adequacy of the design
and installation of MMs and BMPs. For
example, the objective of effectiveness
monitoring could be to evaluate the
effectiveness of MMs and BMPs as designed
and installed, or to evaluate the effectiveness
of MMs and BMPs that are designed and
installed adequately or to standards and
specifications. Effectiveness monitoring is not
addressed in this guide, but is the subject of
another EPA guidance document, Monitoring
Guidance for Determining the Effectiveness of
Nonpoint Source Controls (USEPA, 1997).

1.4 QUALITY ASSURANCE AND QUALITY
    CONTROL

An integral part of the design phase of any
nonpoint source pollution monitoring project is
quality assurance and quality control (QA/QC).
Development of a quality assurance project
plan (QAPP) is the first step of incorporating
QA/QC into a monitoring project. The QAPP
is a critical document for the data collection
effort inasmuch as it integrates the technical
and quality aspects of the planning,
implementation, and assessment phases of the
project. The QAPP documents how QA/QC
elements will be implemented throughout a
project's life. It contains statements about the
expectations and requirements of those for
whom the data is being collected (i.e., the
decision maker) and provides details on
project-specific data collection and data
management procedures that are designed to

-------
  Chapter 1
                                Introductio
ensure that these requirements are met.
Development and implementation of a QA/QC
program, including preparation of a QAPP, can
require up to 10 to 20 percent of project
resources (Cross-Smiecinski and Stetzenback,
1994), but this cost is recaptured in lower
overall costs due to the project being well
planned and executed.  A thorough discussion
of QA/QC is provided in Chapter 5 of EPA's
Monitoring Guidance for Determining the
Effectiveness ofNonpoint Source Controls
(USEPA, 1997).

1.5 DATA MANAGEMENT

Data management is a key component of a
successful MM or BMP implementation
monitoring effort. The data management
system that is used—which includes the quality
control and quality assurance aspects of data
handling, how and where data are stored, and
who manages the stored data—determines the
reliability, longevity, and accessibility of the
data. Provided that the data collection effort
was planned and executed well,  an organized
and efficient data management system will
ensure that the data can be used  with
confidence by those who must make decisions
based upon it, the data will be useful as a
baseline for similar data collection efforts in the
future, the data will not become obsolete (or be
misplaced!) quickly,  and the data will be
available to a variety of users for a variety of
applications.

Serious consideration is often not given to a
data management system prior to a data
collection effort, which is precisely why it is so
important to recognize the long-term value of a
small investment of time and money in proper
data management. Data management competes
with other agency priorities for money, staff,
and time, and if the importance and long-term
value of proper data management is recognized
early in a project's development, the more
likely it will be to receive sufficient funding.
Overall, data management might account for
only a small portion of a project's total budget,
but the return on the investment is great when
it is considered that the larger investment in
data collection can be rendered virtually useless
unless data is managed adequately.

Two important aspects of data that should be
considered when planning the initial data
collection effort and a data management system
are data life cycle and data accessibility. The
data life cycle can be characterized by the
following stages:
(1) Data is collected; (2) data is checked for
quality; (3) data is entered into a data base;  (4)
data is used, and (5) data eventually becomes
obsolete.  The expected usefulness and life
span of the data should be considered during
the initial  stages of planning a data collection
effort, when the money, staff, and time that are
devoted to data collection must be weighed
against its usefulness and longevity.  Data with
a limited use and that is likely to become
obsolete soon after it is collected is a poorer
investment decision than data with multiple
applications and a long life span.  If a data
collection effort involves the collection of data
of limited use and a  short life span, it might be
necessary to modify the data collection
effort—either by changing its goals and
objectives or by adding new ones—to increase
the breadth and length of the data's
applicability. A good data management system
will ensure that any data that are collected will
be useful for the greatest number of
applications for the longest possible time.

-------
 Introduction
                              Chapter 1
Data accessibility is a critical factor in
determining its usefulness.  Data attains its
highest value if it is as widely accessible as
possible, if access to it requires the least
amount of staff effort as possible, and if it can
be used by others conveniently.  If data are
stored where those who might need it can
obtain it with little assistance, it is more likely
to be shared and used. The format for data
storage determines how conveniently the data
can be used. Electronic storage  in a widely
available and used data storage format makes it
convenient to use. Storage as only a paper
copy buried in a  report, where any analysis
requires entry into an electronic format or
time-consuming manipulation, makes data
extremely inconvenient to use and unlikely that
it will be used.

The following should be considered for the
development of a data management strategy:

•   What level of quality control should the
    data be subject to?  Data that will be used
    for a variety  of purposes or that will be
    used for important decisions should receive
    a careful quality control check.

•   Where and how will the data be stored?
    The options for data storage range from a
    printed  final  report on a bookshelf to an
    electronic data base accessible to
    government agencies and the public.
    Determining where and how data will be
    stored therefore also requires careful
    consideration of the question: How
    accessible should the data be?

•   Who will maintain the data base? Data
    stored in a large data base might be
    managed by  a professional data manager,
    while data kept in agency files might be
managed by people with various
backgrounds over the course of time.

How much will data management cost? As
with all other aspects of a data collection
effort, data management costs money and
this cost must be balanced with all other
costs involved in the project.

-------
                                                CHAPTER 2.  SAMPLING DESIGN
2.1 INTRODUCTION

This chapter discusses recommended methods
for designing sampling programs to track and
evaluate the implementation of nonpoint
source control measures.  This chapter does
not address sampling to determine whether the
management measures (MMs) or best
management practices (BMPs) are effective
since no water quality sampling is done.
Because of the variation in agricultural
practices and related nonpoint source control
measures implemented throughout the United
States, the approaches taken by various states
to track and evaluate nonpoint source  control
measure implementation will differ.
Nevertheless, all approaches can be based on
sound statistical methods  for selecting
sampling strategies, computing sample sizes,
and evaluating data. EPA recommends that
states should consult with a trained statistician
to be certain that the approach, design, and
assumptions are appropriate to the task at
hand.

As described in Chapter 1, implementation
monitoring is the focus of this guidance.
Effectiveness monitoring  is the focus of
another guidance prepared by EPA,
Monitoring Guidance for Determining the
Effectiveness of Nonpoint Source Controls
(USEPA, 1997).  The recommendations and
examples in this chapter address two primary
monitoring goals:

•  Determine the extent to which MMs and
   BMPs are implemented in accordance with
   relevant standards and specifications.
•  Determine whether there is a change in the
   extent to which MMs and BMPs are being
   implemented.

For example, state or county agriculture
personnel might be interested in whether
regulations for the exclusion of livestock from
riparian areas are being adhered to in regions
with particular water quality problems. State
or county personnel might also be interested in
whether, in response to an intensive state-wide
effort to improve pesticide use practices and
increase the use of integrated pest management
practices, there is a detectable change in the
pesticide practices being used by farmers.

2.1.1   Study Objectives

To develop a study design, clear, quantitative
monitoring objectives must be developed.  For
example, the objective might be to estimate
the percent of farm owners or managers that
use integrated pest management (IPM) to
within ±5 percent.  Or perhaps a state is
getting ready to perform an extensive 2-year
outreach and cost-share effort to promote a
fence-out or other program to reduce cattle
wading through streams. In this case,
detecting a 10 percent change in the farms that
permit their cattle direct access to streams
might be of interest. In the first example,
summary statistics are developed to describe
the current status, whereas in the second
example, some sort of statistical analysis
(hypothesis testing) is performed to determine
whether a significant change has really
occurred.  This choice has an impact on how
the data are collected.  As an example,
summary statistics might require unbalanced

-------
  Sampling Design
                                Chapter 2
sample allocations to account for variability
such as farm size, type, and ownership,
whereas balanced designs (e.g., two sets of
data with the same number of observations in
each set) are more typical for hypothesis
testing.

2.1.2   Probabilistic Sampling

Most study designs that are appropriate for
tracking and evaluating implementation are
based on a probabilistic approach since
tracking every farm is not cost-effective.  In a
probabilistic approach, individuals are
randomly selected from the entire group.  The
selected individuals are evaluated, and the
results from the individuals provide an
unbiased assessment about the entire group.
Applying the results from randomly selected
individuals to the entire group is statistical
inference. Statistical inference enables one to
determine, for example, in terms  of
probability,  the percentage of farms using IPM
without visiting every farm.  One could also
determine whether the change in  the number
of farms with appropriate nutrient
management is within the range of what could
occur by chance or the change is  large enough
to indicate a real modification of farmer
practices.

The group about which inferences are made is
the population or target population., which
consists of population units. The sample
population is the set of population units that
are directly available for measurement. For
example, if the objective is to determine the
degree to which  adequate animal  waste
management has been established in
agricultural  operations, the population to be
sampled would be agricultural operations for
which animal waste management is an
appropriate BMP (i.e., farms with livestock).
Statistical inferences can be made only about
the target population available for sampling.
For example, if implementation of grazing
management is being assessed and only public
grazing lands can be sampled, inferences
cannot be made about the management of
private grazing lands. Another example to
consider  is a mail survey. In most cases, only
a percentage of survey forms is returned. The
extent to  which nonrespondents bias the
survey findings should be examined: Do the
nonrespondents represent those less likely to
use IPM? Typically, a second mailing, phone
calls, or visits to those who do not respondent
might be necessary to evaluate the impact of
nonrespondents.

The most common types of sampling that
should be used for implementation monitoring
are summarized in Table 2-1.  In general,
probabilistic approaches are preferred.
However, there might be circumstances under
which targeted sampling should be used.
Targeted sampling refers to using best
professional judgement for selecting sample
locations. For example, state or county
agriculture personnel deciding to evaluate all
farms in a given watershed would be targeted
sampling. The choice of a sampling plan
depends on study objectives, patterns of
variability in the target population, cost-
effectiveness of alternative plans, types of
measurements  to be made, and convenience
(Gilbert,  1987).

-------
 Chapter 2
                         Sampling Design
Table 2-1. Applications of four sampling designs for implementation monitoring.
Sarrmlina Desian
Simple Random
Sampling
Stratified Random
Sampling
Cluster Sampling
Systematic Sampling
Comment
Each population unit has an equal probability of being selected.
Useful when a sample population can be broken down into groups, or strata,
that are internally more homogeneous than the entire sample population.
Random samples are taken from each stratum although the probability of
being selected might vary from stratum to stratum depending on cost and
variability.
Useful when there are a number of methods for defining population units
and when individual units are clumped together. In this case, clusters are
randomly selected and every unit in the cluster is measured.
This sampling has a random starting point with each subsequent
observation a fixed interval (space or time) from the previous observation.
Simple random sampling is the most
elementary type of sampling. Each unit of the
target population has an equal chance of being
selected. This type of sampling is appropriate
when there are no major trends, cycles, or
patterns in the target population (Cochran,
1977). Random sampling can be applied in a
variety of ways including farm or field
selection.  Random samples can also be taken
at different times at a single farm. Figure 2-1
provides an example of simple random
sampling from a listing of farms and from a
map.

If the pattern of MM and BMP
implementation is expected to be uniform
across the state, simple random sampling is
appropriate to estimate the extent of
implementation.  If, however, implementation
is homogeneous only within certain categories
(e.g., federal,  state, or private lands), stratified
random sampling should be used.

In stratified random sampling, the target
population is divided into groups called strata
for the purpose of obtaining a better estimate
of the mean or total for the entire population.
Simple random sampling is then used within
each stratum. Stratification involves the use of
categorical variables to group observations
into more units, thereby reducing the
variability of observations within each unit.
For example, in a state with federal, state, and
private rangelands that are used for grazing,
there might be different patterns of BMP
implementation.  Lands in the state could be
divided into federal, state, and private as
separate strata from which samples would be
taken. In general, a larger number of samples
should be taken in a stratum if the stratum is
more variable, larger, or less costly to sample
than other strata.  For example, if BMP
implementation is more variable on private
rangelands, a greater number of sampling  sites
might be needed in that stratum to increase the
precision of the overall estimate. Cochran
(1977) found that stratified random sampling
provides a better  estimate of the mean for a
population with a trend, followed in order by
systematic sampling (discussed later) and

-------
Sampling Design
Chapter 2

Farm Cataloa No.
1
2
3
4
5
6
7
8
• • •
118
119
120
121
122
123
124
125
126
127
128
Waterbodv
Stream
Pond
Pond
Stream
—
River
Lake
• • •
Stream
Stream
—
—
Bay
Bay
Stream
Pond
Stream
—
Pond
Tvoe
Crop
Crop
Livestock
Crop/Livestock
Livestock
Crop
Crop/Livestock
Crop
• • •
Crop/Livestock
Crop/Livestock
Crop/Livestock
Livestock
Crop/Livestock
Crop
Crop
Crop/Livestock
Crop/Livestock
Livestock
Crop
Countv Code
N3
S4
S2
E5
SI
S7
W18
E34
• • •
S21
W7
W4
N5
N9
S3
W11
E14
S14
S8
N13

Figure 2-1a. Simple random sampling from a listing of farms. In this listing, all farms are
presented as a single list and farms are selected randomly from the entire list. Shaded farms
represent those selected for sampling.
Figure 2-1 b. Simple random sampling from a map.
Dots represent farms. All farms of interest are
represented on the map, and the farms to be
sampled (open dots—F) were selected randomly
from all of those on the map. The shaded lines on
the map could represent county, watershed,
hydrologic, or some other boundary, but they are
ignored for the purposes of simple random
sampling.

-------
  Sampling Design
                                Chapter 2
simple random sampling. He also noted that
stratification typically results in a smaller
variance for the estimated mean or total than
that which results from comparable simple
random sampling.

If the state believes that there will be a
difference between two or more subsets of
farms, such as between types of ownership or
crop, the farms can first be stratified into these
subsets and a random sample taken within
each subset (McNew, 1990). The goal of
stratification is to increase the accuracy of the
estimated mean values over what could have
been obtained using simple random sampling
of the entire population.  The method makes
use of prior information to divide the target
population into subgroups that are internally
homogeneous. There are a number of ways to
"select" farms (e.g., by farm ownership, farm
size, farm type, hydrologic unit, soil type, or
county), or sets of farms, to be certain that
important information will not be lost,  or that
MM or BMP  use will not be misrepresented as
a result of treating all potential  survey farms as
equal.  Figure 2-2 provides an example of
stratified random sampling from a listing of
farms and from a map.

It might also be of interest to compare the
relative percentages of cropland classified as
having high, medium, and low erosion
potentials that are under conservation tillage.
Highly erodible land might be responsible for
a larger share of sediment losses, and it would
usually be desirable to track the extent to
which conservation tillage practices have been
implemented  on these land areas.  A stratified
random sampling procedure could be used to
estimate the percentage of total cropland with
different erosion potentials under conservation
tillage.
Cluster sampling is applied in cases where it is
more practical to measure randomly selected
groups of individual units than to measure
randomly selected individual units (Gilbert,
1987). In cluster sampling, the total
population is divided into a number of
relatively small subdivisions, or clusters, and
then some of the subdivisions are randomly
selected for sampling. For one-stage cluster
sampling, the selected clusters are sampled
totally. In two-stage cluster sampling, random
sampling is performed within each cluster
(Gaugush, 1987). For example, this approach
might be useful if a state wants to estimate the
proportion of farms less than 800 meters from
a stream that are following state-approved
nutrient management plans. All farms less
than 800  meters from a particular stream (or
portion of a stream) can be regarded as a
single cluster.  Once all clusters have been
identified, specific  clusters can be randomly
chosen for sampling. Freund (1973) notes that
estimates based on cluster sampling are
generally not as good as those based on simple
random samples, but they are more cost-
effective. As a result, Gaugush (1987)
believes that the difficulty associated with
analyzing cluster samples is compensated for
by the reduced sampling requirements and
cost. Figure 2-3 provides an example of
cluster sampling from a listing of farms and
from a map.

-------
  Sampling Design
                                                       Chapter 2

Farm Catalog No.
1
2
6
8
• • •
123
124
128
3
5
• • •
121
127
4
7
• • •
118
119
120
122
125
126
Water Body
Stream
Pond
River
• • •
Bay
Stream
Pond
Pond
• • •
—
Stream
Lake
• • •
Stream
Stream
...
Bay
Pond
Stream
Type
Crop
Crop
Crop
Crop
• • •
Crop
Crop
Crop
Livestock
Livestock
• • •
Livestock
Livestock
Crop/Livestock
Crop/Livestock
• • •
Crop/Livestock
Crop/Livestock
Crop/Livestock
Crop/Livestock
Crop/Livestock
Crop/Livestock
County Code
N3
S4
S7
E34
• • •
S3
W11
N13
S2
S1
• • •
N5
S8
E5
W18
• • •
S21
W7
W4
N9
E14
S14


Figure 2-2a. Stratified random sampling from a listing of farms. Within this listing, farms are
subdivided by type. Then, considering only one farm type (e.g., crop farms), some farms are
selected randomly. The process of random sampling is then repeated for the other farm types
(i.e., livestock, crop/livestock).  Shaded farms represent those selected for sampling.
  C
CL
             CL
   L
 CL
         CL
Figure 2-2b. Stratified random sampling from a
map.  Letters represent farms, subdivided by type
(C = crop, CL = crop/livestock, L = livestock). All
farms of interest are represented on the map.
From all farms  in one type category, some were
randomly selected for sampling (highlighted
farms). The process was repeated for each  farm
type category.  The shaded lines on the map could
represent county, soil type, or some other
boundary, and  could have been used as a means
for separating the farms into categories for the
sampling process.

-------
 Chapter 2
Sampling Design

Farm Catalog No.
1
4
• • •
118
119
124
126
2
3
• • •
125
128
6
7
122
123
5
8
• • •
120
121
127
Water Body
Stream
Stream
• • •
Stream
Stream
Stream
Stream
Pond
Pond
• • •
Pond
Pond
River
Lake
Bay
Bav
...
• • •
—
—
Type
Crop
Crop/Livestock
• • •
Crop/Livestock
Crop/Livestock
Crop
Croo/Livestock
Crop
Livestock
• • •
Crop/Livestock
Croo
Croo
Croo/Livestock
Crop/Livestock
Croo
Livestock
Crop
• • •
Crop/Livestock
Livestock
Livestock
County Code
N3
E5
• • •
S21
W7
W11
S14
S4
S2
• • •
E14
N13
S7
W18
N9
S3
SI
E34
• • •
W4
N5
S8


Figure 2-3a.  One-stage cluster sampling from a listing of farms.  Within this listing, farms are
subdivided by the type of waterbody near them.  Some of the waterbody types were then
randomly selected (in this case streams and bays) and all farms with those waterbodies were
selected for sampling. Shaded farms represent those selected for sampling.
                                         Figure 2-3b.  Cluster sampling from a map. All
                                         farms in the area of interest are represented on
                                         the map (closed {!} and open {F} dots).
                                         Waterbody types were selected randomly, and
                                         farms with those waterbodies (closed dots {!})
                                         were selected for sampling.  Shaded lines could
                                         represent a type of boundary, such as soil type,
                                         county, or watershed, and could have been used
                                         as the basis for the sampling process as well.

-------
 Chapter 2
                         Sampling Design
Systematic sampling is used extensively in
water quality monitoring programs because it
is relatively easy to do from a management
perspective.  In systematic sampling the first
sample has a random starting point and each
subsequent sample has a constant distance
from the previous  sample. For example, if a
sample size of 70 is desired from a mailing list
of 700 farm owners, the first sample would be
randomly selected from among the first 10
people, say the seventh person.  Subsequent
samples would then be based on the 17th, 27th,
..., 697th person. In comparison, a stratified
random sampling approach might be to sort
the mailing list by county and then to
randomly select farm owners from  each
county.  Figure 2-4 provides an  example of
systematic sampling from a listing  of farms
and from a map.

In general, systematic sampling is superior to
stratified random sampling when only  one or
two samples per stratum are taken for
estimating the  mean (Cochran, 1977) or when
is there is a known pattern of management
measure implementation. Gilbert (1987)
reports that systematic sampling is  equivalent
to simple random sampling in estimating the
mean if the target population has no trends,
strata, or correlations among the population
units. Cochran (1977) notes that on the
average, simple random sampling and
systematic sampling have equal variances.
However, Cochran (1977) also states that for
any single population for which the number of
sampling units is small, the variance from
systematic sampling is erratic and might be
smaller or larger than the variance from simple
random sampling.

Gilbert (1987)  cautions that any periodic
variation in the target population should be
known before establishing a systematic
sampling program. Sampling intervals equal
to or multiples of the target population's cycle
of variation might result in biased estimates of
the population mean. Systematic sampling can
be designed to capitalize on a periodic
structure if that structure can be characterized
sufficiently (Cochran, 1977).  A simple or
stratified random sample is recommended,
however, in cases where the periodic structure
is not well known or if the randomly selected
starting point is likely to have an impact on the
results (Cochran, 1977).

Gilbert (1987) notes that assumptions about
the population are required in estimating
population variance from a single systematic
sample of a given size.  However, there are
systematic sampling approaches that do
support unbiased estimation of population
variance, including multiple systematic
sampling, systematic stratified sampling, and
two-stage sampling (Gilbert,  1987). In
multiple systematic sampling more than one
systematic sample is taken from the target
population. Systematic stratified sampling
involves the collection of two or more
systematic samples within each stratum.

2.1.3   Measurement and Sampling Errors

In addition to making sure that samples are
representative of the sample population, it is
also necessary to consider the types of bias or
error that might be introduced into the study.
Measurement error is the deviation of a
measurement from the true value (e.g., the
percent residue cover for a field was estimated
as 23  percent and the true value was 26
percent). A consistent under- or
overestimation of the true value is referred to
as measurement bias. Random sampling error
arises from the variability from one population
unit to the next (Gilbert, 1987), explaining

-------
Chapter 2
Sampling Design

Farm Catalog No.
1
2
3
4
5
6
7
8
• • •
118
119
120
121
122
123
124
125
126
127
128
Water Body
Stream
Pond
Pond
Stream
—
River
Lake
• • •
Stream
Stream
—
—
Bay
Bay
Stream
Pond
Stream
—
Pond
Type
Crop
Crop
Livestock
Crop/Livestock
Livestock
Crop
Crop/Livestock
Crop
• • •
Crop/Livestock
Crop/Livestock
Crop/Livestock
Livestock
Crop/Livestock
Crop
Crop
Crop/Livestock
Crop/Livestock
Livestock
Crop
County Code
N3
S4
S2
E5
SI
S7
W18
E34
• • •
S21
W7
W4
N5
N9
S3
W11
E14
S14
S8
N13

Figure 2-4a. Systematic sampling from a listing of farms. From a listing of all farms of interest,
an initial site (Farm No. 3) was selected randomly from among the first ten on the list. Every
fifth farm listed was subsequently selected for sampling. Shaded farms represent those
selected for sampling.

Figure 2-4b. Systematic sampling from a map.
Dots (! and F) represent farms of interest. A single
point on the map (n) and one of the farms were
randomly selected. A line was stretched outward
from the point to (and beyond) the selected farm.
The line was then rotated about the map and every
fifth dot that it touched was selected for sampling
(open dots—F). The direction of rotation was
determined prior to selection of the point of the
line's origin and the initial farm. The shaded lines
on the map could represent county boundaries, soil
type, watershed, or some other boundary, but were
not used for the sampling process.

-------
  Sampling Design
                                Chapter 2
why the proportion of farm owners using a
certain BMP differs from one survey to
another.

The goal of sampling is to obtain an accurate
estimate by reducing the sampling and
measurements errors to acceptable levels,
while explaining as much of the variability as
possible to improve the precision of the
estimates (Gaugush, 1987).  Precision is a
measure of how close an agreement there is
among individual measurements of the same
population. The accuracy of a measurement
refers to how close the measurement is to the
true value.  If a study has low bias and high
precision, the results will have high accuracy.
Figure 2-5 illustrates the relationship between
bias, precision, and accuracy.

As suggested earlier, numerous sources of
variability should be accounted for in
developing a sampling design.  Sampling
errors are introduced by virtue of the natural
variability within any given population of
interest. As sampling errors relate to MM or
BMP implementation, the most effective
method for reducing such errors is to carefully
determine the target population and to stratify
the  target population to minimize the
nonuniformity in each stratum.

Measurement errors can be minimized by
ensuring that interview  questions or surveys
are  well designed. If a survey is used as a data
collection tool, for example, the investigator
should evaluate the nonrespondents to
determine whether there is a bias in who
returned the results (e.g., whether the
nonrespondents were more or less  likely to
implement MMs or BMPs). If data are
collected by sending staff out to inspect
randomly selected fields, the approach for
inspecting the fields should  be consistent.  For
example, how do survey personnel determine
that at least 40 percent of the ground is
covered by residuals, or what is the basis for
determining whether a BMP has been properly
implemented?

Reducing sampling errors below a certain
point (relative to measurement errors) does not
necessarily benefit the resulting analysis
because total error is a function of the two
types of error. For example, if measurement
errors such as response or interviewing errors
are large, there is no point in taking a huge
sample to reduce the sampling error of the
estimate since the total  error will be primarily
determined by the measurement error.
Measurement error is of particular concern
when farmer surveys are used for
implementation monitoring.  Likewise,
reducing measurement errors would not be
worthwhile if only a small sample size were
available for analysis because there would be a
large sampling error (and therefore a large
total error) regardless of the size of the
measurement error.  A proper balance between
sampling and measurement  errors should be
maintained because research accuracy limits
effective sample size and vice versa (Blalock,
1979).

-------
 Chapter 2
                          Sampling Design
              \
                     (*)
Figure 2-5.  Graphical representation of the relationship between bias, precision, and accuracy
(after Gilbert, 1987). (a):  high bias + low precision = low accuracy; (b):  low bias + low
precision = low accuracy;  (c): high bias + high precision = low accuracy; and (d):  low bias +
high precision = high accuracy.
2.1.4   Estimation and Hypothesis Testing

Rather than presenting every observation
collected, the data analyst usually summarizes
major characteristics with a few descriptive
statistics. Descriptive statistics include any
characteristic designed to summarize an
important feature of a data set. A point
estimate is  a single number that represents the
descriptive statistic.  Statistics common to
implementation monitoring include
proportions, means, medians, totals, and
others.  When estimating parameters of a
population, such as the proportion or mean, it
is useful to estimate the confidence interval.
The confidence interval indicates the range in
which the true value lies for a stated
confidence level.  For example, if it is
estimated that 65 percent of soybeans were
planted using no-till and the 90 percent
confidence limit is ±5 percent, there is a 90
percent chance that between 60 and 70 percent
of the soybeans were planted using no-till.

-------
  Sampling Design
                                                         Chapter 2
Hypothesis testing should be used to determine
whether the level of MM and BMP
implementation has changed over time. The
null hypothesis (HJ is the root of hypothesis
testing.  Traditionally, H0 is a statement of no
change, no effect, or no difference; for
example, "the proportion of farm owners using
IPM after the cost-share program is equal to
the proportion of farm owners using IPM
before the cost-share program." The
alternative hypothesis (Ha) is counter to H0,
traditionally being a statement of change,
effect, or difference. If H0 is rejected, Ha is
accepted.  Regardless of the statistical test
selected for analyzing the data, the analyst
must select the significance level (a) of the
test. That is, the analyst must determine what
error level is acceptable.  There are two types
of errors in hypothesis testing:

Type I:   H0 is rejected when H0 is really true.

Type II:  H0 is accepted when H0 is really
         false.

Table 2-2 depicts these errors, with the
magnitude of Type I errors represented by a
and the magnitude of Type II errors
represented by (3. The probability of making a
Type I error is equal to the a of the test and is
                        selected by the data analyst. In most cases,
                        managers or analysts will define 1-aio be in
                        the range of 0.90 to 0.99 (e.g., a confidence
                        level of 90 to 99 percent), although there have
                        been applications where 1-a has been set to as
                        low as 0.80. Selecting a 95 percent confidence
                        level implies that the analyst will reject the H0
                        when H0 is true (i.e., a false positive) 5 percent
                        of the time.  The same notion applies to the
                        confidence interval for point estimates
                        described above:  a is  set to 0.10, and there is a
                        10 percent chance that the true percentage of
                        soybeans planted using no-till is outside the 60
                        to 70 percent range. This implies that if the
                        decisions to be made based on the analysis are
                        major (i.e., affect many people in adverse or
                        costly ways) the confidence level needs to be
                        greater. For less significant decisions (i.e.,
                        low-cost ramifications) the  confidence level
                        can be lower.

                        Type II error depends on the significance
                        level, sample size, and variability, and which
                        alternative hypothesis is true. Power (J-fi) is
                        defined as the  probability of correctly rejecting
                        H0 when H0 is false. In general, for a fixed
                        sample size, a and (3 vary inversely. For a
                        fixed  a, (3 can be reduced by increasing the
                        sample size (Remington and Schork, 1970).
Table 2-2.  Errors in hy
jothesis testing.
Decision
Accept H0
Reject H0
State of Affairs in the Population
H0 is True
1-a
(Confidence level)
a
(Significance level)
(Type I error)
H0 is False
P
(Type II error)
1-P
(Power)

-------
 Chapter 2
                         Sampling Design
2.2 SAMPLING CONSIDERATIONS

In a document of this brevity, it is not possible
to address all of the issues that face technical
staff who are responsible for developing and
implementing studies to track and evaluate the
implementation of nonpoint source control
measures. For example, when is the best time
to implement a survey or do on-site visits? In
reality, it is difficult to pinpoint a single time
of the year.  Some BMPs can be checked any
time of the year,  whereas others have a small
window of opportunity.  In northern areas, the
time between fall harvest and winter snows
might be the most effective time of year to
assess implementation of a large number  of
erosion control practices.

If the goal of the study is to determine the
effectiveness of a farmer education program,
sampling should be timed to  ensure that there
was sufficient time for outreach activities and
for the farmers to implement the desired
practices.  Also, farmers are more receptive to
visits and participation in a survey during off-
peak business times (i.e., not during planting,
harvesting, livestock birthing, etc.).
Furthermore, field personnel must have
permission to perform site visits from each
affected farm owner or manager prior to
arriving at the farms. Where access is denied,
a replacement farm is needed. This farm  is
selected in accordance with the type of farm
selection being used, i.e., simple random,
stratified random, cluster, or  systematic.

From a study design perspective, all of these
issues—study objectives, sampling strategy,
allowable error, and formulation of
hypotheses—must be considered together with
determining the sampling strategy. This
section describes common issues that the
technical staff might consider in targeting their
sampling efforts or determining whether to
stratify their sampling efforts. In general, if
there is reason to believe that there are
different rates of BMP or MM implementation
in different groups, stratified random sampling
should increase overall accuracy. Following
the discussion, a list of resources that can be
used to facilitate evaluating these issues is
presented.

2.2.1  Farm Ownership and Size

Farm ownership can be divided (i.e., stratified)
into multiple categories for sampling purposes
depending on the MM implementation being
tracked. The 1992 Census of Agriculture
(USDOC, 1994) provides information by state
on:
    Farms by type of ownership (individual or
    family, partnership, corporation, and
    other).

    Farms owned versus rented or leased.

    Farm owner characteristics.

    Farm gross income.

    Average farm size.

-------
  Sampling Design
                                Chapter 2
•   Number of farms by size (1 to 9, 10 to 49,
    50 to 179, 180 to 499, 500 to 999, 1,000 to
    1,999, and 2,000 acres or more).

The Economic Research Section of the U.S.
Department of Agriculture (USDA) also
provides information on farm ownership, as do
many state programs. For example, a
sampling plan to determine the percentage of
acres on which erosion control practices had
been implemented could be designed based on
the data shown in Table 2-3.  (The units of
interest  are acres of harvested cropland.)
However, it should be noted that if there is
reason to believe that implementation of
erosion  control practices is not uniform among
farm owners of farms of differing sizes, more
intense sampling of one or more
subpopulations (strata) might be warranted.
2.2.2   Location and Other Physical
       Characteristics
Selection of farms for sampling should ensure
a representative sample of all appropriate areas
of a state or coastal zone. Stratifying by
county, watershed, hydrologic unit, or any
other geographically or physically based area
might increase overall accuracy. Other
important considerations for selecting areas
from which to sample include:

•  Areas with different soil types.

   Areas with different erosion potentials (see
   USD A's National Resources Inventory).

•  Areas with different climates (i.e.,
   differences in total rainfall or storm
   frequency).

   Areas with known degraded water quality
   conditions.
2.2.3  Farm Type and Agricultural Practices

To obtain a representative sample, data must
first be collected on the types of agricultural
 Table 2-3.  Acres of harvested cropland in Virginia from USDOC's 1992 Census of
 Agriculture.
Total Farm Size (acres)
1 to 49 acres
50 to 99 acres
100 to 500 acres
500 to 999 acres
1,000 to 2, 999 acres
2,000 acres or more
Total
Number of
Farms
9,802
7,690
16,125
2,515
943
257
37,332
Harvested
Cropland (acres)
88,488
158,089
965,178
551,639
428,572
215,010
2,406,976

-------
 Chapter 2
                         Sampling Design
practices that occur in a designated sampling
area.  Once farms have been stratified by the
types of MMs they should be implementing,
farms can be selected for sampling. For
example, if grazing management were the only
practice being evaluated, farms with only
cropland would be removed from the sample
population.  Alternatively, if the investigator is
interested in agriculture practices that affect
the delivery of nitrogen to surface waters, only
farms where MMs or BMPs that affect
nitrogen movement are being implemented
would be selected. Numerous sources of
information can be used to infer the sample
population.  These sources should be consulted
before designing a monitoring plan. The U.S.
Department of Commerce's (USDOC) Census
of Agriculture provides information by state
on:

   Acres of harvested cropland.

   Acres of irrigated cropland.

   Types of livestock (cattle, milk cows, hogs
   and pigs, chickens, etc.).

   Types of crops (corn, wheat, tobacco,
   soybeans, peanuts, hay, land in orchards,
   etc.).

USD A's National Resources Inventory
provides statistical information by U.S.
Geological Survey (USGS) cataloging unit on
the acreage of different crop types and other
land  uses.

2.2.4  Sources of Information

For a truly random selection of population
units, it is necessary to access or develop a
database that includes the entire target
population.  The Census of Agriculture
(USDOC, 1994) is a good source, but it is
limited to some extent by confidentiality
constraints.  (Certain data are not included,
except at the state level, for counties that have
only a few operations or are dominated by a
single operation.)  Other currently available
national data bases generally include only
agricultural entities that participate in cost-
share programs. A more inclusive source
presently available is county land maps.
These maps, however, generally lack data
regarding the specific type of farm operation
and therefore do not provide the information
needed to perform simple random site
selection.

The following are possible sources of
information on farms, which can be used for
identifying potential monitoring farms and
obtaining other information for farm selection.
Positive and negative attributes of each
information source are included.

1992 National Resource Inventory (USDA,
1994a):  The National Resource Inventory
(NRI) is a data base composed of data on the
natural resources on the nonfederal lands of
the United States) 74 percent of the Nation's
land area. Its focus is on the soil, water, and
related resources of farms, nonfederal forests,
and grazing lands.  The data were collected
from more than 800,000 sample sites
nationwide and are statistically reliable for
analysis at the national, regional, state, major
land resource area, or multiple county level,
though not at the county level. Data elements
include land cover/use (cropland, pasture land,
rangeland and its condition, forest land,  barren
land, rural land, urban, and built-up areas),
land ownership, soil information, irrigation,
water bodies, conservation practices, and
cropping history.  Data are available on  CD-
ROMs and can be integrated with other data

-------
  Sampling Design
                                Chapter 2
through spatial linkages in a geographic
information system (GIS). To obtain the NRI
data base, contact: NRCS National
Cartography and Geospatial Center, Fort
Worth Federal Center, Building 23, Room 60,
P.O. Box 6567, Fort Worth, TX  76115-0567;
1-800-672-5559;
http://www.ncg.nrcs.usda.gov.

Census of Agriculture (USDOC,  1994): The
Census of Agriculture is the leading source of
statistics about the Nation's agricultural
production  and the only source for consistent,
comparable data at the county, state, and
national levels.  Data are collected on a 5-year
cycle in years ending in "2" and "7" and are
available on computer tapes and CD-ROMs.
Data elements include farms (number and
size), harvested cropland, irrigated land,
market value of products, farm ownership,
livestock and poultry, selected crops
harvested, and more.  The Census of
Agriculture has been  transferred to the
National Agricultural Statistics Service
(NASS), who funded the  1997 census.
Information on obtaining  the Census of
Agriculture is available on the Internet at
http://www.census.gov.

USD A Farm Numbers:  USD A farm
numbers are developed when a farmer receives
any financial assistance from a USDA
organization.  Only farms participating  in
USDA programs are included in the data base.

USGS Land Use and Land Cover (USGS,
1990): Using these data, at a level 2
definition, provides information on four
categories of agricultural land uses:
(1) cropland and pasture;  (2) orchards, groves,
vineyards, nurseries,  and ornamental
horticulture areas; (3) confined feeding
operations; and (4) other agricultural land.
Watershed, topography, soil types, and/or
political boundary maps could be used in
conjunction with this land use information.
Information on obtaining land use and land
cover maps is available on the Internet at
http://www.usgs.gov or at
http://www.ncg.nrcs.usda.gov.

County Land Maps: These maps can
provide information on farm owners or
managers and possibly land use. Selection of
farms to determine the type of operations
occurring would have to be made randomly.

State Cooperative Extension Service:  Farms
that received Extension Service grants or
participated in Coop programs are included.
These programs vary from state to state. As
with the USDA farm numbers,
nonparticipatory farms are not included, which
could result in biased sampling.

Complaint Records:  Complaint records
could be used in combination with other
sources.  Such records represent farms that
have had problems in the past, which will very
likely skew the data set.

National Agriculture Statistics Service
(NASS):  This agency, a branch of the USDA,
issues reports related to national forecasts and
estimates of crops, livestock, poultry, dairy,
prices, labor, and related agricultural items
(USDA, undated).  The agency has the most
comprehensive national list of farms available.
NASS could produce random lists of farmers
through one of its two frames. The first frame
is an area frame, which randomly selects land
segments that average 1 square mile in size.
In most states the area frame is stratified into
four broad categories based on land use: (1)
areas intensively cultivated for crops, (2)
extensive areas used primarily for grazing and

-------
 Chapter 2
                         Sampling Design
producing livestock, (3) residential and
business land in cities and towns, and
(4) nonagricultural lands such as parks and
military complexes.  The second frame is the
list frame, which consists of names and
addresses of producers grouped by size and
type of unit. In a list frame sample names are
selected randomly (based on whatever
stratification is desired) and mailed
questionnaires. Phone calls or visits are made
to those farmers who do not respond by mail.
A disadvantage of NASS is that it does not
release names to other agencies. If this
method of selection were chosen, NASS
would have to perform the sampling.
Information on obtaining data from NASS is
available on the Internet at
http://www.usda.gov/nass or through the
NASS hotline at 1-800-727-9540.

Computer-aided Management Practices
System (CAMPS):  This data base has records
of all nutrient management plans developed by
the USDA Natural Resource Conservation
Service (formerly the Soil Conservation
Service, or SCS).

Field  Office Computing System (FOCS):
The Field Office Computing System (FOCS)
replaced CAMPS, and full conversion from
CAMPS to FOCS was completed in all field
offices of the  Natural Resources Conservation
Service by January 1996.  The system  contains
information on client businesses, resource
inventories, conservation plans, practice cost
comparisons,  and a variety of specialty
applications.  Some of these applications are
SOILS, with county-level soils data;
PLANTS, with state-level plant data; GLA
(Grazing Land Applications), with forage,
herd, grazing  schedule, and feedstuff data;
WEQ (Wind Erosion Equation), a tool to
compute wind erosion; Crop Rotation Detail,
which includes planting, harvest, and tillage
data; RUSLE (Revised Universal Soil Loss
Equation), a tool to compute sheet/rill erosion;
Nutrient Screening Tool, a tool for evaluating
nitrogen and phosphorus leaching and surface
runoff; Pesticide Screening Tool, a tool for
evaluating potential for pesticide leaching and
runoff; and Farm*A*Syst, software for
evaluating the potential for surface and
groundwater pollution. Information on FOCS
is available
through the Internet at
http://www.itc.nrcs.usda.gov/fchd/focs.

Farm Service Agency (FSA): The Farm
Service Agency (FSA), created when the
Department of Agriculture reorganized in
1994, incorporates programs from the
Agricultural Stabilization and Conservation
Service (ASCS), the Federal Crop Insurance
Corporation, and the Farmers Home
Administration. FSA administers programs
for commodity loans, commodity purchases,
crop insurance, emergency and disaster relief,
farm ownership and operation loans, and
farmland conservation.  The Conservation
Reserve Program assists farmers in conserving
and improving soil, water, and wildlife
resources on farmland by converting highly
erodible and other environmentally sensitive
acreage from production to long-term cover.
FSA also maintains a collection of aerial
photographs of farmlands.  Information on
FSA can be obtained through the Internet at
http://www.fsa.usda.gov, or at the following
address: USDA FSA Public Affairs Staff,
P.O. Box 2415, STOP 0506, Washington, DC,
20013, (202)  720-5237. For information on
the collection of aerial photographs maintained
by the agency, contact USDA FSA Aerial
Photography Field Office, P.O. Box 30010,
Salt Lake City, UT, 84130-0010, (801) 975-
3503.

-------
  Sampling Design
                                Chapter 2
2.3  SAMPLE SIZE CALCULATIONS

This section describes methods for estimating
sample sizes to compute point estimates such
as proportions and means, as well as detecting
changes with a given significance level.
Usually, several assumptions regarding data
distribution, variability, and cost must be made
to determine the sample size.  Some
assumptions might result in sample size
estimates that are too high or too low.
Depending on the sampling cost and cost for
not sampling enough data, it must be decided
whether to make conservative or "best-value"
assumptions. Because the cost of visiting any
individual farm or group of farms is relatively
constant, it is more economical to collect a
few extra samples rather than realize you need
to go back to collect additional data. In most
cases, the analyst should probably consider
evaluating a range of assumptions on the
impact of sample size and overall program
cost.

To maintain document brevity, some terms
and definitions that will be used in the
remainder of this chapter are summarized in
Table 2-4.  These terms are consistent with
those in most introductory-level statistics
texts, and more information can be found
there.  Those with some statistical training will
note that some of these definitions include an
additional term referred to as the finite
population correction term (1-4)), where  is
equal to n/N. In many applications, the
number of population units in the sample
population (TV) is large in comparison to the
population units sampled (n) and (7-0) can be
ignored.  However, depending on the number
of units (farms for example) in a particular
population, TV can become quite small. Nis
determined by the definition of the sample
population and the corresponding population
units. If  is greater than 0.1, the finite
population correction factor should not be
ignored (Cochran, 1977).

Applying any of the equations described in
this section is difficult when no historical data
set exists to quantify initial estimates of
proportions, standard deviations, means, or
coefficients of variation. To estimate these
parameters, Cochran (1977) recommends four
sources:

•  Existing information on the same
   population or a similar population.

   A two-step sample. Use the first-step
   sampling results to estimate the needed
   factors, for best design, of the second step.
   Use data from both steps to

-------
 Chapter 2
                          Sampling Design
 Table 2-4.  Definitions used in sample size calculation equations.
   N
s^
s
ISfx
M
o2
o
Cv
s2(x)
*

s(x)
1-4)
d
dr
total number of population units
in sample population
number of samples
preliminary estimate of sample
size
number of successes
proportion of successes
proportion of failures (1-p)
ith observation of a sample
sample mean
sample variance
sample standard deviation
total amount
population mean
population variance
population standard deviation
coefficient of variation
variance of sample mean
n/N (unless  otherwise stated in
text)
                                           p = a/n
               q =  1 - p
                                          x  = —
                                          s =
                                          d =
                                                          s2  =
                                                          C= six
                                                X-\i
                                                   n

                s(x)=— (
                       {n
                                                  {n

standard error (of sample mean)   Z,
finite population correction factor
allowable error
relative error
                                           adf
    value corresponding to cumulative area of
    1-a using the normal distribution (see
    Table A1).
    value corresponding to cumulative area of
    1-a using the student t distribution with df
    degrees of freedom (see Table A2).
estimate the final precision of the
characteristic(s) sampled.

    A "pilot study" on a "convenient" or
    "meaningful" subsample. Use the results
    to estimate the needed factors.  Here the
    results of the pilot study generally cannot
    be used in the calculation of the final
    precision because often the pilot sample is
    not representative of the entire population
    to be sampled.
•   Informed judgment, or an educated guess.

It is important to note that this document only
addresses estimating sample sizes with
traditional parametric procedures. The
methods described in this document should be
appropriate in most cases, considering the type
of data expected.  If the data to be sampled are
skewed, as with much water quality data, the
investigator should plan to transform the data
to something symmetric, if not normal, before
computing sample sizes (Helsel and Hirsch,

-------
  Sampling Design
                                                                            Chapter 2
1995).  Kupper and Hafner (1989) also note
that some of these equations tend to
underestimate the necessary sample because
power is not taken into consideration. Again,
EPA recommends that if you do not have a
background in statistics, you should consult
with a trained statistician to be certain that
your approach, design, and assumptions are
appropriate to the task at hand.

2.3.1   Simple Random Sampling

In simple random sampling, we presume that
the sample population is relatively
homogeneous and we would not expect a
difference in sampling costs or variability.  If
the cost or variability of any group  within the
sample population were different, it might be
What sample size is necessary to estimate
the proportion of farms implementing IPM to
within ±5 percent?

What sample size is necessary to estimate
the proportion of farms implementing IPM so
that the relative error is less than 5 percent?
                                            If the proportion is expected to be a low
                                            number, using a constant allowable error
                                            might not be appropriate.  Ten percent
                                            plus/minus 5 percent has a 50 percent relative
                                            error. Alternatively, the relative error, dn can
                                            be specified (i.e., the true proportion lies
                                            betweenp-drp andp+drp with a 1-a
                                            confidence level) and a preliminary estimate
                                            of sample size can be computed as (Snedecor
                                            and Cochran, 1980)
                                                                                  (2-2)
                                            In both equations, the analyst must make an
                                            initial estimate ofp before starting the study.
                                            In the first equation, a conservative sample
                                            size can be computed by assuming p equal to
                                            0.5. In the second equation the sample size
                                            gets larger as p approaches 0 for constant dn
                                            thus an informed initial estimate ofp is
                                            needed. Values of a typically range from 0.01
                                            to 0.10. The final sample size is then
                                            estimated as (Snedecor and Cochran, 1980)
                                             n = i
CJ)
for cj> > 0.1

otherwise
(2-3)
more appropriate to consider a stratified
random sampling approach.

To estimate the proportion of farms
implementing a certain BMP or MM, such that
the allowable error, d, meets the study
precision requirements (i.e., the true
proportion lies between p-d andp+d with a 1-
a confidence level), a preliminary estimate of
sample size can be computed as (Snedecor and
Cochran, 1980)
no =
          d2
                                     (2-1)
                                            where  is equal to n/N.  Table 2-5
                                            demonstrates the impact on n of selecting/?, a,
                                            d, dr, and N. For example, 278 random
                                            samples are needed to estimate the proportion

-------
 Chapter 2
                         Sampling Design
Table 2-5.  Comparison of sample size as a function of p, a, cf, dn and N for estimating
 jroportions using equations 2-1 through 2-3.
Probability
of Success,
P
0.1
0.1
0.5
0.5
0.1
0.1
0.5
0.5
Signifi-
cance
level, a
0.05
0.05
0.05
0.05
0.10
0.10
0.10
0.10
Allowable
error, d
0.050
0.075
0.050
0.075
0.050
0.075
0.050
0.075
Relative
error, dr
0.500
0.750
0.100
0.150
0.500
0.750
0.100
0.150
Preliminary
sample
size, «„
138
61
384
171
97
43
271
120
Sample Size, n
Number of Population Units in Sample
Population, N
500
108
55
217
127
82
43
176
97
750
117
61
254
139
86
43
199
104
1,000
121
61
278
146
97
43
213
107
2,000
138
61
322
171
97
43
238
120
Large N
138
61
384
171
97
43
271
120
of 1,000 farmers using IPM to within ±5
percent (J=0.05) with a 95 percent confidence
level assuming roughly one-half of farmers are
using IPM.
What sample size is necessary to estimate
the average number of acres per farm that
are under conservation tillage to within ±25
acres?

What sample size is necessary to estimate
the average number of acres per farm that
are under conservation tillage to within ±10
percent?
Suppose the goal is to estimate the average
acreage per farm where conservation tillage is
used.  The number of random samples
required to achieve a desired margin of error
when estimating the mean (i.e., the true mean
lies between x-d and x+d with a 1-a
confidence level) is (Gilbert, 1987)
n  =
                                                                                  (2-4)
IfNis large, the above equation can be
simplified to
n =
                                     (2-5)
Since the Student's lvalue is a function of n,
Equations 2-4 and 2-5 are applied iteratively.
That is, guess at what n will be, look up
ti-a/2,n-i from Table A2, and compute a revised
n.  If the initial guess of n and the revised n
are different, use the revised n as the new

-------
Sampling Design
Chapter 2
guess, and repeat the process until the
computed value of n converges with the
guessed value. If the population standard
deviation is known (not too likely), rather than
estimated, the above equation can be further
simplified to:
n = (Z^^o/d)2 (2-6)

To keep the relative error of the mean estimate
below a certain level (i.e., the true mean lies
between x-dr x and x+dr x with a
1-a confidence level), the sample size can be
computed with (Gilbert, 1987)
n =
(2.7)
error under 15 percent (i.e., dr < 0.15) with a
90 percent confidence level.

Unfortunately, this is the first study that
County X has done and there is no information
about the coefficient of variation, Cv. The
investigator, however, is familiar with a recent
study done by another company. Based on
that study, the investigator estimates the Cv as
0.6 and s equal to 30. As a first-cut
approximation, Equation 2-6 is applied with
Zj-,/2 equal to 1.645 and assuming TV is large:

n = (1.645*0.6/0.15)2
= 43.3 ~ 44 samples
Cv is usually less variable from study to study
than are estimates of the standard deviation,
which are used in Equations 2-4 through 2-6.
Professional judgment and experience,
typically based on previous studies, are
required to estimate Cv. Had Cv been known,
Zi-«/2 would have been used in place of t,^^,
in Equation 2-7. If TV is large, Equation 2-7
simplifies to:
n =
(2-8)
For County X, farms range in size from 20 to
4,325 acres although most are less than 500
acres in size. The goal of the sampling
program is to estimate the average number of
cropland acres using minimum tillage.
However, the investigator is concerned about
skewing the mean estimate with the few large
farms. As a result, the sample population for
this analysis is the 430 cropland farms with
less than 500 total acres of cropland. The
investigator also wants to keep the relative
Since n/Nis greater than 0.1 and Cv is
estimated (i.e., not known), it is best to
reestimate n with Equation 2-7 using 44
samples as the initial guess of n. In this case,
ti-a/2,n-i is obtained from Table A2 as 1.6811.
n =
(1.6811xQ.6/0.15)2
1 + (1.6811x0.6/0.15)2/430
= 40.9 ~ 41 samples
Notice that the revised sample is somewhat
smaller than the initial guess of n. In this case
it is recommended to reapply Equation 2-7
using 41 samples as the revised guess of n. In
this case, tj.^^ is obtained from Table A2 as
1.6839.
n =
(1.6839xQ.6/0.15)2
l + (1.6839xQ.6/0.15)2/430
= 41.0 ~ 41 samples

-------
Chapter 2
Sampling Design
Since the revised sample size matches the
estimated sample size on which t,^^, was
based, no further iterations are necessary. The
proposed study should include 41 farms
randomly selected from the 430 cropland
farms with less than 500 total acres of
cropland in County X.
What sample size is necessary to determine
whether there is a 20 percent difference in
BMP implementation before and after a
cost-share program?

What sample size is necessary to detect a
30-acre increase in average conservation
tillage acreage per farm between farm
owners that own versus rent land?
When interest is focused on whether the level
of BMP implementation has changed, it is
necessary to estimate the extent of
implementation at two different time periods.
Alternatively, the proportion from two
different populations can be compared. In
either case, two independent random samples
are taken and a hypothesis test is used to
determine whether there has been a significant
change in implementation. (See Snedecor and
Cochran (1980) for sample size calculations
for matched data.) Consider an example in
which the proportion of highly erodible land
under conservation tillage will be estimated at
two time periods. What sample size is
needed?

To compute sample sizes for comparing two
proportions, p, andp2, it is necessary to
provide a best estimate forp} andp2, as well as
specifying the significance level and power (7-
(3). Recall that power is equal to the
probability of rejecting H0 when H0 is false.
Given this information, the analyst substitutes
these values into (Snedecor and Cochran,
1980)
n =
V
(2-9)
where Za and Z2p correspond to the normal
deviate. Although this equation assumes that
N large, it is acceptable for practical use
(Snedecor and Cochran, 1980). Common
values of (Za andZ2ft)2 are summarized in
Table 2-6. To account forp} andp2 being
Table 2-6. Common values of (Za + Z2p)2 for estimating sample size for use with equations 2-9
and 2-10.
Power,
1-P
0.80
0.85
0.90
0.95
0.99
a for One-sided Test
0.01
10.04
11.31
13.02
15.77
21.65
0.05
6.18
7.19
8.56
10.82
15.77
0.10
4.51
5.37
6.57
8.56
13.02
a for Two-sided Test
0.01
11.68
13.05
14.88
17.81
24.03
0.05
7.85
8.98
10.51
12.99
18.37
0.10
6.18
7.19
8.56
10.82
15.77

-------
  Sampling Design
                                                                            Chapter 2
estimated, Z could be substituted with t. In
lieu of an iterative calculation, Snedecor and
Cochran (1980) propose the following
approach:  (1) compute n0 using Equation
2-9; (2)  round n0 up to the next highest
integer,/; and (3) multiply n0 by (f+3)/(f+l) to
derive the final estimate of n.

To detect a difference in proportions of 0.20
with a two-sided test, a equal to 0.05,1-fi
equal  to 0.90, and an estimate of/>; and/>2
equal  to 0.4 and 0.6, n0 is computed as
            [(0.4)(0.6) + (0.6)(0.4)]
no =  10.51
                  (0.6 - 0.4)2
   =  126.1
Rounding 126.1 to the next highest integer,/is
equal to 127, and n is computed as 126.1 x
130/128 or 128.1. Therefore 129 samples in
each random sample, or 258 total samples, are
needed to detect a difference in proportions of
0.2.  Beware of other sources of information
that give significantly lower estimates of
sample size. In some cases the other sources
do not specify l-(3; otherwise, be sure that an
"apples-to-apples" comparison is being made.

To compare the average from two random
samples to detect a change of d (i.e., x2-x,\ the
following equation is used:
n  =
                                     (2-10)
Common values of (Za cmdZ2p)2 are
summarized in Table 2-6. To account for s,
and s2 being estimated, Z should be replaced
with t.  In lieu of an iterative calculation,
Snedecor and Cochran (1980) propose the
following approach: (1) compute n0 using
Equation 2-10; (2) round n0 up to the next
highest integer,/; and (3) multiply n0 by
(f+3)/(f+J) to derive the final estimate of n.
Continuing the County X example above,
where s was estimated as 75 acres, the
investigator will also want to compare the
average number of cropland acres using
minimum tillage now to the average number
of minimum tillage acres in a few years. To
demonstrate success, the investigator believes
that it will be necessary to detect a 50-acre
increase.  Although the standard deviation
might change after the cost-share program,
there is no particular reason to propose a
different s after the cost-share program. To
detect a difference of 50  acres with a two-
sided test, a equal to 0.05, 1-fi equal to 0.90,
and an estimate of s} and s2 equal to 75, n0 is
computed as
            (752 +  752)
                502                 (2-H)
   =  47.3

Rounding 47.3 to the next highest integer,/is
equal to 48, and n is computed as 47.3 x 51/49
or 49.2. Therefore 50 samples in each random
sample, or 100 total  samples, are needed to
detect a difference of 50  acres.

2.3.2   Stratified Random Sampling

The key reason for selecting a stratified
random sampling  strategy over simple random
sampling is to divide a heterogeneous
population into more homogeneous groups. If
populations are grouped based on size (e.g.,
farm size) when there is a large number of
                                             no =  10.51
                                             What sample size is necessary to estimate
                                             the average number of acres per farm that
                                             are under conservation tillage when there is
                                             a wide variety of farm sizes?

-------
Chapter 2
Sampling Design
small units and a few larger units, a large gain
in precision can be expected (Snedecor and
Cochran, 1980). Stratifying also allows the
investigator to efficiently allocate sampling
resources based on cost. The stratum mean,
xh, is computed using the standard approach
for estimating the mean. The overall mean,
xst, is computed as
L
TFh *h (2-12)
h=l
**=
where L is the number of strata and Wh is the
relative size of the h* stratum. Wh can be
computed as Nf/N where Nh and N are the
number of population units in the hth stratum
and the total number of population units across
all strata, respectively. Assuming that simple
random sampling is used within each stratum,
the variance of xst is estimated as (Gilbert,
1987)
N2
n,
-^L\^L (2-13)
N,. n,.
where nh is the number of samples in the h*
stratum and sh2 is computed as (Gilbert, 1987)
1
„ \2
(2-14)
There are several procedures for computing
sample sizes. The method described below
allocates samples based on stratum size,
variability, and unit sampling cost. Ifs2(xst)
is specified as Ffor a design goal, n can be
obtain/edfrom (Gilb
erJ, 1987)
h=l
h=l
(2-15)
where ch is the per unit sampling cost in the hth
stratum and nh is estimated as (Gilbert, 1987)
w ? /.fc~
n, = n
h=l

In the discussion above, the goal is to estimate
an overall mean. To apply a stratified random
sampling approach to estimating proportions,
substituteph,pst,phqh, and s2(pj for xh, xst, sh\
and s2(xst) in the above equations,
respectively.

To demonstrate the above approach, consider
the County X example again. In addition to
the 430 farms that are less than 500 acres,
there are 100 farms that range in size from 501
to 1,000 acres, 50 farms that range in size
from 1,001 to 2,000 acres, and 20 farms that
range in size from 2,001 to 4,500 acres. Table
2-7 presents three basic scenarios for
estimating sample size. In the first scenario, sh
and ch are assumed equal among all strata.
Using a design goal of V equal to 100 and
applying Equation 2-15 yields a total sample
size of 51.4 or 52. Since sh and ch are uniform,
these samples are allocated proportionally to
Wh, which is referred to as proportional
allocation. This allocation can be verified by
comparing the percent sample allocation to
Wh. Due to rounding up, a total of 53 samples
are allocated.

Under the second scenario, referred to as the
Neyman allocation., the variability between
strata changes, but unit sample cost is
constant. In this example, sh increases by 50
between strata. Because of the increased
variability in the last three strata, a total of
h=i

-------
  Sampling Design
                               Chapter 2
 Table 2-7. Allocation of samples.
Farm Size
(acres)
Number of
Farms
WJ
Relative
Size
(Wh)
Standard
Deviation
(sh)
Unit
Sample
Cost
(Cfl)
Sample Allocation
Number
%
A) Proportional allocation (s,, and ch are constant)
20-80
81-200
201-300
301-400
430
100
50
20
0.7167
0.1667
0.0833
0.0333
30
30
30
30
1
1
1
1
31
7
4
2
70.5
15.9
9.1
4.5
Using Equation 2-15, n is equal to 41.9. Applying Equation 2-16 to each stratum yields a total of
44 samples after rounding up to the next integer.
B) Neyman allocation (ch is constant)
20-80
81-200
201-300
301-400
430
100
50
20
0.7167
0.1667
0.0833
0.0333
30
45
60
75
1
1
1
1
35
13
9
5
56.5
21.0
14.5
8.1
Using Equation 2-15, n is equal to 59.3. Applying Equation 2-16 to each stratum yields a total of
62 samples after rounding up to the next integer.
C) Allocation where sh and ch are not constant
20-80
81-200
201-300
301-400
430
100
50
20
0.7167
0.1667
0.0833
0.0333
30
45
60
75
1.00
1.25
1.50
2.00
38
12
8
4
61.3
19.4
12.9
6.5
Using Equation 2-15, n is equal to 60.0. Applying Equation 2-16 to each stratum yields a total of
62 samples after rounding up to the next integer.
79.1 or 80 samples are needed to meet the
same design goal.  So while more samples are
taken in every strata, proportionally fewer
samples are needed in the smaller farm size
group.  For example, using proportional
allocation nearly 70 percent of the samples are
taken in the 20 to 500-acre farm size stratum,
whereas approximately 54 percent of the
samples are taken in the same stratum using
the Neyman allocation.

-------
 Chapter 2
                         Sampling Design
Finally, introducing sample cost variation will
also affect sample allocation. In the last
scenario it was assumed that it is twice as
expensive to evaluate a farm from the largest
farm size stratum than to evaluate a farm from
the smallest farm size stratum. In this
example, roughly the same total number of
samples are needed to meet the design goal,
yet more samples are taken in the smaller size
stratum.

2.3.3   Cluster Sampling

Cluster sampling is commonly used when
there is a choice between the size of the
sampling unit (e.g., fields versus farms). In
general, it is cheaper to sample larger units
than smaller units, but these results tend to be
less accurate (Snedecor and Cochran, 1980).
Thus, if there is not a unit sampling cost
advantage to cluster sampling, it is probably
better to use simple random sampling. To
decide whether to perform a cluster sample, it
will probably be necessary to perform a
special investigation to quantify sampling
errors and costs using the two approaches.

Perhaps the best approach to explaining the
difference between simple random sampling
and cluster sampling is to consider an example
set of results. In this example, the investigator
did a field evaluation of BMP implementation
along a stream to evaluate whether
recommended BMPs had been implemented
and maintained. Since the watershed was
quite large, the investigator elected to inspect
10 farms at each site. Table 2-8 presents the
number of farms at each site that had
implemented and maintained recommended
BMPs. The overall mean is 5.6; a little more
than one-half of the farms have implemented
recommended BMPs.  However, note that
since the population unit corresponds to the 10
farms collectively, there are only 30 samples
and the standard error for the proportion of
farmers using recommended BMPs is 0.035.
Had the investigator incorrectly calculated the
standard error using the random sampling
equations, he or she would have computed
0.0287, nearly a 20 percent error.

Since the standard error from the cluster
sampling example is 0.035, it is possible to
estimate the corresponding simple random
sample size to get the same precision using
      pa
n =  -J—L-
              (0.56)(0.44)
                 0.0352
                                    (2-17)
   = 201
Is collecting 300 samples using a cluster
sampling approach cheaper than collecting
about 200 simple random samples? If so,
cluster sampling should be used; otherwise
simple random sampling should be used.

2.3.4   Systematic Sampling

It might be necessary to obtain a baseline
estimate of the proportion of farms where
nutrient management practices have been
implemented using a mailed questionnaire or
phone survey.  Assuming a record of farms in
the state is available in a sequence unrelated to
the manner in which nutrient management
plans  are implemented by  individual farms
(e.g., in alphabetical order by the farm
owner's name), a systematic sample can be
obtained in the following manner (Casley and

-------
  Sampling Design
                                Chapter 2
Table 2-8.  Number of farms (out of 10) implementing recommended BMPs.
3 95764
5 77475
8 47453
635
384
399
5
6
7
Grand Total = 168
x=5.6 p = 5.6/1 0=0.560
s=1.923 5=1.923/10=0.1923
Standard error using cluster sampling: s(p)=0.1923/(30)05
Standard error if simple random sampling assumption had
sfo)=((0.56)(1 -0.56)/300)° 5 =0.0287
=0.035
been incorrectly used:

Lury, 1982):

1.   Select a random number r between 1 and
    «, where n is the number required in the
    sample.

2.   The sampling units are then r, r + (N/n), r
    + (2N/n), ...,r + (n-l)(N/n), where TV is
    total number of available records.

If the population units are in random order
(e.g., no trends, no  natural strata,
uncorrelated), systematic sampling is, on
average, equivalent to simple random
sampling.

Once the sampling  units (in this case, specific
farms) have been selected, a questionnaire can
be mailed to the farm owner or a telephone
inquiry made about nutrient management
practices being followed by the farm owner.

In  another example, the Conservation
Technology Information Center (CTIC), with
the assistance of the Natural Resource
Conservation Service (NRCS, formerly the
Soil Conservation Service), randomly selects
approximately 3,100 sites for its annual
National Crop Residue Management Survey
(CTIC, 1994). A method for randomly
selecting sites to fit local data needs was
recently developed for assessing
implementation of conservation tillage
practices (CTIC, 1995).  This method, the
County Transect Survey, involves establishing
a driving route that passes through all regions
heavily used for crop production. Large
urbanized areas and  heavily traveled federal
and state highways are avoided where
possible. The direction of the route is not
significant. In a recent application of the
method in Illinois, the  route was 110 miles
long and included 456 cropland observation
sites.  Data were collected at set predetermined
intervals.  Data on rainfall, slope, soil
erodibility, soil loss  tolerance ( T\
contouring, ephemeral erosion, and crop
rotation/tillage system employed were also
collected. Figure 2-6 presents the type of
random route used in the survey. The county
transect survey method has also been used
successfully in Minnesota, Ohio, and Indiana

-------
 Chapter 2
Sampling Design
(CTIC, 1995), and is being considered for use
in Pennsylvania.
Figure 2-6. Example route for a county transect survey (CTIC, 1995).

-------
                           CHAPTER 3.  METHODS FOR EVALUATING DATA
3.1   INTRODUCTION
Once data have been collected, it is necessary
to statistically summarize and analyze the data.
EPA recommends that the data analysis
methods be selected before collecting the first
sample. Many statistical methods have been
computerized in easy-to-use software that is
available for use on personal computers.
Inclusion or exclusion in this section does not
imply an endorsement or lack thereof by the
U.S. Environmental Protection Agency.
Commercial-off-the-shelf software that covers
a wide range of statistical and graphical
support includes SAS, Statistica, Statgraphics,
Systat, Data Desk (Macintosh only), BMDP,
and JMP. Numerous spreadsheets, database
management packages, and other graphics
software can also be used to perform many of
the needed analyses. In addition, the
following programs, written specifically for
environmental analyses, are also available:

•  SCOUT: A Data Analysis Program, EPA,
   NTIS Order Number PB93-505303.

•  WQHYDRO (WATER
   QUALITY/HYDROLOGY
   GRAPHICS/ANALYSIS SYSTEM), Eric
   R. Aroner, Environmental Engineer, P.O.
   Box 18149, Portland, OR 97218.

•  WQSTAT, lim C. Loftis, Department of
   Chemical and Bioresource Engineering,
   Colorado State University, Fort Collins,
   CO 80524.

Computing the proportion of sites
implementing a certain BMP or the average
number of acres that are under a certain BMP
follows directly from the equations presented
in Section 2.3 and is not repeated.  The
remainder of this section is focused on
evaluating changes in BMP implementation.
The methods provided in this section provide
only a cursory overview of the type of
analyses that might be of interest.  For a more
thorough discussion  on these methods, the
reader is referred to Gilbert (1987), Snedecor
and Cochran (1980), and Helsel and Hirsch
(1995).  Typically the data collected for
evaluating changes will typically come as two
or more sets of random samples. In this case,
the analyst will test for a shift or step change.

Depending on the objective,  it is appropriate to
select a one- or two-sided test.  For example, if
the analyst knows that BMP implementation
will only go up as a result of a cost-share
program, a one-sided test could be formulated.
Alternatively, if the analyst does not know
whether implementation will go up or down, a
two-sided test is necessary. To simply
compare two random samples to decide if they
are significantly different, a two-sided test is
used. Typical null hypotheses (H0) and
alternative hypotheses (Ha) for one- and two-
sided tests are provided below:

   One-sided test
H0: BMP Implementation(Post cost share) <
   BMP Implementation(Pre cost share)

Ha: BMP Implementation(Post cost share) >
   BMP Implementation(Pre cost share)

   Two-sided test

-------
 Methods for Evaluating Data
                                                                  Chapter
H0: BMP Implementation(Post cost share) =
    BMP Implementation(Pre cost share)

Ha: BMP Implementation(Post cost share) *
    BMP Implementation(Pre cost share)

Selecting a one-sided test instead of a two-
sided test results in an increased power for
the same significance level (Winer, 1971).
That is, if the conditions are appropriate, a
corresponding one-sided test is more
desirable than a two-sided test given the same
a and sample size.  The manager and analyst
should take great care in selecting one- or
two-sided tests.

3.2  COMPARING THE MEANS FROM Two
     INDEPENDENT RANDOM SAMPLES

The Student's t test for two samples and the
Mann-Whitney test are the most appropriate
tests for these types of data. Assuming the
data meet the assumptions of the t test, the
two-sample t statistic with n,+n2-2 degrees of
freedom is (Remington  and Schork, 1970)
     /-7T_7T\ _ A
t=  (l2)    -^
           1   1                      (3-1)
        \
n,
where n, and n2 are the sample size of the first
and second data set and xl and x2 are the
estimated means from the first and second data
set, respectively. The pooled standard
deviation, sp, is defined by
        7 ,     ,   ? ,    s 10.5

                                      (3-2)
where s,2 and s22 correspond to the estimated
variances of the first and second data set,
                                             Tests for Two Independent Random Samples
                                              Test'	Key Assumptions	
                                              Two-sample t
                                                    Both data sets must be
                                                    normally distributed
                                                    Data sets should have
                                                    equal variances1
                                              Mann-Whitney   • None
                                                The standard forms of these tests require
                                                independent random samples.
                                                The variance homogeneity assumption can
                                                be relaxed.
respectively. The difference quantity (A0) can
be any value, but here it is set to zero.  A0 can
be set to a non-zero value to test whether the
difference between the two data sets is greater
than a selected value. If the variances are not
equal, refer to Snedecor and Cochran (1980)
for methods for computing the t statistic. In a
two-sided test, the value from Equation 3-1  is
compared to the t value from Table A2 with
a/2 and nt+n2-2 degrees of freedom.

The Mann-Whitney test can also be used to
compare two independent random samples.
This test is very flexible since there are no
assumptions about the distribution of either
sample or whether the distributions have to be
the same (Helsel and Hirsch, 1995). Wilcoxon
(1945) first introduced this test for equal-sized
samples.  Mann and Whitney (1947) modified
the original Wilcoxon's test to apply it to
different sample sizes. Here, it is determined
whether one data set tends to have larger
observations than the other.

If the distributions of the two samples are
similar except for location (i.e., similar spread
and skew), Ha can be refined to imply that the
median concentration from one sample is

-------
 Chapter
              Methods for Evaluating Data
"greater than," "less than," or "not equal to"
the median concentration from the second
sample. To achieve this greater detail in Ha,
transformations such as logs can be used.

Tables of Mann-Whitney test statistics (e.g.,
Conover, 1980) may be consulted to determine
whether to reject H0 for small sample sizes. If
n, and n2 are greater than or equal to 10
observations, the test statistic can be computed
from the following equation (Conover, 1980):
               T - n,
N
                                     (3-3)
                i=\
where
n =
T =
 number of observations in sample with
 fewer observations,
 number of observations in sample with
 more observations,

 sum of ranks for sample with fewer
 observations, and
 rank for the ith ordered observation
 used in both samples.
T] is normally distributed and Table Al can be
used to determine the appropriate quantile.
Helsel and Hirsch (1995) and USEPA (1997)
provide detailed examples for both of these
tests.

3.3  COMPARING THE PROPORTIONS FROM
     Two INDEPENDENT SAMPLES

Consider the example in which the proportion
of highly erodible land under conservation
tillage has been estimated during two time
periods to be/>; andp2 using sample sizes of n,
and n2, respectively. Assuming a normal
approximation is valid, the test statistic under
a null hypothesis of equivalent proportions (no
change) is
 \
                                                  1    1
                                     (3-4)
where p is a pooled estimate of proportion and
is equal to (x1+x2)/(n1+n2) and x} and x2 are
the number of successes during the two time
periods. An estimator for the difference in
proportions is simply p, -p2.

In an earlier example, it was determined that
129 observations in each sample were needed
to detect a difference in proportions of 0.20
with a two-sided test, a equal to 0.05, and
1-P equal to 0.90. Assuming that 130 samples
were taken andpj andp2 were estimated from
the data as 0.6 and 0.4, the test statistic would
be estimated as
        0.6-0.4
                                             \
                                                                     =  3.22
                                               0.5(0.5)


                                                                                  (3-5)
            130   130,
Comparing this value to the t value from Table
A2 (a/2 = 0.025, df=258) of 1.96,
H0 is rejected.

3.4  COMPARING MORE THAN Two
     INDEPENDENT RANDOM SAMPLES

The analysis of variance (ANOVA) and
Kruskal-Wallis are extensions of the
two-sample t and Mann-Whitney tests,
respectively, and can be used for analyzing
more than two independent random samples
when the data are continuous (e.g., mean
acreage). Unlike the t test described earlier,

-------
  Methods for Evaluating Data
                               Chapter 3
the ANOVA can have more than one factor or
explanatory variable. The Kruskal-Wallis test
accommodates only one factor, whereas the
Friedman test can be used for two factors.  In
addition to applying one of the above tests to
determine if one of the samples is significantly
different from the others, it is also necessary to
do postevaluations to determine which of the
samples is different. This section recommends
Tukey's method to  analyze the raw or rank-
transformed data only if one of the previous
tests (ANOVA, rank-transformed ANOVA,
Kruskal-Wallis, Friedman) indicates a
significant difference between groups.
Tukey's method can be used for equal or
unequal sample sizes (Helsel and Hirsch,
1995). The reader is cautioned, when
performing an ANOVA using standard
software, to be sure that the ANOVA test used
matches the data. See USEPA (1997) for a
more detailed discussion on comparing more
than two independent random  samples.

3.5  COMPARING CATEGORICAL DATA

In comparing categorical data, it is important
to distinguish between nominal categories
(e.g., land ownership, county location, type of
BMP) and ordinal categories (e.g., BMP
implementation rankings, low-medium-high
scales).

The  starting point for all  evaluations is the
development of a contingency table.  In Table
3-1,  the preference  of three BMPs is compared
to operator type in a contingency table.  In this
case both categorical variables are nominal. In
this example, 45 of the 102 operators that own
the land they till used BMPj.  There were a
total of 174 observations.

To test for independence, the sum of the
squared differences between the expected (Ey)
and the observed (O;j) count summed over all
cells is computed as (Helsel and Hirsch, 1995)
y  =
*^ci
                E..
                                      (3-6)
where Ey is equal to AtC/N.  x* is compared to
the 1-a quantile of the % distribution with
(m-l)(k-l) degrees of freedom (see Table A3).
In the example presented in Table 3-1, the
symbols listed in the parentheses correspond to
the above equation. Note that k corresponds to
the three types of BMPs and m corresponds to
the three different types of

-------
 Chapter
               Methods for Evaluating Data
 Table 3-1. Contingency table of observed operator type and implemented BMP.
Operator Type
Rent
Own
Combination
Column Total, C,
BMP,
10(0,0
45 (021)
8 (031)
63 (C,)
BMP2
30 (012)
32 (022)
3 (0,,)
65 (C,)
BMP3
17(013)
25 (023)
4 (033)
46 (CO
Row Total,
57 (A,)
102(A2)
15(AO
174(N)
   Key to Symbols:
   O5 =  number of observations for the /th operator and /th BMP type
   A, =  row total for the /th operator type (total number of observations for a given operator type)
   Cj =  column total for the/th BMP type (total number of observations for a given BMP type)
   N =  total number of observations
operators. Table 3-2 shows computed values
of Etj and (O^-E^/E^ in parentheses for the
example data. Xa i§ equal to 14.60. From
Table A3, the 0.95 quantile of the /
distribution with 4 degrees of freedom is
9.488.  H0 is rejected; the selection of BMP is
not random among the different operator
types.  The largest values in the parentheses in
Table 3-2 give an idea as to which
combinations of operator type and BMP are
noteworthy.  In this example, it appears that
BMP2 is preferred to BMPj for those operators
that rent the land they till.

Now consider that in addition to evaluating
information regarding the operator and BMP
type, we also recorded a value from 1 to 5
indicating how well the BMP was installed
and maintained, with 5 indicating the best
results.  In this case, the BMP implementation
rating is ordinal. Using the same notation as
before, the average rank of observations in
row x, Rx, is equal to (Helsel and Hirsch,
1995)
                                       (3-7)
where At corresponds to the row total. The
average rank of observations in columny, Dp
is equal to
       2 = 1
(3-8)
          C
where C,- corresponds to the column total.  The
Kruskal-Wallis test statistic is then computed
as
K = (N- 1)
k
Er r>2 N
LJDJ N
7 = 1
m
EA K2 N
AIKI ^
2 = 1
'N+\
N
'N+\
. N .
                                       (3-9)
where K is compared to the x2 distribution
with k-1 degrees of freedom.  This is the most
general form of the Kruskal-Wallis test since it
is a comparison of distribution shifts

-------
  Methods for Evaluating Data
                                Chapter
 Table 3-2.  Contingency table of expected operator type and implemented BMP. (Values in
 parentheses correspond to
Operator Type
Rent
Own
Combination
Column Total
BMP!
20.64
(5.48)
36.93
(1.76)
5.43
(1.22)
63
BMP2
21.29
(3.56)
38.10
(0.98)
5.60
(1.21)
65
BMP3
15.07
(0.25)
26.97
(0.14)
3.97
(0.00)
46
Row Total
57
102
15
174
rather than shifts in the median (Helsel and
Hirsch, 1995).

Table 3-3 is a continuation of the previous
example indicating the BMP implementation
rating for each BMP type.  For example, 29 of
the 70 observations that were given a rating of
4 are associated with BMP2. The terms inside
the parentheses of Table 3-3 correspond to the
terms used in Equations 3-7 to 3-9. Note that
k corresponds to the three types of BMPs and
m corresponds to the five different levels of
BMP implementation. Using Equation 3-9 for
the data in Table 3-3, K is equal to 14.86.
Comparing this value to 5.991 obtained from
Table A3, there is a significant difference in
the quality of implementation between the
three BMPs.

The last type of categorical data evaluation
considered in this chapter is when both
variables are ordinal. The Kendall ib for tied
data can be used for this analysis. The statistic
Tb is calculated as (Helsel  and Hirsch,  1995)
                                                             S
where S, SSa, and SSC are computed as
s  =  E  [££0
     /-— /  U-— / /-— /  xy
     allxy
                 xy
                               xy
       2 = 1
                                     (3-n)
                                    (3-12)
                                     (3-13)
To determine whether Tb is significant,
modified to a normal statistic using
                                   ^ is
        s-\
         °s
        S+l
             ifS>0
                                                          ifS<0
                                     (3-14)

-------
 Chapter
                                                               Methods for Evaluating Data
 Table 3-3. Contingency table of implemented BMP and rating of installation and
 maintenance.
BMP
Implementation
Rating
1
2
3
4
5
Column Total, C,
BMP,
1 (0^)
7 (021)
15(031)
32 (041)
8 (0B1)
63 (C,)
BMP2
2 (012)
3 (022)
16(032)
29 (042)
15(0,,)
65 (C,)
BMP3
2 (013)
5 (023)
26 (033)
9 (043)
4 (CU
46 (CO
Row Total,
A,
5 (A,)
15(A2)
57 (A3)
70 (A4)
27 (A,)
174(N)
   Key to Symbols:
   QIJ =  number of observations for the /th BMP implementation rating andyth BMP type
   A,  =  row total for the /th BMP implementation rating (total number of observations for a given BMP implementation rating)
   Cj  =  column total for the/th BMP type (total number of observations for a given BMP type)
   N  =  total number of observations
where
°, =»
                                       (3-15)
where Zs is zero if S is zero.  The values of at
and ct are computed as At /N and Ct /N,
respectively.

Table 3-4 presents the BMP implementation
ratings that were taken in three separate years.
For example, 15 of the 57 observations that
were given a rating of 3 are associated with
Year 2. Using Equations
3-11 and 3-15, S and os are equal to 2,509 and
679.75, respectively. Therefore, Zs is equal to
(2509-1)7679.75 or 3.69. Comparing this
value to a value of 1.96 obtained from Table
Al (a/2=0.025) indicates that BMP
implementation is improving with time.

-------
Methods for Evaluating Data
Chapter
Table 3-4.  Contingency table of implemented BMP and sample year.
BMP
Implementation
Rating
1
2
3
4
5
Column Total, Cy
c,
Year 1 Year 2 Year 3
2 (0,,) 1 (012) 2 (013)
5 (021) 7 (022) 3 (023)
26(031) 15(032) 16(033)
9 (041) 32 (042) 29 (043)
4(0B1) 8(0,,) 15(0*,)
46 (CO 63 (C2) 65 (C3)
0.264 0.362 0.374
Row Total,
A
5 (A,)
15(A2)
57 (A3)
70 (A4)
27 (A,)
174(N)

a,
0.029
0.086
0.328
0.402
0.155


  Key to Symbols:
  Oj =  number of observations for the /th BMP implementation rating and /th year
  A,  =  row total for the /th BMP implementation rating (total number of observations for a given BMP implementation rating)
  Cj  =  column total for the /th BMP type (total number of observations for a given year)
  N  =  total number of observations
  a,  =  X\,/N
  c,  =  C,/N

-------
                               CHAPTER 4.  CONDUCTING THE EVALUATION
4.1 INTRODUCTION

This chapter addresses the process of
determining whether agricultural MMs or
BMPs are being implemented and whether
they are being implemented according to
approved standards or specifications.
Guidance is provided on what should be
measured to assess MM and BMP
implementation, as well as methods for
collecting the information, including physical
farm or field evaluations, mail- and/or
telephone-based surveys, personal interviews,
and aerial reconnaissance and photography.
Designing survey instruments to avoid error
and rating MM and BMP implementation are
also discussed.

Evaluation methods  are separated into two
types: Expert evaluations and self-
evaluations. Expert  evaluations are those in
which actual field investigations are conducted
by trained personnel to gather information on
MM or BMP  implementation.  Self-
evaluations are those in which answers to a
predesigned questionnaire or survey are
provided by the person being surveyed,
usually a farm owner or manager. The
answers provided are used as survey results.
Self-evaluations might also include
examination of materials related to a farm,
such as applications  for cost-share programs or
crop histories. Extreme caution should be
exercised when using data from self-
evaluations as the basis for assessing MM or
BMP compliance since they are not typically
reliable for this purpose. Each of these
evaluation methods has advantages and
disadvantages that should be considered prior
to deciding which one to use or in what
combination to use them.  Aerial
reconnaissance and photography can be used
to support either evaluation method.

Self-evaluations are useful for collecting
information on the level of awareness that
farm owners or managers have of MMs or
BMPs, dates of planting or harvest, field or
crop conditions, which MMs or BMPs were
implemented, and whether the assistance of a
state or county agriculture professional was
used. However, the type of or level  of detail
of information that can be obtained from self-
evaluations might be inadequate to satisfy the
objectives of a MM or BMP implementation
survey. If this is the case, expert evaluations
might be called for. Expert evaluations are
necessary if the information on MM or BMP
implementation that is required must be more
detailed or more reliable than that that can be
obtained with self-evaluations. Examples of
information that would be obtained reliably
only through an expert evaluation include an
objective assessment of the adequacy of MM
or BMP implementation, the degree to which
site-specific factors (e.g., type of crop, soil
type, or presence of a water body) influenced
MM or BMP implementation,  or the need for
changes in standards and specifications for
MM or BMP implementation.  Sections 4.3
and 4.4 discuss expert evaluations and self-
evaluations, respectively, in  more detail.

Other important factors to consider when
choosing variables include the time of year
when the BMP compliance survey will be
conducted and when BMPs were installed.
Some agriculture BMPs, or aspects of their

-------
  Conducting the Evaluation
                                Chapter 4
implementation that can be analyzed vary with
time of year and phase of farming operations.
Variables that are appropriate to these factors
should be chosen. The nutrient management
and pesticide management MMs in particular
might not lend themselves to direct on-site
analysis except at specific times of year, such
as during or soon after fertilizer and pesticide
applications, respectively. Concerning BMPs
that have been in place for some time, the
adequacy of implementation might be of less
interest than the adequacy of the operation and
maintenance of the BMP. For example, it
might be of interest to examine fences along
streams for structural integrity (i.e., holes that
would allow cattle to pass through) rather than
to calculate the miles of stream along which
the fences were installed.  Similarly,  waste
storage structure might be inspected for the
amount of freeboard when operating  at
capacity rather than analyzing their
construction for adherence to construction
specifications. If numerous BMPs are being
analyzed during a single farm visit, variables
that relate to different aspects of BMP
installation, operation, and maintenance might
be chosen separately for each BMP to be
inspected.

Aerial reconnaissance and photography is
another means available for collecting
information on farming practices, though some
of the MMs and BMPs employed for
agriculture  might be difficult if not impossible
to identify on aerial photographs. Aerial
reconnaissance and photography are  discussed
in detail in  Section 4.5.

The general types of information obtainable
with self-evaluations are listed in Table 4-1.
Regardless  of the approach(es) used,  proper
and thorough preparation for the evaluation is
the key to success.
4.2 CHOICE OF VARIABLES

Once the objectives of a BMP implementation
or compliance survey have been clearly
defined, the most important factor in the
assessment of MM or BMP implementation is
the determination of which variable(s) to
measure. A good variable provides a direct
measure of how well a BMP was or is being
implemented.  Individual variables should
provide measures of different factors related to
BMP implementation. The best variables are
those which are measures of the adequacy of
MM or BMP implementation and are based on
quantifiable expressions of conformance with
state standards and specifications. As the
variables that are used become less directly
related to actual MM or BMP implementation,
their accuracy as measures of BMP
implementation decreases.

Examples of useful variables include the tons
and percentage per day of animal manure
captured and treated by wastewater facilities
associated with confined animal facilities and
the cattle-hours per day during which livestock
are excluded from stream banks, both of which
would be expressed in terms of conformance
with applicable state standards and
specifications. Less useful variables measure
factors that are related to MM and BMP
implementation but do not necessarily provide
an accurate measure of their implementation.
Examples of these types of variables are the
number of manure storage facilities
constructed in a year and the number of farms
with approved pesticide management plans.
Other poor variables would be the passage of
legislation requiring MM or BMP application
on farms, development of an information
education program for nutrient management,
or the number of requests for information on
nutrient management. Although these

-------
Table 4-1. General types of information obtainable with self-evaluations and expert
evaluations.
  Information Obtainable from Self Evaluations

  Background Information
    •  Type of facility installed (e.g., confined animal facility, wastewater storage and/or treatment
       facility)
    •  Capacity of facility
    •  Square feet of facilities
    •  Type and number of animals and/or crops on farm
    •  Cropping history
    •  Yield data and estimates
    •  Field limitations
    •  Pest problems on farm
    •  Soil test results
    •  Map

  Management Measures/Best Management Practices
    •  Management measures used on farm
    •  BMPs installed
    •  Dates of MM/BMP installation
    •  Design specifications of BMPs
    •  Type of waterbody or area protected
    •  Previous management measures used

  Management Plans
    •  Preparation of management plans (e.g., nutrient, grazing, pesticide, irrigation water)
    •  Dates of plan preparation and revisions
    •  Date of initial plan implementation
    •  Total acreage under management

  Equipment
    •  Types of equipment used on farm
    •  Dates of equipment calibration
    •  Application rates
    •  Timing of applications
    •  Substances applied (e.g., pesticides, fertilizers)
    •  Ambient conditions during applications
    •  Location of mixing, loading, and storage areas

  Information Requiring Expert Evaluations
    •  Design sufficiency
    •  Installation sufficiency
    •  Adequacy of operation/management
    •  Confirmation of information from self evaluation

-------
  Chapter 4
                 Conducting the Evaluation
variables relate to MM or BMP
implementation, they provide no real
information on whether MMs or BMPs are
actually being implemented or whether they
are being implemented properly.

Variables generally will not directly relate to
MM implementation, as most agriculture MMs
are combinations of several BMPs. Measures
of MM implementation, therefore, usually will
be based on separate assessments of two or
more BMPs, and the implementation of each
BMP will be based on a unique set of
variables. Some examples of BMPs related to
the EPA's Grazing Management Measure,
variables for assessing compliance with the
BMPs, and related standards and specifications
that might be required by state agriculture
authorities are presented in Figure 4-1.
Because farm owners and managers choose to
implement or not implement MMs or BMPs
based on site-specific conditions, it is also
appropriate to apply varying weights to the
variables chosen to assess MM and BMP
implementation to correspond to site-specific
conditions. For example, variables related to
animal waste disposal practices might be de-
emphasized—and other, more applicable
variables emphasized more—on farms with
relatively few animals.  Similarly, on a farm
with a water body, variables related to
livestock access to the water body, sediment
runoff, and chemical deposition (pesticide use,
fertilizer use) might be emphasized over other
variables to arrive at a site-specific rating of
the adequacy of MM or BMP implementation.

The purpose for which the information
collected during a MM or BMP
implementation survey will be used is another
important consideration when selecting
variables. An implementation survey can
serve many purposes beyond the primary
purpose of assessing MM and BMP
implementation.  For instance, variables might
be selected to assess compliance with each
category of BMP that is of interest and to
assess overall compliance with BMP
specification and standards.  In addition, other
variables might be selected to assess the effect
that each has on the ability or willingness of
farm owners or managers to comply with
BMP implementation standards or
specifications. The information obtained from
evaluations using the latter type of variable
could be useful for modifying MM or BMP
implementation standards and specifications
for application to particular farm types or
conditions.

Table 4-2 provides examples of good and poor
variables for the assessment of MM or BMP
implementation of the agricultural MMs
developed by EPA (USEPA, 1993a). The
variables listed in the table are only examples,
and local or regional conditions should
ultimately dictate what variables should be
used.

-------
 GRAZING MANAGEMENT MEASURE
   Protect range, pasture, and other grazing lands:
   (1)
   (2)
By implementing one or more of the following to protect sensitive areas (such as
stream banks, wetlands, estuaries, ponds, lake shores, and riparian zones):
(a)   Exclude livestock,
(b)   Provide stream crossings or hardened watering access for drinking,
(c)   Provide alternative drinking water locations,
(d)   Locate salt and additional shade, if needed, away from sensitive areas, or
(e)   Use improved grazing management (e.g., herding)
to reduce the physical disturbance and reduce direct loading of animal waste and
sediment caused by livestock; and

By achieving either of the following on all range, pasture, and other grazing lands not
addressed under (1):
(a)   Implement the range  and pasture components of a Conservation Management
     System (CMS) as defined in the Field Office Technical Guide of the USDA-NRCS
     by applying the progressive planning approach of the USDA-NRCS to reduce
     erosion,
        or
        (b)
     Maintain range, pasture, and other grazing lands in accordance with activity plans
     established by either the Bureau of Land Management of the U.S. Department of
     the Interior or the Forest Service of USDA.
 Related BMPs, measurement variables, and standards and specifications:
      Management Measure
            Practice

    Postpone grazing or rest
    grazing land for a prescribed
    period
    Alternate water source
    installed to convey water away
    from riparian areas
    Livestock excluded from an
    area not intended for grazing
    Range seeded to establish
    adapted plants on native
    grazing land
                            Potential Measurement
                                  Variables

                           Percent ground cover
                           Stubble height
                           Presence of alternative water
                           source
                           Distance from water body of
                           water provided to livestock

                           Cattle-hours per day of
                           exclusion of livestock from
                           water bodies

                           Percent ground cover
                           Plant species
Example Related Standards and
       Specifications

  Recommended percent ground
  cover for grazing
  Recommended stubble height
  for grazing

  Guidelines for provision of
  alternative sources of water on
  farms with water bodies
  Guidelines for protection of
  water quality for specific types
  of water bodies.

  Recommended amount of
  ground cover for grazing
  Acceptable plant species for
  the region
Figure 4-1.  Potential variables and examples of implementation standards and specifications
that might be useful for evaluating compliance with the Grazing Management Measure.

-------
Table 4-2.  Example variables for management measure implementation analysis.
  Management
    Measure
      Useful Variables
Less Useful Variables
 Appropriate
  Sampling
    Unit
  Erosion and
  Sediment
  Control
Area on which reduced tillage or
terrace systems are installed
Area of runoff diversion systems
or filter strips per acre of
cropland
Area of highly erodible cropland
converted to permanent cover
 Number of approved
 farm soil and erosion
 management plans
 Number of grassed
 waterways, grade
 stabilization  structures,
 filter strips installed
  Field
  Acre
  Facility
  Wastewater
  and Runoff
  from
  Confined
  Animal
  Facilities
Quantity and percentage of total
facility wastewater and runoff
that is collected by a waste
storage or treatment system
 Number of manure
 storage facilities
  Confined
  animal
  facility
  Animal unit
  Nutrient
  Management
Number of farms following and
acreage covered by approved
nutrient management plans
Percent of farmers keeping
records and applying nutrients at
rates consistent with
management recommendations
Quantity and percent reduction in
fertilizer applied
Amount of fertilizer and manure
spread between spreader
calibrations
 Number of farms with
 approved nutrient
 management plans
  Farm
  Field
  Application
  Pesticide
  Management
Number of farms with complete
records of field surveys and
pesticide applications and
following approved pest
management plans
Number of pest field surveys
performed on a weekly (or other
time frame) basis
Quantity and percent reduction in
pesticides use	
 Number of farms with
 approved pesticide
 management plans
  Field
  Farm
  Application
  Grazing
  Management
Number of cattle-hours of access
to riparian areas per day
Miles of stream from which
grazing animals are excluded
 Miles offence installed
•  Stream
mile
•  Animal unit

-------
  Conducting the Evaluation
                                Chapter 4
4.3 Expert Evaluations

4.3.1  Site Evaluations

Expert evaluations are the best way to collect
reliable information on MM and BMP
implementation.  They involve a person or
team of people visiting individual farms and
speaking with farm owners and/or managers to
obtain information on MM and BMP
implementation.  For many of the MMs,
assessing and verifying compliance will
require a farm visit and evaluation. The
following  should be considered before expert
evaluations are conducted:

•  Obtaining permission of the farm owner or
   manager. Without proper authorization to
   visit a  farm from a farm owner or
   manager, the relationship between farmers
   and the agriculture agency, and any future
   regulatory or compliance action could be
   jeopardized.

•  The type(s) of expertise needed to assess
   proper implementation.  For  some MMs, a
   team of trained personnel might be
   required to determine whether MMs have
   been implemented properly.

•  The activities that should occur during an
   expert evaluation. This information is
   necessary for proper and complete
   preparation for the farm visit, so that it can
   be completed in a single visit and at the
   proper time.

•  The me thod of rating the MMs and BMPs.
   MM and BMP rating systems are discussed
   below.

•  Consistency among evaluation teams and
   between farm evaluations. Proper training
   and preparation of expert evaluation team
   members are crucial to ensure accuracy
   and consistency.

•  The collection of information while at a
   farm. Information collection should be
   facilitated with preparation of data
   collection forms that include any necessary
   MM and BMP rating information needed
   by the evaluation team members.

•  The content and format of post-evaluation
   discussions. Site evaluation  team members
   should bear in mind the value of
   postevaluation discussion among team
   members. Notes can be taken during the
   evaluation concerning any items that
   would benefit from group discussion.

Evaluators might consist of a single person
suitably trained in agricultural expert
evaluation to a group of professionals with
various expertise.  The composition of
evaluation teams will depend on the types of
MMs or BMPs being evaluated.  Potential
team members could include:

•  Agricultural engineer
•  Agriculture extension agent
•  Agronomist
•  Hydrologist
•  Pesticide specialist
•  Soil scientist
•  Water quality expert

The composition of evaluation teams can vary
depending on the purpose of the  evaluation,
available staff and other resources, and the
geographic area being covered.  All team
members should be familiar with the required
MMs and BMPs, and each team  should have a
member who has previously participated in an
expert evaluation.  This will ensure familiarity

-------
  Chapter 4
                 Conducting the Evaluation
with the technical aspects of the MMs and
BMPs that will be rated during the evaluation
and the expert evaluation process.

Training might be necessary to bring all team
members to the level of proficiency needed to
conduct the expert evaluations. State or
county agricultural personnel should be
familiar with agriculture regulations, state
BMP standards and specifications, and proper
BMP implementation, and therefore are
generally well qualified to teach these topics to
evaluation team members who are less
familiar with them.  Agricultural agents or
other specialists who have participated in BMP
implementation surveys might be enlisted to
train evaluation team members about the
actual conduct of expert evaluations. This
training should include identification of BMPs
particularly critical to water quality protection,
analysis of erosion potential, and other aspects
of BMP implementation that require
professional judgement, as well as any
standard methods for measurements to judge
BMP implementation against state standards
and specifications.

Alternatively, if only one or two individuals
will be conducting expert evaluations, their
training in the various specialties, such as
those listed above, necessary to evaluate the
quality of MM and BMP implementation
could be provided by a team of specialists who
are familiar with agricultural practices and
nonpoint source pollution.

In the interest of consistency among the
evaluations and among team members, it is
advisable that one or more mock evaluations
take place prior to visiting selected sample
farms. These "practice sessions" provide team
members with an opportunity to become
familiar with MMs and BMPs as they should
be implemented under different farm
conditions, gain familiarity with the evaluation
forms and meanings of the terms and questions
on them, and learn from other team members
with different expertise.  Mock evaluations are
valuable for ensuring that all evaluators have a
similar understanding of the intent of the
questions, especially for questions whose
responses involve a degree of subjectivity on
the part of the evaluator.

Where expert evaluation teams are composed
of more than two or three people, it might be
helpful to divide up the various responsibilities
for conducting the expert evaluations among
team members ahead of time to avoid
confusion at the farm and to be certain that all
tasks are completed but not duplicated. Having
a spokesperson for the group who will be
responsible for communicating with the farm
owner or manager—prior to the expert
evaluation, at the expert evaluation if they are
present, and afterward—might also be helpful.
A county agriculture representative is
generally a good choice as spokesperson
because he/she can represent the state anc
county agriculture authorities.  Newly-formed
evaluation teams might benefit most from a
division of labor and selection of a team leader
or team coordinator with experience in expert
evaluations who will be responsible for the
quality of the expert evaluations.  Smaller
teams might find that a division of
responsibilities is  not necessary, as might
larger teams that have experience working
together. If responsibilities are to be assigned,
mock evaluations  can be a good time to work
out these details.

4.3.2   Rating Implementation of
       Management Measures and Best
       Management Practices

-------
  Conducting the Evaluation
                                Chapter 4
Many factors influence the implementation of
MMs and BMPs, so it is sometimes necessary
to use best professional judgment (BPJ) to rate
their implementation and BPJ will almost
always be necessary when rating overall BMP
compliance at a farm.  Site-specific factors
such as soil type, crop rotation history,
topography, tillage, and harvesting methods
affect the implementation of erosion and
sediment control BMPs, for instance, and must
be taken into account by evaluators when
rating MM or BMP implementation.
Implementation of MMs will often be based
on implementation of more than one BMP,
and this makes rating MM  implementation
similar to rating overall BMP implementation
at a farm or ranch. Determining an overall
rating involves grouping the ratings of
implementation of individual BMPs into a
single rating, which introduces more
subjectivity than rating the implementation of
individual BMPs based on  standards and
specifications. Choice of a rating system and
rating terms, which are aspects of proper
evaluation design, is therefore important in
minimizing the level of subjectivity associated
with overall BMP compliance and MM
implementation ratings. When creating
overall ratings, it is still important to record
the detailed ratings of individual BMPs as
supporting information.

Individual BMPs, overall BMP compliance,
and MMs can be rated using a binary approach
(e.g., pass/fail, compliant/noncompliant, or
yes/no) or on a scale with more than two
choices, such as 1 to 5 or 1 to 10 (where 1 is
the worst—see Example).  The simplest
method of rating MM and BMP
implementation is the use of a binary
approach. Using a binary approach, either an
entire farm or individual MMs or BMPs are
rated as being in compliance or not in
compliance with respect to specified criteria.
Scale systems can take the form of ratings
from poor to excellent, inadequate to adequate,
low to high, 1 to 3, 1 to 5, and so forth.

Whatever form of scale is used, the factors
that would individually or collectively qualify
a farm, MM, or BMP for one of the rankings
should be clearly stated. The more choices
that are added to the scale, the smaller and
smaller the difference between them becomes
and each must therefore be defined more
specifically and accurately.  This is especially
important if different teams or individuals rate
farms separately.  Consistency among the
ratings then depends on each team or
individual evaluator knowing precisely what
the criteria for each rating option mean. Clear
and precise explanations of the rating scale can
also help avoid or reduce disagreements

Example...of a rating scale (adapted from
Rossman and Phillips, 1992).
 A possible rating scale from 1 to 5 might be:
   5 = Implementation exceeds requirements
   4 = Implementation meets requirements
   3 = Implementation has a minor departure
      from requirements
   2 = Implementation has a major departure
      from requirements
   1 = Implementation is in gross neglect of
      requirements

 where:

 Minor departure is defined as "small in
 magnitude or localized," major departure is
 defined as "significant magnitude or where
 the BMPs are consistently neglected" and
 gross neglect is defined as "potential risk to
 water resources is significant and there is no
 evidence that any attempt is made to
 implement the BMP."

-------
  Chapter 4
                 Conducting the Evaluation
among team members.  This applies equally to
a binary approach. The factors, individually
or collectively, that would cause a farm, MM,
or BMP to be rated as not being in compliance
with design specifications should be clearly
stated on the evaluation form or in support
documentation.

Rating farms or MMs and BMPs  on a scale
requires a greater degree of analysis by the
evaluation team than does using a binary
approach. Each higher number represents a
better level of MM or BMP implementation.
In effect, a binary rating approach is a scale
with two choices; a scale of low, medium, and
high (compliance) is a scale with  three
choices.  Use of a scale system with more than
two rating choices can provide more
information to program managers than a
binary rating approach, and this factor must be
weighted against the greater complexity
involved in using one. For instance, a survey
that uses a scale of 1 to 5 might result in one
MM with a ranking of 1, five with a ranking
of 2, six with a ranking of 3, eight with a
ranking of 4, and five with a ranking of 5.
Precise criteria would have to be developed to
be able to ensure consistency within and
between survey teams in rating the MMs, but
the information that only one MM was
implemented poorly, 11 were implemented
below standards, 13 met or were above
standards, and 5 were implemented very well
might be more valuable than the information
that 18 MMs were found to be in  compliance
with design specifications, which is the only
information that would be obtained with a
binary rating approach.

If a rating system with more than two ratings
is used to collect data, the data can be
analyzed either by using the original rating
data  or by first transforming the data into a
binomial (i.e., two-choice rating) system. For
instance, ratings of 1 through 5 could be
reduced to two ratings by grouping the Is, 2s,
and 3s together into one group (e.g.,
inadequate) and the 4s and 5s into a separate
group (e.g., adequate).  If this approach is
used, it is best to retain the original rating data
for the detailed information they contain and
to reduce the data to a binomial system only
for the purpose of statistical analysis. Chapter
3, Section 3.5, contains information on the
analysis of categorical data.

4.3.3 Rating Terms

The choice of rating terms used on the
evaluation forms is an important factor in
ensuring consistency and reducing bias, and
the terms used to describe and define the
rating options should be as objective as
possible. For a rating system with a  large
number of options, the meanings of each
option should be clearly defined.  It is
suggested to avoid using terms such  as
"major" and "minor" when describing erosion
or pollution effects or deviations from
prescribed MM or BMP implementation
criteria because they might have different
connotations for different evaluation team
members. It is easier for an evaluation team to
agree upon meaning if options are described in
terms of measurable criteria and examples are
provided to clarify the intended meaning. It is
also  suggested not to use terms that carry
negative connotations.  Evaluators might be
disinclined to rate a MM or BMP as  having a
"major deviation" from an implementation
criterion, even if justified, because of the
negative connotation carried by the term.
Rather than using such a term, observable
conditions or effects of the quality of
implementation can be listed and specific
ratings (e.g., 1-5 or compliant/noncompliant

-------
  Conducting the Evaluation
                                Chapter 4
for the criterion) can be associated with the
conditions or effects. For example, instead of
rating an animal waste management facility as
having a "major deficiency," a specific
deficiency could be described and ascribed an
associated rating (e.g., "Waste storage
structure is designed for no more than 70% of
the confined animals = noncompliant").

Evaluation team members will often have to
take specific notes on farms, MMs, or BMPs
during the evaluation, either to justify the
ratings they have ascribed to variables or for
discussion with other team members after the
survey. When recording notes about the farms,
MMs, or BMPs, evaluation team members
should be as specific as the criteria for the
ratings. A rating recorded as "MM deviates
highly from implementation criteria" is highly
subjective and loses specific meaning when
read by anyone other than the person who
wrote the note.  Notes  should therefore be as
objective and specific as possible.

An overall farm rating is useful for
summarizing information in reports,
identifying the level of implementation of
MMs and BMPs, indicating the likelihood that
environmental protection is  being achieved,
identifying additional training or education
needs; and conveying information to program
managers, who are often not familiar with
MMs or BMPs. For the purposes of
preserving the valuable information contained
in the original ratings of farms, MMs, or
BMPs, however, overall ratings  should
summarize, not replace, the  original data.
Analysis of year-to-year variations in MM or
BMP implementation, the factors involved in
MM or BMP program implementation, and
factors that could improve MM or BMP
implementation and MM or BMP program
success are only possible if the original,
detailed farm, MM, or BMP data are used.

Approaches commonly used for determining
final BMP implementation ratings include
calculating a percentage based on individual
BMP ratings, consensus, compilation of
aggregate scores by an objective party, voting,
and voting only where consensus on a farm or
MM or BMP rating cannot be reached. Not all
systems for arriving at final ratings are
applicable to all circumstances.
4.3.4  Consistency Issues

Consistency among evaluators and between
evaluations is important, and because of the
potential for subjectivity to play a role in
expert evaluations, consistency should be
thoroughly addressed in the quality assurance
and quality control (QA/QC) aspects of
planning and conducting an implementation
survey.  Consistency arises as a QA/QC
concern in the planning phase of an
implementation survey in the choice of
evaluators, the selection of the size of
evaluation teams, and in evaluator training. It
arises as a QA/QC  concern while conducting
an implementation survey in whether
evaluations are conducted by individuals or
teams, how MM and BMP implementation on
individual fields or farms is documented, how
evaluation team discussions of issues are
conducted, how problems are resolved, and
how individual MMs and BMPs or whole
farms are rated.

Consistency is likely to be best if only one to
two evaluators conduct the expert evaluations
and the same individuals conduct all of the

-------
  Chapter 4
                 Conducting the Evaluation
evaluations. If, for statistical purposes, many
farms (e.g., 100 or more) need to be evaluated,
use of only one to two evaluators might also
be the most efficient approach. In this case,
having a team of evaluators revisit a
subsample of the farms that were originally
evaluated  by one to two individuals might be
useful for  quality control purposes.

If teams of evaluators conduct the evaluations,
consistency can be achieved by keeping the
membership of the teams constant.
Differences of opinion, which are likely to
arise among team members, can be settled
through discussions held during evaluations,
and the experience of team members who have
done past  evaluations can help guide
decisions.  Pre-evaluation training sessions,
such as the mock evaluations discussed above,
will help ensure that the first few expert
evaluations are not "learning" experiences to
such an extent that those farms must be
revisited to ensure that they receive the same
level of scrutiny as farms evaluated later.

If different farms are visited by different teams
of evaluators or if individual evaluators are
assigned to different farms, it is especially
important  that consistency be established
before the evaluations are conducted. For best
results, discussions among evaluators should
be held periodically during the evaluations to
discuss any potential problems.  For instance,
evaluators could visit some farms together at
the beginning of the evaluations to promote
consistency in ratings, followed by expert
evaluations conducted by individual
evaluators. Then, after a few farm or MM
evaluations, evaluators  could gather again to
discuss results and to share any knowledge
gained to ensure continued consistency.
As mentioned above, consistency can be
established during mock evaluations held
before the actual evaluations begin. These
mock evaluations are excellent opportunities
for evaluators to discuss the meaning of terms
on rating forms, differences between rating
criteria, and differences of opinion about
proper MM or BMP implementation. A
member of the evaluation team should be able
to represent the state's position on the
definition of terms and clarify areas of
confusion.

Descriptions of MMs and BMPs should be
detailed enough to support any ratings given to
individual features and to the MM or BMP
overall. Sketching a diagram of the MM or
BMP helps identify design problems,
promotes careful evaluation of all features,
and provides a record of the MM or BMP for
future reference. A diagram is also valuable
when discussing the MM or BMP with the
farm owner or identifying features in need of
improvement or alteration. Farm owners or
managers can also use a copy of the diagram
and evaluation when discussing their
operations with state or county agriculture
personnel.  Photographs of MM or BMP
features are a valuable reference material and
should be used whenever an  evaluator feels
that a written description or a diagram could
be inadequate. Photographs of what constitutes
both good and poor MM or BMP
implementation are valuable for explanatory
and educational purposes; for example, for
presentations to managers and the public.

4.3.5  Postevaluation Onsite Activities

It is important to complete all pertinent tasks
as soon as possible after the completion of an
expert evaluation to avoid extra work later and
to reduce the chances of introducing error

-------
  Conducting the Evaluation
                                Chapter 4
attributable to inaccurate or incomplete
memory or confusion. All evaluation forms
for each farm should be filled out completely
before leaving the farm.  Information not filled
in at the beginning of the evaluation can be
obtained from the farm owner or manager if
necessary.  Any questions that evaluators had
about the MMs and BMPs during the
evaluation can be discussed, notes written
during the evaluation can be shared and used
to help clarify details of the evaluation process
and ratings.  The opportunity to revisit the
farm will still exist if there are points that
cannot be agreed upon among evaluation team
members.

Also,  while the evaluation team is still on the
farm,  the farm owner or manager should be
informed about what will follow; for instance,
whether he/she will receive a copy of the
report, when to expect it, what the results
mean, and his/her responsibility in light of the
evaluation, if any. Immediately following the
evaluation is also an excellent time to discuss
the findings with the farm owner or manager if
he/she was not present during the evaluation.

4.4 SELF-EVALUATIONS

4.4.1  Methods

Self-evaluations, while often not a reliable
source of MM or BMP implementation data,
can be used to augment data collected through
expert evaluations or in place of expert
evaluations where the latter cannot be
conducted. In some cases, state agriculture
authority staff might have been involved
directly with BMP selection and
implementation and will be a source  of useful
information even if an expert evaluation is not
conducted. Self-evaluations are an appropriate
survey method for obtaining background
information from farmers or persons
associated with farming operations, such as
county extension agents.

Mail, telephone, and mail with telephone
follow-up are common self-evaluation
methods. Mail and telephone surveys are
useful for collecting general information, such
as the management measures that specific
agricultural operations should be
implementing. County extension agents or
other state or local agricultural agents can be
interviewed or sent a questionnaire that
requests very specific information. Recent
advances in and increasing access to electronic
means of communication (i.e., e-mail and the
Internet) might make these viable survey
instruments in the future.

Mail surveys with a telephone follow-up
and/or farm visit are an efficient method of
collecting information. The USDA National
Agricultural Statistics  Service has found that
10 to 20 percent of farm  owners or managers
will respond to crop production questionnaires
that are mailed. Approximately two-thirds of
the questionnaires that are not returned are
completed by telephone and the remainder are
completed by personal visits to farms (USDA,
undated). The entire NASS survey effort,
from designing the questionnaire to reporting
the results, takes approximately 6 months.

The level of response obtained by NASS is
probably higher than would be obtained for
MM or BMP implementation monitoring
because NASS has developed a high level of
trust with farmers through years of
cooperation. In addition, NASS is prohibited
by law from releasing information on
individual farm operations, a fact of which
most farmers are aware.

-------
  Chapter 4
                 Conducting the Evaluation
To ensure comparability of results,
information that is collected as part of a self-
evaluation—whether collected through the
mail, over the phone, or during farm
visits—should be collected in a manner that
does not favor one method over the others.
Ideally, telephone follow-up and on-site
interviews should consist of no more than
reading the questions on the questionnaire,
without providing any additional explanation
or information that would not have been
available to those who responded through the
mail.  This approach eliminates as much as
possible any bias associated with the different
means of collecting the information.  Figure 4-
2 presents an example of an animal waste
management survey questionnaire modeled
after aNASS crop production questionnaire.
The questionnaire design is discussed in
Section 4.4.3.

It is important that the accuracy of information
received through mail and phone surveys be
checked. Inaccurate or incomplete responses
to questions on mail and/or telephone surveys
commonly result from survey respondents
misinterpreting questions and thus providing
misleading information, not including all
relevant information in their responses, not
wanting to provide some types of information,
or deliberately providing some inaccurate
responses.  Therefore, the accuracy of
information received through mail  and phone
surveys should be checked by selecting a
subsample of the farmers surveyed and
conducting follow-up farm visits.

4.4.2   Cost

Cost can be an important consideration when
selecting an evaluation method. Farm visits
can cost several hundred dollars per farm
visited, depending on the type of farming
involved, the information to be collected, and
the number of evaluators used.  Mail and/or
telephone surveys can be an inexpensive
means of collecting information, but their cost
must be balanced with the type and accuracy
of information that can be collected through
them. Other costs also need to be figured into
the overall cost of mail and/or telephone
surveys, including follow-up phone calls  and
farm visits to make up for a poor response to
mailings and for accuracy checks.  NASS has
found that a mail survey with a telephone
follow-up costs $6 to $10 per farm. Farm
visits can cost several hundred dollars per farm
depending on the complexity of the operation
and the desired information.  Additionally, the
cost of questionnaire design must be
considered, as a well-designed questionnaire is
extremely important to the success of self-
evaluations. Questionnaire design is discussed
in the next section.

The number of evaluators used for farm visits
has an obvious impact on the cost of a MM or
BMP implementation survey. Survey costs
can be minimized by having one or two
evaluators visit farms instead of having
multiple-person teams visit each farm.  If the
expertise of many specialists is desired, it
might be cost-effective to have multiple-
person teams check the quality  of evaluations
conducted by one or two evaluators.  This can
usually be done at a subsample of farms after
they have been surveyed.

An important factor to consider when
determining the number of evaluators to
include on farm visitation teams, and how to
balance the use of one to two evaluators versus
multiple-person teams, is the objectives of the
survey.  Cost notwithstanding, the teams
conducting the expert evaluations must be
sufficient to meet the objectives of the  survey,

-------
Animal Waste Management Survey

Purpose of Survey: To determine conformity with the following criteria for the control of runoff
from confined animal facilities:

States would put their standards here.

Limit the discharge from the confined animal facility to surface waters by:

(1) Storing both the facility wastewater and the runoff from confined animal facilities that are caused
by storms up to and including a 25-year, 24-hour frequency storm. Storage structures should:
(a) Have an earthen lining or plastic membrane lining, or
(b) Be constructed with concrete, or
(c) Be a storage tank.

(2) Managing stored runoff and accumulated solids from the facility through an appropriate waste
utilization system.

Population of interest: Farms in the coastal zone with new or existing confined animal facilities that
contain the following number of animals or more:
Animal Type
Beef Cattle
Stables (horses)
Dairy Cattle
Layers
Broilers
Turkeys
Swine
Number
300
200
70
15,000
15,000
13,750
80
Animal Units
300
400
98
1501
4952
1501
4952
2,475
200
Facilities that have been required by federal regulation 40 CFR 122.23 to obtain an NPDES discharge
permit are excluded.

Level of analysis: States should determine the level of analysis necessary.

If facility has a liquid manure system.
If facility has continous overflow watering.
Figure 4-2. Sample draft survey for confined animal facility management evaluation.

-------
         Items of interest: This may vary depending on the type of facilities found within a
         state and the state's program for addressing this issue.

         Land Use and Ownership

         Total acres operated                                                         nnn
           Land owned                                                              nnn
           Land rented                                                              nnn

         Demographic Characteristics of Farm Operators

         Years farming                                                              nnn
         Years farming this operation                                                  nnn
         Years of formal education                                                   nnn
         Age                                                                       nnn

         Peak Number of Livestock

         Beef cattle                                                                 nnn
         Horses                                                                     nnn
         Dairy cattle                                                                 nnn
         Layers (in facility with liquid manure systems)                                 nnn
         Layers (in facility with continuous overflow watering)                           nnn
         Broilers (in facility with liquid manure systems)                                nnn
         Broilers (in facility with continuous overflow watering)                          nnn
         Turkeys                                                                    nnn
         Swine                                                                     nnn

         Animal waste management practices

         Do you have a facility for wastewater and runoff from your animal operation?      y/n
         Did an engineer, extension agent, or other professional assist in the
           design of the facility?                                                   y/n/na
         Was the facility designed to accommodate the peak amount of waste
           entering it?                                                            y/n/na
         Does the facility store both the wastewater and runoff caused by a
           25-year, 24-hour frequency storm?                              y/n/na/unknown
         Does the facility have a earthen lining or plastic membrane?                   y/n/na
         Is the facility constructed with concrete?                                    y/n/na
         Is the facility a storage tank?                                               y/n/na
         Are the stored runoff and accumulated solids used as fertilizer?                y/n/na
         If yes, what type of system is used?                              nnnnnnnnnnnnnn
Figure 4-2.  Sample draft survey,  (continued)

-------
  Conducting the Evaluation
                                 Chapter 4
and if the required teams would be too costly,
then the objectives of the survey might need to
be modified.

Another factor that contributes to the cost of a
MM or BMP implementation survey is the
number of farms to be surveyed. Once again,
a balance must be reached between cost, the
objectives of the survey, and the number of
farms to be evaluated. Generally, once the
objectives of the study have been specified,
the number of farms to be evaluated is
determined statistically to meet required data
quality objectives. If the number of farms that
is determined in this way would be too costly,
then it would be necessary to modify the study
objectives or the data quality objectives.
Statistical determination of the number of
farms to evaluate is discussed in Section 2.3.

4.4.3   Questionnaire Design

Many books have been written on the design
of data collection forms and questionnaires
(e.g., Churchill, 1983; Ferber et al., 1964; lull
and Hawkins, 1990), and these can provide
good advice for the creation of simple
questionnaires that will be used for a single
survey. However, for complex questionnaires
or ones that will be used for initial surveys as
part of a series of surveys (i.e., trend analysis),
it is strongly advised that a professional in
questionnaire design be consulted.  This is
because while it might seem that designing a
questionnaire is a simple task, small details
such as the order of questions, the selection of
one word or phrase over a similar one, and the
tone of the questions can significantly  affect
survey results. A professionally-designed
questionnaire can yield information beyond
that contained in the responses to the questions
themselves, while a poorly-designed
questionnaire can invalidate the results.
The objective of a questionnaire, which should
be closely related to the objectives of the
survey, should be extremely well thought out
prior to its being designed.  Questionnaires
should also be designed at the same time as the
information to be collected is selected to
ensure that the questions address the objectives
as precisely as possible. Conducting these
activities simultaneously also provides
immediate feedback on the attainability of the
objectives and the detail of information that
can be collected.  For example, an investigator
might want information on the extent of
grazing in riparian areas, but might discover
while designing the questionnaire that the
desired information could not be obtained
through the use of a questionnaire, or that the
information that  could be collected would be
insufficient to fully address the chosen
objectives. In  such a situation the investigator
could revise the objectives and questions
before going further with questionnaire design.

Tull and Hawkins (1990) identified seven
major elements of questionnaire construction:

1.  Preliminary decisions
2.  Question content
3.  Question wording
4.  Response format
5.  Question sequence
6.  Physical characteristics of the
    questionnaire
7.  Pretest and revision.

Preliminary decisions include determining
exactly what type of information is required,
determining the target audience, and selecting
the method of communication (e.g, mail,
telephone, farm visit). These subjects are
addressed in other sections of this guidance.

-------
  Chapter 4
                 Conducting the Evaluation
The second step is to determine the content of
the questions.  Each question should generate
one or more of the information requirements
identified in the preliminary decisions. The
ability of the question to elicit the necessary
data needs to be assessed. "Double-barreled"
questions, in which two or more questions are
asked as one, should be avoided.  Questions
that require the respondent to aggregate
several sources of information should be
subdivided into several specific questions or
parts. The ability of the respondent to answer
accurately should also be considered when
preparing questions.  Some respondents might
be unfamiliar with the type of information
requested or the terminology used.  Or a
respondent might have forgotten some of the
information  of interest, or might be unable to
verbalize an answer.  Consideration should be
given to the willingness of respondents to
answer the questions accurately.  If a
respondent feels that a particular answer might
be embarrassing or personally harmful, (e.g.,
might lead to fines or increased regulation), he
or she might refuse to answer the question or
might deliberately provide inaccurate
information. For this reason, answers to
questions that might lead to such responses
should be checked for accuracy whenever
possible.

The next step is the specific phrasing of the
questions. Simple, easily understood language
is preferred.  The wording should not bias the
answer or be too subjective. For instance, a
question should not ask if grazing in riparian
areas is a problem on the farm. Instead, a
series of questions could ask if cattle are kept
on the farm, if the farm has any riparian areas
(which should  be defined), if any means are
provided along the riparian areas to exclude
grazing animals, and what those means are.
These questions all request factual information
of which a farmer should be knowledgeable
and they progress from simple to more
complex.  All alternatives and assumptions
should be clearly stated on the questionnaire,
and the respondent's frame of reference should
be considered.

Fourth, the type of response format should be
selected. Various types of information can
best be obtained using open-ended, multiple-
choice, or dichotomous questions.  An open-
ended question allows respondents to answer
in any way they feel is appropriate. Multiple-
choice questions tend to reduce some types of
bias and are easier to tabulate and analyze;
however, good multiple-choice questions can
be more difficult to formulate. Dichotomous
questions allow only two responses, such as
"yes-no" or "agree-disagree." Dichotomous
questions are suitable for determining points
of fact, but must be very precisely  stated and
unequivocally solicit only a single piece of
information.

The fifth step in questionnaire design is the
ordering of the questions.  The first questions
should be simple to answer, objective, and
interesting in order to relax the respondent.
The questionnaire should move from topic to
topic in a logical manner without confusing
the respondent. Early questions that could
bias the respondent should be avoided.  There
is evidence that response quality declines near
the end of a long questionnaire (Tull and
Hawkins, 1990). Therefore, more  important
information should be solicited early. Before
presenting the questions, the questionnaire
should explain how long (on average) it will
take to complete and the types of information
that will be solicited. The  questionnaire
should not present the respondent with any
surprises.

-------
  Conducting the Evaluation
                                Chapter 4
The layout of the questionnaire should make it
easy to use and should minimize recording
mistakes.  The layout should clearly show the
respondent all possible answers. For mail
surveys, a pleasant appearance is important for
securing cooperation.

The final step in the design of a questionnaire
is the pretest and possible revision.  A
questionnaire should always be pretested with
members of the target audience.  This will
preclude expending large amounts of effort
and then discovering that the questionnaire
produces biased or incomplete information.

4.5 AERIAL RECONNAISSANCE AND
   PHOTOGRAPHY

Aerial reconnaissance and photography can be
useful tools for gathering physical farm
information quickly and comparatively
inexpensively, and they are used in
conservation for a variety of purposes. Aerial
photography has been proven to be helpful for
agricultural conservation practice
identification (Pelletier and Griffin, 1988);
rangeland monitoring (BLM, 1991); terrain
stratification, inventory site identification,
planning, and monitoring in mountainous
regions (Hetzel, 1988; Born and Van Hooser,
1988); as well as for forest regeneration
assessment (Hall and Aired, 1992) and forest
inventory and analysis (Hackett, 1988).
Factors such as the characteristics of what is
being monitored, scale, and camera format
determine how useful aerial photography can
be for a particular purpose.

Pelletier and Griffin (1988) investigated the
use of aerial photography for the identification
of agriculture conservation practices.  They
found that practices that occupy a large area
and have an identifiable pattern, such as
contour cropping, strip cropping, terraces, and
windbreaks, were readily identified even at a
small scale (1:80,000) but that smaller, single-
unit practices, such as sediment basins and
sediment diversions, were difficult to identify
at a small scale.  They estimated that 29
percent of practices could be identified at a
scale of 1:80,000, 45 percent could be
identified at 1:30,000, 70 percent could be
identified at 1:15,000, and over 90 percent
could be identified at a scale of 1:10,000.

Photographic scale and resolution must be
taken into consideration when deciding
whether to use aerial photography, and a
photographic scale that produces good
resolution of the items of importance to the
monitoring effort must be chosen.  The Bureau
of Land Management (BLM) uses low-level,
large-scale (1:1,000 to 1:1,500) aerial
photography to monitor rangeland vegetation
(BLM, 1991). The agency reports that scales
smaller than 1:1,500 (e.g., 1:10,000, 1:30,000)
are too small to monitor the classes  of land
cover (shrubs, grasses and forbs, bare soil,
rock) on rangeland. Born and Van Hooser
(1988) found that a scale of 1:58,000 was
marginal for use in forestry resource
inventorying and monitoring.

Camera format is a factor that also must be
considered.  Large-format cameras are
generally preferred over small-format cameras
(e.g., 35 mm), but are more costly to purchase
and operate. The large negative size (9 cm x 9
cm) produced using a large-format camera
provides the resolution and detail necessary
for accurate photo interpretation. Large-
format cameras can be used from higher
altitudes than small-format cameras, and the
image area covered by a large-format image at
a given scale (e.g., 1:1,500) is much larger
than the image area captured by  a small-

-------
  Chapter 4
                 Conducting the Evaluation
format camera at the same scale.  Small-scale
cameras (i.e., 35 mm) can be used for
identifications that involve large-scale
features, such as mining areas, the extent of
burning, and large animals in censuses, and
they are less costly to purchase and use than
large-format cameras, but they are limited in
the altitude that the photographs can be taken
from and the resolution that they provide when
enlarged (Owens, 1988).

BLM recommends the use of a large-format
camera because the images provide the photo
interpreter with more geographical reference
points, it provides flexibility to increase
sample plot size, and it permits modest
navigational errors during overflight (BLM,
1991).  Also,  if hiring someone to take the
photographs,  most photo contractors will have
large-format equipment for the purpose.

A drawback to the use of aerial photography is
that conservation  practices that do not meet
implementation or operational standards but
that are similar to practices that do are
indistinguishable  from ones that do in an aerial
photograph (Pelletier and Griffin, 1988).
Also, practices that are defined by managerial
concepts rather than physical criteria,  such as
irrigation water management or nutrient
management, cannot be detected with aerial
photographs.

Regardless of scale, format, or item being
monitored, it is useful for photo interpreters to
receive 2-3 days of training on the basic
fundamentals of photo interpretation and that
they be thoroughly familiar with the
vegetation and land forms in the areas where
the photographs that they will be interpreting
were taken (BLM, 1991). A visit to the farms
in the photographs is recommended to
improve correlation between the interpretation
and actual farm characteristics. Generally,
after a few visits and interpretations of
photographs of those farms, photo interpreters
will be familiar with the photographic
characteristics of the vegetation in the area and
the farm visits can be reserved for verification
of items in doubt. A change in type of
vegetation or physiography in photographs
normally requires new visits until photo
interpreters are familiar with the
characteristics of the new vegetation in the
photographs.

Information on obtaining aerial photographs is
available from the Farm Service Agency and
the Natural Resources Conservation Service.
Contact the Farm Service Agency  at:  USDA
FSA Aerial Photography Field Office, P.O.
Box 30010,  Salt Lake City, UT, 84130-0010,
(801) 975-3503. The Farm Service Agency's
Internet address is http://www.fsa.usda.gov.
Contact the Natural Resources Conservation
Service at:  NRCS National Cartography and
Geospatial Center, Fort Worth Federal Center,
Bldg 23, Room 60, P.O. Box 6567, Fort
Worth, TX 76115-0567; 1-800-672-5559.
NRCS's Internet address is
http://www.ncg.nrcs.usda.gov.

-------
                   CHAPTER 5.  PRESENTATION OF EVALUATION RESULTS
5.1  INTRODUCTION

The first three chapters of this guidance
presented techniques for the collection of
information.  Data analysis and interpretation
are addressed in detail in Chapter 4 of EPA's
Monitoring Guidance for Determining the
Effectiveness ofNonpoint Source Controls
(USEPA, 1997). This chapter provides ideas
for the presentation of results.

The presentation of MM or BMP compliance
survey results, whether written or oral, is an
integral part of a successful monitoring study.
The quality of the presentation of results is  an
indication of the quality of the compliance
survey, and if the presentation fails to convey
important information from the compliance
survey to those who need the information, the
compliance survey itself might be considered a
failure.

The quality of the presentation of results is
dependent on at least four criteria—it must  be
complete, accurate, clear, and concise
(Churchill, 1983). Completeness means that
the presentation provides all necessary
information to the audience in the language
that it understands; accuracy is determined  by
how well an investigator handles the data,
phrases findings, and reasons; clarity is the
result of clear and logical thinking and a
precision of expression; and conciseness is  the
result of selecting for inclusion only that
which is necessary.

Throughout the process of preparing the
results of a MM or BMP compliance survey
for presentation, it must be kept in mind that
the study was initially undertaken to provide
information for management
purposes—specifically, to help make a
decision (Tull and Hawkins, 1990). The
presentation of results should be built around
the decision that the compliance survey was
undertaken to support.  The message of the
presentation must also be tailored to that
decision. It must be realized that there will be
a time lag between the compliance survey and
the presentation of the results, and the results
should be presented in light of their
applicability to the management decision to be
made based on them. The length of the time
lag is a key factor in determining this
applicability. If the time lag is significant, it
should be made clear during the presentation
that the situation might have changed since the
survey was conducted.  If reliable trend data
are available, the person making the
presentation might be able to provide a sense
of the likely magnitude of any change in the
situation. If the change in status is thought to
be insignificant, evidence should be presented
to support this claim. For example, state that
"At the time that the compliance survey was
conducted, farmers were using BMPs with
increasing frequency, and the lack of any
changes in program implementation coupled
with continued interaction with farmers
provides no reason to believe that this trend
has changed since that time."  It would be
misleading to state "The monitoring study
indicates that farmers are using BMPs with
increasing frequency."  The validity and force
of the message will be enhanced further
through use of the active voice (we believe}
rather than the passive voice (it is believed).

Three major factors must be considered when
presenting the results of MM and BMP

-------
  Presentation of Evaluation R
                                   Chapter 5
implementation studies: Identifying the target
audience, selecting the appropriate medium
(printed word, speech, pictures, etc.), and
selecting the most appropriate format to meet
the needs of the audience.

5.2  AUDIENCE IDENTIFICATION

Identification of the audience(s) to which the
results of the MM and BMP compliance
survey will be presented determines the
content and format of the presentation.  For
results of compliance survey studies, there are
typically six potential audiences:

•  Interested/concerned citizens
•  Farm owners and managers
•  Media/general public
•  Policy makers
•  Resource managers
   Scientists

These audiences have different information
needs,  interests, and  abilities to understand
complex data.  It is the job of the person(s)
preparing the presentation to analyze these
factors prior to preparing a presentation. The
four  criteria for presentation quality apply
regardless of the audience.  Other elements of
a comprehensive presentation, such as
discussion of the objectives and limitations of
the study and necessary details of the method,
must be part of the presentation and must be
tailored to the  audience. For instance, details
of the sampling plan, why the plan was chosen
over others, and the statistical methods used
for analysis might be of interest to other
investigators planning a similar study, and
such details should be recorded even if they
are not part of any presentation of results
because of their value for future reference
when the monitoring is repeated or similar
studies are undertaken, but they are best not
included in a presentation to management.

5.3  PRESENTATION FORMAT

Regardless of whether the results of a
compliance survey are presented written or
orally, or both, the information being
presented  must be understandable to the
audience.  Consideration of who the audience
is will help ensure that the presentation is
particularly suited to its needs, and choice of
the correct format for the presentation will
ensure that the information is conveyed in a
manner that is easy to comprehend.

Most reports will have to be presented both
written and orally. Written reports are
valuable for peer review, public information
dissemination, and for future reference. Oral
presentations are often necessary for
managers, who usually do not have time to
read an entire report, only have need for the
results of the study, and are usually not
interested  in the  finer details of the study.
Different versions of a report might well have
to be written—for the public, scientists, and
managers  (i.e., an executive  summary)—and
separate oral presentations for different
audiences—the public, farmers, managers, and
scientists at a conference—might have to be
prepared.

Most information can most effectively be
presented  in the form of tables, charts, and
diagrams (Tull and Hawkins, 1990). These
graphic forms of data and information
presentation can help simplify the
presentation, making it easier for an audience
to comprehend than if explained exhaustively
with words. Words are important for pointing
out significant ideas or findings, and for
interpreting the results where appropriate.

-------
  Chapter 5
                 Presentation of Evaluation R
Words should not be used to repeat what is
already adequately explained in graphics, and
slides or transparencies that are composed
largely of words should contain only a few
essential ideas each.  Presentation of too much
written information on a single slide or
transparency only confuses the audience.
Written slides or transparencies should also be
free of graphics, such as clever logos or
background highlights—unless the pictures are
essential to understanding the information
presented—since they only make the slides or
transparencies more difficult to  read.
Examples of graphics and written slides are
presented in Figures 5-1 through 5-3.

Different types of graphics have different uses
as well. Information presented in a tabular
format can be difficult to interpret because the
reader has to spend some time with the
information to extract the essential points from
it. The same information presented in a pie
chart or bar graph can convey essential
information immediately and avoid the
inclusion of background data that are not
essential to the point. When preparing
information for a report, an investigator should
organize the information in various ways and
choose that which conveys only the
information essential for the audience in the
least complicated manner.

5.3.1   Written Presentations

The following criteria should be considered
when preparing written material:

•  Reading level or level of education of the
   target audience.

•  Level of detail necessary to make the
   results understandable to the target
   audience)Different audiences require
   various levels of background information
   to fully understand the study's results.

•  Layout.  The integration of text, graphics,
   color, white space, columns, sidebars, and
   other design elements is important in the
   production of material that the target
   audience will find readable and visually
   appealing.

•  Graphics.  Photos, drawings, charts, tables,
   maps, and other graphic elements can be
   used to effectively present information that
   the reader might otherwise not understand.

5.3.2   Oral Presentations

An effective oral presentation requires special
preparation. Tull and Hawkins (1990)
recommend three steps:

1.  Analyze the audience, as explained above;

2.  Prepare an outline of the presentation, and
   preferably a written script;

3.  Rehearse it.  Several dry runs of the
   presentation should be made, and if

-------
Presentation of Evaluation R
Chapter 5
5 Leading Sources of Water Quality Impairment
in various types of water bodies
RANK RIVERS

1 Agriculture

2 STPs

3 Habitat
Modification

4 Urban Runoff
Resource
Extraction
LAKES
Agriculture
STPs
Urban Runoff
Other NPS
Habitat
Modification
ESTUARIES
Urban Runoff
STPs
Agriculture
Industry Point
Sources

Petroleum Activities
Figure 5-1. Example of presentation of information in a written slide. (Source: USEPA, 1995)
possible it should be taped on a VCR and
the presentation analyzed.

These steps are extremely important if an oral
presentation is to be effective. Remember that
oral presentations of 1A to 1 hour are often all
that is available for the presentation of the
results of months of research to managers who
are poised to make decisions based on the
presentation. Adequate preparation is essential
if the oral presentation is to accomplish its
purpose.

5.4 FOR FURTHER INFORMATION

The provision of specific examples of
effective and ineffective presentation graphics,
writing styles, and organizations is beyond the
scope of this document. A number of
resources that contain suggestions for how
study results should be presented are available,
however, and should be consulted. A listing
of some references is provided below.

• The New York Public Library Writer's
Guide to Style and Usage (NYPL, 1994)
has information on design, layout, and
presentation in addition to guidance on
grammar and style.

• Good Style: Writing for Science and
Technology (Kirkman, 1992) provides

-------
  Chapter 5
              Presentation of Evaluation R
                      Agricultyral
                                    (21       Reporting)
                                                         Good
                                     Jmpwircsf by Agriculture-
                                         1M, 55?
                   Nonirrigaledi Crop Prod.

                   iirig jtetl Crop Prod,

                   RampHand

                   Feedlots

                   Pastureiancf

                   Animal Holding Areas,
                                                                     •Hfr.
                                          i     10    15     »     75
                                          Perctnt of River Mll« tmpaeled
                                            by AprieyllyiB in General
Figure 5-2. Example of representation of data using a combination of a pie chart and a
horizontal  bar chart.  (Source:  USEPA, 1995)
techniques for presenting technical material in
a coherent, readable style.

•   The Modern Researcher (Barzun and
    Graff, 1992) explains how to turn research
into readable, well organized writing.

-------
 Presentation of Evaluation R
Chapter 5
       Leading Sources of Pollution
          Relative Quantity of Lake Acres Affected by Source
                   n^T^BI Municipal Point Sources
 Figure 5-3.  Example of representation of data in the form of
 a pie chart.

•   Writing with Precision: How to Write So
   That You Cannot Possibly Be
   Misunderstood, 6th ed. (Bates, 1993)
   addresses communication problems of the
   1990s.

•   Designer's Guide to Creating Charts &
   Diagrams (Holmes, 1991) gives tips for
   combining graphics with statistical
   information.

•   The Elements of Graph Design (Kosslyn,
   1993) shows how to create effective
   displays of quantitative data.

-------
                                                                     REFERENCES

Academic Press.  1992.  Dictionary of Science and Technology.  Academic Press, Inc., San
Diego, California.

Barzun, J., and H.F. Graff. 1992. The Modern Researcher. 5th ed. Houghton Mifflin.

Bates, J. 1993. Writing with Precision: How to Write So That You Cannot Possibly Be
Misunderstood. 6th ed. Acropolis.

Blalock, H.M., Jr. 1979. Social Statistics. Rev. 2nd ed.  McGraw-Hill Book Company, New
York, NY.

BLM.  1991.  Inventory and Monitoring Coordination: Guidelines for the Use of Aerial
Photography in Monitoring. Technical Report TR 1734-1. Department of the Interior, Bureau
of Land Management.

Born, J.D., and D.D. Van Hooser.  1988. Intermountain Research Station remote sensing use
for resource inventory, planning, and monitoring. In Remote Sensing for Resource Inventory,
Planning,  and Monitoring.  Proceedings of the Second Forest Service Remote Sensing
Applications Conference, Sidell, Louisiana, andNSTL, Mississippi, April 11-15, 1988.

Casley, D.J., and D.A. Lury. 1982. Monitoring and Evaluation of Agriculture and Rural
Development Projects. The Johns Hopkins University Press, Baltimore, MD.

Churchill, G. A., Jr. 1983. Marketing Research: Methodological Foundations, 3rd ed. The
Dryden Press, New York, New York.

Cochran, W.G.  1977. Sampling techniques. 3rd ed. John Wiley and Sons, New York, New
York.

Cross-Smiecinski, A., and L.D. Stetzenback. 1994. Quality planning for the life science
researcher: Meeting quality assurance requirements. CRC Press, Boca Raton, Florida.

CTIC.  1994.  1994 National Crop Residue Management Survey.  Conservation Technology
Information Center, West Lafayette, IN.

CTIC.  1995. Conservation IMPACT,  vol. 13, no. 4, April 1995. Conservation Technology
Information Center, West Lafayette, IN.

Ferber, R., D.F. Blankertz, and S. Hollander. 1964. Marketing Research. The Ronald Press
Company, New York, NY.

-------
 References
Freund, I.E.  1973.  Modern elementary statistics. Prentice-Hall, Englewood Cliffs, New Jersey.

Gaugush, R.F. 1987. Sampling Design for Reservoir Water Quality Investigations. Instruction
Report E-87-1. Department of the Army, US Army Corps of Engineers, Washington, DC.

Gilbert, R.O.  1987. Statistical Methods for Environmental Pollution Monitoring. VanNostrand
Reinhold, New York, NY.

Hackett, R.L.  1988. Remote sensing at the North Central Forest Experiment Station.  In Remote
Sensing for Resource Inventory, Planning, and Monitoring. Proceedings of the Second Forest
Service Remote Sensing Applications Conference, Sidell, Louisiana, and NSTL, Mississippi,
April 11-15,  1988.

Hall, R.J.,  and A.H. Aired. 1992. Forest regeneration appraisal with large-scale aerial
photographs.  The Forestry Chronicle 68(1): 142-150.

Helsel, D.R., andR.M. Hirsch.  1995. Statistical Methods in Water Resources. Elsevier.
Amsterdam.

Hetzel, G.E.   1988. Remote sensing applications and monitoring in the Rocky Mountain region.
In Remote Sensing for Resource Inventory, Planning, and Monitoring. Proceedings of the
Second Forest Service Remote Sensing Applications Conference, Sidell, Louisiana, and NSTL,
Mississippi, April 11-15, 1988.

Holmes, N. 1991. Designer's Guide to Creating Charts & Diagrams.  Watson-Guptill.

Hook, D., W. McKee, T. Williams, B. Baker, L. Lundquist, R. Martin, and J. Mills. 1991. A
Survey of Voluntary Compliance of Forestry BMPs.  South Carolina Forestry Commission,
Columbia, SC.

IDDHW. 1993. Forest Practices Water Quality Audit 1992. Idaho Department of Health and
Welfare, Division of Environmental Quality, Boise, ID.

Kirkman, J. 1992. Good Style: Writing for Science and Technology. Chapman and Hall.

Kosslyn, S.M. 1993. The Elements of Graph Design. W.H. Freeman.
Kupper, L.L., and K.B. Hafner.  1989. How appropriate are popular sample size formulas? Am.
Stat. 43:101-105.

MacDonald,  L.H., A.W. Smart, and R.C. Wissmar. 1991. Monitoring Guidelines to Evaluate
the Effects of Forestry Activities on Streams in the Pacific Northwest and Alaska. EPA/910/9-91-
001. U.S. Environmental Protection Agency Region  10, Seattle, WA.

-------
References
Mann, H.B., and D.R. Whitney. 1947. On a test of whether one of two random variables is
stochastically larger than the other. Annals of Mathematical Statistics 18:50-60.

McNew, R.W. 1990. Sampling and Estimating Compliance with BMPs. In Workshop on
Implementation Monitoring of Forestry Best Management Practices., Southern Group of State
Foresters, USDA Forest Service, Southern Region, Atlanta, GA, January 23-25, 1990, pp. 86-
105.

Meals, D.W. 1988. Laplatte River Watershed Water Quality Monitoring & Analysis Program.
Program Report No. 10. Vermont Water Resources Research Center, School of Natural
Resources, University of Vermont, Burlington, VT.

NYPL. 1994. The New York Public Library Writer's Guide to Style and Usage. A Stonesong
Press book. HarperCollins Publishers, New York, NY.

Owens, T. 1988. Using 35mm photographs in resource inventories. In Remote Sensing for
Resource Inventory, Planning, and Monitoring. Proceedings of the Second Forest Service
Remote Sensing Applications Conference, Sidell, Louisiana, andNSTL, Mississippi, April 11-
15, 1988.

Pelletier, R.E., and R.H. Griffin. 1988. An evaluation of photographic scale in aerial
photography for identification of conservation practices. J. Soil Water Conserv. 43(4):333-337.

Rashin, E., C. Clishe, and A. Loch. 1994. Effectiveness of forest road and timber harvest best
management practices with respect to sediment-related water quality impacts. Interim Report
No. 2. Washington State Department of Ecology, Environmental Investigations and Laboratory
Services program, Wastershed Assessments Section. Ecology Publication No. 94-67. Olympia,
Washington.

Remington, R.D., and M.A. Schork. 1970. Statistics with applications to the biological and
health sciences. Prentice-Hall, Englewood Cliffs, New Jersey.

Rossman, R., and M.J. Phillips. 1991. Minnesota forestry best management practices
implementation monitoring. 1991 forestry field audit. Minnesota Department of Natural
Resources, Division of Forestry.

Schultz, B. 1992. Montana Forestry Best Management Practices Implementation Monitoring.
The 1992 Forestry BMP Audits Final Report. Montana Department of State Lands, Forestry
Division, Missoula, MT.

Snedecor, G.W. and W.G. Cochran. 1980. Statistical methods. 7th ed. The Iowa State
University Press, Ames, Iowa.

-------
 References
Tull, D.S., and D.I. Hawkins.  1990.  Marketing Research. Measurement and Method. Fifth
edition. Macmillan Publishing Company, New York, New York.

USDA. 1994a. 1992 National Resources Inventory.  U.S. Department of Agriculture, Natural
Resource Conservation Service, Resources Inventory and Geographical Information Systems
Division, Washington, DC.

USDA. 1994b. Agricultural Resources and Environmental Indicators. Agricultural Handbook
No. 705. U.S. Department of Agriculture, Economic Research Service, Natural Resources and
Environmental Division, Herndon, VA.

USDA. Undated. Preparing Statistics for Agriculture.  U.S. Department of Agriculture,
National Agricultural Statistics Service, Washington, DC.

USDOC. 1994.  1992 Census of Agriculture.  U.S. Department of Commerce, Bureau of the
Census. U.S. Government Printing Office, Washington, DC.

USEPA. 1993 a. Guidance Specifying Management Measures For Sources OfNonpoint
Pollution In Coastal Waters. EPA 840-B-92-002. U.S. Environmental Protection Agency,
Office of Water, Washington, DC.

USEPA. 1993b. Evaluation of the Experimental Rural Clean Water Program. EPA 841-R-93-
005. U.S. Environmental Protection  Agency, Office of Water, Washington, DC.

USEPA. 1995. National water quality inventory 1994 Report to Congress. EPA 841-R-95-005.
U.S. Environmental Protection Agency, Office of Water, Washington, DC.

USEPA. 1997. Monitoring Guidance for Determining the Effectiveness ofNonpoint Source
Controls. EPA 841-B-96-004. U.S.  Environmental Protection Agency, Office of Water,
Washington, DC.  August.

USGS. 1990. Land Use and Land Cover Digital Data from 1:250,000- and 1:100,000-Scale
Maps: Data Users Guide.  National Mapping Program Technical Instructions Data Users Guide
4. U.S. Department of the Interior, U.S. Geological Survey, Reston, VA.

Wilcoxon, F.  1945.  Individual comparisons by ranking metheds. Biometrics 1:80-83.

Winer, BJ. 1971. Statistical principles in experimental design.  McGraw-Hill Book Company,
New York.

-------
                                                                          GLOSSARY

accuracy, the extent to which a measurement approaches the true value of the measured
quantity

aerial photography: the practice of taking photographs from an airplane, helicopter, or other
aviation device while it is airborne

allocation, Neyman: stratified sampling in which the cost of sampling each stratum is in
proportion to the size of the stratum but variability between strata changes

allocation, proportional:  stratified sampling in which the variability and cost of sampling for
each stratum are in proportion to the size of the stratum

allowable error: the level of error acceptable for the purposes of a study

ANOVA: see analysis of variance

analysis of variance: a statistical test used to determine whether two or more sample means
could have been obtained from populations with the same parametric mean

assumptions: characteristics of a population of a sampling method taken to be true without proof

bar graph: a representation of data wherein data is grouped and represented as vertical or
horizontal bars over an axis

best professional judgement: an informed opinion made by a professional in the appropriate
field of study or expertise

best management practice:  a practice or combination of practices that are determined to be the
most effective and practicable means of controlling point and nonpoint pollutants at levels
compatible with environmental quality goals

bias:  a characteristic of samples such that when taken from a population with a known
parameter, their average does not give the parametric value

binomial: an algebraic expression that is the sum or difference of two terms

camera format:  refers to the size of the negative taken by a camera. 35mm is a small camera
format

chi-square distribution:  a scaled quantity whose distribution provides the distribution of the
sample variance

-------
  Glossary
coefficient of variation: a statistical measure used to compare the relative amounts of variation
in populations having different means

confidence interval:  a range of values about a measured value in which the true value is
presumed to lie

conservation tillage:  a method of conservation in which plant material is left on the ground after
harvest to control erosion

consistency: conforming to a regular method or style; an approach that keeps all factors of
measurement similar from one measurement to the next to the extent possible

contour farming: a farming method in which fields are tilled along the topographic contours of
the land

cumulative effects:  the total influences attributable to numerous individual influences

degrees of freedom:  the number of residuals (the difference between a measured value and the
sample average) required to completely determine the others

design, balanced:  a sampling design wherein separate sets of data to be used are similar in
quantity and type

distribution: the allocation or spread of values of a given  parameter among its possible values

e-mail: an electronic system for correspondence

erosion potential:  a measure of the ease with which soil can be carried away in storm water
runoff or irrigation runoff

error: the fluctuation that occurs from one repetition to another; also experimental error

estimate, baseline:  an estimate of baseline, or actual conditions

estimate, pooled: a single estimate obtained from grouping individual estimates and using the
latter to obtain a single value

finite population correction term: a correction term used when population size is small relative
to sample size

Friedman test:  a nonparametric test that can be used for analysis when two variables are
involved

-------
                                                                                 Glossary
hydrologic modification:  the alteration of the natural circulation or distribution of water by the
placement of structures or other activities

hypothesis, alternative: the hypothesis which is contrary to the null hypothesis

hypothesis, null: the hypothesis or conclusion assumed to be true prior to any analysis

Internet: an electronic data transmission system

Kruskal-Wallis test: a nonparametric test recommended for the general case with a samples and
nt variates per sample

management measure: an economically achievable measure for the control of the addition of
pollutants from existing and new categories and classes of nonpoint sources of pollution, which
reflect the greatest degree of pollutant reduction achievable through the application of the best
available nonpoint pollution control practices, technologies, processes, siting criteria, operating
methods, or other alternatives

Mann-Whitney test:  a nonparametric test for use when a test is only between two samples

mean, estimated:  a value of population mean arrived at through sampling

mean, overall: the measured average of a population

mean, stratum: the measured average within a sample subgroup or stratum

measurement bias: a consistent under- or overestimation of the true value of something being
measured, often due to the method of measurement

measurement error: the deviation of a measurement from the true value of that which is being
measured

median:  the value of the middle term when data are arranged in order of size; a measure of
central tendency

monitoring, baseline:  monitoring conducted to establish initial knowledge about the actual state
of a population

monitoring, compliance:  monitoring conducted to determine if those who must implement
programs, best management practices, or management measures, or who  must conduct
operations according to standards or specifications are doing so

-------
 Glossary
monitoring, project:  monitoring conducted to determine the impact of a project, activity, or
program

monitoring, validation:  monitoring conducted to determine how well a model accurately reflects
reality

navigational error: errors in determining the actual location (altitude or latitude/longitude) of an
airplane or other aviation device due to instrumentation or the operator

nominal:  referred to by name; variables that cannot be measured but must be expressed
qualitatively

nonparametric method:  distribution-free method; any of various inferential procedures whose
conclusions do not rely on assumptions about the distribution of the population of interest

normal approximation:  an assumption that a population has the characteristics of a normally-
distributed population

normal deviate:  deviation from the mean expressed in units of a

nutrient management plan: a plan for managing the quantity of nutrients applied to crops to
achieve maximum plant nutrition and minimum nutrient waste

ordinal: ordered such that the position of an element in a series is specified

parametric method:  any statistical method whose conclusions rely on assumptions about the
distribution of the population of interest

physiography: a description of the surface features of the Earth; a description of landforms

pie chart: a representation of data wherein data is grouped and represented as more or less
triangular sections of a circle and the total is the entire circle

population, sample:  the members of a population that are actually sampled or measured

population, target: the population about which inferences are made; the group of interest, from
which samples are taken

population unit:  an individual member of a target population that can be measured
independently of other members

power: the probability of correctly rejecting the null hypothesis when the alternative hypothesis
is false.

-------
                                                                                 Glossary
precision:  a measure of the similarity of individual measurements of the same population

question, dichotomous: a question that allows for only two responses, such as "yes" and "no"

question, double-barreled: two questions asked as a single question

question, multiple-choice:  a question with two or more predetermined responses

question, open-ended: a question format that requires a response beyond "yes" or "no"

remote sensing:  methods of obtaining data from a location distant from the object being
measured, such as from an airplane or satellite

resolution: the sharpness of a photograph

sample size: the number of population units measured

sampling, cluster:  sampling in which small groups of population units are selected for sampling
and each unit in each selected group is measured

sampling, simple random:  sampling in which each unit of the target population has an equal
chance of being selected

sampling, stratified random: sampling in  which the target population is divided into separate
subgroups, each of which is more internally similar than the overall population is, prior to
sample selection

sampling, systematic:  sampling in which population units are chosen in accordance with a
predetermined sample selection system

sampling error:  error attributable to actual variability in population units not accounted for by
the sampling method

scale (aerial photography): the proportion of the image size of an object (such as a land area) to
its actual size, e.g., 1:3000.  The smaller the second number, the larger the scale

scale system: a system for ranking measurements or members of a population on a scale, such as
1 to 5

significance level:  Type I error expressed as a percentage, a probability, that measured values

standard deviation: a measure of spread; the positive  square root of the variance

-------
 Glossary
standard error: an estimate of the standard deviation of means that would be expected if a
collection of means based on equal-sized samples of n items from the same population were
obtained

statistical inference:  conclusions drawn about a population using statistics

statistics, descriptive: measurements of population characteristics designed to summarize
important features of a data set

stratification:  the process of dividing a population into internally similar subgroups

stratum:  one of the subgroups created prior to sampling in stratified random  sampling

streamside management area: a designated area that consists of a waterbody (e.g., stream) and
an adjacent area of varying width where management practices that might affect water quality,
fish, or other aquatic  resources are modified to protect the waterbody and its adjacent resources
and to reduce the  pollution effect of an activity on the waterbody

Student's t test: a statistical test  used  to test for significant differences between means when only
two samples are involved

subjectivity: a characteristic of analysis that requires personal judgement on the part of the
person doing the analysis

target audience: the  population that a monitoring effort is intended to measure

tillage:  the operation of implements through  the soil to prepare seedbeds and rootbeds, control
weeds and brush,  aerate the soil, and cause faster breakdown of organic matter and minerals to
release plant foods

total maximum daily  load: a total allowable addition of pollutants from all affecting sources to
an individual waterbody over a 24-hour period

transformation, data: manipulation of data such that it will meet the assumptions required for
analysis

Tukey's test: a test to ascertain whether the interaction found in a given set of data can be
explained in terms of multiplicative main effects

unit sampling cost: the cost of attributable to sampling a single population unit

variance:  a measure  of the  spread of data around the mean

-------
                                                                                 Glossary
watershed assessment:  an investigation of numerous characteristics of a watershed in order to
describe its actual condition

Wilcoxon's test:  a nonparametric test for use when a test is only between two samples

-------
                                                                            INDEX
accuracy 2-10, 4-14
allocation
      Neyman 2-25, 2-27
      proportional 2-25
analysis of variance 3-4
      rank-transformed 3-4
best professional judgement 2-2
bias, see error
BMP
      pass/fail rating system 4-9
      scale rating system 4-9
BMP implementation assessments
      site-specific 1-2
      watershed 1-2
camera format 4-20
Census of Agriculture 2-13, 2-15, 2-16
Clean Water Act
      Section 303(d) 1-2
      Section 319(h) 1-2
Coastal Nonpoint  Pollution Control
             Program 1-1
Coastal Zone Act Reauthorization
             Amendments of 1990 1-1
      Section 6217(b) 1-2
      Section 6217(d) 1-2
      Section 6217(g) 1-2
complaint records  2-16
Computer-aided Management Practices
             System 2-17
Conservation  Technology Information
             Center 2-28
consistency 4-8, 4-12
Cooperative Extension Service 2-16
cost of evaluations 4-17
County Transect Survey 2-28
County X example 2-22, 2-24, 2-25
data
      accessibility 1-5,1-6
      electronic storage 1-6
      historical 2-18
      life cycle 1-5
      longevity 1-5
      management 1-5
       reliability 1-5
       transformation 4-10
Economic Research Section, USD A 2-14
error 2-8
       due to nonrespondents 2-10
       measurement 2-8
       reducing 2-10
       sampling 2-10
       Type 12-12
       Type II2-12
estimate
       point 2-11
       pooled 3-3
estimation 2-11
evaluations
       expert 4-1, 4-7
       information obtainable from 4-1
       mock 4-8
       self 4-1,  4-13
       site 4-7
       teams 4-7
       training for 4-8
       variable selection 4-4
       variables 4-2
farm numbers, VSDA 2-16
Farm Service Agency 2-17, 4-21
       Aerial Photography Field Office 4-21
Field Office Computing System 2-17
finite population correction term 2-18
Friedman test 3-4
hypothesis
       alternative 2-12
       null 2-12
hypothesis testing 2-12
implementation rating 4-9
interviews, personal 4-1
Kruskal-Wallis test 3-4
Land Maps, county 2-16
Land Use and Land Cover, VSGS 2-16
Management measures 1-2
Mann-Whitney test 3-2
monitor 1-3
       and CNPCPs 1-3

-------
       baseline 1-4
       compliance 1-4
       effectiveness 1-4
       implementation 1-3, 2-1
       objectives 1-4, 2-1
       project 1-4
       trend 1-3
       uses 1-4
       validation 1-4
Monitoring Guidance for Determining the
              Effectiveness ofNonpoint
              Source Control Measures 1-4,
              1-5, 2-1, 5-1
National Agriculture Statistics Service 2-16,
              4-14
National Crop Residue Management Survey
              2-28
National Oceanic and Atmospheric
             Administration 1-1
National Resources Inventory 2-15
Natural Resources Conservation Service 4-21
nonpoint source pollution, sources 1-1
photographs 4-13
       aerial 2-18
photography
       aerial 4-1, 4-20
       resolution 4-20
       scale 4-20
population
       assumptions about 2-8
       sample, definition 2-2
       target, definition 2-2
       units, definition 2-2
       variation 2-8
precision 2-10, 2-19
presentations 5-1
       and time lag 5-1
       audience 5-2
       criteria 5-1
       format 5-2
       graphics 5-3
       major factors 5-2
       oral 5-2, 5-3
       resources 5-4
       written 5-2, 5-3
quality assurance and quality control 1-4,
             4-12
quality assurance project plan 1-4
questionnaires
       content 4-18
       design 4-17
       dichotomous 4-19
       elements 4-18
       layout 4-19
       multiple-choice 4-19
       objective 4-18
       open-ended 4-19
       ordering of questions 4-19
      phrasing 4-19
      pretest 4-19
       response format 4-19
rating systems
       binary 4-9
       consistency 4-10
       overall rating 4-11
       scale 4-9
       terms 4-10
sample size, estimation 2-18
sampling
       cluster 2-5, 2-27
      per unit cost 2-25
      probabilistic 2-2
       simple random 2-3, 2-20
       strategy 2-13
       stratified random 2-3, 2-24
       systematic 2-8, 2-27
       timing 2-13
scale, appropriate 1-3
standard deviation, pooled 3-2
statistical inference 2-2
statistics
       confidence interval 2-11
       descriptive 2-11
       difference quantity 3-2
       overall mean 2-25
      parametric 2-19
       relative error 2-20
       significance level 2-12
       software 3-1
       stratum mean 2-25
Student's t test 2-21, 3-2
       two-sample 3-2

-------
Surveys
       accuracy of information 4-14
       mail 4-1, 4-14
       telephone 4-1, 4-14
tests
       one-sided, hypotheses 3-1
       two-sided, hypotheses 3-1
Tukey 's test 3-4
U.S. Environmental Protection Agency 1-1
Wilcoxon 's test 3-3
[Note: Italicized page numbers indicate
location of definitions of terms.]

-------
    APPENDIX A
Statistical Tables

-------
Appendix A
Table Al. Cumulative areas under the
to














^P
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
1.1
1.2
1.3
1.4
1.5
1.6
1.7
1.8
1.9
2.0
2.1
2.2
2.3
2.4
2.5
2.6
2.7
2.8
2.9
3.0
3.1
3.2
3.3
3.4














0.00
0.5000
0.5398
0.5793
0.6179
0.6554
0.6915
0.7257
0.7580
0.7881
0.8159
0.8413
0.8643
0.8849
0.9032
0.9192
Zp)









^^




0.01
0.5040
0.5438
0.5832
0.6217
0.6591
0.6950
0.7291
0.7611
0.7910
0.8186
0.8438
0.8665
0.8869
0.9049
0.9207
0.9332I 0.9345
0.9452 ""0.9463
0.9554 0.9564
0.9641
0.9713
0.9772
0.9821
0.9861
0.9893
0.9918
0.9938
0.9953
0.9965
0.9974
0.9981
0.9987
0.9990
0.9993
0.9995
0.9997

0.9649
0.9719
0.9778
0.9826
0.9864
0.9896
0.9920
0.9940
0.9955
0.9966
0.9975






A
A
/
y




0.02
0.5080
0.5478
0.5871
0.6255
0.6628
0.6985
0.7324
0.7642
0.7939
0.8212
0.8461
0.8686
0.8888
0.9066
0.9222
0.9357
0.9474
0.9573
0.9656
0.9726
0.9783
0.9830
0.9868
0.9898
0.9922
0.9941
0.9956
0.9967
0.9976
0.9982 0.9982
0.9987
0.9991
0.9993
0.9995
0.9997

0.9987
0.9991
0.9994
0.9995
0.9997





/"*"•>
Normal distribution (values of p corresponding





/ 1^-
/




0.03
0.5120
0.5517
0.5910
0.6293
0.6664
0.7019
0.7357
0.7673
0.7967
0.8238
0.8485
0.8708
0.8907
0.9082
0.9236
0.9370
0.9484
0.9582
0.9664
0.9732
0.9788
0.9834
0.9871
0.9901
0.9925
0.9943
0.9957
0.9968
0.9977
0.9983
0.9988
0.9991
0.9994
0.9996
0.9997
\
\












/ A
1^ p

r 	
^^


0.04
0.5160
0.5557
0.5948
0.6331
0.6700
0.7054
0.7389
0.7704
0.7995
0.8264
0.8508
0.8729
0.8925
0.9099
0.9251
0.9382
0.9495
0.9591
0.9671
0.9738
0.9793
0.9838
0.9875
0.9904
0.9927
0.9945
0.9959
0.9969
0.9977
0.9984
0.9988
0.9992
0.9994
0.9996
0.9997



0.05
0.5199
0.5596
0.5987
0.6368
0.6736
0.7088
0.7422
0.7734
0.8023
0.8289
0.8531
0.8749
0.8944
0.9115
0.9265
0.9394
0.9505
0.9599
0.9678
0.9744
0.9798
0.9842
0.9878
0.9906
0.9929
0.9946
0.9960
0.9970
0.9978
0.9984
0.9989
0.9992
0.9994
0.9996
0.9997







rsa — ct






0.06
0.5239
0.5636
0.6026
0.6406
0.6772
0.7123
0.7454
0.7764
0.8051
0.8315
0.8554
0.8770
0.8962
0.9131
0.9279
0.9406
0.9515
0.9608
0.9686
0.9750
0.9803
0.9846
0.9881
0.9909
0.9931
0.9948
0.9961
0.9971
0.9979
0.9985
0.9989
0.9992
0.9994
0.9996
0.9997














0.07
0.5279
0.5675
0.6064
0.6443
0.6808
0.7157
~07486
0.7794
0.8078
0.8340
0.8577
0.8790
0.8980
0.9147
0.9292
0.9418
0.9525
0.9616
0.9693
0.9756
0.9808
0.9850
0.9884
0.9911
0.9932
0.9949
0.9962
0.9972
0.9979
0.9985
0.9989
0.9992
0.9995
0.9996
0.9997












0.08
0.5319
0.5714
0.6103
0.6480
0.6844
0.7190
0.7517
0.7823
0.8106
0.8365
0.8599
0.8810
0.8997
0.9162
0.9306
0.9429
0.9535
0.9625
0.9699
0.9761
0.9812
0.9854
0.9887
0.9913
0.9934
0.9951
0.9963
0.9973
0.9980












0.09
0.5359
0.5753
0.6141
0.6517
0.6879
0.7224
0.7549
0.7852
0.8133
0.8389
0.8621
0.8830
0.9015
0.9177
0.9319
0.9441
0.9545
0.9633
0.9706
0.9767
0.9817
0.9857
0.9890
0.9916
0.9936
0.9952
0.9964
0.9974
0.9981
0.9986 0.9986
0.9990
0.9993
0.9995
0.9996
0.9997

0.9990
0.9993
0.9995
0.9997
0.9998

-------
Appendix A
Table A2. Percentiles of the ta ,jf distribution (values off such that 100(l-a)% of the
distribution is less than t)













df
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
35
40
50
60
80
100
150
200
inf.














a = 0.40
0.3249
0.2887
0.2767
0.2707
0.2672
0.2648
0.2632
0.2619
0.2610
0.2602
0.2596
0.2590
0.2586
0.2582
0.2579
0.2576
0.2573
0.2571
0.2569
0.2567
0.2566
0.2564
0.2563
0.2562
0.2561
0.2560
0.2559
0.2558
0.2557
0.2556
0.2553
0.2550
0.2547
0.2545
0.2542
0.2540
0.2538
0.2537
0.2533









S
	 ±±_



a =0.30
0.7265
0.6172
0.5844
0.5686
0.5594
0.5534
0.5491
0.5459
0.5435
0.5415
0.5399
0.5386
0.5375
0.5366
0.5357
0.5350
0.5344
0.5338
0.5333
0.5329
0.5325
0.5321
0.5317
0.5314
0.5312
0.5309
0.5306
0.5304
0.5302
0.5300
0.5292
0.5286
0.5278
0.5272
0.5265
0.5261
0.5255
0.5252
0.5244




^~
s
z
/
/





a = 0.20
1 .3764
1 .0607
0.9785
0.9410
0.9195
0.9057
0.8960
0.8889
0.8834
0.8791
0.8755
0.8726
0.8702
0.8681
0.8662
0.8647
0.8633
0.8620
0.8610
0.8600
0.8591
0.8583
0.8575
0.8569
0.8562
0.8557
0.8551
0.8546
0.8542
0.8538
0.8520
0.8507
0.8489
0.8477
0.8461
0.8452
0.8440
0.8434
0.8416





— -».
\
\
\
\





a = 0.10
3.0777
1.8856
1.6377
1 .5332
1 .4759
1.4398
1 .4149
1 .3968
1.3830
1.3722
1 .3634
1.3562
1.3502
1 .3450
1 .3406
1.3368
1 .3334
1 .3304
1.3277
1.3253
1 .3232
1.3212
1.3195
1 .3178
1 .3163
1.3150
1 .3137
1 .3125
1 .31 14
1 .3104
1 .3062
1.3031
1.2987
1 .2958
1 .2922
1.2901
1 .2872
1 .2858
1.2816








Area
/
S. ff
f^-
t

a =0.05
6.3137
2.9200
2.3534
2.1318
2.0150
1 .9432
1 .8946
1 .8595
1 .8331
1 .8125
1 .7959
1 .7823
1 .7709
1 .7613
1 .7531
1 .7459
1 .7396
1 .7341
1 .7291
1 .7247
1 .7207
1 .7171
1 .7139
1 .7109
1 .7081
1 .7056
1 .7033
1 .701 1
1 .6991
1 .6973
1 .6896
1 .6839
1 .6759
1 .6706
1 .6641
1 .6602
1 .6551
1 .6525
1 .6449








= a






a = 0.025
12.7062
4.3027
3.1824
2.7765
2.5706
2.4469
2.3646
2.3060
2.2622
2.2281
2.2010
2.1788
2.1604
2.1448
2.1315
2.1199
2.1098
2.1009
2.0930
2.0860
2.0796
2.0739
2.0687
2.0639
2.0595
2.0555
2.0518
2.0484
2.0452
2.0423
2.0301
2.0211
2.0086
2.0003
1 .9901
1.9840
1 .9759
1 .9719
1.9600















a =0.010
31 .8210
6.9645
4.5407
3.7469
3.3649
3.1427
2.9979
2.8965
2.8214
2.7638
2.7181
2.6810
2.6503
2.6245
2.6025
2.5835
2.5669
2.5524
2.5395
2.5280
2.5176
2.5083
2.4999
2.4922
2.4851
2.4786
2.4727
2.4671
2.4620
2.4573
2.4377
2.4233
2.4033
2.3901
2.3739
2.3642
2.3515
2.3451
2.3264















a =0.005
63.6559
9.9250
5.8408
4.6041
4.0321
3.7074
3.4995
3.3554
3.2498
3.1693
3.1058
3.0545
3.0123
2.9768
2.9467
2.9208
2.8982
2.8784
2.8609
2.8453
2.8314
2.8188
2.8073
2.7970
2.7874
2.7787
2.7707
2.7633
2.7564
2.7500
2.7238
2.7045
2.6778
2.6603
2.6387
2.6259
2.6090
2.6006
2.5758

-------
Appendix A
Table A3. Upper and lower pereentiles of the Chi-square distribution














df
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
35
40
50
60
70
80
90
100
200


















£.
f
I
I
\
1
I






X
\
\
\













V
-^
— I —
T°
A.







Are
?=
^—- 	









i = 1-p





















































































P
0.001

0.002
0.024
0.091
0.210
0.381
0.599
0.857
1.152
1.479
1.834
2.214
2.617
3.041
3.483
3.942
4.416
4.905
5.407
5.921
6.447
6.983
7.529
8.085
8.649
9.222
9.803
10.391
10.986
1 1 .588
14.688
17.917
24.674
31.738
39.036
46.520
54.156
61.918
143.84

0.005

0.010
0.072
0.207
0.412
0.676
0.989
1.344
1.735
2.156
2.603
3.074
3.565
4.075
4.601
5.142
5.697
6.265
6.844
7.434
8.034
8.643
9.260
9.886
10.520
11.160
1 1 .808
12.461
13.121
13.787
17.192
20.707
27.991
35.534
43.275
51.172
59.196
67.328
152.24

0.010

0.020
0.115
0.297
0.554
0.872
1.239
1.647
2.088
2.558
3.053
3.571
4.107
4.660
5.229
5.812
0.025 0.050 0.100 0.900
0.001
0.051
0.216
0.484
0.831
1.237
1.690
2.180
2.700
3.247
3.816
4.404
5.009
0.004 1 0.016
0.103 0.211
0.352
0.711
1.145
1.635
2.167
2.733
3.325
3.940
4.575
5.226
5.892
5.629 6.571
6.262 7.26T
6.908 7.962
6.408 7.564
7.015
7.633
8.260
8.897
9.542
10.196
10.856
1 1 .524
12.198
12.878
13.565
14.256
14.953
18.509
22.164
29.707
37.485
45.442
53.540
61.754
70.065
156.43
8.231
8.907
9.591
10.283
10.982
1 1 .689
12.401
13.120
8.672
9.390
10.117
10.851
11.591
12.338
13.091
13.848
14.611
13.8441 15.379
14.573 16.151
15.308 16.928
16.047 ~1 7.708
16.791
20.569
24.433
32.357
40.482
48.758
57.153
65.647
74.222
162.73
18.493
22.465
26.509
34.764
43.188
51.739
60.391
69.126
77.929
168.28
0.584
1.064
1.610
2.204
2.833
3.490
4.168
4.865
5.578
6.304
7.041
7.790
8.547
9.312
10.085
10.865
1 1 .651
12.443
13.240
14.041
14.848
15.659
16.473
1 7.292
18.114
18.939
19.768
20.599
24.797
29.051
37.689
46.459
55.329
64.278
73.291
82.358
1 74.84

2.706
4.605
6.251
7.779
9.236
10.645
12.017
13.362
14.684
15.987
17.275
18.549
19.812
21.064
22.307
23.542
24.769
25.989
27.204
28.412
29.615
30.813
32.007
33.196
34.382
35.563
36.741
37.916
39.087
40.256
46.059
51 .805
63.167
74.397
85.527
96.578
107.57
118.50
226.02

0.950 0.975 0.990 0.995 0.999
3.841
5.991
7.815
9.488
1 1 .070
12.592
14.067
15.507
16.919
18.307
19.675
21.026
22.362
23.685
5.024 6.635 7.879
7378] 9.210| 10.597
9.348
11.143
12.832
14.449
16.013
17.535
19.023
1 1 .345
13.277
15.086
16.812
18.475
20.090
21.666
20.483 23.209
21.920
23.337
24.736
26.119
24.725
26.217
27.688
29.141
24.996 27.488! 30.578
26.296I 28.845I 32.000
27.587 30.191] 33.409
28.869] 31.526| 34.805
30.144
31.410
32.671
33.924
35.172
36.415
37.652
38.885
40.113
41.337
42.557
43.773
49.802
55.758
67.505
79.082
90.531
101.88
113.15
124.34
233.99

32.852
34.170
35.479
36.781
38.076
39.364
40.646
41.923
43.195
44.461
45.722
46.979
53.203
59.342
71.420
83.298
95.023
106.63
118.14
129.56
241.06

36.191
37.566
38.932
40.289
41 .638
42.980
44.314
45.642
46.963
48.278
49.588
50.892
57.342
63.691
76.154
88.379
100.43
112.33
124.12
135.81
249.45

12.838
14.860
16.750
18.548
20.278
21.955
23.589
25.188
26.757
28.300
29.819
31.319
32.801
34.267
35.718
37.156
38.582
39.997
41.401
42.796
44.181
45.558
46.928
48.290
49.645
50.994
52.335
53.672
60.275
66.766
79.490
91.952
104.21
116.32
128.30
140.17
255.26

10.827
13.815
16.266
18.466
20.515
22.457
24.321
26.124
27.877
29.588
31.264
32.909
34.527
36.124
37.698
39.252
40.791
42.312
43.819
45.314
46.796
48.268
49.728
51.179
52.619
54.051
55.475
56.892
58.301
59.702
66.619
73.403
86.660
99.608
112.32
124.84
137.21
149.45
267.54

-------