vvEPA
United States Office Of Water EPA 841-B-97-010
Environmental Protection (4503F) September 1997
Agency
TECHNIQUES FOR TRACKING,
EVALUATING, AND REPORTING
THE IMPLEMENTATION OF
NONPOINT SOURCE CONTROL
MEASURES
AGRICULTURE
-------
EPA/841 -B-97-010
September 1997
TECHNIQUES FOR TRACKING, EVALUATING,
AND REPORTING THE IMPLEMENTATION
OF NONPOINT SOURCE CONTROL
MEASURES
I. AGRICULTURE
Final
September 1997
Prepared for
Steve Dressing
Nonpoint Source Pollution Control Branch
United States Environmental Protection Agency
Prepared by
Tetra Tech, Inc.
EPA Contract No. 68-C3-0303
Work Assignment No. 4-51
-------
TABLE OF CONTENTS
Chapter 1 Introduction
1.1 Purpose of Guidance 1-1
1.2 Background 1-1
1.3 Types of Monitoring 1-3
1.4 Quality Assurance and Quality Control 1-4
1.5 Data Management 1-5
Chapter 2 Sampling Design
2.1 Introduction 2-1
2.1.1 Study Objectives 2-1
2.1.2 Probabilistic Sampling 2-2
2.1.3 Measurement and Sampling Errors 2-8
2.1.4 Estimation and Hypothesis Testing 2-11
2.2 Sampling Considerations 2-13
2.2.1 Farm Ownership and Size 2-13
2.2.2 Location and Other Physical Characteristics 2-14
2.2.3 Farm Type and Agricultural Practices 2-15
2.2.4 Sources of Information 2-15
2.3 Sample Size Calculations 2-18
2.3.1 Simple Random Sampling 2-20
2.3.2 Stratified Random Sampling 2-24
2.3.3 Cluster Sampling 2-27
2.3.4 Systematic Sampling 2-27
Chapter 3 Methods for Evaluating Data
3.1 Introduction 3-1
3.2 Comparing the Means from Two Independent Random Samples 3-2
3.3 Comparing the Proportions from Two Independent Samples 3-3
3.4 Comparing More Than Two Independent Random Samples 3-4
3.5 Comparing Categorical Data 3-4
Chapter 4 Conducting the Evaluation
4.1 Introduction 4-1
4.2 Choice of Variables 4-2
4.3 Expert Evaluations 4-7
4.3.1 Site Evaluations 4-7
4.3.2 Rating Implementation of Management Measures and Best
Management Practices 4-9
4.3.3 Rating Terms 4-10
4.3.4 Consistency Issues 4-12
4.3.5 Postevaluation Onsite Activities 4-13
-------
Table of Center
4.4 Self-Evaluations 4-13
4.4.1 Methods 4-13
4.4.2 Cost 4-14
4.4.3 Questionnaire Design 4-17
4.5 Aerial Reconnaissance and Photography 4-19
Chapter 5 Presentation of Evaluation Results
5.1 Introduction 5-1
5.2 Audience Identification 5-2
5.3 Presentation Format 5-2
5.3.1 Written Presentations 5-3
5.3.2 Oral Presentations 5-3
5.4 For Further Information 5-4
References R-l
Glossary G-l
Index 1-1
Appendix A: Statistical Tables A-l
-------
Table of Contet
List of Tables
Table 2-1 Applications of four sampling designs for implementation
monitoring 2-3
Table 2-2 Errors in hypothesis testing 2-12
Table 2-3 Acres of harvested cropland in Virginia from USDOC's 1992
Census of Agriculture 2-14
Table 2-4 Definitions used in sample size calculation equations 2-19
Table 2-5 Comparison of sample size as a function of various parameters 2-21
Table 2-6 Common values of (ZK + Z2p)2 for estimating sample size 2-23
Table 2-7 Allocation of Samples 2-26
Table 2-8 Number of farms implementing recommended BMPs 2-28
Table 3-1 Contingency table of observed operator type and
implemented BMP 3-5
Table 3-2 Contingency table of expected operator type and implemented BMP 3-6
Table 3-3 Contingency table of implemented BMP and rating of
installation and maintenance 3-7
Table 3-4 Contingency table of implemented BMP and sample year 3-8
Table 4-1 General types of information obtainable with self-evaluations
and expert evaluations 4-3
Table 4-2 Example variables for management measure implementation analysis 4-6
List of Figures
Figure 2-1 Simple random sampling from a list and a map 2-4
Figure 2-2 Stratified random sampling from a list and a map 2-6
Figure 2-3 Cluster sampling from a list and a map 2-7
Figure 2-4 Systematic sampling from a list and a map 2-9
Figure 2-5 Graphical presentation of the relationship between bias,
precision, and accuracy 2-11
Figure 2-6 Example route for a county transect survey 2-29
Figure 4-1 Potential variables and examples of implementation
standards and specifications 4-5
Figure 4-2 Sample draft survey for confined animal facility management
evaluation 4-15
Figure 5-1 Example of presentation of information in a written slide 5-4
Figure 5-2 Example of representation of data using a combination of a pie
chart and a horizontal bar chart 5-5
Figure 5-3 Example representation of data in the form of a pie chart 5-6
-------
CHAPTER 1. INTRODUCTION
1.1 PURPOSE OF GUIDANCE
This guidance is intended to assist state,
regional, and local environmental professionals
in tracking the implementation of best
management practices (BMPs) used to control
agricultural nonpoint source pollution.
Information is provided on methods for
selecting sites for evaluation, sample size
estimation, sampling, and results evaluation
and presentation. The focus of the guidance is
on the statistical approaches needed to
properly collect and analyze data that are
accurate and defensible. A properly designed
BMP implementation monitoring program can
save both time and money. For example, there
are over 37,000 farms in the state of Virginia.
To determine the status of BMP
implementation on each of those farms would
easily exceed most budgets and thus statistical
sampling of sites is needed. This document
provides guidance for sampling representative
farms to yield summary statistics at a fraction
of the cost of a comprehensive inventory.
Some nonpoint source projects and programs
combine BMP implementation monitoring with
water quality monitoring to evaluate the
effectiveness of BMPs at protecting water
quality (Meals, 1988; Rashin et al., 1994;
USEPA, 1993b). For this type of monitoring
to be successful, the scale of the project must
be small (e.g., a watershed of a few hundred to
a few thousand acres). Accurate records of all
the sources of pollutants of concern and a
census of how all BMPs are operating are very
important for this type of monitoring effort.
Otherwise, it can be extremely difficult to
The focus of this guide is on the design of
monitoring programs to assess agricultural
management measure and best management
practice implementation, with particular
emphasis on statistical considerations.
correlate BMP implementation with changes in
stream water quality. This guidance does not
address monitoring the implementation and
effectiveness of all BMPs in a watershed. This
guidance does provide information to help
program managers gather statistically valid
information to assess implementation of BMPs
on a more general (e.g., statewide) basis. The
benefits of implementation monitoring are
presented in Section 1.3.
1.2 BACKGROUND
Pollution from nonpoint sources—sediment
deposition, erosion, contaminated runoff,
hydrologic modifications that degrade water
quality, and other diffuse sources of water
pollution—is the largest cause of water quality
impairment in the United States (USEPA,
1995). Congress passed the Coastal Zone Act
Reauthorization Amendments of 1990
(CZARA) to help address nonpoint source
pollution in coastal waters. CZARA provides
that each state with an approved coastal zone
management program develop and submit to
the U.S. Environmental Protection Agency
(EPA) and National Oceanic and Atmospheric
Administration (NOAA) a Coastal Nonpoint
Pollution Control Program (CNPCP). State
programs must "provide for the
-------
Introduction
Chapter 1
implementation" of management measures in
conformity with the EPA Guidance Specifying
Management Measures For Sources Of
Nonpoint Pollution In Coastal Waters,
developed pursuant to section 6217(g) of
CZARA(USEPA, 1993a). Management
measures (MMs), as defined in CZARA, are
economically achievable measures to control
the addition of pollutants to coastal waters,
which reflect the greatest degree of pollutant
reduction achievable through the application of
the best available nonpoint pollution control
practices, technologies, processes, siting
criteria, operating methods, or other
alternatives. Many of EPA's MMs are
combinations of BMPs. For example,
depending on site characteristics,
implementation of the Confined Animal Facility
MM might involve use of the following BMPs:
Construction of a waste storage pond,
installation of grassed waterways, protection of
heavily-used areas, management of roof runoff,
and construction of a composting facility.
CZARA does not specifically require that
states monitor the implementation of MMs and
BMPs as part of their CNPCPs. State
CNPCPs must however, provide for technical
assistance to local governments and the public
for implementing the MMs and BMPs. Section
6217(b) states:
Each State program . . . shall provide for
the implementation, at a minimum, of
management measures . . . and shall also
contain ... (4) The provision of technical
and other assistance to local governments
and the public for implementing the
measures . . . which may include assistance
... to predict and assess the effectiveness
of such measures ....
EPA and NOAA also have some responsibility
under section 6217 for providing technical
assistance to implement state CNPCPs.
Section 6217(d), Technical assistance, states:
[NOAA and EPA] shall provide technical
assistance ... in developing and
implementing programs. Such assistance
shall include: ... (4) methods to predict
and assess the effects of coastal land use
management measures on coastal water
quality and designated uses.
This guidance document was developed to
provide the technical assistance described in
CZARA sections 6217(b)(4) and 6217(d), but
the techniques described can be used for other
similar programs and projects. For instance,
monitoring projects funded under Clean Water
Act (CWA) section 319(h) grants, efforts to
implement total maximum daily loads
developed under CWA Section 303(d),
stormwater permitting programs, and other
programs could all benefit from knowledge of
BMP implementation.
Methods to assess the implementation of MMs
and BMPs, then, are a key focus of the
technical assistance to be provided by EPA and
NOAA. Implementation assessments can be
done on several scales. Site-specific
assessments can be used to assess individual
BMPs or MMs, and watershed assessments can
be used to look at the cumulative effects of
implementing multiple MMs. With regard to
"site-specific" assessments, individual BMPs
must be assessed at the appropriate scale for
the BMP of interest. For example, to assess
the implementation of MMs and BMPs for
animal waste handling and disposal on a farm,
only the structures, areas, and practices
implemented specifically for animal waste
-------
Chapter 1
Introductio
management (e.g., dikes, diversions, storage
ponds, composting facility, and manure
application records) would need to be
inspected. In this instance the animal waste
storage facility would be the appropriate scale
and "site." To assess erosion control, the
proper scale might be fields over 10 acres and
the site could be 100-meter transect
measurements of crop residue. For nutrient
management, the scale and site might be an
entire farm. Site-specific measurements can
then be used to extrapolate to a watershed or
statewide assessment. It is recognized that
some studies might require a complete
inventory of MM and BMP implementation
across an entire watershed or other geographic
area.
1.3 TYPES OF MONITORING
The term monitor is defined as "to check or
evaluate something on a constant or regular
basis" (Academic Press, 1992). It is possible
to distinguish among various types of
monitoring. Two types, implementation and
trend (i.e., trends in implementation)
monitoring, are the focus of this guidance.
These types of monitoring can be used to
address the following goals:
• Determine the extent to which MMs and
BMPs are implemented in accordance with
relevant standards and specifications.
• Determine whether there has been a change
in the extent to which MMs and BMPs are
being implemented.
In general, implementation monitoring is used
to determine whether goals, objectives,
standards, and management practices are being
implemented as detailed in implementation
plans. In the context of BMPs within state
CNPCPs, implementation monitoring is used to
determine the degree to which MMs and BMPs
required or recommended by the CNPCPs are
being implemented. If CNPCPs call for
voluntary implementation of MMs and BMPs,
implementation monitoring can be used to
determine the success of the voluntary program
(1) within a given monitoring period (e.g., 1 or
2 years); (2) during several monitoring periods,
to determine any temporal trends in BMP
implementation; or (3) in various regions of the
state.
Trend monitoring involves long-term
monitoring of changes in one or more
parameters. As discussed in this guidance,
public attitudes, land use, or the use of
different agricultural practices are examples of
parameters that could be measured with trend
monitoring. For example, the Conservation
Technology Information Center tracks trends
in the implementation of different tillage
practices from year to year (CTIC, 1994).
Isolating the impacts of MMs and BMPs on
water quality requires tracking MM and BMP
implementation overtime, i.e., trend
monitoring.
Because trend monitoring involves measuring a
change (or lack thereof) in some parameter
over time, it is necessarily of longer duration
and requires that a baseline, or starting point,
be established. Any changes in the measured
parameter are then detected in reference to the
baseline.
Implementation and the related trend
monitoring can be used to determine
(1) which MMs and BMPs are being
implemented, (2) whether MMs and BMPs are
being implemented as designed, and
-------
Introduction
Chapter 1
(3) the need for increased efforts to promote or
induce use of MMs and BMPs. Data from
implementation monitoring, used in
combination with other types of data, can be
useful in meeting a variety of other objectives,
including the following (Hook et al., 1991;
IDDHW, 1993; Schultz, 1992):
• To evaluate BMP effectiveness for
protecting soil and water resources.
To identify areas in need of further
investigation.
• To establish a reference point of overall
compliance with BMPs.
• To determine whether farmers are aware of
BMPs.
• To determine whether farmers are using the
advice of agricultural BMP experts.
• To identify any BMP implementation
problems specific to a category of farm.
• To evaluate whether any agricultural
practices cause environmental damage.
To compare the effectiveness of alternative
BMPs.
MacDonald et al. (1991) describes additional
types of monitoring, including effectiveness
monitoring, baseline monitoring, project
monitoring, validation monitoring, and
compliance monitoring. As emphasized by
McDonald and others, these monitoring types
are not mutually exclusive and the distinc-tions
among them are usually determined by the
purpose of the monitoring.
Effectiveness monitoring is used to determine
whether MMs or BMPs, as designed and
implemented, are effective in meeting
management goals and objectives.
Effectiveness monitoring is a logical follow-up
to implementation monitoring, because it is
essential that effectiveness monitoring include
an assessment of the adequacy of the design
and installation of MMs and BMPs. For
example, the objective of effectiveness
monitoring could be to evaluate the
effectiveness of MMs and BMPs as designed
and installed, or to evaluate the effectiveness
of MMs and BMPs that are designed and
installed adequately or to standards and
specifications. Effectiveness monitoring is not
addressed in this guide, but is the subject of
another EPA guidance document, Monitoring
Guidance for Determining the Effectiveness of
Nonpoint Source Controls (USEPA, 1997).
1.4 QUALITY ASSURANCE AND QUALITY
CONTROL
An integral part of the design phase of any
nonpoint source pollution monitoring project is
quality assurance and quality control (QA/QC).
Development of a quality assurance project
plan (QAPP) is the first step of incorporating
QA/QC into a monitoring project. The QAPP
is a critical document for the data collection
effort inasmuch as it integrates the technical
and quality aspects of the planning,
implementation, and assessment phases of the
project. The QAPP documents how QA/QC
elements will be implemented throughout a
project's life. It contains statements about the
expectations and requirements of those for
whom the data is being collected (i.e., the
decision maker) and provides details on
project-specific data collection and data
management procedures that are designed to
-------
Chapter 1
Introductio
ensure that these requirements are met.
Development and implementation of a QA/QC
program, including preparation of a QAPP, can
require up to 10 to 20 percent of project
resources (Cross-Smiecinski and Stetzenback,
1994), but this cost is recaptured in lower
overall costs due to the project being well
planned and executed. A thorough discussion
of QA/QC is provided in Chapter 5 of EPA's
Monitoring Guidance for Determining the
Effectiveness ofNonpoint Source Controls
(USEPA, 1997).
1.5 DATA MANAGEMENT
Data management is a key component of a
successful MM or BMP implementation
monitoring effort. The data management
system that is used—which includes the quality
control and quality assurance aspects of data
handling, how and where data are stored, and
who manages the stored data—determines the
reliability, longevity, and accessibility of the
data. Provided that the data collection effort
was planned and executed well, an organized
and efficient data management system will
ensure that the data can be used with
confidence by those who must make decisions
based upon it, the data will be useful as a
baseline for similar data collection efforts in the
future, the data will not become obsolete (or be
misplaced!) quickly, and the data will be
available to a variety of users for a variety of
applications.
Serious consideration is often not given to a
data management system prior to a data
collection effort, which is precisely why it is so
important to recognize the long-term value of a
small investment of time and money in proper
data management. Data management competes
with other agency priorities for money, staff,
and time, and if the importance and long-term
value of proper data management is recognized
early in a project's development, the more
likely it will be to receive sufficient funding.
Overall, data management might account for
only a small portion of a project's total budget,
but the return on the investment is great when
it is considered that the larger investment in
data collection can be rendered virtually useless
unless data is managed adequately.
Two important aspects of data that should be
considered when planning the initial data
collection effort and a data management system
are data life cycle and data accessibility. The
data life cycle can be characterized by the
following stages:
(1) Data is collected; (2) data is checked for
quality; (3) data is entered into a data base; (4)
data is used, and (5) data eventually becomes
obsolete. The expected usefulness and life
span of the data should be considered during
the initial stages of planning a data collection
effort, when the money, staff, and time that are
devoted to data collection must be weighed
against its usefulness and longevity. Data with
a limited use and that is likely to become
obsolete soon after it is collected is a poorer
investment decision than data with multiple
applications and a long life span. If a data
collection effort involves the collection of data
of limited use and a short life span, it might be
necessary to modify the data collection
effort—either by changing its goals and
objectives or by adding new ones—to increase
the breadth and length of the data's
applicability. A good data management system
will ensure that any data that are collected will
be useful for the greatest number of
applications for the longest possible time.
-------
Introduction
Chapter 1
Data accessibility is a critical factor in
determining its usefulness. Data attains its
highest value if it is as widely accessible as
possible, if access to it requires the least
amount of staff effort as possible, and if it can
be used by others conveniently. If data are
stored where those who might need it can
obtain it with little assistance, it is more likely
to be shared and used. The format for data
storage determines how conveniently the data
can be used. Electronic storage in a widely
available and used data storage format makes it
convenient to use. Storage as only a paper
copy buried in a report, where any analysis
requires entry into an electronic format or
time-consuming manipulation, makes data
extremely inconvenient to use and unlikely that
it will be used.
The following should be considered for the
development of a data management strategy:
• What level of quality control should the
data be subject to? Data that will be used
for a variety of purposes or that will be
used for important decisions should receive
a careful quality control check.
• Where and how will the data be stored?
The options for data storage range from a
printed final report on a bookshelf to an
electronic data base accessible to
government agencies and the public.
Determining where and how data will be
stored therefore also requires careful
consideration of the question: How
accessible should the data be?
• Who will maintain the data base? Data
stored in a large data base might be
managed by a professional data manager,
while data kept in agency files might be
managed by people with various
backgrounds over the course of time.
How much will data management cost? As
with all other aspects of a data collection
effort, data management costs money and
this cost must be balanced with all other
costs involved in the project.
-------
CHAPTER 2. SAMPLING DESIGN
2.1 INTRODUCTION
This chapter discusses recommended methods
for designing sampling programs to track and
evaluate the implementation of nonpoint
source control measures. This chapter does
not address sampling to determine whether the
management measures (MMs) or best
management practices (BMPs) are effective
since no water quality sampling is done.
Because of the variation in agricultural
practices and related nonpoint source control
measures implemented throughout the United
States, the approaches taken by various states
to track and evaluate nonpoint source control
measure implementation will differ.
Nevertheless, all approaches can be based on
sound statistical methods for selecting
sampling strategies, computing sample sizes,
and evaluating data. EPA recommends that
states should consult with a trained statistician
to be certain that the approach, design, and
assumptions are appropriate to the task at
hand.
As described in Chapter 1, implementation
monitoring is the focus of this guidance.
Effectiveness monitoring is the focus of
another guidance prepared by EPA,
Monitoring Guidance for Determining the
Effectiveness of Nonpoint Source Controls
(USEPA, 1997). The recommendations and
examples in this chapter address two primary
monitoring goals:
• Determine the extent to which MMs and
BMPs are implemented in accordance with
relevant standards and specifications.
• Determine whether there is a change in the
extent to which MMs and BMPs are being
implemented.
For example, state or county agriculture
personnel might be interested in whether
regulations for the exclusion of livestock from
riparian areas are being adhered to in regions
with particular water quality problems. State
or county personnel might also be interested in
whether, in response to an intensive state-wide
effort to improve pesticide use practices and
increase the use of integrated pest management
practices, there is a detectable change in the
pesticide practices being used by farmers.
2.1.1 Study Objectives
To develop a study design, clear, quantitative
monitoring objectives must be developed. For
example, the objective might be to estimate
the percent of farm owners or managers that
use integrated pest management (IPM) to
within ±5 percent. Or perhaps a state is
getting ready to perform an extensive 2-year
outreach and cost-share effort to promote a
fence-out or other program to reduce cattle
wading through streams. In this case,
detecting a 10 percent change in the farms that
permit their cattle direct access to streams
might be of interest. In the first example,
summary statistics are developed to describe
the current status, whereas in the second
example, some sort of statistical analysis
(hypothesis testing) is performed to determine
whether a significant change has really
occurred. This choice has an impact on how
the data are collected. As an example,
summary statistics might require unbalanced
-------
Sampling Design
Chapter 2
sample allocations to account for variability
such as farm size, type, and ownership,
whereas balanced designs (e.g., two sets of
data with the same number of observations in
each set) are more typical for hypothesis
testing.
2.1.2 Probabilistic Sampling
Most study designs that are appropriate for
tracking and evaluating implementation are
based on a probabilistic approach since
tracking every farm is not cost-effective. In a
probabilistic approach, individuals are
randomly selected from the entire group. The
selected individuals are evaluated, and the
results from the individuals provide an
unbiased assessment about the entire group.
Applying the results from randomly selected
individuals to the entire group is statistical
inference. Statistical inference enables one to
determine, for example, in terms of
probability, the percentage of farms using IPM
without visiting every farm. One could also
determine whether the change in the number
of farms with appropriate nutrient
management is within the range of what could
occur by chance or the change is large enough
to indicate a real modification of farmer
practices.
The group about which inferences are made is
the population or target population., which
consists of population units. The sample
population is the set of population units that
are directly available for measurement. For
example, if the objective is to determine the
degree to which adequate animal waste
management has been established in
agricultural operations, the population to be
sampled would be agricultural operations for
which animal waste management is an
appropriate BMP (i.e., farms with livestock).
Statistical inferences can be made only about
the target population available for sampling.
For example, if implementation of grazing
management is being assessed and only public
grazing lands can be sampled, inferences
cannot be made about the management of
private grazing lands. Another example to
consider is a mail survey. In most cases, only
a percentage of survey forms is returned. The
extent to which nonrespondents bias the
survey findings should be examined: Do the
nonrespondents represent those less likely to
use IPM? Typically, a second mailing, phone
calls, or visits to those who do not respondent
might be necessary to evaluate the impact of
nonrespondents.
The most common types of sampling that
should be used for implementation monitoring
are summarized in Table 2-1. In general,
probabilistic approaches are preferred.
However, there might be circumstances under
which targeted sampling should be used.
Targeted sampling refers to using best
professional judgement for selecting sample
locations. For example, state or county
agriculture personnel deciding to evaluate all
farms in a given watershed would be targeted
sampling. The choice of a sampling plan
depends on study objectives, patterns of
variability in the target population, cost-
effectiveness of alternative plans, types of
measurements to be made, and convenience
(Gilbert, 1987).
-------
Chapter 2
Sampling Design
Table 2-1. Applications of four sampling designs for implementation monitoring.
Sarrmlina Desian
Simple Random
Sampling
Stratified Random
Sampling
Cluster Sampling
Systematic Sampling
Comment
Each population unit has an equal probability of being selected.
Useful when a sample population can be broken down into groups, or strata,
that are internally more homogeneous than the entire sample population.
Random samples are taken from each stratum although the probability of
being selected might vary from stratum to stratum depending on cost and
variability.
Useful when there are a number of methods for defining population units
and when individual units are clumped together. In this case, clusters are
randomly selected and every unit in the cluster is measured.
This sampling has a random starting point with each subsequent
observation a fixed interval (space or time) from the previous observation.
Simple random sampling is the most
elementary type of sampling. Each unit of the
target population has an equal chance of being
selected. This type of sampling is appropriate
when there are no major trends, cycles, or
patterns in the target population (Cochran,
1977). Random sampling can be applied in a
variety of ways including farm or field
selection. Random samples can also be taken
at different times at a single farm. Figure 2-1
provides an example of simple random
sampling from a listing of farms and from a
map.
If the pattern of MM and BMP
implementation is expected to be uniform
across the state, simple random sampling is
appropriate to estimate the extent of
implementation. If, however, implementation
is homogeneous only within certain categories
(e.g., federal, state, or private lands), stratified
random sampling should be used.
In stratified random sampling, the target
population is divided into groups called strata
for the purpose of obtaining a better estimate
of the mean or total for the entire population.
Simple random sampling is then used within
each stratum. Stratification involves the use of
categorical variables to group observations
into more units, thereby reducing the
variability of observations within each unit.
For example, in a state with federal, state, and
private rangelands that are used for grazing,
there might be different patterns of BMP
implementation. Lands in the state could be
divided into federal, state, and private as
separate strata from which samples would be
taken. In general, a larger number of samples
should be taken in a stratum if the stratum is
more variable, larger, or less costly to sample
than other strata. For example, if BMP
implementation is more variable on private
rangelands, a greater number of sampling sites
might be needed in that stratum to increase the
precision of the overall estimate. Cochran
(1977) found that stratified random sampling
provides a better estimate of the mean for a
population with a trend, followed in order by
systematic sampling (discussed later) and
-------
Sampling Design
Chapter 2
Farm Cataloa No.
1
2
3
4
5
6
7
8
• • •
118
119
120
121
122
123
124
125
126
127
128
Waterbodv
Stream
Pond
Pond
Stream
—
River
Lake
• • •
Stream
Stream
—
—
Bay
Bay
Stream
Pond
Stream
—
Pond
Tvoe
Crop
Crop
Livestock
Crop/Livestock
Livestock
Crop
Crop/Livestock
Crop
• • •
Crop/Livestock
Crop/Livestock
Crop/Livestock
Livestock
Crop/Livestock
Crop
Crop
Crop/Livestock
Crop/Livestock
Livestock
Crop
Countv Code
N3
S4
S2
E5
SI
S7
W18
E34
• • •
S21
W7
W4
N5
N9
S3
W11
E14
S14
S8
N13
Figure 2-1a. Simple random sampling from a listing of farms. In this listing, all farms are
presented as a single list and farms are selected randomly from the entire list. Shaded farms
represent those selected for sampling.
Figure 2-1 b. Simple random sampling from a map.
Dots represent farms. All farms of interest are
represented on the map, and the farms to be
sampled (open dots—F) were selected randomly
from all of those on the map. The shaded lines on
the map could represent county, watershed,
hydrologic, or some other boundary, but they are
ignored for the purposes of simple random
sampling.
-------
Sampling Design
Chapter 2
simple random sampling. He also noted that
stratification typically results in a smaller
variance for the estimated mean or total than
that which results from comparable simple
random sampling.
If the state believes that there will be a
difference between two or more subsets of
farms, such as between types of ownership or
crop, the farms can first be stratified into these
subsets and a random sample taken within
each subset (McNew, 1990). The goal of
stratification is to increase the accuracy of the
estimated mean values over what could have
been obtained using simple random sampling
of the entire population. The method makes
use of prior information to divide the target
population into subgroups that are internally
homogeneous. There are a number of ways to
"select" farms (e.g., by farm ownership, farm
size, farm type, hydrologic unit, soil type, or
county), or sets of farms, to be certain that
important information will not be lost, or that
MM or BMP use will not be misrepresented as
a result of treating all potential survey farms as
equal. Figure 2-2 provides an example of
stratified random sampling from a listing of
farms and from a map.
It might also be of interest to compare the
relative percentages of cropland classified as
having high, medium, and low erosion
potentials that are under conservation tillage.
Highly erodible land might be responsible for
a larger share of sediment losses, and it would
usually be desirable to track the extent to
which conservation tillage practices have been
implemented on these land areas. A stratified
random sampling procedure could be used to
estimate the percentage of total cropland with
different erosion potentials under conservation
tillage.
Cluster sampling is applied in cases where it is
more practical to measure randomly selected
groups of individual units than to measure
randomly selected individual units (Gilbert,
1987). In cluster sampling, the total
population is divided into a number of
relatively small subdivisions, or clusters, and
then some of the subdivisions are randomly
selected for sampling. For one-stage cluster
sampling, the selected clusters are sampled
totally. In two-stage cluster sampling, random
sampling is performed within each cluster
(Gaugush, 1987). For example, this approach
might be useful if a state wants to estimate the
proportion of farms less than 800 meters from
a stream that are following state-approved
nutrient management plans. All farms less
than 800 meters from a particular stream (or
portion of a stream) can be regarded as a
single cluster. Once all clusters have been
identified, specific clusters can be randomly
chosen for sampling. Freund (1973) notes that
estimates based on cluster sampling are
generally not as good as those based on simple
random samples, but they are more cost-
effective. As a result, Gaugush (1987)
believes that the difficulty associated with
analyzing cluster samples is compensated for
by the reduced sampling requirements and
cost. Figure 2-3 provides an example of
cluster sampling from a listing of farms and
from a map.
-------
Sampling Design
Chapter 2
Farm Catalog No.
1
2
6
8
• • •
123
124
128
3
5
• • •
121
127
4
7
• • •
118
119
120
122
125
126
Water Body
Stream
Pond
River
• • •
Bay
Stream
Pond
Pond
• • •
—
Stream
Lake
• • •
Stream
Stream
...
Bay
Pond
Stream
Type
Crop
Crop
Crop
Crop
• • •
Crop
Crop
Crop
Livestock
Livestock
• • •
Livestock
Livestock
Crop/Livestock
Crop/Livestock
• • •
Crop/Livestock
Crop/Livestock
Crop/Livestock
Crop/Livestock
Crop/Livestock
Crop/Livestock
County Code
N3
S4
S7
E34
• • •
S3
W11
N13
S2
S1
• • •
N5
S8
E5
W18
• • •
S21
W7
W4
N9
E14
S14
Figure 2-2a. Stratified random sampling from a listing of farms. Within this listing, farms are
subdivided by type. Then, considering only one farm type (e.g., crop farms), some farms are
selected randomly. The process of random sampling is then repeated for the other farm types
(i.e., livestock, crop/livestock). Shaded farms represent those selected for sampling.
C
CL
CL
L
CL
CL
Figure 2-2b. Stratified random sampling from a
map. Letters represent farms, subdivided by type
(C = crop, CL = crop/livestock, L = livestock). All
farms of interest are represented on the map.
From all farms in one type category, some were
randomly selected for sampling (highlighted
farms). The process was repeated for each farm
type category. The shaded lines on the map could
represent county, soil type, or some other
boundary, and could have been used as a means
for separating the farms into categories for the
sampling process.
-------
Chapter 2
Sampling Design
Farm Catalog No.
1
4
• • •
118
119
124
126
2
3
• • •
125
128
6
7
122
123
5
8
• • •
120
121
127
Water Body
Stream
Stream
• • •
Stream
Stream
Stream
Stream
Pond
Pond
• • •
Pond
Pond
River
Lake
Bay
Bav
...
• • •
—
—
Type
Crop
Crop/Livestock
• • •
Crop/Livestock
Crop/Livestock
Crop
Croo/Livestock
Crop
Livestock
• • •
Crop/Livestock
Croo
Croo
Croo/Livestock
Crop/Livestock
Croo
Livestock
Crop
• • •
Crop/Livestock
Livestock
Livestock
County Code
N3
E5
• • •
S21
W7
W11
S14
S4
S2
• • •
E14
N13
S7
W18
N9
S3
SI
E34
• • •
W4
N5
S8
Figure 2-3a. One-stage cluster sampling from a listing of farms. Within this listing, farms are
subdivided by the type of waterbody near them. Some of the waterbody types were then
randomly selected (in this case streams and bays) and all farms with those waterbodies were
selected for sampling. Shaded farms represent those selected for sampling.
Figure 2-3b. Cluster sampling from a map. All
farms in the area of interest are represented on
the map (closed {!} and open {F} dots).
Waterbody types were selected randomly, and
farms with those waterbodies (closed dots {!})
were selected for sampling. Shaded lines could
represent a type of boundary, such as soil type,
county, or watershed, and could have been used
as the basis for the sampling process as well.
-------
Chapter 2
Sampling Design
Systematic sampling is used extensively in
water quality monitoring programs because it
is relatively easy to do from a management
perspective. In systematic sampling the first
sample has a random starting point and each
subsequent sample has a constant distance
from the previous sample. For example, if a
sample size of 70 is desired from a mailing list
of 700 farm owners, the first sample would be
randomly selected from among the first 10
people, say the seventh person. Subsequent
samples would then be based on the 17th, 27th,
..., 697th person. In comparison, a stratified
random sampling approach might be to sort
the mailing list by county and then to
randomly select farm owners from each
county. Figure 2-4 provides an example of
systematic sampling from a listing of farms
and from a map.
In general, systematic sampling is superior to
stratified random sampling when only one or
two samples per stratum are taken for
estimating the mean (Cochran, 1977) or when
is there is a known pattern of management
measure implementation. Gilbert (1987)
reports that systematic sampling is equivalent
to simple random sampling in estimating the
mean if the target population has no trends,
strata, or correlations among the population
units. Cochran (1977) notes that on the
average, simple random sampling and
systematic sampling have equal variances.
However, Cochran (1977) also states that for
any single population for which the number of
sampling units is small, the variance from
systematic sampling is erratic and might be
smaller or larger than the variance from simple
random sampling.
Gilbert (1987) cautions that any periodic
variation in the target population should be
known before establishing a systematic
sampling program. Sampling intervals equal
to or multiples of the target population's cycle
of variation might result in biased estimates of
the population mean. Systematic sampling can
be designed to capitalize on a periodic
structure if that structure can be characterized
sufficiently (Cochran, 1977). A simple or
stratified random sample is recommended,
however, in cases where the periodic structure
is not well known or if the randomly selected
starting point is likely to have an impact on the
results (Cochran, 1977).
Gilbert (1987) notes that assumptions about
the population are required in estimating
population variance from a single systematic
sample of a given size. However, there are
systematic sampling approaches that do
support unbiased estimation of population
variance, including multiple systematic
sampling, systematic stratified sampling, and
two-stage sampling (Gilbert, 1987). In
multiple systematic sampling more than one
systematic sample is taken from the target
population. Systematic stratified sampling
involves the collection of two or more
systematic samples within each stratum.
2.1.3 Measurement and Sampling Errors
In addition to making sure that samples are
representative of the sample population, it is
also necessary to consider the types of bias or
error that might be introduced into the study.
Measurement error is the deviation of a
measurement from the true value (e.g., the
percent residue cover for a field was estimated
as 23 percent and the true value was 26
percent). A consistent under- or
overestimation of the true value is referred to
as measurement bias. Random sampling error
arises from the variability from one population
unit to the next (Gilbert, 1987), explaining
-------
Chapter 2
Sampling Design
Farm Catalog No.
1
2
3
4
5
6
7
8
• • •
118
119
120
121
122
123
124
125
126
127
128
Water Body
Stream
Pond
Pond
Stream
—
River
Lake
• • •
Stream
Stream
—
—
Bay
Bay
Stream
Pond
Stream
—
Pond
Type
Crop
Crop
Livestock
Crop/Livestock
Livestock
Crop
Crop/Livestock
Crop
• • •
Crop/Livestock
Crop/Livestock
Crop/Livestock
Livestock
Crop/Livestock
Crop
Crop
Crop/Livestock
Crop/Livestock
Livestock
Crop
County Code
N3
S4
S2
E5
SI
S7
W18
E34
• • •
S21
W7
W4
N5
N9
S3
W11
E14
S14
S8
N13
Figure 2-4a. Systematic sampling from a listing of farms. From a listing of all farms of interest,
an initial site (Farm No. 3) was selected randomly from among the first ten on the list. Every
fifth farm listed was subsequently selected for sampling. Shaded farms represent those
selected for sampling.
Figure 2-4b. Systematic sampling from a map.
Dots (! and F) represent farms of interest. A single
point on the map (n) and one of the farms were
randomly selected. A line was stretched outward
from the point to (and beyond) the selected farm.
The line was then rotated about the map and every
fifth dot that it touched was selected for sampling
(open dots—F). The direction of rotation was
determined prior to selection of the point of the
line's origin and the initial farm. The shaded lines
on the map could represent county boundaries, soil
type, watershed, or some other boundary, but were
not used for the sampling process.
-------
Sampling Design
Chapter 2
why the proportion of farm owners using a
certain BMP differs from one survey to
another.
The goal of sampling is to obtain an accurate
estimate by reducing the sampling and
measurements errors to acceptable levels,
while explaining as much of the variability as
possible to improve the precision of the
estimates (Gaugush, 1987). Precision is a
measure of how close an agreement there is
among individual measurements of the same
population. The accuracy of a measurement
refers to how close the measurement is to the
true value. If a study has low bias and high
precision, the results will have high accuracy.
Figure 2-5 illustrates the relationship between
bias, precision, and accuracy.
As suggested earlier, numerous sources of
variability should be accounted for in
developing a sampling design. Sampling
errors are introduced by virtue of the natural
variability within any given population of
interest. As sampling errors relate to MM or
BMP implementation, the most effective
method for reducing such errors is to carefully
determine the target population and to stratify
the target population to minimize the
nonuniformity in each stratum.
Measurement errors can be minimized by
ensuring that interview questions or surveys
are well designed. If a survey is used as a data
collection tool, for example, the investigator
should evaluate the nonrespondents to
determine whether there is a bias in who
returned the results (e.g., whether the
nonrespondents were more or less likely to
implement MMs or BMPs). If data are
collected by sending staff out to inspect
randomly selected fields, the approach for
inspecting the fields should be consistent. For
example, how do survey personnel determine
that at least 40 percent of the ground is
covered by residuals, or what is the basis for
determining whether a BMP has been properly
implemented?
Reducing sampling errors below a certain
point (relative to measurement errors) does not
necessarily benefit the resulting analysis
because total error is a function of the two
types of error. For example, if measurement
errors such as response or interviewing errors
are large, there is no point in taking a huge
sample to reduce the sampling error of the
estimate since the total error will be primarily
determined by the measurement error.
Measurement error is of particular concern
when farmer surveys are used for
implementation monitoring. Likewise,
reducing measurement errors would not be
worthwhile if only a small sample size were
available for analysis because there would be a
large sampling error (and therefore a large
total error) regardless of the size of the
measurement error. A proper balance between
sampling and measurement errors should be
maintained because research accuracy limits
effective sample size and vice versa (Blalock,
1979).
-------
Chapter 2
Sampling Design
\
(*)
Figure 2-5. Graphical representation of the relationship between bias, precision, and accuracy
(after Gilbert, 1987). (a): high bias + low precision = low accuracy; (b): low bias + low
precision = low accuracy; (c): high bias + high precision = low accuracy; and (d): low bias +
high precision = high accuracy.
2.1.4 Estimation and Hypothesis Testing
Rather than presenting every observation
collected, the data analyst usually summarizes
major characteristics with a few descriptive
statistics. Descriptive statistics include any
characteristic designed to summarize an
important feature of a data set. A point
estimate is a single number that represents the
descriptive statistic. Statistics common to
implementation monitoring include
proportions, means, medians, totals, and
others. When estimating parameters of a
population, such as the proportion or mean, it
is useful to estimate the confidence interval.
The confidence interval indicates the range in
which the true value lies for a stated
confidence level. For example, if it is
estimated that 65 percent of soybeans were
planted using no-till and the 90 percent
confidence limit is ±5 percent, there is a 90
percent chance that between 60 and 70 percent
of the soybeans were planted using no-till.
-------
Sampling Design
Chapter 2
Hypothesis testing should be used to determine
whether the level of MM and BMP
implementation has changed over time. The
null hypothesis (HJ is the root of hypothesis
testing. Traditionally, H0 is a statement of no
change, no effect, or no difference; for
example, "the proportion of farm owners using
IPM after the cost-share program is equal to
the proportion of farm owners using IPM
before the cost-share program." The
alternative hypothesis (Ha) is counter to H0,
traditionally being a statement of change,
effect, or difference. If H0 is rejected, Ha is
accepted. Regardless of the statistical test
selected for analyzing the data, the analyst
must select the significance level (a) of the
test. That is, the analyst must determine what
error level is acceptable. There are two types
of errors in hypothesis testing:
Type I: H0 is rejected when H0 is really true.
Type II: H0 is accepted when H0 is really
false.
Table 2-2 depicts these errors, with the
magnitude of Type I errors represented by a
and the magnitude of Type II errors
represented by (3. The probability of making a
Type I error is equal to the a of the test and is
selected by the data analyst. In most cases,
managers or analysts will define 1-aio be in
the range of 0.90 to 0.99 (e.g., a confidence
level of 90 to 99 percent), although there have
been applications where 1-a has been set to as
low as 0.80. Selecting a 95 percent confidence
level implies that the analyst will reject the H0
when H0 is true (i.e., a false positive) 5 percent
of the time. The same notion applies to the
confidence interval for point estimates
described above: a is set to 0.10, and there is a
10 percent chance that the true percentage of
soybeans planted using no-till is outside the 60
to 70 percent range. This implies that if the
decisions to be made based on the analysis are
major (i.e., affect many people in adverse or
costly ways) the confidence level needs to be
greater. For less significant decisions (i.e.,
low-cost ramifications) the confidence level
can be lower.
Type II error depends on the significance
level, sample size, and variability, and which
alternative hypothesis is true. Power (J-fi) is
defined as the probability of correctly rejecting
H0 when H0 is false. In general, for a fixed
sample size, a and (3 vary inversely. For a
fixed a, (3 can be reduced by increasing the
sample size (Remington and Schork, 1970).
Table 2-2. Errors in hy
jothesis testing.
Decision
Accept H0
Reject H0
State of Affairs in the Population
H0 is True
1-a
(Confidence level)
a
(Significance level)
(Type I error)
H0 is False
P
(Type II error)
1-P
(Power)
-------
Chapter 2
Sampling Design
2.2 SAMPLING CONSIDERATIONS
In a document of this brevity, it is not possible
to address all of the issues that face technical
staff who are responsible for developing and
implementing studies to track and evaluate the
implementation of nonpoint source control
measures. For example, when is the best time
to implement a survey or do on-site visits? In
reality, it is difficult to pinpoint a single time
of the year. Some BMPs can be checked any
time of the year, whereas others have a small
window of opportunity. In northern areas, the
time between fall harvest and winter snows
might be the most effective time of year to
assess implementation of a large number of
erosion control practices.
If the goal of the study is to determine the
effectiveness of a farmer education program,
sampling should be timed to ensure that there
was sufficient time for outreach activities and
for the farmers to implement the desired
practices. Also, farmers are more receptive to
visits and participation in a survey during off-
peak business times (i.e., not during planting,
harvesting, livestock birthing, etc.).
Furthermore, field personnel must have
permission to perform site visits from each
affected farm owner or manager prior to
arriving at the farms. Where access is denied,
a replacement farm is needed. This farm is
selected in accordance with the type of farm
selection being used, i.e., simple random,
stratified random, cluster, or systematic.
From a study design perspective, all of these
issues—study objectives, sampling strategy,
allowable error, and formulation of
hypotheses—must be considered together with
determining the sampling strategy. This
section describes common issues that the
technical staff might consider in targeting their
sampling efforts or determining whether to
stratify their sampling efforts. In general, if
there is reason to believe that there are
different rates of BMP or MM implementation
in different groups, stratified random sampling
should increase overall accuracy. Following
the discussion, a list of resources that can be
used to facilitate evaluating these issues is
presented.
2.2.1 Farm Ownership and Size
Farm ownership can be divided (i.e., stratified)
into multiple categories for sampling purposes
depending on the MM implementation being
tracked. The 1992 Census of Agriculture
(USDOC, 1994) provides information by state
on:
Farms by type of ownership (individual or
family, partnership, corporation, and
other).
Farms owned versus rented or leased.
Farm owner characteristics.
Farm gross income.
Average farm size.
-------
Sampling Design
Chapter 2
• Number of farms by size (1 to 9, 10 to 49,
50 to 179, 180 to 499, 500 to 999, 1,000 to
1,999, and 2,000 acres or more).
The Economic Research Section of the U.S.
Department of Agriculture (USDA) also
provides information on farm ownership, as do
many state programs. For example, a
sampling plan to determine the percentage of
acres on which erosion control practices had
been implemented could be designed based on
the data shown in Table 2-3. (The units of
interest are acres of harvested cropland.)
However, it should be noted that if there is
reason to believe that implementation of
erosion control practices is not uniform among
farm owners of farms of differing sizes, more
intense sampling of one or more
subpopulations (strata) might be warranted.
2.2.2 Location and Other Physical
Characteristics
Selection of farms for sampling should ensure
a representative sample of all appropriate areas
of a state or coastal zone. Stratifying by
county, watershed, hydrologic unit, or any
other geographically or physically based area
might increase overall accuracy. Other
important considerations for selecting areas
from which to sample include:
• Areas with different soil types.
Areas with different erosion potentials (see
USD A's National Resources Inventory).
• Areas with different climates (i.e.,
differences in total rainfall or storm
frequency).
Areas with known degraded water quality
conditions.
2.2.3 Farm Type and Agricultural Practices
To obtain a representative sample, data must
first be collected on the types of agricultural
Table 2-3. Acres of harvested cropland in Virginia from USDOC's 1992 Census of
Agriculture.
Total Farm Size (acres)
1 to 49 acres
50 to 99 acres
100 to 500 acres
500 to 999 acres
1,000 to 2, 999 acres
2,000 acres or more
Total
Number of
Farms
9,802
7,690
16,125
2,515
943
257
37,332
Harvested
Cropland (acres)
88,488
158,089
965,178
551,639
428,572
215,010
2,406,976
-------
Chapter 2
Sampling Design
practices that occur in a designated sampling
area. Once farms have been stratified by the
types of MMs they should be implementing,
farms can be selected for sampling. For
example, if grazing management were the only
practice being evaluated, farms with only
cropland would be removed from the sample
population. Alternatively, if the investigator is
interested in agriculture practices that affect
the delivery of nitrogen to surface waters, only
farms where MMs or BMPs that affect
nitrogen movement are being implemented
would be selected. Numerous sources of
information can be used to infer the sample
population. These sources should be consulted
before designing a monitoring plan. The U.S.
Department of Commerce's (USDOC) Census
of Agriculture provides information by state
on:
Acres of harvested cropland.
Acres of irrigated cropland.
Types of livestock (cattle, milk cows, hogs
and pigs, chickens, etc.).
Types of crops (corn, wheat, tobacco,
soybeans, peanuts, hay, land in orchards,
etc.).
USD A's National Resources Inventory
provides statistical information by U.S.
Geological Survey (USGS) cataloging unit on
the acreage of different crop types and other
land uses.
2.2.4 Sources of Information
For a truly random selection of population
units, it is necessary to access or develop a
database that includes the entire target
population. The Census of Agriculture
(USDOC, 1994) is a good source, but it is
limited to some extent by confidentiality
constraints. (Certain data are not included,
except at the state level, for counties that have
only a few operations or are dominated by a
single operation.) Other currently available
national data bases generally include only
agricultural entities that participate in cost-
share programs. A more inclusive source
presently available is county land maps.
These maps, however, generally lack data
regarding the specific type of farm operation
and therefore do not provide the information
needed to perform simple random site
selection.
The following are possible sources of
information on farms, which can be used for
identifying potential monitoring farms and
obtaining other information for farm selection.
Positive and negative attributes of each
information source are included.
1992 National Resource Inventory (USDA,
1994a): The National Resource Inventory
(NRI) is a data base composed of data on the
natural resources on the nonfederal lands of
the United States) 74 percent of the Nation's
land area. Its focus is on the soil, water, and
related resources of farms, nonfederal forests,
and grazing lands. The data were collected
from more than 800,000 sample sites
nationwide and are statistically reliable for
analysis at the national, regional, state, major
land resource area, or multiple county level,
though not at the county level. Data elements
include land cover/use (cropland, pasture land,
rangeland and its condition, forest land, barren
land, rural land, urban, and built-up areas),
land ownership, soil information, irrigation,
water bodies, conservation practices, and
cropping history. Data are available on CD-
ROMs and can be integrated with other data
-------
Sampling Design
Chapter 2
through spatial linkages in a geographic
information system (GIS). To obtain the NRI
data base, contact: NRCS National
Cartography and Geospatial Center, Fort
Worth Federal Center, Building 23, Room 60,
P.O. Box 6567, Fort Worth, TX 76115-0567;
1-800-672-5559;
http://www.ncg.nrcs.usda.gov.
Census of Agriculture (USDOC, 1994): The
Census of Agriculture is the leading source of
statistics about the Nation's agricultural
production and the only source for consistent,
comparable data at the county, state, and
national levels. Data are collected on a 5-year
cycle in years ending in "2" and "7" and are
available on computer tapes and CD-ROMs.
Data elements include farms (number and
size), harvested cropland, irrigated land,
market value of products, farm ownership,
livestock and poultry, selected crops
harvested, and more. The Census of
Agriculture has been transferred to the
National Agricultural Statistics Service
(NASS), who funded the 1997 census.
Information on obtaining the Census of
Agriculture is available on the Internet at
http://www.census.gov.
USD A Farm Numbers: USD A farm
numbers are developed when a farmer receives
any financial assistance from a USDA
organization. Only farms participating in
USDA programs are included in the data base.
USGS Land Use and Land Cover (USGS,
1990): Using these data, at a level 2
definition, provides information on four
categories of agricultural land uses:
(1) cropland and pasture; (2) orchards, groves,
vineyards, nurseries, and ornamental
horticulture areas; (3) confined feeding
operations; and (4) other agricultural land.
Watershed, topography, soil types, and/or
political boundary maps could be used in
conjunction with this land use information.
Information on obtaining land use and land
cover maps is available on the Internet at
http://www.usgs.gov or at
http://www.ncg.nrcs.usda.gov.
County Land Maps: These maps can
provide information on farm owners or
managers and possibly land use. Selection of
farms to determine the type of operations
occurring would have to be made randomly.
State Cooperative Extension Service: Farms
that received Extension Service grants or
participated in Coop programs are included.
These programs vary from state to state. As
with the USDA farm numbers,
nonparticipatory farms are not included, which
could result in biased sampling.
Complaint Records: Complaint records
could be used in combination with other
sources. Such records represent farms that
have had problems in the past, which will very
likely skew the data set.
National Agriculture Statistics Service
(NASS): This agency, a branch of the USDA,
issues reports related to national forecasts and
estimates of crops, livestock, poultry, dairy,
prices, labor, and related agricultural items
(USDA, undated). The agency has the most
comprehensive national list of farms available.
NASS could produce random lists of farmers
through one of its two frames. The first frame
is an area frame, which randomly selects land
segments that average 1 square mile in size.
In most states the area frame is stratified into
four broad categories based on land use: (1)
areas intensively cultivated for crops, (2)
extensive areas used primarily for grazing and
-------
Chapter 2
Sampling Design
producing livestock, (3) residential and
business land in cities and towns, and
(4) nonagricultural lands such as parks and
military complexes. The second frame is the
list frame, which consists of names and
addresses of producers grouped by size and
type of unit. In a list frame sample names are
selected randomly (based on whatever
stratification is desired) and mailed
questionnaires. Phone calls or visits are made
to those farmers who do not respond by mail.
A disadvantage of NASS is that it does not
release names to other agencies. If this
method of selection were chosen, NASS
would have to perform the sampling.
Information on obtaining data from NASS is
available on the Internet at
http://www.usda.gov/nass or through the
NASS hotline at 1-800-727-9540.
Computer-aided Management Practices
System (CAMPS): This data base has records
of all nutrient management plans developed by
the USDA Natural Resource Conservation
Service (formerly the Soil Conservation
Service, or SCS).
Field Office Computing System (FOCS):
The Field Office Computing System (FOCS)
replaced CAMPS, and full conversion from
CAMPS to FOCS was completed in all field
offices of the Natural Resources Conservation
Service by January 1996. The system contains
information on client businesses, resource
inventories, conservation plans, practice cost
comparisons, and a variety of specialty
applications. Some of these applications are
SOILS, with county-level soils data;
PLANTS, with state-level plant data; GLA
(Grazing Land Applications), with forage,
herd, grazing schedule, and feedstuff data;
WEQ (Wind Erosion Equation), a tool to
compute wind erosion; Crop Rotation Detail,
which includes planting, harvest, and tillage
data; RUSLE (Revised Universal Soil Loss
Equation), a tool to compute sheet/rill erosion;
Nutrient Screening Tool, a tool for evaluating
nitrogen and phosphorus leaching and surface
runoff; Pesticide Screening Tool, a tool for
evaluating potential for pesticide leaching and
runoff; and Farm*A*Syst, software for
evaluating the potential for surface and
groundwater pollution. Information on FOCS
is available
through the Internet at
http://www.itc.nrcs.usda.gov/fchd/focs.
Farm Service Agency (FSA): The Farm
Service Agency (FSA), created when the
Department of Agriculture reorganized in
1994, incorporates programs from the
Agricultural Stabilization and Conservation
Service (ASCS), the Federal Crop Insurance
Corporation, and the Farmers Home
Administration. FSA administers programs
for commodity loans, commodity purchases,
crop insurance, emergency and disaster relief,
farm ownership and operation loans, and
farmland conservation. The Conservation
Reserve Program assists farmers in conserving
and improving soil, water, and wildlife
resources on farmland by converting highly
erodible and other environmentally sensitive
acreage from production to long-term cover.
FSA also maintains a collection of aerial
photographs of farmlands. Information on
FSA can be obtained through the Internet at
http://www.fsa.usda.gov, or at the following
address: USDA FSA Public Affairs Staff,
P.O. Box 2415, STOP 0506, Washington, DC,
20013, (202) 720-5237. For information on
the collection of aerial photographs maintained
by the agency, contact USDA FSA Aerial
Photography Field Office, P.O. Box 30010,
Salt Lake City, UT, 84130-0010, (801) 975-
3503.
-------
Sampling Design
Chapter 2
2.3 SAMPLE SIZE CALCULATIONS
This section describes methods for estimating
sample sizes to compute point estimates such
as proportions and means, as well as detecting
changes with a given significance level.
Usually, several assumptions regarding data
distribution, variability, and cost must be made
to determine the sample size. Some
assumptions might result in sample size
estimates that are too high or too low.
Depending on the sampling cost and cost for
not sampling enough data, it must be decided
whether to make conservative or "best-value"
assumptions. Because the cost of visiting any
individual farm or group of farms is relatively
constant, it is more economical to collect a
few extra samples rather than realize you need
to go back to collect additional data. In most
cases, the analyst should probably consider
evaluating a range of assumptions on the
impact of sample size and overall program
cost.
To maintain document brevity, some terms
and definitions that will be used in the
remainder of this chapter are summarized in
Table 2-4. These terms are consistent with
those in most introductory-level statistics
texts, and more information can be found
there. Those with some statistical training will
note that some of these definitions include an
additional term referred to as the finite
population correction term (1-4)), where > is
equal to n/N. In many applications, the
number of population units in the sample
population (TV) is large in comparison to the
population units sampled (n) and (7-0) can be
ignored. However, depending on the number
of units (farms for example) in a particular
population, TV can become quite small. Nis
determined by the definition of the sample
population and the corresponding population
units. If > is greater than 0.1, the finite
population correction factor should not be
ignored (Cochran, 1977).
Applying any of the equations described in
this section is difficult when no historical data
set exists to quantify initial estimates of
proportions, standard deviations, means, or
coefficients of variation. To estimate these
parameters, Cochran (1977) recommends four
sources:
• Existing information on the same
population or a similar population.
A two-step sample. Use the first-step
sampling results to estimate the needed
factors, for best design, of the second step.
Use data from both steps to
-------
Chapter 2
Sampling Design
Table 2-4. Definitions used in sample size calculation equations.
N
s^
s
ISfx
M
o2
o
Cv
s2(x)
*
s(x)
1-4)
d
dr
total number of population units
in sample population
number of samples
preliminary estimate of sample
size
number of successes
proportion of successes
proportion of failures (1-p)
ith observation of a sample
sample mean
sample variance
sample standard deviation
total amount
population mean
population variance
population standard deviation
coefficient of variation
variance of sample mean
n/N (unless otherwise stated in
text)
p = a/n
q = 1 - p
x = —
s =
d =
s2 =
C= six
X-\i
n
s(x)=— (
{n
{n
standard error (of sample mean) Z,
finite population correction factor
allowable error
relative error
adf
value corresponding to cumulative area of
1-a using the normal distribution (see
Table A1).
value corresponding to cumulative area of
1-a using the student t distribution with df
degrees of freedom (see Table A2).
estimate the final precision of the
characteristic(s) sampled.
A "pilot study" on a "convenient" or
"meaningful" subsample. Use the results
to estimate the needed factors. Here the
results of the pilot study generally cannot
be used in the calculation of the final
precision because often the pilot sample is
not representative of the entire population
to be sampled.
• Informed judgment, or an educated guess.
It is important to note that this document only
addresses estimating sample sizes with
traditional parametric procedures. The
methods described in this document should be
appropriate in most cases, considering the type
of data expected. If the data to be sampled are
skewed, as with much water quality data, the
investigator should plan to transform the data
to something symmetric, if not normal, before
computing sample sizes (Helsel and Hirsch,
-------
Sampling Design
Chapter 2
1995). Kupper and Hafner (1989) also note
that some of these equations tend to
underestimate the necessary sample because
power is not taken into consideration. Again,
EPA recommends that if you do not have a
background in statistics, you should consult
with a trained statistician to be certain that
your approach, design, and assumptions are
appropriate to the task at hand.
2.3.1 Simple Random Sampling
In simple random sampling, we presume that
the sample population is relatively
homogeneous and we would not expect a
difference in sampling costs or variability. If
the cost or variability of any group within the
sample population were different, it might be
What sample size is necessary to estimate
the proportion of farms implementing IPM to
within ±5 percent?
What sample size is necessary to estimate
the proportion of farms implementing IPM so
that the relative error is less than 5 percent?
If the proportion is expected to be a low
number, using a constant allowable error
might not be appropriate. Ten percent
plus/minus 5 percent has a 50 percent relative
error. Alternatively, the relative error, dn can
be specified (i.e., the true proportion lies
betweenp-drp andp+drp with a 1-a
confidence level) and a preliminary estimate
of sample size can be computed as (Snedecor
and Cochran, 1980)
(2-2)
In both equations, the analyst must make an
initial estimate ofp before starting the study.
In the first equation, a conservative sample
size can be computed by assuming p equal to
0.5. In the second equation the sample size
gets larger as p approaches 0 for constant dn
thus an informed initial estimate ofp is
needed. Values of a typically range from 0.01
to 0.10. The final sample size is then
estimated as (Snedecor and Cochran, 1980)
n = i
CJ)
for cj> > 0.1
otherwise
(2-3)
more appropriate to consider a stratified
random sampling approach.
To estimate the proportion of farms
implementing a certain BMP or MM, such that
the allowable error, d, meets the study
precision requirements (i.e., the true
proportion lies between p-d andp+d with a 1-
a confidence level), a preliminary estimate of
sample size can be computed as (Snedecor and
Cochran, 1980)
no =
d2
(2-1)
where > is equal to n/N. Table 2-5
demonstrates the impact on n of selecting/?, a,
d, dr, and N. For example, 278 random
samples are needed to estimate the proportion
-------
Chapter 2
Sampling Design
Table 2-5. Comparison of sample size as a function of p, a, cf, dn and N for estimating
jroportions using equations 2-1 through 2-3.
Probability
of Success,
P
0.1
0.1
0.5
0.5
0.1
0.1
0.5
0.5
Signifi-
cance
level, a
0.05
0.05
0.05
0.05
0.10
0.10
0.10
0.10
Allowable
error, d
0.050
0.075
0.050
0.075
0.050
0.075
0.050
0.075
Relative
error, dr
0.500
0.750
0.100
0.150
0.500
0.750
0.100
0.150
Preliminary
sample
size, «„
138
61
384
171
97
43
271
120
Sample Size, n
Number of Population Units in Sample
Population, N
500
108
55
217
127
82
43
176
97
750
117
61
254
139
86
43
199
104
1,000
121
61
278
146
97
43
213
107
2,000
138
61
322
171
97
43
238
120
Large N
138
61
384
171
97
43
271
120
of 1,000 farmers using IPM to within ±5
percent (J=0.05) with a 95 percent confidence
level assuming roughly one-half of farmers are
using IPM.
What sample size is necessary to estimate
the average number of acres per farm that
are under conservation tillage to within ±25
acres?
What sample size is necessary to estimate
the average number of acres per farm that
are under conservation tillage to within ±10
percent?
Suppose the goal is to estimate the average
acreage per farm where conservation tillage is
used. The number of random samples
required to achieve a desired margin of error
when estimating the mean (i.e., the true mean
lies between x-d and x+d with a 1-a
confidence level) is (Gilbert, 1987)
n =
(2-4)
IfNis large, the above equation can be
simplified to
n =
(2-5)
Since the Student's lvalue is a function of n,
Equations 2-4 and 2-5 are applied iteratively.
That is, guess at what n will be, look up
ti-a/2,n-i from Table A2, and compute a revised
n. If the initial guess of n and the revised n
are different, use the revised n as the new
-------
Sampling Design
Chapter 2
guess, and repeat the process until the
computed value of n converges with the
guessed value. If the population standard
deviation is known (not too likely), rather than
estimated, the above equation can be further
simplified to:
n = (Z^^o/d)2 (2-6)
To keep the relative error of the mean estimate
below a certain level (i.e., the true mean lies
between x-dr x and x+dr x with a
1-a confidence level), the sample size can be
computed with (Gilbert, 1987)
n =
(2.7)
error under 15 percent (i.e., dr < 0.15) with a
90 percent confidence level.
Unfortunately, this is the first study that
County X has done and there is no information
about the coefficient of variation, Cv. The
investigator, however, is familiar with a recent
study done by another company. Based on
that study, the investigator estimates the Cv as
0.6 and s equal to 30. As a first-cut
approximation, Equation 2-6 is applied with
Zj-,/2 equal to 1.645 and assuming TV is large:
n = (1.645*0.6/0.15)2
= 43.3 ~ 44 samples
Cv is usually less variable from study to study
than are estimates of the standard deviation,
which are used in Equations 2-4 through 2-6.
Professional judgment and experience,
typically based on previous studies, are
required to estimate Cv. Had Cv been known,
Zi-«/2 would have been used in place of t,^^,
in Equation 2-7. If TV is large, Equation 2-7
simplifies to:
n =
(2-8)
For County X, farms range in size from 20 to
4,325 acres although most are less than 500
acres in size. The goal of the sampling
program is to estimate the average number of
cropland acres using minimum tillage.
However, the investigator is concerned about
skewing the mean estimate with the few large
farms. As a result, the sample population for
this analysis is the 430 cropland farms with
less than 500 total acres of cropland. The
investigator also wants to keep the relative
Since n/Nis greater than 0.1 and Cv is
estimated (i.e., not known), it is best to
reestimate n with Equation 2-7 using 44
samples as the initial guess of n. In this case,
ti-a/2,n-i is obtained from Table A2 as 1.6811.
n =
(1.6811xQ.6/0.15)2
1 + (1.6811x0.6/0.15)2/430
= 40.9 ~ 41 samples
Notice that the revised sample is somewhat
smaller than the initial guess of n. In this case
it is recommended to reapply Equation 2-7
using 41 samples as the revised guess of n. In
this case, tj.^^ is obtained from Table A2 as
1.6839.
n =
(1.6839xQ.6/0.15)2
l + (1.6839xQ.6/0.15)2/430
= 41.0 ~ 41 samples
-------
Chapter 2
Sampling Design
Since the revised sample size matches the
estimated sample size on which t,^^, was
based, no further iterations are necessary. The
proposed study should include 41 farms
randomly selected from the 430 cropland
farms with less than 500 total acres of
cropland in County X.
What sample size is necessary to determine
whether there is a 20 percent difference in
BMP implementation before and after a
cost-share program?
What sample size is necessary to detect a
30-acre increase in average conservation
tillage acreage per farm between farm
owners that own versus rent land?
When interest is focused on whether the level
of BMP implementation has changed, it is
necessary to estimate the extent of
implementation at two different time periods.
Alternatively, the proportion from two
different populations can be compared. In
either case, two independent random samples
are taken and a hypothesis test is used to
determine whether there has been a significant
change in implementation. (See Snedecor and
Cochran (1980) for sample size calculations
for matched data.) Consider an example in
which the proportion of highly erodible land
under conservation tillage will be estimated at
two time periods. What sample size is
needed?
To compute sample sizes for comparing two
proportions, p, andp2, it is necessary to
provide a best estimate forp} andp2, as well as
specifying the significance level and power (7-
(3). Recall that power is equal to the
probability of rejecting H0 when H0 is false.
Given this information, the analyst substitutes
these values into (Snedecor and Cochran,
1980)
n =
V
(2-9)
where Za and Z2p correspond to the normal
deviate. Although this equation assumes that
N large, it is acceptable for practical use
(Snedecor and Cochran, 1980). Common
values of (Za andZ2ft)2 are summarized in
Table 2-6. To account forp} andp2 being
Table 2-6. Common values of (Za + Z2p)2 for estimating sample size for use with equations 2-9
and 2-10.
Power,
1-P
0.80
0.85
0.90
0.95
0.99
a for One-sided Test
0.01
10.04
11.31
13.02
15.77
21.65
0.05
6.18
7.19
8.56
10.82
15.77
0.10
4.51
5.37
6.57
8.56
13.02
a for Two-sided Test
0.01
11.68
13.05
14.88
17.81
24.03
0.05
7.85
8.98
10.51
12.99
18.37
0.10
6.18
7.19
8.56
10.82
15.77
-------
Sampling Design
Chapter 2
estimated, Z could be substituted with t. In
lieu of an iterative calculation, Snedecor and
Cochran (1980) propose the following
approach: (1) compute n0 using Equation
2-9; (2) round n0 up to the next highest
integer,/; and (3) multiply n0 by (f+3)/(f+l) to
derive the final estimate of n.
To detect a difference in proportions of 0.20
with a two-sided test, a equal to 0.05,1-fi
equal to 0.90, and an estimate of/>; and/>2
equal to 0.4 and 0.6, n0 is computed as
[(0.4)(0.6) + (0.6)(0.4)]
no = 10.51
(0.6 - 0.4)2
= 126.1
Rounding 126.1 to the next highest integer,/is
equal to 127, and n is computed as 126.1 x
130/128 or 128.1. Therefore 129 samples in
each random sample, or 258 total samples, are
needed to detect a difference in proportions of
0.2. Beware of other sources of information
that give significantly lower estimates of
sample size. In some cases the other sources
do not specify l-(3; otherwise, be sure that an
"apples-to-apples" comparison is being made.
To compare the average from two random
samples to detect a change of d (i.e., x2-x,\ the
following equation is used:
n =
(2-10)
Common values of (Za cmdZ2p)2 are
summarized in Table 2-6. To account for s,
and s2 being estimated, Z should be replaced
with t. In lieu of an iterative calculation,
Snedecor and Cochran (1980) propose the
following approach: (1) compute n0 using
Equation 2-10; (2) round n0 up to the next
highest integer,/; and (3) multiply n0 by
(f+3)/(f+J) to derive the final estimate of n.
Continuing the County X example above,
where s was estimated as 75 acres, the
investigator will also want to compare the
average number of cropland acres using
minimum tillage now to the average number
of minimum tillage acres in a few years. To
demonstrate success, the investigator believes
that it will be necessary to detect a 50-acre
increase. Although the standard deviation
might change after the cost-share program,
there is no particular reason to propose a
different s after the cost-share program. To
detect a difference of 50 acres with a two-
sided test, a equal to 0.05, 1-fi equal to 0.90,
and an estimate of s} and s2 equal to 75, n0 is
computed as
(752 + 752)
502 (2-H)
= 47.3
Rounding 47.3 to the next highest integer,/is
equal to 48, and n is computed as 47.3 x 51/49
or 49.2. Therefore 50 samples in each random
sample, or 100 total samples, are needed to
detect a difference of 50 acres.
2.3.2 Stratified Random Sampling
The key reason for selecting a stratified
random sampling strategy over simple random
sampling is to divide a heterogeneous
population into more homogeneous groups. If
populations are grouped based on size (e.g.,
farm size) when there is a large number of
no = 10.51
What sample size is necessary to estimate
the average number of acres per farm that
are under conservation tillage when there is
a wide variety of farm sizes?
-------
Chapter 2
Sampling Design
small units and a few larger units, a large gain
in precision can be expected (Snedecor and
Cochran, 1980). Stratifying also allows the
investigator to efficiently allocate sampling
resources based on cost. The stratum mean,
xh, is computed using the standard approach
for estimating the mean. The overall mean,
xst, is computed as
L
TFh *h (2-12)
h=l
**=
where L is the number of strata and Wh is the
relative size of the h* stratum. Wh can be
computed as Nf/N where Nh and N are the
number of population units in the hth stratum
and the total number of population units across
all strata, respectively. Assuming that simple
random sampling is used within each stratum,
the variance of xst is estimated as (Gilbert,
1987)
N2
n,
-^L\^L (2-13)
N,. n,.
where nh is the number of samples in the h*
stratum and sh2 is computed as (Gilbert, 1987)
1
„ \2
(2-14)
There are several procedures for computing
sample sizes. The method described below
allocates samples based on stratum size,
variability, and unit sampling cost. Ifs2(xst)
is specified as Ffor a design goal, n can be
obtain/edfrom (Gilb
erJ, 1987)
h=l
h=l
(2-15)
where ch is the per unit sampling cost in the hth
stratum and nh is estimated as (Gilbert, 1987)
w ? /.fc~
n, = n
h=l
In the discussion above, the goal is to estimate
an overall mean. To apply a stratified random
sampling approach to estimating proportions,
substituteph,pst,phqh, and s2(pj for xh, xst, sh\
and s2(xst) in the above equations,
respectively.
To demonstrate the above approach, consider
the County X example again. In addition to
the 430 farms that are less than 500 acres,
there are 100 farms that range in size from 501
to 1,000 acres, 50 farms that range in size
from 1,001 to 2,000 acres, and 20 farms that
range in size from 2,001 to 4,500 acres. Table
2-7 presents three basic scenarios for
estimating sample size. In the first scenario, sh
and ch are assumed equal among all strata.
Using a design goal of V equal to 100 and
applying Equation 2-15 yields a total sample
size of 51.4 or 52. Since sh and ch are uniform,
these samples are allocated proportionally to
Wh, which is referred to as proportional
allocation. This allocation can be verified by
comparing the percent sample allocation to
Wh. Due to rounding up, a total of 53 samples
are allocated.
Under the second scenario, referred to as the
Neyman allocation., the variability between
strata changes, but unit sample cost is
constant. In this example, sh increases by 50
between strata. Because of the increased
variability in the last three strata, a total of
h=i
-------
Sampling Design
Chapter 2
Table 2-7. Allocation of samples.
Farm Size
(acres)
Number of
Farms
WJ
Relative
Size
(Wh)
Standard
Deviation
(sh)
Unit
Sample
Cost
(Cfl)
Sample Allocation
Number
%
A) Proportional allocation (s,, and ch are constant)
20-80
81-200
201-300
301-400
430
100
50
20
0.7167
0.1667
0.0833
0.0333
30
30
30
30
1
1
1
1
31
7
4
2
70.5
15.9
9.1
4.5
Using Equation 2-15, n is equal to 41.9. Applying Equation 2-16 to each stratum yields a total of
44 samples after rounding up to the next integer.
B) Neyman allocation (ch is constant)
20-80
81-200
201-300
301-400
430
100
50
20
0.7167
0.1667
0.0833
0.0333
30
45
60
75
1
1
1
1
35
13
9
5
56.5
21.0
14.5
8.1
Using Equation 2-15, n is equal to 59.3. Applying Equation 2-16 to each stratum yields a total of
62 samples after rounding up to the next integer.
C) Allocation where sh and ch are not constant
20-80
81-200
201-300
301-400
430
100
50
20
0.7167
0.1667
0.0833
0.0333
30
45
60
75
1.00
1.25
1.50
2.00
38
12
8
4
61.3
19.4
12.9
6.5
Using Equation 2-15, n is equal to 60.0. Applying Equation 2-16 to each stratum yields a total of
62 samples after rounding up to the next integer.
79.1 or 80 samples are needed to meet the
same design goal. So while more samples are
taken in every strata, proportionally fewer
samples are needed in the smaller farm size
group. For example, using proportional
allocation nearly 70 percent of the samples are
taken in the 20 to 500-acre farm size stratum,
whereas approximately 54 percent of the
samples are taken in the same stratum using
the Neyman allocation.
-------
Chapter 2
Sampling Design
Finally, introducing sample cost variation will
also affect sample allocation. In the last
scenario it was assumed that it is twice as
expensive to evaluate a farm from the largest
farm size stratum than to evaluate a farm from
the smallest farm size stratum. In this
example, roughly the same total number of
samples are needed to meet the design goal,
yet more samples are taken in the smaller size
stratum.
2.3.3 Cluster Sampling
Cluster sampling is commonly used when
there is a choice between the size of the
sampling unit (e.g., fields versus farms). In
general, it is cheaper to sample larger units
than smaller units, but these results tend to be
less accurate (Snedecor and Cochran, 1980).
Thus, if there is not a unit sampling cost
advantage to cluster sampling, it is probably
better to use simple random sampling. To
decide whether to perform a cluster sample, it
will probably be necessary to perform a
special investigation to quantify sampling
errors and costs using the two approaches.
Perhaps the best approach to explaining the
difference between simple random sampling
and cluster sampling is to consider an example
set of results. In this example, the investigator
did a field evaluation of BMP implementation
along a stream to evaluate whether
recommended BMPs had been implemented
and maintained. Since the watershed was
quite large, the investigator elected to inspect
10 farms at each site. Table 2-8 presents the
number of farms at each site that had
implemented and maintained recommended
BMPs. The overall mean is 5.6; a little more
than one-half of the farms have implemented
recommended BMPs. However, note that
since the population unit corresponds to the 10
farms collectively, there are only 30 samples
and the standard error for the proportion of
farmers using recommended BMPs is 0.035.
Had the investigator incorrectly calculated the
standard error using the random sampling
equations, he or she would have computed
0.0287, nearly a 20 percent error.
Since the standard error from the cluster
sampling example is 0.035, it is possible to
estimate the corresponding simple random
sample size to get the same precision using
pa
n = -J—L-
(0.56)(0.44)
0.0352
(2-17)
= 201
Is collecting 300 samples using a cluster
sampling approach cheaper than collecting
about 200 simple random samples? If so,
cluster sampling should be used; otherwise
simple random sampling should be used.
2.3.4 Systematic Sampling
It might be necessary to obtain a baseline
estimate of the proportion of farms where
nutrient management practices have been
implemented using a mailed questionnaire or
phone survey. Assuming a record of farms in
the state is available in a sequence unrelated to
the manner in which nutrient management
plans are implemented by individual farms
(e.g., in alphabetical order by the farm
owner's name), a systematic sample can be
obtained in the following manner (Casley and
-------
Sampling Design
Chapter 2
Table 2-8. Number of farms (out of 10) implementing recommended BMPs.
3 95764
5 77475
8 47453
635
384
399
5
6
7
Grand Total = 168
x=5.6 p = 5.6/1 0=0.560
s=1.923 5=1.923/10=0.1923
Standard error using cluster sampling: s(p)=0.1923/(30)05
Standard error if simple random sampling assumption had
sfo)=((0.56)(1 -0.56)/300)° 5 =0.0287
=0.035
been incorrectly used:
Lury, 1982):
1. Select a random number r between 1 and
«, where n is the number required in the
sample.
2. The sampling units are then r, r + (N/n), r
+ (2N/n), ...,r + (n-l)(N/n), where TV is
total number of available records.
If the population units are in random order
(e.g., no trends, no natural strata,
uncorrelated), systematic sampling is, on
average, equivalent to simple random
sampling.
Once the sampling units (in this case, specific
farms) have been selected, a questionnaire can
be mailed to the farm owner or a telephone
inquiry made about nutrient management
practices being followed by the farm owner.
In another example, the Conservation
Technology Information Center (CTIC), with
the assistance of the Natural Resource
Conservation Service (NRCS, formerly the
Soil Conservation Service), randomly selects
approximately 3,100 sites for its annual
National Crop Residue Management Survey
(CTIC, 1994). A method for randomly
selecting sites to fit local data needs was
recently developed for assessing
implementation of conservation tillage
practices (CTIC, 1995). This method, the
County Transect Survey, involves establishing
a driving route that passes through all regions
heavily used for crop production. Large
urbanized areas and heavily traveled federal
and state highways are avoided where
possible. The direction of the route is not
significant. In a recent application of the
method in Illinois, the route was 110 miles
long and included 456 cropland observation
sites. Data were collected at set predetermined
intervals. Data on rainfall, slope, soil
erodibility, soil loss tolerance ( T\
contouring, ephemeral erosion, and crop
rotation/tillage system employed were also
collected. Figure 2-6 presents the type of
random route used in the survey. The county
transect survey method has also been used
successfully in Minnesota, Ohio, and Indiana
-------
Chapter 2
Sampling Design
(CTIC, 1995), and is being considered for use
in Pennsylvania.
Figure 2-6. Example route for a county transect survey (CTIC, 1995).
-------
CHAPTER 3. METHODS FOR EVALUATING DATA
3.1 INTRODUCTION
Once data have been collected, it is necessary
to statistically summarize and analyze the data.
EPA recommends that the data analysis
methods be selected before collecting the first
sample. Many statistical methods have been
computerized in easy-to-use software that is
available for use on personal computers.
Inclusion or exclusion in this section does not
imply an endorsement or lack thereof by the
U.S. Environmental Protection Agency.
Commercial-off-the-shelf software that covers
a wide range of statistical and graphical
support includes SAS, Statistica, Statgraphics,
Systat, Data Desk (Macintosh only), BMDP,
and JMP. Numerous spreadsheets, database
management packages, and other graphics
software can also be used to perform many of
the needed analyses. In addition, the
following programs, written specifically for
environmental analyses, are also available:
• SCOUT: A Data Analysis Program, EPA,
NTIS Order Number PB93-505303.
• WQHYDRO (WATER
QUALITY/HYDROLOGY
GRAPHICS/ANALYSIS SYSTEM), Eric
R. Aroner, Environmental Engineer, P.O.
Box 18149, Portland, OR 97218.
• WQSTAT, lim C. Loftis, Department of
Chemical and Bioresource Engineering,
Colorado State University, Fort Collins,
CO 80524.
Computing the proportion of sites
implementing a certain BMP or the average
number of acres that are under a certain BMP
follows directly from the equations presented
in Section 2.3 and is not repeated. The
remainder of this section is focused on
evaluating changes in BMP implementation.
The methods provided in this section provide
only a cursory overview of the type of
analyses that might be of interest. For a more
thorough discussion on these methods, the
reader is referred to Gilbert (1987), Snedecor
and Cochran (1980), and Helsel and Hirsch
(1995). Typically the data collected for
evaluating changes will typically come as two
or more sets of random samples. In this case,
the analyst will test for a shift or step change.
Depending on the objective, it is appropriate to
select a one- or two-sided test. For example, if
the analyst knows that BMP implementation
will only go up as a result of a cost-share
program, a one-sided test could be formulated.
Alternatively, if the analyst does not know
whether implementation will go up or down, a
two-sided test is necessary. To simply
compare two random samples to decide if they
are significantly different, a two-sided test is
used. Typical null hypotheses (H0) and
alternative hypotheses (Ha) for one- and two-
sided tests are provided below:
One-sided test
H0: BMP Implementation(Post cost share) <
BMP Implementation(Pre cost share)
Ha: BMP Implementation(Post cost share) >
BMP Implementation(Pre cost share)
Two-sided test
-------
Methods for Evaluating Data
Chapter
H0: BMP Implementation(Post cost share) =
BMP Implementation(Pre cost share)
Ha: BMP Implementation(Post cost share) *
BMP Implementation(Pre cost share)
Selecting a one-sided test instead of a two-
sided test results in an increased power for
the same significance level (Winer, 1971).
That is, if the conditions are appropriate, a
corresponding one-sided test is more
desirable than a two-sided test given the same
a and sample size. The manager and analyst
should take great care in selecting one- or
two-sided tests.
3.2 COMPARING THE MEANS FROM Two
INDEPENDENT RANDOM SAMPLES
The Student's t test for two samples and the
Mann-Whitney test are the most appropriate
tests for these types of data. Assuming the
data meet the assumptions of the t test, the
two-sample t statistic with n,+n2-2 degrees of
freedom is (Remington and Schork, 1970)
/-7T_7T\ _ A
t= (l2) -^
1 1 (3-1)
\
n,
where n, and n2 are the sample size of the first
and second data set and xl and x2 are the
estimated means from the first and second data
set, respectively. The pooled standard
deviation, sp, is defined by
7 , , ? , s 10.5
(3-2)
where s,2 and s22 correspond to the estimated
variances of the first and second data set,
Tests for Two Independent Random Samples
Test' Key Assumptions
Two-sample t
Both data sets must be
normally distributed
Data sets should have
equal variances1
Mann-Whitney • None
The standard forms of these tests require
independent random samples.
The variance homogeneity assumption can
be relaxed.
respectively. The difference quantity (A0) can
be any value, but here it is set to zero. A0 can
be set to a non-zero value to test whether the
difference between the two data sets is greater
than a selected value. If the variances are not
equal, refer to Snedecor and Cochran (1980)
for methods for computing the t statistic. In a
two-sided test, the value from Equation 3-1 is
compared to the t value from Table A2 with
a/2 and nt+n2-2 degrees of freedom.
The Mann-Whitney test can also be used to
compare two independent random samples.
This test is very flexible since there are no
assumptions about the distribution of either
sample or whether the distributions have to be
the same (Helsel and Hirsch, 1995). Wilcoxon
(1945) first introduced this test for equal-sized
samples. Mann and Whitney (1947) modified
the original Wilcoxon's test to apply it to
different sample sizes. Here, it is determined
whether one data set tends to have larger
observations than the other.
If the distributions of the two samples are
similar except for location (i.e., similar spread
and skew), Ha can be refined to imply that the
median concentration from one sample is
-------
Chapter
Methods for Evaluating Data
"greater than," "less than," or "not equal to"
the median concentration from the second
sample. To achieve this greater detail in Ha,
transformations such as logs can be used.
Tables of Mann-Whitney test statistics (e.g.,
Conover, 1980) may be consulted to determine
whether to reject H0 for small sample sizes. If
n, and n2 are greater than or equal to 10
observations, the test statistic can be computed
from the following equation (Conover, 1980):
T - n,
N
(3-3)
i=\
where
n =
T =
number of observations in sample with
fewer observations,
number of observations in sample with
more observations,
sum of ranks for sample with fewer
observations, and
rank for the ith ordered observation
used in both samples.
T] is normally distributed and Table Al can be
used to determine the appropriate quantile.
Helsel and Hirsch (1995) and USEPA (1997)
provide detailed examples for both of these
tests.
3.3 COMPARING THE PROPORTIONS FROM
Two INDEPENDENT SAMPLES
Consider the example in which the proportion
of highly erodible land under conservation
tillage has been estimated during two time
periods to be/>; andp2 using sample sizes of n,
and n2, respectively. Assuming a normal
approximation is valid, the test statistic under
a null hypothesis of equivalent proportions (no
change) is
\
1 1
(3-4)
where p is a pooled estimate of proportion and
is equal to (x1+x2)/(n1+n2) and x} and x2 are
the number of successes during the two time
periods. An estimator for the difference in
proportions is simply p, -p2.
In an earlier example, it was determined that
129 observations in each sample were needed
to detect a difference in proportions of 0.20
with a two-sided test, a equal to 0.05, and
1-P equal to 0.90. Assuming that 130 samples
were taken andpj andp2 were estimated from
the data as 0.6 and 0.4, the test statistic would
be estimated as
0.6-0.4
\
= 3.22
0.5(0.5)
(3-5)
130 130,
Comparing this value to the t value from Table
A2 (a/2 = 0.025, df=258) of 1.96,
H0 is rejected.
3.4 COMPARING MORE THAN Two
INDEPENDENT RANDOM SAMPLES
The analysis of variance (ANOVA) and
Kruskal-Wallis are extensions of the
two-sample t and Mann-Whitney tests,
respectively, and can be used for analyzing
more than two independent random samples
when the data are continuous (e.g., mean
acreage). Unlike the t test described earlier,
-------
Methods for Evaluating Data
Chapter 3
the ANOVA can have more than one factor or
explanatory variable. The Kruskal-Wallis test
accommodates only one factor, whereas the
Friedman test can be used for two factors. In
addition to applying one of the above tests to
determine if one of the samples is significantly
different from the others, it is also necessary to
do postevaluations to determine which of the
samples is different. This section recommends
Tukey's method to analyze the raw or rank-
transformed data only if one of the previous
tests (ANOVA, rank-transformed ANOVA,
Kruskal-Wallis, Friedman) indicates a
significant difference between groups.
Tukey's method can be used for equal or
unequal sample sizes (Helsel and Hirsch,
1995). The reader is cautioned, when
performing an ANOVA using standard
software, to be sure that the ANOVA test used
matches the data. See USEPA (1997) for a
more detailed discussion on comparing more
than two independent random samples.
3.5 COMPARING CATEGORICAL DATA
In comparing categorical data, it is important
to distinguish between nominal categories
(e.g., land ownership, county location, type of
BMP) and ordinal categories (e.g., BMP
implementation rankings, low-medium-high
scales).
The starting point for all evaluations is the
development of a contingency table. In Table
3-1, the preference of three BMPs is compared
to operator type in a contingency table. In this
case both categorical variables are nominal. In
this example, 45 of the 102 operators that own
the land they till used BMPj. There were a
total of 174 observations.
To test for independence, the sum of the
squared differences between the expected (Ey)
and the observed (O;j) count summed over all
cells is computed as (Helsel and Hirsch, 1995)
y =
*^ci
E..
(3-6)
where Ey is equal to AtC/N. x* is compared to
the 1-a quantile of the % distribution with
(m-l)(k-l) degrees of freedom (see Table A3).
In the example presented in Table 3-1, the
symbols listed in the parentheses correspond to
the above equation. Note that k corresponds to
the three types of BMPs and m corresponds to
the three different types of
-------
Chapter
Methods for Evaluating Data
Table 3-1. Contingency table of observed operator type and implemented BMP.
Operator Type
Rent
Own
Combination
Column Total, C,
BMP,
10(0,0
45 (021)
8 (031)
63 (C,)
BMP2
30 (012)
32 (022)
3 (0,,)
65 (C,)
BMP3
17(013)
25 (023)
4 (033)
46 (CO
Row Total,
57 (A,)
102(A2)
15(AO
174(N)
Key to Symbols:
O5 = number of observations for the /th operator and /th BMP type
A, = row total for the /th operator type (total number of observations for a given operator type)
Cj = column total for the/th BMP type (total number of observations for a given BMP type)
N = total number of observations
operators. Table 3-2 shows computed values
of Etj and (O^-E^/E^ in parentheses for the
example data. Xa i§ equal to 14.60. From
Table A3, the 0.95 quantile of the /
distribution with 4 degrees of freedom is
9.488. H0 is rejected; the selection of BMP is
not random among the different operator
types. The largest values in the parentheses in
Table 3-2 give an idea as to which
combinations of operator type and BMP are
noteworthy. In this example, it appears that
BMP2 is preferred to BMPj for those operators
that rent the land they till.
Now consider that in addition to evaluating
information regarding the operator and BMP
type, we also recorded a value from 1 to 5
indicating how well the BMP was installed
and maintained, with 5 indicating the best
results. In this case, the BMP implementation
rating is ordinal. Using the same notation as
before, the average rank of observations in
row x, Rx, is equal to (Helsel and Hirsch,
1995)
(3-7)
where At corresponds to the row total. The
average rank of observations in columny, Dp
is equal to
2 = 1
(3-8)
C
where C,- corresponds to the column total. The
Kruskal-Wallis test statistic is then computed
as
K = (N- 1)
k
Er r>2 N
LJDJ N
7 = 1
m
EA K2 N
AIKI ^
2 = 1
'N+\
N
'N+\
. N .
(3-9)
where K is compared to the x2 distribution
with k-1 degrees of freedom. This is the most
general form of the Kruskal-Wallis test since it
is a comparison of distribution shifts
-------
Methods for Evaluating Data
Chapter
Table 3-2. Contingency table of expected operator type and implemented BMP. (Values in
parentheses correspond to
Operator Type
Rent
Own
Combination
Column Total
BMP!
20.64
(5.48)
36.93
(1.76)
5.43
(1.22)
63
BMP2
21.29
(3.56)
38.10
(0.98)
5.60
(1.21)
65
BMP3
15.07
(0.25)
26.97
(0.14)
3.97
(0.00)
46
Row Total
57
102
15
174
rather than shifts in the median (Helsel and
Hirsch, 1995).
Table 3-3 is a continuation of the previous
example indicating the BMP implementation
rating for each BMP type. For example, 29 of
the 70 observations that were given a rating of
4 are associated with BMP2. The terms inside
the parentheses of Table 3-3 correspond to the
terms used in Equations 3-7 to 3-9. Note that
k corresponds to the three types of BMPs and
m corresponds to the five different levels of
BMP implementation. Using Equation 3-9 for
the data in Table 3-3, K is equal to 14.86.
Comparing this value to 5.991 obtained from
Table A3, there is a significant difference in
the quality of implementation between the
three BMPs.
The last type of categorical data evaluation
considered in this chapter is when both
variables are ordinal. The Kendall ib for tied
data can be used for this analysis. The statistic
Tb is calculated as (Helsel and Hirsch, 1995)
S
where S, SSa, and SSC are computed as
s = E [££0
/-— / U-— / /-— / xy
allxy
xy
xy
2 = 1
(3-n)
(3-12)
(3-13)
To determine whether Tb is significant,
modified to a normal statistic using
^ is
s-\
°s
S+l
ifS>0
ifS<0
(3-14)
-------
Chapter
Methods for Evaluating Data
Table 3-3. Contingency table of implemented BMP and rating of installation and
maintenance.
BMP
Implementation
Rating
1
2
3
4
5
Column Total, C,
BMP,
1 (0^)
7 (021)
15(031)
32 (041)
8 (0B1)
63 (C,)
BMP2
2 (012)
3 (022)
16(032)
29 (042)
15(0,,)
65 (C,)
BMP3
2 (013)
5 (023)
26 (033)
9 (043)
4 (CU
46 (CO
Row Total,
A,
5 (A,)
15(A2)
57 (A3)
70 (A4)
27 (A,)
174(N)
Key to Symbols:
QIJ = number of observations for the /th BMP implementation rating andyth BMP type
A, = row total for the /th BMP implementation rating (total number of observations for a given BMP implementation rating)
Cj = column total for the/th BMP type (total number of observations for a given BMP type)
N = total number of observations
where
°, =»
(3-15)
where Zs is zero if S is zero. The values of at
and ct are computed as At /N and Ct /N,
respectively.
Table 3-4 presents the BMP implementation
ratings that were taken in three separate years.
For example, 15 of the 57 observations that
were given a rating of 3 are associated with
Year 2. Using Equations
3-11 and 3-15, S and os are equal to 2,509 and
679.75, respectively. Therefore, Zs is equal to
(2509-1)7679.75 or 3.69. Comparing this
value to a value of 1.96 obtained from Table
Al (a/2=0.025) indicates that BMP
implementation is improving with time.
-------
Methods for Evaluating Data
Chapter
Table 3-4. Contingency table of implemented BMP and sample year.
BMP
Implementation
Rating
1
2
3
4
5
Column Total, Cy
c,
Year 1 Year 2 Year 3
2 (0,,) 1 (012) 2 (013)
5 (021) 7 (022) 3 (023)
26(031) 15(032) 16(033)
9 (041) 32 (042) 29 (043)
4(0B1) 8(0,,) 15(0*,)
46 (CO 63 (C2) 65 (C3)
0.264 0.362 0.374
Row Total,
A
5 (A,)
15(A2)
57 (A3)
70 (A4)
27 (A,)
174(N)
a,
0.029
0.086
0.328
0.402
0.155
Key to Symbols:
Oj = number of observations for the /th BMP implementation rating and /th year
A, = row total for the /th BMP implementation rating (total number of observations for a given BMP implementation rating)
Cj = column total for the /th BMP type (total number of observations for a given year)
N = total number of observations
a, = X\,/N
c, = C,/N
-------
CHAPTER 4. CONDUCTING THE EVALUATION
4.1 INTRODUCTION
This chapter addresses the process of
determining whether agricultural MMs or
BMPs are being implemented and whether
they are being implemented according to
approved standards or specifications.
Guidance is provided on what should be
measured to assess MM and BMP
implementation, as well as methods for
collecting the information, including physical
farm or field evaluations, mail- and/or
telephone-based surveys, personal interviews,
and aerial reconnaissance and photography.
Designing survey instruments to avoid error
and rating MM and BMP implementation are
also discussed.
Evaluation methods are separated into two
types: Expert evaluations and self-
evaluations. Expert evaluations are those in
which actual field investigations are conducted
by trained personnel to gather information on
MM or BMP implementation. Self-
evaluations are those in which answers to a
predesigned questionnaire or survey are
provided by the person being surveyed,
usually a farm owner or manager. The
answers provided are used as survey results.
Self-evaluations might also include
examination of materials related to a farm,
such as applications for cost-share programs or
crop histories. Extreme caution should be
exercised when using data from self-
evaluations as the basis for assessing MM or
BMP compliance since they are not typically
reliable for this purpose. Each of these
evaluation methods has advantages and
disadvantages that should be considered prior
to deciding which one to use or in what
combination to use them. Aerial
reconnaissance and photography can be used
to support either evaluation method.
Self-evaluations are useful for collecting
information on the level of awareness that
farm owners or managers have of MMs or
BMPs, dates of planting or harvest, field or
crop conditions, which MMs or BMPs were
implemented, and whether the assistance of a
state or county agriculture professional was
used. However, the type of or level of detail
of information that can be obtained from self-
evaluations might be inadequate to satisfy the
objectives of a MM or BMP implementation
survey. If this is the case, expert evaluations
might be called for. Expert evaluations are
necessary if the information on MM or BMP
implementation that is required must be more
detailed or more reliable than that that can be
obtained with self-evaluations. Examples of
information that would be obtained reliably
only through an expert evaluation include an
objective assessment of the adequacy of MM
or BMP implementation, the degree to which
site-specific factors (e.g., type of crop, soil
type, or presence of a water body) influenced
MM or BMP implementation, or the need for
changes in standards and specifications for
MM or BMP implementation. Sections 4.3
and 4.4 discuss expert evaluations and self-
evaluations, respectively, in more detail.
Other important factors to consider when
choosing variables include the time of year
when the BMP compliance survey will be
conducted and when BMPs were installed.
Some agriculture BMPs, or aspects of their
-------
Conducting the Evaluation
Chapter 4
implementation that can be analyzed vary with
time of year and phase of farming operations.
Variables that are appropriate to these factors
should be chosen. The nutrient management
and pesticide management MMs in particular
might not lend themselves to direct on-site
analysis except at specific times of year, such
as during or soon after fertilizer and pesticide
applications, respectively. Concerning BMPs
that have been in place for some time, the
adequacy of implementation might be of less
interest than the adequacy of the operation and
maintenance of the BMP. For example, it
might be of interest to examine fences along
streams for structural integrity (i.e., holes that
would allow cattle to pass through) rather than
to calculate the miles of stream along which
the fences were installed. Similarly, waste
storage structure might be inspected for the
amount of freeboard when operating at
capacity rather than analyzing their
construction for adherence to construction
specifications. If numerous BMPs are being
analyzed during a single farm visit, variables
that relate to different aspects of BMP
installation, operation, and maintenance might
be chosen separately for each BMP to be
inspected.
Aerial reconnaissance and photography is
another means available for collecting
information on farming practices, though some
of the MMs and BMPs employed for
agriculture might be difficult if not impossible
to identify on aerial photographs. Aerial
reconnaissance and photography are discussed
in detail in Section 4.5.
The general types of information obtainable
with self-evaluations are listed in Table 4-1.
Regardless of the approach(es) used, proper
and thorough preparation for the evaluation is
the key to success.
4.2 CHOICE OF VARIABLES
Once the objectives of a BMP implementation
or compliance survey have been clearly
defined, the most important factor in the
assessment of MM or BMP implementation is
the determination of which variable(s) to
measure. A good variable provides a direct
measure of how well a BMP was or is being
implemented. Individual variables should
provide measures of different factors related to
BMP implementation. The best variables are
those which are measures of the adequacy of
MM or BMP implementation and are based on
quantifiable expressions of conformance with
state standards and specifications. As the
variables that are used become less directly
related to actual MM or BMP implementation,
their accuracy as measures of BMP
implementation decreases.
Examples of useful variables include the tons
and percentage per day of animal manure
captured and treated by wastewater facilities
associated with confined animal facilities and
the cattle-hours per day during which livestock
are excluded from stream banks, both of which
would be expressed in terms of conformance
with applicable state standards and
specifications. Less useful variables measure
factors that are related to MM and BMP
implementation but do not necessarily provide
an accurate measure of their implementation.
Examples of these types of variables are the
number of manure storage facilities
constructed in a year and the number of farms
with approved pesticide management plans.
Other poor variables would be the passage of
legislation requiring MM or BMP application
on farms, development of an information
education program for nutrient management,
or the number of requests for information on
nutrient management. Although these
-------
Table 4-1. General types of information obtainable with self-evaluations and expert
evaluations.
Information Obtainable from Self Evaluations
Background Information
• Type of facility installed (e.g., confined animal facility, wastewater storage and/or treatment
facility)
• Capacity of facility
• Square feet of facilities
• Type and number of animals and/or crops on farm
• Cropping history
• Yield data and estimates
• Field limitations
• Pest problems on farm
• Soil test results
• Map
Management Measures/Best Management Practices
• Management measures used on farm
• BMPs installed
• Dates of MM/BMP installation
• Design specifications of BMPs
• Type of waterbody or area protected
• Previous management measures used
Management Plans
• Preparation of management plans (e.g., nutrient, grazing, pesticide, irrigation water)
• Dates of plan preparation and revisions
• Date of initial plan implementation
• Total acreage under management
Equipment
• Types of equipment used on farm
• Dates of equipment calibration
• Application rates
• Timing of applications
• Substances applied (e.g., pesticides, fertilizers)
• Ambient conditions during applications
• Location of mixing, loading, and storage areas
Information Requiring Expert Evaluations
• Design sufficiency
• Installation sufficiency
• Adequacy of operation/management
• Confirmation of information from self evaluation
-------
Chapter 4
Conducting the Evaluation
variables relate to MM or BMP
implementation, they provide no real
information on whether MMs or BMPs are
actually being implemented or whether they
are being implemented properly.
Variables generally will not directly relate to
MM implementation, as most agriculture MMs
are combinations of several BMPs. Measures
of MM implementation, therefore, usually will
be based on separate assessments of two or
more BMPs, and the implementation of each
BMP will be based on a unique set of
variables. Some examples of BMPs related to
the EPA's Grazing Management Measure,
variables for assessing compliance with the
BMPs, and related standards and specifications
that might be required by state agriculture
authorities are presented in Figure 4-1.
Because farm owners and managers choose to
implement or not implement MMs or BMPs
based on site-specific conditions, it is also
appropriate to apply varying weights to the
variables chosen to assess MM and BMP
implementation to correspond to site-specific
conditions. For example, variables related to
animal waste disposal practices might be de-
emphasized—and other, more applicable
variables emphasized more—on farms with
relatively few animals. Similarly, on a farm
with a water body, variables related to
livestock access to the water body, sediment
runoff, and chemical deposition (pesticide use,
fertilizer use) might be emphasized over other
variables to arrive at a site-specific rating of
the adequacy of MM or BMP implementation.
The purpose for which the information
collected during a MM or BMP
implementation survey will be used is another
important consideration when selecting
variables. An implementation survey can
serve many purposes beyond the primary
purpose of assessing MM and BMP
implementation. For instance, variables might
be selected to assess compliance with each
category of BMP that is of interest and to
assess overall compliance with BMP
specification and standards. In addition, other
variables might be selected to assess the effect
that each has on the ability or willingness of
farm owners or managers to comply with
BMP implementation standards or
specifications. The information obtained from
evaluations using the latter type of variable
could be useful for modifying MM or BMP
implementation standards and specifications
for application to particular farm types or
conditions.
Table 4-2 provides examples of good and poor
variables for the assessment of MM or BMP
implementation of the agricultural MMs
developed by EPA (USEPA, 1993a). The
variables listed in the table are only examples,
and local or regional conditions should
ultimately dictate what variables should be
used.
-------
GRAZING MANAGEMENT MEASURE
Protect range, pasture, and other grazing lands:
(1)
(2)
By implementing one or more of the following to protect sensitive areas (such as
stream banks, wetlands, estuaries, ponds, lake shores, and riparian zones):
(a) Exclude livestock,
(b) Provide stream crossings or hardened watering access for drinking,
(c) Provide alternative drinking water locations,
(d) Locate salt and additional shade, if needed, away from sensitive areas, or
(e) Use improved grazing management (e.g., herding)
to reduce the physical disturbance and reduce direct loading of animal waste and
sediment caused by livestock; and
By achieving either of the following on all range, pasture, and other grazing lands not
addressed under (1):
(a) Implement the range and pasture components of a Conservation Management
System (CMS) as defined in the Field Office Technical Guide of the USDA-NRCS
by applying the progressive planning approach of the USDA-NRCS to reduce
erosion,
or
(b)
Maintain range, pasture, and other grazing lands in accordance with activity plans
established by either the Bureau of Land Management of the U.S. Department of
the Interior or the Forest Service of USDA.
Related BMPs, measurement variables, and standards and specifications:
Management Measure
Practice
Postpone grazing or rest
grazing land for a prescribed
period
Alternate water source
installed to convey water away
from riparian areas
Livestock excluded from an
area not intended for grazing
Range seeded to establish
adapted plants on native
grazing land
Potential Measurement
Variables
Percent ground cover
Stubble height
Presence of alternative water
source
Distance from water body of
water provided to livestock
Cattle-hours per day of
exclusion of livestock from
water bodies
Percent ground cover
Plant species
Example Related Standards and
Specifications
Recommended percent ground
cover for grazing
Recommended stubble height
for grazing
Guidelines for provision of
alternative sources of water on
farms with water bodies
Guidelines for protection of
water quality for specific types
of water bodies.
Recommended amount of
ground cover for grazing
Acceptable plant species for
the region
Figure 4-1. Potential variables and examples of implementation standards and specifications
that might be useful for evaluating compliance with the Grazing Management Measure.
-------
Table 4-2. Example variables for management measure implementation analysis.
Management
Measure
Useful Variables
Less Useful Variables
Appropriate
Sampling
Unit
Erosion and
Sediment
Control
Area on which reduced tillage or
terrace systems are installed
Area of runoff diversion systems
or filter strips per acre of
cropland
Area of highly erodible cropland
converted to permanent cover
Number of approved
farm soil and erosion
management plans
Number of grassed
waterways, grade
stabilization structures,
filter strips installed
Field
Acre
Facility
Wastewater
and Runoff
from
Confined
Animal
Facilities
Quantity and percentage of total
facility wastewater and runoff
that is collected by a waste
storage or treatment system
Number of manure
storage facilities
Confined
animal
facility
Animal unit
Nutrient
Management
Number of farms following and
acreage covered by approved
nutrient management plans
Percent of farmers keeping
records and applying nutrients at
rates consistent with
management recommendations
Quantity and percent reduction in
fertilizer applied
Amount of fertilizer and manure
spread between spreader
calibrations
Number of farms with
approved nutrient
management plans
Farm
Field
Application
Pesticide
Management
Number of farms with complete
records of field surveys and
pesticide applications and
following approved pest
management plans
Number of pest field surveys
performed on a weekly (or other
time frame) basis
Quantity and percent reduction in
pesticides use
Number of farms with
approved pesticide
management plans
Field
Farm
Application
Grazing
Management
Number of cattle-hours of access
to riparian areas per day
Miles of stream from which
grazing animals are excluded
Miles offence installed
• Stream
mile
• Animal unit
-------
Conducting the Evaluation
Chapter 4
4.3 Expert Evaluations
4.3.1 Site Evaluations
Expert evaluations are the best way to collect
reliable information on MM and BMP
implementation. They involve a person or
team of people visiting individual farms and
speaking with farm owners and/or managers to
obtain information on MM and BMP
implementation. For many of the MMs,
assessing and verifying compliance will
require a farm visit and evaluation. The
following should be considered before expert
evaluations are conducted:
• Obtaining permission of the farm owner or
manager. Without proper authorization to
visit a farm from a farm owner or
manager, the relationship between farmers
and the agriculture agency, and any future
regulatory or compliance action could be
jeopardized.
• The type(s) of expertise needed to assess
proper implementation. For some MMs, a
team of trained personnel might be
required to determine whether MMs have
been implemented properly.
• The activities that should occur during an
expert evaluation. This information is
necessary for proper and complete
preparation for the farm visit, so that it can
be completed in a single visit and at the
proper time.
• The me thod of rating the MMs and BMPs.
MM and BMP rating systems are discussed
below.
• Consistency among evaluation teams and
between farm evaluations. Proper training
and preparation of expert evaluation team
members are crucial to ensure accuracy
and consistency.
• The collection of information while at a
farm. Information collection should be
facilitated with preparation of data
collection forms that include any necessary
MM and BMP rating information needed
by the evaluation team members.
• The content and format of post-evaluation
discussions. Site evaluation team members
should bear in mind the value of
postevaluation discussion among team
members. Notes can be taken during the
evaluation concerning any items that
would benefit from group discussion.
Evaluators might consist of a single person
suitably trained in agricultural expert
evaluation to a group of professionals with
various expertise. The composition of
evaluation teams will depend on the types of
MMs or BMPs being evaluated. Potential
team members could include:
• Agricultural engineer
• Agriculture extension agent
• Agronomist
• Hydrologist
• Pesticide specialist
• Soil scientist
• Water quality expert
The composition of evaluation teams can vary
depending on the purpose of the evaluation,
available staff and other resources, and the
geographic area being covered. All team
members should be familiar with the required
MMs and BMPs, and each team should have a
member who has previously participated in an
expert evaluation. This will ensure familiarity
-------
Chapter 4
Conducting the Evaluation
with the technical aspects of the MMs and
BMPs that will be rated during the evaluation
and the expert evaluation process.
Training might be necessary to bring all team
members to the level of proficiency needed to
conduct the expert evaluations. State or
county agricultural personnel should be
familiar with agriculture regulations, state
BMP standards and specifications, and proper
BMP implementation, and therefore are
generally well qualified to teach these topics to
evaluation team members who are less
familiar with them. Agricultural agents or
other specialists who have participated in BMP
implementation surveys might be enlisted to
train evaluation team members about the
actual conduct of expert evaluations. This
training should include identification of BMPs
particularly critical to water quality protection,
analysis of erosion potential, and other aspects
of BMP implementation that require
professional judgement, as well as any
standard methods for measurements to judge
BMP implementation against state standards
and specifications.
Alternatively, if only one or two individuals
will be conducting expert evaluations, their
training in the various specialties, such as
those listed above, necessary to evaluate the
quality of MM and BMP implementation
could be provided by a team of specialists who
are familiar with agricultural practices and
nonpoint source pollution.
In the interest of consistency among the
evaluations and among team members, it is
advisable that one or more mock evaluations
take place prior to visiting selected sample
farms. These "practice sessions" provide team
members with an opportunity to become
familiar with MMs and BMPs as they should
be implemented under different farm
conditions, gain familiarity with the evaluation
forms and meanings of the terms and questions
on them, and learn from other team members
with different expertise. Mock evaluations are
valuable for ensuring that all evaluators have a
similar understanding of the intent of the
questions, especially for questions whose
responses involve a degree of subjectivity on
the part of the evaluator.
Where expert evaluation teams are composed
of more than two or three people, it might be
helpful to divide up the various responsibilities
for conducting the expert evaluations among
team members ahead of time to avoid
confusion at the farm and to be certain that all
tasks are completed but not duplicated. Having
a spokesperson for the group who will be
responsible for communicating with the farm
owner or manager—prior to the expert
evaluation, at the expert evaluation if they are
present, and afterward—might also be helpful.
A county agriculture representative is
generally a good choice as spokesperson
because he/she can represent the state anc
county agriculture authorities. Newly-formed
evaluation teams might benefit most from a
division of labor and selection of a team leader
or team coordinator with experience in expert
evaluations who will be responsible for the
quality of the expert evaluations. Smaller
teams might find that a division of
responsibilities is not necessary, as might
larger teams that have experience working
together. If responsibilities are to be assigned,
mock evaluations can be a good time to work
out these details.
4.3.2 Rating Implementation of
Management Measures and Best
Management Practices
-------
Conducting the Evaluation
Chapter 4
Many factors influence the implementation of
MMs and BMPs, so it is sometimes necessary
to use best professional judgment (BPJ) to rate
their implementation and BPJ will almost
always be necessary when rating overall BMP
compliance at a farm. Site-specific factors
such as soil type, crop rotation history,
topography, tillage, and harvesting methods
affect the implementation of erosion and
sediment control BMPs, for instance, and must
be taken into account by evaluators when
rating MM or BMP implementation.
Implementation of MMs will often be based
on implementation of more than one BMP,
and this makes rating MM implementation
similar to rating overall BMP implementation
at a farm or ranch. Determining an overall
rating involves grouping the ratings of
implementation of individual BMPs into a
single rating, which introduces more
subjectivity than rating the implementation of
individual BMPs based on standards and
specifications. Choice of a rating system and
rating terms, which are aspects of proper
evaluation design, is therefore important in
minimizing the level of subjectivity associated
with overall BMP compliance and MM
implementation ratings. When creating
overall ratings, it is still important to record
the detailed ratings of individual BMPs as
supporting information.
Individual BMPs, overall BMP compliance,
and MMs can be rated using a binary approach
(e.g., pass/fail, compliant/noncompliant, or
yes/no) or on a scale with more than two
choices, such as 1 to 5 or 1 to 10 (where 1 is
the worst—see Example). The simplest
method of rating MM and BMP
implementation is the use of a binary
approach. Using a binary approach, either an
entire farm or individual MMs or BMPs are
rated as being in compliance or not in
compliance with respect to specified criteria.
Scale systems can take the form of ratings
from poor to excellent, inadequate to adequate,
low to high, 1 to 3, 1 to 5, and so forth.
Whatever form of scale is used, the factors
that would individually or collectively qualify
a farm, MM, or BMP for one of the rankings
should be clearly stated. The more choices
that are added to the scale, the smaller and
smaller the difference between them becomes
and each must therefore be defined more
specifically and accurately. This is especially
important if different teams or individuals rate
farms separately. Consistency among the
ratings then depends on each team or
individual evaluator knowing precisely what
the criteria for each rating option mean. Clear
and precise explanations of the rating scale can
also help avoid or reduce disagreements
Example...of a rating scale (adapted from
Rossman and Phillips, 1992).
A possible rating scale from 1 to 5 might be:
5 = Implementation exceeds requirements
4 = Implementation meets requirements
3 = Implementation has a minor departure
from requirements
2 = Implementation has a major departure
from requirements
1 = Implementation is in gross neglect of
requirements
where:
Minor departure is defined as "small in
magnitude or localized," major departure is
defined as "significant magnitude or where
the BMPs are consistently neglected" and
gross neglect is defined as "potential risk to
water resources is significant and there is no
evidence that any attempt is made to
implement the BMP."
-------
Chapter 4
Conducting the Evaluation
among team members. This applies equally to
a binary approach. The factors, individually
or collectively, that would cause a farm, MM,
or BMP to be rated as not being in compliance
with design specifications should be clearly
stated on the evaluation form or in support
documentation.
Rating farms or MMs and BMPs on a scale
requires a greater degree of analysis by the
evaluation team than does using a binary
approach. Each higher number represents a
better level of MM or BMP implementation.
In effect, a binary rating approach is a scale
with two choices; a scale of low, medium, and
high (compliance) is a scale with three
choices. Use of a scale system with more than
two rating choices can provide more
information to program managers than a
binary rating approach, and this factor must be
weighted against the greater complexity
involved in using one. For instance, a survey
that uses a scale of 1 to 5 might result in one
MM with a ranking of 1, five with a ranking
of 2, six with a ranking of 3, eight with a
ranking of 4, and five with a ranking of 5.
Precise criteria would have to be developed to
be able to ensure consistency within and
between survey teams in rating the MMs, but
the information that only one MM was
implemented poorly, 11 were implemented
below standards, 13 met or were above
standards, and 5 were implemented very well
might be more valuable than the information
that 18 MMs were found to be in compliance
with design specifications, which is the only
information that would be obtained with a
binary rating approach.
If a rating system with more than two ratings
is used to collect data, the data can be
analyzed either by using the original rating
data or by first transforming the data into a
binomial (i.e., two-choice rating) system. For
instance, ratings of 1 through 5 could be
reduced to two ratings by grouping the Is, 2s,
and 3s together into one group (e.g.,
inadequate) and the 4s and 5s into a separate
group (e.g., adequate). If this approach is
used, it is best to retain the original rating data
for the detailed information they contain and
to reduce the data to a binomial system only
for the purpose of statistical analysis. Chapter
3, Section 3.5, contains information on the
analysis of categorical data.
4.3.3 Rating Terms
The choice of rating terms used on the
evaluation forms is an important factor in
ensuring consistency and reducing bias, and
the terms used to describe and define the
rating options should be as objective as
possible. For a rating system with a large
number of options, the meanings of each
option should be clearly defined. It is
suggested to avoid using terms such as
"major" and "minor" when describing erosion
or pollution effects or deviations from
prescribed MM or BMP implementation
criteria because they might have different
connotations for different evaluation team
members. It is easier for an evaluation team to
agree upon meaning if options are described in
terms of measurable criteria and examples are
provided to clarify the intended meaning. It is
also suggested not to use terms that carry
negative connotations. Evaluators might be
disinclined to rate a MM or BMP as having a
"major deviation" from an implementation
criterion, even if justified, because of the
negative connotation carried by the term.
Rather than using such a term, observable
conditions or effects of the quality of
implementation can be listed and specific
ratings (e.g., 1-5 or compliant/noncompliant
-------
Conducting the Evaluation
Chapter 4
for the criterion) can be associated with the
conditions or effects. For example, instead of
rating an animal waste management facility as
having a "major deficiency," a specific
deficiency could be described and ascribed an
associated rating (e.g., "Waste storage
structure is designed for no more than 70% of
the confined animals = noncompliant").
Evaluation team members will often have to
take specific notes on farms, MMs, or BMPs
during the evaluation, either to justify the
ratings they have ascribed to variables or for
discussion with other team members after the
survey. When recording notes about the farms,
MMs, or BMPs, evaluation team members
should be as specific as the criteria for the
ratings. A rating recorded as "MM deviates
highly from implementation criteria" is highly
subjective and loses specific meaning when
read by anyone other than the person who
wrote the note. Notes should therefore be as
objective and specific as possible.
An overall farm rating is useful for
summarizing information in reports,
identifying the level of implementation of
MMs and BMPs, indicating the likelihood that
environmental protection is being achieved,
identifying additional training or education
needs; and conveying information to program
managers, who are often not familiar with
MMs or BMPs. For the purposes of
preserving the valuable information contained
in the original ratings of farms, MMs, or
BMPs, however, overall ratings should
summarize, not replace, the original data.
Analysis of year-to-year variations in MM or
BMP implementation, the factors involved in
MM or BMP program implementation, and
factors that could improve MM or BMP
implementation and MM or BMP program
success are only possible if the original,
detailed farm, MM, or BMP data are used.
Approaches commonly used for determining
final BMP implementation ratings include
calculating a percentage based on individual
BMP ratings, consensus, compilation of
aggregate scores by an objective party, voting,
and voting only where consensus on a farm or
MM or BMP rating cannot be reached. Not all
systems for arriving at final ratings are
applicable to all circumstances.
4.3.4 Consistency Issues
Consistency among evaluators and between
evaluations is important, and because of the
potential for subjectivity to play a role in
expert evaluations, consistency should be
thoroughly addressed in the quality assurance
and quality control (QA/QC) aspects of
planning and conducting an implementation
survey. Consistency arises as a QA/QC
concern in the planning phase of an
implementation survey in the choice of
evaluators, the selection of the size of
evaluation teams, and in evaluator training. It
arises as a QA/QC concern while conducting
an implementation survey in whether
evaluations are conducted by individuals or
teams, how MM and BMP implementation on
individual fields or farms is documented, how
evaluation team discussions of issues are
conducted, how problems are resolved, and
how individual MMs and BMPs or whole
farms are rated.
Consistency is likely to be best if only one to
two evaluators conduct the expert evaluations
and the same individuals conduct all of the
-------
Chapter 4
Conducting the Evaluation
evaluations. If, for statistical purposes, many
farms (e.g., 100 or more) need to be evaluated,
use of only one to two evaluators might also
be the most efficient approach. In this case,
having a team of evaluators revisit a
subsample of the farms that were originally
evaluated by one to two individuals might be
useful for quality control purposes.
If teams of evaluators conduct the evaluations,
consistency can be achieved by keeping the
membership of the teams constant.
Differences of opinion, which are likely to
arise among team members, can be settled
through discussions held during evaluations,
and the experience of team members who have
done past evaluations can help guide
decisions. Pre-evaluation training sessions,
such as the mock evaluations discussed above,
will help ensure that the first few expert
evaluations are not "learning" experiences to
such an extent that those farms must be
revisited to ensure that they receive the same
level of scrutiny as farms evaluated later.
If different farms are visited by different teams
of evaluators or if individual evaluators are
assigned to different farms, it is especially
important that consistency be established
before the evaluations are conducted. For best
results, discussions among evaluators should
be held periodically during the evaluations to
discuss any potential problems. For instance,
evaluators could visit some farms together at
the beginning of the evaluations to promote
consistency in ratings, followed by expert
evaluations conducted by individual
evaluators. Then, after a few farm or MM
evaluations, evaluators could gather again to
discuss results and to share any knowledge
gained to ensure continued consistency.
As mentioned above, consistency can be
established during mock evaluations held
before the actual evaluations begin. These
mock evaluations are excellent opportunities
for evaluators to discuss the meaning of terms
on rating forms, differences between rating
criteria, and differences of opinion about
proper MM or BMP implementation. A
member of the evaluation team should be able
to represent the state's position on the
definition of terms and clarify areas of
confusion.
Descriptions of MMs and BMPs should be
detailed enough to support any ratings given to
individual features and to the MM or BMP
overall. Sketching a diagram of the MM or
BMP helps identify design problems,
promotes careful evaluation of all features,
and provides a record of the MM or BMP for
future reference. A diagram is also valuable
when discussing the MM or BMP with the
farm owner or identifying features in need of
improvement or alteration. Farm owners or
managers can also use a copy of the diagram
and evaluation when discussing their
operations with state or county agriculture
personnel. Photographs of MM or BMP
features are a valuable reference material and
should be used whenever an evaluator feels
that a written description or a diagram could
be inadequate. Photographs of what constitutes
both good and poor MM or BMP
implementation are valuable for explanatory
and educational purposes; for example, for
presentations to managers and the public.
4.3.5 Postevaluation Onsite Activities
It is important to complete all pertinent tasks
as soon as possible after the completion of an
expert evaluation to avoid extra work later and
to reduce the chances of introducing error
-------
Conducting the Evaluation
Chapter 4
attributable to inaccurate or incomplete
memory or confusion. All evaluation forms
for each farm should be filled out completely
before leaving the farm. Information not filled
in at the beginning of the evaluation can be
obtained from the farm owner or manager if
necessary. Any questions that evaluators had
about the MMs and BMPs during the
evaluation can be discussed, notes written
during the evaluation can be shared and used
to help clarify details of the evaluation process
and ratings. The opportunity to revisit the
farm will still exist if there are points that
cannot be agreed upon among evaluation team
members.
Also, while the evaluation team is still on the
farm, the farm owner or manager should be
informed about what will follow; for instance,
whether he/she will receive a copy of the
report, when to expect it, what the results
mean, and his/her responsibility in light of the
evaluation, if any. Immediately following the
evaluation is also an excellent time to discuss
the findings with the farm owner or manager if
he/she was not present during the evaluation.
4.4 SELF-EVALUATIONS
4.4.1 Methods
Self-evaluations, while often not a reliable
source of MM or BMP implementation data,
can be used to augment data collected through
expert evaluations or in place of expert
evaluations where the latter cannot be
conducted. In some cases, state agriculture
authority staff might have been involved
directly with BMP selection and
implementation and will be a source of useful
information even if an expert evaluation is not
conducted. Self-evaluations are an appropriate
survey method for obtaining background
information from farmers or persons
associated with farming operations, such as
county extension agents.
Mail, telephone, and mail with telephone
follow-up are common self-evaluation
methods. Mail and telephone surveys are
useful for collecting general information, such
as the management measures that specific
agricultural operations should be
implementing. County extension agents or
other state or local agricultural agents can be
interviewed or sent a questionnaire that
requests very specific information. Recent
advances in and increasing access to electronic
means of communication (i.e., e-mail and the
Internet) might make these viable survey
instruments in the future.
Mail surveys with a telephone follow-up
and/or farm visit are an efficient method of
collecting information. The USDA National
Agricultural Statistics Service has found that
10 to 20 percent of farm owners or managers
will respond to crop production questionnaires
that are mailed. Approximately two-thirds of
the questionnaires that are not returned are
completed by telephone and the remainder are
completed by personal visits to farms (USDA,
undated). The entire NASS survey effort,
from designing the questionnaire to reporting
the results, takes approximately 6 months.
The level of response obtained by NASS is
probably higher than would be obtained for
MM or BMP implementation monitoring
because NASS has developed a high level of
trust with farmers through years of
cooperation. In addition, NASS is prohibited
by law from releasing information on
individual farm operations, a fact of which
most farmers are aware.
-------
Chapter 4
Conducting the Evaluation
To ensure comparability of results,
information that is collected as part of a self-
evaluation—whether collected through the
mail, over the phone, or during farm
visits—should be collected in a manner that
does not favor one method over the others.
Ideally, telephone follow-up and on-site
interviews should consist of no more than
reading the questions on the questionnaire,
without providing any additional explanation
or information that would not have been
available to those who responded through the
mail. This approach eliminates as much as
possible any bias associated with the different
means of collecting the information. Figure 4-
2 presents an example of an animal waste
management survey questionnaire modeled
after aNASS crop production questionnaire.
The questionnaire design is discussed in
Section 4.4.3.
It is important that the accuracy of information
received through mail and phone surveys be
checked. Inaccurate or incomplete responses
to questions on mail and/or telephone surveys
commonly result from survey respondents
misinterpreting questions and thus providing
misleading information, not including all
relevant information in their responses, not
wanting to provide some types of information,
or deliberately providing some inaccurate
responses. Therefore, the accuracy of
information received through mail and phone
surveys should be checked by selecting a
subsample of the farmers surveyed and
conducting follow-up farm visits.
4.4.2 Cost
Cost can be an important consideration when
selecting an evaluation method. Farm visits
can cost several hundred dollars per farm
visited, depending on the type of farming
involved, the information to be collected, and
the number of evaluators used. Mail and/or
telephone surveys can be an inexpensive
means of collecting information, but their cost
must be balanced with the type and accuracy
of information that can be collected through
them. Other costs also need to be figured into
the overall cost of mail and/or telephone
surveys, including follow-up phone calls and
farm visits to make up for a poor response to
mailings and for accuracy checks. NASS has
found that a mail survey with a telephone
follow-up costs $6 to $10 per farm. Farm
visits can cost several hundred dollars per farm
depending on the complexity of the operation
and the desired information. Additionally, the
cost of questionnaire design must be
considered, as a well-designed questionnaire is
extremely important to the success of self-
evaluations. Questionnaire design is discussed
in the next section.
The number of evaluators used for farm visits
has an obvious impact on the cost of a MM or
BMP implementation survey. Survey costs
can be minimized by having one or two
evaluators visit farms instead of having
multiple-person teams visit each farm. If the
expertise of many specialists is desired, it
might be cost-effective to have multiple-
person teams check the quality of evaluations
conducted by one or two evaluators. This can
usually be done at a subsample of farms after
they have been surveyed.
An important factor to consider when
determining the number of evaluators to
include on farm visitation teams, and how to
balance the use of one to two evaluators versus
multiple-person teams, is the objectives of the
survey. Cost notwithstanding, the teams
conducting the expert evaluations must be
sufficient to meet the objectives of the survey,
-------
Animal Waste Management Survey
Purpose of Survey: To determine conformity with the following criteria for the control of runoff
from confined animal facilities:
States would put their standards here.
Limit the discharge from the confined animal facility to surface waters by:
(1) Storing both the facility wastewater and the runoff from confined animal facilities that are caused
by storms up to and including a 25-year, 24-hour frequency storm. Storage structures should:
(a) Have an earthen lining or plastic membrane lining, or
(b) Be constructed with concrete, or
(c) Be a storage tank.
(2) Managing stored runoff and accumulated solids from the facility through an appropriate waste
utilization system.
Population of interest: Farms in the coastal zone with new or existing confined animal facilities that
contain the following number of animals or more:
Animal Type
Beef Cattle
Stables (horses)
Dairy Cattle
Layers
Broilers
Turkeys
Swine
Number
300
200
70
15,000
15,000
13,750
80
Animal Units
300
400
98
1501
4952
1501
4952
2,475
200
Facilities that have been required by federal regulation 40 CFR 122.23 to obtain an NPDES discharge
permit are excluded.
Level of analysis: States should determine the level of analysis necessary.
If facility has a liquid manure system.
If facility has continous overflow watering.
Figure 4-2. Sample draft survey for confined animal facility management evaluation.
-------
Items of interest: This may vary depending on the type of facilities found within a
state and the state's program for addressing this issue.
Land Use and Ownership
Total acres operated nnn
Land owned nnn
Land rented nnn
Demographic Characteristics of Farm Operators
Years farming nnn
Years farming this operation nnn
Years of formal education nnn
Age nnn
Peak Number of Livestock
Beef cattle nnn
Horses nnn
Dairy cattle nnn
Layers (in facility with liquid manure systems) nnn
Layers (in facility with continuous overflow watering) nnn
Broilers (in facility with liquid manure systems) nnn
Broilers (in facility with continuous overflow watering) nnn
Turkeys nnn
Swine nnn
Animal waste management practices
Do you have a facility for wastewater and runoff from your animal operation? y/n
Did an engineer, extension agent, or other professional assist in the
design of the facility? y/n/na
Was the facility designed to accommodate the peak amount of waste
entering it? y/n/na
Does the facility store both the wastewater and runoff caused by a
25-year, 24-hour frequency storm? y/n/na/unknown
Does the facility have a earthen lining or plastic membrane? y/n/na
Is the facility constructed with concrete? y/n/na
Is the facility a storage tank? y/n/na
Are the stored runoff and accumulated solids used as fertilizer? y/n/na
If yes, what type of system is used? nnnnnnnnnnnnnn
Figure 4-2. Sample draft survey, (continued)
-------
Conducting the Evaluation
Chapter 4
and if the required teams would be too costly,
then the objectives of the survey might need to
be modified.
Another factor that contributes to the cost of a
MM or BMP implementation survey is the
number of farms to be surveyed. Once again,
a balance must be reached between cost, the
objectives of the survey, and the number of
farms to be evaluated. Generally, once the
objectives of the study have been specified,
the number of farms to be evaluated is
determined statistically to meet required data
quality objectives. If the number of farms that
is determined in this way would be too costly,
then it would be necessary to modify the study
objectives or the data quality objectives.
Statistical determination of the number of
farms to evaluate is discussed in Section 2.3.
4.4.3 Questionnaire Design
Many books have been written on the design
of data collection forms and questionnaires
(e.g., Churchill, 1983; Ferber et al., 1964; lull
and Hawkins, 1990), and these can provide
good advice for the creation of simple
questionnaires that will be used for a single
survey. However, for complex questionnaires
or ones that will be used for initial surveys as
part of a series of surveys (i.e., trend analysis),
it is strongly advised that a professional in
questionnaire design be consulted. This is
because while it might seem that designing a
questionnaire is a simple task, small details
such as the order of questions, the selection of
one word or phrase over a similar one, and the
tone of the questions can significantly affect
survey results. A professionally-designed
questionnaire can yield information beyond
that contained in the responses to the questions
themselves, while a poorly-designed
questionnaire can invalidate the results.
The objective of a questionnaire, which should
be closely related to the objectives of the
survey, should be extremely well thought out
prior to its being designed. Questionnaires
should also be designed at the same time as the
information to be collected is selected to
ensure that the questions address the objectives
as precisely as possible. Conducting these
activities simultaneously also provides
immediate feedback on the attainability of the
objectives and the detail of information that
can be collected. For example, an investigator
might want information on the extent of
grazing in riparian areas, but might discover
while designing the questionnaire that the
desired information could not be obtained
through the use of a questionnaire, or that the
information that could be collected would be
insufficient to fully address the chosen
objectives. In such a situation the investigator
could revise the objectives and questions
before going further with questionnaire design.
Tull and Hawkins (1990) identified seven
major elements of questionnaire construction:
1. Preliminary decisions
2. Question content
3. Question wording
4. Response format
5. Question sequence
6. Physical characteristics of the
questionnaire
7. Pretest and revision.
Preliminary decisions include determining
exactly what type of information is required,
determining the target audience, and selecting
the method of communication (e.g, mail,
telephone, farm visit). These subjects are
addressed in other sections of this guidance.
-------
Chapter 4
Conducting the Evaluation
The second step is to determine the content of
the questions. Each question should generate
one or more of the information requirements
identified in the preliminary decisions. The
ability of the question to elicit the necessary
data needs to be assessed. "Double-barreled"
questions, in which two or more questions are
asked as one, should be avoided. Questions
that require the respondent to aggregate
several sources of information should be
subdivided into several specific questions or
parts. The ability of the respondent to answer
accurately should also be considered when
preparing questions. Some respondents might
be unfamiliar with the type of information
requested or the terminology used. Or a
respondent might have forgotten some of the
information of interest, or might be unable to
verbalize an answer. Consideration should be
given to the willingness of respondents to
answer the questions accurately. If a
respondent feels that a particular answer might
be embarrassing or personally harmful, (e.g.,
might lead to fines or increased regulation), he
or she might refuse to answer the question or
might deliberately provide inaccurate
information. For this reason, answers to
questions that might lead to such responses
should be checked for accuracy whenever
possible.
The next step is the specific phrasing of the
questions. Simple, easily understood language
is preferred. The wording should not bias the
answer or be too subjective. For instance, a
question should not ask if grazing in riparian
areas is a problem on the farm. Instead, a
series of questions could ask if cattle are kept
on the farm, if the farm has any riparian areas
(which should be defined), if any means are
provided along the riparian areas to exclude
grazing animals, and what those means are.
These questions all request factual information
of which a farmer should be knowledgeable
and they progress from simple to more
complex. All alternatives and assumptions
should be clearly stated on the questionnaire,
and the respondent's frame of reference should
be considered.
Fourth, the type of response format should be
selected. Various types of information can
best be obtained using open-ended, multiple-
choice, or dichotomous questions. An open-
ended question allows respondents to answer
in any way they feel is appropriate. Multiple-
choice questions tend to reduce some types of
bias and are easier to tabulate and analyze;
however, good multiple-choice questions can
be more difficult to formulate. Dichotomous
questions allow only two responses, such as
"yes-no" or "agree-disagree." Dichotomous
questions are suitable for determining points
of fact, but must be very precisely stated and
unequivocally solicit only a single piece of
information.
The fifth step in questionnaire design is the
ordering of the questions. The first questions
should be simple to answer, objective, and
interesting in order to relax the respondent.
The questionnaire should move from topic to
topic in a logical manner without confusing
the respondent. Early questions that could
bias the respondent should be avoided. There
is evidence that response quality declines near
the end of a long questionnaire (Tull and
Hawkins, 1990). Therefore, more important
information should be solicited early. Before
presenting the questions, the questionnaire
should explain how long (on average) it will
take to complete and the types of information
that will be solicited. The questionnaire
should not present the respondent with any
surprises.
-------
Conducting the Evaluation
Chapter 4
The layout of the questionnaire should make it
easy to use and should minimize recording
mistakes. The layout should clearly show the
respondent all possible answers. For mail
surveys, a pleasant appearance is important for
securing cooperation.
The final step in the design of a questionnaire
is the pretest and possible revision. A
questionnaire should always be pretested with
members of the target audience. This will
preclude expending large amounts of effort
and then discovering that the questionnaire
produces biased or incomplete information.
4.5 AERIAL RECONNAISSANCE AND
PHOTOGRAPHY
Aerial reconnaissance and photography can be
useful tools for gathering physical farm
information quickly and comparatively
inexpensively, and they are used in
conservation for a variety of purposes. Aerial
photography has been proven to be helpful for
agricultural conservation practice
identification (Pelletier and Griffin, 1988);
rangeland monitoring (BLM, 1991); terrain
stratification, inventory site identification,
planning, and monitoring in mountainous
regions (Hetzel, 1988; Born and Van Hooser,
1988); as well as for forest regeneration
assessment (Hall and Aired, 1992) and forest
inventory and analysis (Hackett, 1988).
Factors such as the characteristics of what is
being monitored, scale, and camera format
determine how useful aerial photography can
be for a particular purpose.
Pelletier and Griffin (1988) investigated the
use of aerial photography for the identification
of agriculture conservation practices. They
found that practices that occupy a large area
and have an identifiable pattern, such as
contour cropping, strip cropping, terraces, and
windbreaks, were readily identified even at a
small scale (1:80,000) but that smaller, single-
unit practices, such as sediment basins and
sediment diversions, were difficult to identify
at a small scale. They estimated that 29
percent of practices could be identified at a
scale of 1:80,000, 45 percent could be
identified at 1:30,000, 70 percent could be
identified at 1:15,000, and over 90 percent
could be identified at a scale of 1:10,000.
Photographic scale and resolution must be
taken into consideration when deciding
whether to use aerial photography, and a
photographic scale that produces good
resolution of the items of importance to the
monitoring effort must be chosen. The Bureau
of Land Management (BLM) uses low-level,
large-scale (1:1,000 to 1:1,500) aerial
photography to monitor rangeland vegetation
(BLM, 1991). The agency reports that scales
smaller than 1:1,500 (e.g., 1:10,000, 1:30,000)
are too small to monitor the classes of land
cover (shrubs, grasses and forbs, bare soil,
rock) on rangeland. Born and Van Hooser
(1988) found that a scale of 1:58,000 was
marginal for use in forestry resource
inventorying and monitoring.
Camera format is a factor that also must be
considered. Large-format cameras are
generally preferred over small-format cameras
(e.g., 35 mm), but are more costly to purchase
and operate. The large negative size (9 cm x 9
cm) produced using a large-format camera
provides the resolution and detail necessary
for accurate photo interpretation. Large-
format cameras can be used from higher
altitudes than small-format cameras, and the
image area covered by a large-format image at
a given scale (e.g., 1:1,500) is much larger
than the image area captured by a small-
-------
Chapter 4
Conducting the Evaluation
format camera at the same scale. Small-scale
cameras (i.e., 35 mm) can be used for
identifications that involve large-scale
features, such as mining areas, the extent of
burning, and large animals in censuses, and
they are less costly to purchase and use than
large-format cameras, but they are limited in
the altitude that the photographs can be taken
from and the resolution that they provide when
enlarged (Owens, 1988).
BLM recommends the use of a large-format
camera because the images provide the photo
interpreter with more geographical reference
points, it provides flexibility to increase
sample plot size, and it permits modest
navigational errors during overflight (BLM,
1991). Also, if hiring someone to take the
photographs, most photo contractors will have
large-format equipment for the purpose.
A drawback to the use of aerial photography is
that conservation practices that do not meet
implementation or operational standards but
that are similar to practices that do are
indistinguishable from ones that do in an aerial
photograph (Pelletier and Griffin, 1988).
Also, practices that are defined by managerial
concepts rather than physical criteria, such as
irrigation water management or nutrient
management, cannot be detected with aerial
photographs.
Regardless of scale, format, or item being
monitored, it is useful for photo interpreters to
receive 2-3 days of training on the basic
fundamentals of photo interpretation and that
they be thoroughly familiar with the
vegetation and land forms in the areas where
the photographs that they will be interpreting
were taken (BLM, 1991). A visit to the farms
in the photographs is recommended to
improve correlation between the interpretation
and actual farm characteristics. Generally,
after a few visits and interpretations of
photographs of those farms, photo interpreters
will be familiar with the photographic
characteristics of the vegetation in the area and
the farm visits can be reserved for verification
of items in doubt. A change in type of
vegetation or physiography in photographs
normally requires new visits until photo
interpreters are familiar with the
characteristics of the new vegetation in the
photographs.
Information on obtaining aerial photographs is
available from the Farm Service Agency and
the Natural Resources Conservation Service.
Contact the Farm Service Agency at: USDA
FSA Aerial Photography Field Office, P.O.
Box 30010, Salt Lake City, UT, 84130-0010,
(801) 975-3503. The Farm Service Agency's
Internet address is http://www.fsa.usda.gov.
Contact the Natural Resources Conservation
Service at: NRCS National Cartography and
Geospatial Center, Fort Worth Federal Center,
Bldg 23, Room 60, P.O. Box 6567, Fort
Worth, TX 76115-0567; 1-800-672-5559.
NRCS's Internet address is
http://www.ncg.nrcs.usda.gov.
-------
CHAPTER 5. PRESENTATION OF EVALUATION RESULTS
5.1 INTRODUCTION
The first three chapters of this guidance
presented techniques for the collection of
information. Data analysis and interpretation
are addressed in detail in Chapter 4 of EPA's
Monitoring Guidance for Determining the
Effectiveness ofNonpoint Source Controls
(USEPA, 1997). This chapter provides ideas
for the presentation of results.
The presentation of MM or BMP compliance
survey results, whether written or oral, is an
integral part of a successful monitoring study.
The quality of the presentation of results is an
indication of the quality of the compliance
survey, and if the presentation fails to convey
important information from the compliance
survey to those who need the information, the
compliance survey itself might be considered a
failure.
The quality of the presentation of results is
dependent on at least four criteria—it must be
complete, accurate, clear, and concise
(Churchill, 1983). Completeness means that
the presentation provides all necessary
information to the audience in the language
that it understands; accuracy is determined by
how well an investigator handles the data,
phrases findings, and reasons; clarity is the
result of clear and logical thinking and a
precision of expression; and conciseness is the
result of selecting for inclusion only that
which is necessary.
Throughout the process of preparing the
results of a MM or BMP compliance survey
for presentation, it must be kept in mind that
the study was initially undertaken to provide
information for management
purposes—specifically, to help make a
decision (Tull and Hawkins, 1990). The
presentation of results should be built around
the decision that the compliance survey was
undertaken to support. The message of the
presentation must also be tailored to that
decision. It must be realized that there will be
a time lag between the compliance survey and
the presentation of the results, and the results
should be presented in light of their
applicability to the management decision to be
made based on them. The length of the time
lag is a key factor in determining this
applicability. If the time lag is significant, it
should be made clear during the presentation
that the situation might have changed since the
survey was conducted. If reliable trend data
are available, the person making the
presentation might be able to provide a sense
of the likely magnitude of any change in the
situation. If the change in status is thought to
be insignificant, evidence should be presented
to support this claim. For example, state that
"At the time that the compliance survey was
conducted, farmers were using BMPs with
increasing frequency, and the lack of any
changes in program implementation coupled
with continued interaction with farmers
provides no reason to believe that this trend
has changed since that time." It would be
misleading to state "The monitoring study
indicates that farmers are using BMPs with
increasing frequency." The validity and force
of the message will be enhanced further
through use of the active voice (we believe}
rather than the passive voice (it is believed).
Three major factors must be considered when
presenting the results of MM and BMP
-------
Presentation of Evaluation R
Chapter 5
implementation studies: Identifying the target
audience, selecting the appropriate medium
(printed word, speech, pictures, etc.), and
selecting the most appropriate format to meet
the needs of the audience.
5.2 AUDIENCE IDENTIFICATION
Identification of the audience(s) to which the
results of the MM and BMP compliance
survey will be presented determines the
content and format of the presentation. For
results of compliance survey studies, there are
typically six potential audiences:
• Interested/concerned citizens
• Farm owners and managers
• Media/general public
• Policy makers
• Resource managers
Scientists
These audiences have different information
needs, interests, and abilities to understand
complex data. It is the job of the person(s)
preparing the presentation to analyze these
factors prior to preparing a presentation. The
four criteria for presentation quality apply
regardless of the audience. Other elements of
a comprehensive presentation, such as
discussion of the objectives and limitations of
the study and necessary details of the method,
must be part of the presentation and must be
tailored to the audience. For instance, details
of the sampling plan, why the plan was chosen
over others, and the statistical methods used
for analysis might be of interest to other
investigators planning a similar study, and
such details should be recorded even if they
are not part of any presentation of results
because of their value for future reference
when the monitoring is repeated or similar
studies are undertaken, but they are best not
included in a presentation to management.
5.3 PRESENTATION FORMAT
Regardless of whether the results of a
compliance survey are presented written or
orally, or both, the information being
presented must be understandable to the
audience. Consideration of who the audience
is will help ensure that the presentation is
particularly suited to its needs, and choice of
the correct format for the presentation will
ensure that the information is conveyed in a
manner that is easy to comprehend.
Most reports will have to be presented both
written and orally. Written reports are
valuable for peer review, public information
dissemination, and for future reference. Oral
presentations are often necessary for
managers, who usually do not have time to
read an entire report, only have need for the
results of the study, and are usually not
interested in the finer details of the study.
Different versions of a report might well have
to be written—for the public, scientists, and
managers (i.e., an executive summary)—and
separate oral presentations for different
audiences—the public, farmers, managers, and
scientists at a conference—might have to be
prepared.
Most information can most effectively be
presented in the form of tables, charts, and
diagrams (Tull and Hawkins, 1990). These
graphic forms of data and information
presentation can help simplify the
presentation, making it easier for an audience
to comprehend than if explained exhaustively
with words. Words are important for pointing
out significant ideas or findings, and for
interpreting the results where appropriate.
-------
Chapter 5
Presentation of Evaluation R
Words should not be used to repeat what is
already adequately explained in graphics, and
slides or transparencies that are composed
largely of words should contain only a few
essential ideas each. Presentation of too much
written information on a single slide or
transparency only confuses the audience.
Written slides or transparencies should also be
free of graphics, such as clever logos or
background highlights—unless the pictures are
essential to understanding the information
presented—since they only make the slides or
transparencies more difficult to read.
Examples of graphics and written slides are
presented in Figures 5-1 through 5-3.
Different types of graphics have different uses
as well. Information presented in a tabular
format can be difficult to interpret because the
reader has to spend some time with the
information to extract the essential points from
it. The same information presented in a pie
chart or bar graph can convey essential
information immediately and avoid the
inclusion of background data that are not
essential to the point. When preparing
information for a report, an investigator should
organize the information in various ways and
choose that which conveys only the
information essential for the audience in the
least complicated manner.
5.3.1 Written Presentations
The following criteria should be considered
when preparing written material:
• Reading level or level of education of the
target audience.
• Level of detail necessary to make the
results understandable to the target
audience)Different audiences require
various levels of background information
to fully understand the study's results.
• Layout. The integration of text, graphics,
color, white space, columns, sidebars, and
other design elements is important in the
production of material that the target
audience will find readable and visually
appealing.
• Graphics. Photos, drawings, charts, tables,
maps, and other graphic elements can be
used to effectively present information that
the reader might otherwise not understand.
5.3.2 Oral Presentations
An effective oral presentation requires special
preparation. Tull and Hawkins (1990)
recommend three steps:
1. Analyze the audience, as explained above;
2. Prepare an outline of the presentation, and
preferably a written script;
3. Rehearse it. Several dry runs of the
presentation should be made, and if
-------
Presentation of Evaluation R
Chapter 5
5 Leading Sources of Water Quality Impairment
in various types of water bodies
RANK RIVERS
1 Agriculture
2 STPs
3 Habitat
Modification
4 Urban Runoff
Resource
Extraction
LAKES
Agriculture
STPs
Urban Runoff
Other NPS
Habitat
Modification
ESTUARIES
Urban Runoff
STPs
Agriculture
Industry Point
Sources
Petroleum Activities
Figure 5-1. Example of presentation of information in a written slide. (Source: USEPA, 1995)
possible it should be taped on a VCR and
the presentation analyzed.
These steps are extremely important if an oral
presentation is to be effective. Remember that
oral presentations of 1A to 1 hour are often all
that is available for the presentation of the
results of months of research to managers who
are poised to make decisions based on the
presentation. Adequate preparation is essential
if the oral presentation is to accomplish its
purpose.
5.4 FOR FURTHER INFORMATION
The provision of specific examples of
effective and ineffective presentation graphics,
writing styles, and organizations is beyond the
scope of this document. A number of
resources that contain suggestions for how
study results should be presented are available,
however, and should be consulted. A listing
of some references is provided below.
• The New York Public Library Writer's
Guide to Style and Usage (NYPL, 1994)
has information on design, layout, and
presentation in addition to guidance on
grammar and style.
• Good Style: Writing for Science and
Technology (Kirkman, 1992) provides
-------
Chapter 5
Presentation of Evaluation R
Agricultyral
(21 Reporting)
Good
Jmpwircsf by Agriculture-
1M, 55?
Nonirrigaledi Crop Prod.
iirig jtetl Crop Prod,
RampHand
Feedlots
Pastureiancf
Animal Holding Areas,
•Hfr.
i 10 15 » 75
Perctnt of River Mll« tmpaeled
by AprieyllyiB in General
Figure 5-2. Example of representation of data using a combination of a pie chart and a
horizontal bar chart. (Source: USEPA, 1995)
techniques for presenting technical material in
a coherent, readable style.
• The Modern Researcher (Barzun and
Graff, 1992) explains how to turn research
into readable, well organized writing.
-------
Presentation of Evaluation R
Chapter 5
Leading Sources of Pollution
Relative Quantity of Lake Acres Affected by Source
n^T^BI Municipal Point Sources
Figure 5-3. Example of representation of data in the form of
a pie chart.
• Writing with Precision: How to Write So
That You Cannot Possibly Be
Misunderstood, 6th ed. (Bates, 1993)
addresses communication problems of the
1990s.
• Designer's Guide to Creating Charts &
Diagrams (Holmes, 1991) gives tips for
combining graphics with statistical
information.
• The Elements of Graph Design (Kosslyn,
1993) shows how to create effective
displays of quantitative data.
-------
REFERENCES
Academic Press. 1992. Dictionary of Science and Technology. Academic Press, Inc., San
Diego, California.
Barzun, J., and H.F. Graff. 1992. The Modern Researcher. 5th ed. Houghton Mifflin.
Bates, J. 1993. Writing with Precision: How to Write So That You Cannot Possibly Be
Misunderstood. 6th ed. Acropolis.
Blalock, H.M., Jr. 1979. Social Statistics. Rev. 2nd ed. McGraw-Hill Book Company, New
York, NY.
BLM. 1991. Inventory and Monitoring Coordination: Guidelines for the Use of Aerial
Photography in Monitoring. Technical Report TR 1734-1. Department of the Interior, Bureau
of Land Management.
Born, J.D., and D.D. Van Hooser. 1988. Intermountain Research Station remote sensing use
for resource inventory, planning, and monitoring. In Remote Sensing for Resource Inventory,
Planning, and Monitoring. Proceedings of the Second Forest Service Remote Sensing
Applications Conference, Sidell, Louisiana, andNSTL, Mississippi, April 11-15, 1988.
Casley, D.J., and D.A. Lury. 1982. Monitoring and Evaluation of Agriculture and Rural
Development Projects. The Johns Hopkins University Press, Baltimore, MD.
Churchill, G. A., Jr. 1983. Marketing Research: Methodological Foundations, 3rd ed. The
Dryden Press, New York, New York.
Cochran, W.G. 1977. Sampling techniques. 3rd ed. John Wiley and Sons, New York, New
York.
Cross-Smiecinski, A., and L.D. Stetzenback. 1994. Quality planning for the life science
researcher: Meeting quality assurance requirements. CRC Press, Boca Raton, Florida.
CTIC. 1994. 1994 National Crop Residue Management Survey. Conservation Technology
Information Center, West Lafayette, IN.
CTIC. 1995. Conservation IMPACT, vol. 13, no. 4, April 1995. Conservation Technology
Information Center, West Lafayette, IN.
Ferber, R., D.F. Blankertz, and S. Hollander. 1964. Marketing Research. The Ronald Press
Company, New York, NY.
-------
References
Freund, I.E. 1973. Modern elementary statistics. Prentice-Hall, Englewood Cliffs, New Jersey.
Gaugush, R.F. 1987. Sampling Design for Reservoir Water Quality Investigations. Instruction
Report E-87-1. Department of the Army, US Army Corps of Engineers, Washington, DC.
Gilbert, R.O. 1987. Statistical Methods for Environmental Pollution Monitoring. VanNostrand
Reinhold, New York, NY.
Hackett, R.L. 1988. Remote sensing at the North Central Forest Experiment Station. In Remote
Sensing for Resource Inventory, Planning, and Monitoring. Proceedings of the Second Forest
Service Remote Sensing Applications Conference, Sidell, Louisiana, and NSTL, Mississippi,
April 11-15, 1988.
Hall, R.J., and A.H. Aired. 1992. Forest regeneration appraisal with large-scale aerial
photographs. The Forestry Chronicle 68(1): 142-150.
Helsel, D.R., andR.M. Hirsch. 1995. Statistical Methods in Water Resources. Elsevier.
Amsterdam.
Hetzel, G.E. 1988. Remote sensing applications and monitoring in the Rocky Mountain region.
In Remote Sensing for Resource Inventory, Planning, and Monitoring. Proceedings of the
Second Forest Service Remote Sensing Applications Conference, Sidell, Louisiana, and NSTL,
Mississippi, April 11-15, 1988.
Holmes, N. 1991. Designer's Guide to Creating Charts & Diagrams. Watson-Guptill.
Hook, D., W. McKee, T. Williams, B. Baker, L. Lundquist, R. Martin, and J. Mills. 1991. A
Survey of Voluntary Compliance of Forestry BMPs. South Carolina Forestry Commission,
Columbia, SC.
IDDHW. 1993. Forest Practices Water Quality Audit 1992. Idaho Department of Health and
Welfare, Division of Environmental Quality, Boise, ID.
Kirkman, J. 1992. Good Style: Writing for Science and Technology. Chapman and Hall.
Kosslyn, S.M. 1993. The Elements of Graph Design. W.H. Freeman.
Kupper, L.L., and K.B. Hafner. 1989. How appropriate are popular sample size formulas? Am.
Stat. 43:101-105.
MacDonald, L.H., A.W. Smart, and R.C. Wissmar. 1991. Monitoring Guidelines to Evaluate
the Effects of Forestry Activities on Streams in the Pacific Northwest and Alaska. EPA/910/9-91-
001. U.S. Environmental Protection Agency Region 10, Seattle, WA.
-------
References
Mann, H.B., and D.R. Whitney. 1947. On a test of whether one of two random variables is
stochastically larger than the other. Annals of Mathematical Statistics 18:50-60.
McNew, R.W. 1990. Sampling and Estimating Compliance with BMPs. In Workshop on
Implementation Monitoring of Forestry Best Management Practices., Southern Group of State
Foresters, USDA Forest Service, Southern Region, Atlanta, GA, January 23-25, 1990, pp. 86-
105.
Meals, D.W. 1988. Laplatte River Watershed Water Quality Monitoring & Analysis Program.
Program Report No. 10. Vermont Water Resources Research Center, School of Natural
Resources, University of Vermont, Burlington, VT.
NYPL. 1994. The New York Public Library Writer's Guide to Style and Usage. A Stonesong
Press book. HarperCollins Publishers, New York, NY.
Owens, T. 1988. Using 35mm photographs in resource inventories. In Remote Sensing for
Resource Inventory, Planning, and Monitoring. Proceedings of the Second Forest Service
Remote Sensing Applications Conference, Sidell, Louisiana, andNSTL, Mississippi, April 11-
15, 1988.
Pelletier, R.E., and R.H. Griffin. 1988. An evaluation of photographic scale in aerial
photography for identification of conservation practices. J. Soil Water Conserv. 43(4):333-337.
Rashin, E., C. Clishe, and A. Loch. 1994. Effectiveness of forest road and timber harvest best
management practices with respect to sediment-related water quality impacts. Interim Report
No. 2. Washington State Department of Ecology, Environmental Investigations and Laboratory
Services program, Wastershed Assessments Section. Ecology Publication No. 94-67. Olympia,
Washington.
Remington, R.D., and M.A. Schork. 1970. Statistics with applications to the biological and
health sciences. Prentice-Hall, Englewood Cliffs, New Jersey.
Rossman, R., and M.J. Phillips. 1991. Minnesota forestry best management practices
implementation monitoring. 1991 forestry field audit. Minnesota Department of Natural
Resources, Division of Forestry.
Schultz, B. 1992. Montana Forestry Best Management Practices Implementation Monitoring.
The 1992 Forestry BMP Audits Final Report. Montana Department of State Lands, Forestry
Division, Missoula, MT.
Snedecor, G.W. and W.G. Cochran. 1980. Statistical methods. 7th ed. The Iowa State
University Press, Ames, Iowa.
-------
References
Tull, D.S., and D.I. Hawkins. 1990. Marketing Research. Measurement and Method. Fifth
edition. Macmillan Publishing Company, New York, New York.
USDA. 1994a. 1992 National Resources Inventory. U.S. Department of Agriculture, Natural
Resource Conservation Service, Resources Inventory and Geographical Information Systems
Division, Washington, DC.
USDA. 1994b. Agricultural Resources and Environmental Indicators. Agricultural Handbook
No. 705. U.S. Department of Agriculture, Economic Research Service, Natural Resources and
Environmental Division, Herndon, VA.
USDA. Undated. Preparing Statistics for Agriculture. U.S. Department of Agriculture,
National Agricultural Statistics Service, Washington, DC.
USDOC. 1994. 1992 Census of Agriculture. U.S. Department of Commerce, Bureau of the
Census. U.S. Government Printing Office, Washington, DC.
USEPA. 1993 a. Guidance Specifying Management Measures For Sources OfNonpoint
Pollution In Coastal Waters. EPA 840-B-92-002. U.S. Environmental Protection Agency,
Office of Water, Washington, DC.
USEPA. 1993b. Evaluation of the Experimental Rural Clean Water Program. EPA 841-R-93-
005. U.S. Environmental Protection Agency, Office of Water, Washington, DC.
USEPA. 1995. National water quality inventory 1994 Report to Congress. EPA 841-R-95-005.
U.S. Environmental Protection Agency, Office of Water, Washington, DC.
USEPA. 1997. Monitoring Guidance for Determining the Effectiveness ofNonpoint Source
Controls. EPA 841-B-96-004. U.S. Environmental Protection Agency, Office of Water,
Washington, DC. August.
USGS. 1990. Land Use and Land Cover Digital Data from 1:250,000- and 1:100,000-Scale
Maps: Data Users Guide. National Mapping Program Technical Instructions Data Users Guide
4. U.S. Department of the Interior, U.S. Geological Survey, Reston, VA.
Wilcoxon, F. 1945. Individual comparisons by ranking metheds. Biometrics 1:80-83.
Winer, BJ. 1971. Statistical principles in experimental design. McGraw-Hill Book Company,
New York.
-------
GLOSSARY
accuracy, the extent to which a measurement approaches the true value of the measured
quantity
aerial photography: the practice of taking photographs from an airplane, helicopter, or other
aviation device while it is airborne
allocation, Neyman: stratified sampling in which the cost of sampling each stratum is in
proportion to the size of the stratum but variability between strata changes
allocation, proportional: stratified sampling in which the variability and cost of sampling for
each stratum are in proportion to the size of the stratum
allowable error: the level of error acceptable for the purposes of a study
ANOVA: see analysis of variance
analysis of variance: a statistical test used to determine whether two or more sample means
could have been obtained from populations with the same parametric mean
assumptions: characteristics of a population of a sampling method taken to be true without proof
bar graph: a representation of data wherein data is grouped and represented as vertical or
horizontal bars over an axis
best professional judgement: an informed opinion made by a professional in the appropriate
field of study or expertise
best management practice: a practice or combination of practices that are determined to be the
most effective and practicable means of controlling point and nonpoint pollutants at levels
compatible with environmental quality goals
bias: a characteristic of samples such that when taken from a population with a known
parameter, their average does not give the parametric value
binomial: an algebraic expression that is the sum or difference of two terms
camera format: refers to the size of the negative taken by a camera. 35mm is a small camera
format
chi-square distribution: a scaled quantity whose distribution provides the distribution of the
sample variance
-------
Glossary
coefficient of variation: a statistical measure used to compare the relative amounts of variation
in populations having different means
confidence interval: a range of values about a measured value in which the true value is
presumed to lie
conservation tillage: a method of conservation in which plant material is left on the ground after
harvest to control erosion
consistency: conforming to a regular method or style; an approach that keeps all factors of
measurement similar from one measurement to the next to the extent possible
contour farming: a farming method in which fields are tilled along the topographic contours of
the land
cumulative effects: the total influences attributable to numerous individual influences
degrees of freedom: the number of residuals (the difference between a measured value and the
sample average) required to completely determine the others
design, balanced: a sampling design wherein separate sets of data to be used are similar in
quantity and type
distribution: the allocation or spread of values of a given parameter among its possible values
e-mail: an electronic system for correspondence
erosion potential: a measure of the ease with which soil can be carried away in storm water
runoff or irrigation runoff
error: the fluctuation that occurs from one repetition to another; also experimental error
estimate, baseline: an estimate of baseline, or actual conditions
estimate, pooled: a single estimate obtained from grouping individual estimates and using the
latter to obtain a single value
finite population correction term: a correction term used when population size is small relative
to sample size
Friedman test: a nonparametric test that can be used for analysis when two variables are
involved
-------
Glossary
hydrologic modification: the alteration of the natural circulation or distribution of water by the
placement of structures or other activities
hypothesis, alternative: the hypothesis which is contrary to the null hypothesis
hypothesis, null: the hypothesis or conclusion assumed to be true prior to any analysis
Internet: an electronic data transmission system
Kruskal-Wallis test: a nonparametric test recommended for the general case with a samples and
nt variates per sample
management measure: an economically achievable measure for the control of the addition of
pollutants from existing and new categories and classes of nonpoint sources of pollution, which
reflect the greatest degree of pollutant reduction achievable through the application of the best
available nonpoint pollution control practices, technologies, processes, siting criteria, operating
methods, or other alternatives
Mann-Whitney test: a nonparametric test for use when a test is only between two samples
mean, estimated: a value of population mean arrived at through sampling
mean, overall: the measured average of a population
mean, stratum: the measured average within a sample subgroup or stratum
measurement bias: a consistent under- or overestimation of the true value of something being
measured, often due to the method of measurement
measurement error: the deviation of a measurement from the true value of that which is being
measured
median: the value of the middle term when data are arranged in order of size; a measure of
central tendency
monitoring, baseline: monitoring conducted to establish initial knowledge about the actual state
of a population
monitoring, compliance: monitoring conducted to determine if those who must implement
programs, best management practices, or management measures, or who must conduct
operations according to standards or specifications are doing so
-------
Glossary
monitoring, project: monitoring conducted to determine the impact of a project, activity, or
program
monitoring, validation: monitoring conducted to determine how well a model accurately reflects
reality
navigational error: errors in determining the actual location (altitude or latitude/longitude) of an
airplane or other aviation device due to instrumentation or the operator
nominal: referred to by name; variables that cannot be measured but must be expressed
qualitatively
nonparametric method: distribution-free method; any of various inferential procedures whose
conclusions do not rely on assumptions about the distribution of the population of interest
normal approximation: an assumption that a population has the characteristics of a normally-
distributed population
normal deviate: deviation from the mean expressed in units of a
nutrient management plan: a plan for managing the quantity of nutrients applied to crops to
achieve maximum plant nutrition and minimum nutrient waste
ordinal: ordered such that the position of an element in a series is specified
parametric method: any statistical method whose conclusions rely on assumptions about the
distribution of the population of interest
physiography: a description of the surface features of the Earth; a description of landforms
pie chart: a representation of data wherein data is grouped and represented as more or less
triangular sections of a circle and the total is the entire circle
population, sample: the members of a population that are actually sampled or measured
population, target: the population about which inferences are made; the group of interest, from
which samples are taken
population unit: an individual member of a target population that can be measured
independently of other members
power: the probability of correctly rejecting the null hypothesis when the alternative hypothesis
is false.
-------
Glossary
precision: a measure of the similarity of individual measurements of the same population
question, dichotomous: a question that allows for only two responses, such as "yes" and "no"
question, double-barreled: two questions asked as a single question
question, multiple-choice: a question with two or more predetermined responses
question, open-ended: a question format that requires a response beyond "yes" or "no"
remote sensing: methods of obtaining data from a location distant from the object being
measured, such as from an airplane or satellite
resolution: the sharpness of a photograph
sample size: the number of population units measured
sampling, cluster: sampling in which small groups of population units are selected for sampling
and each unit in each selected group is measured
sampling, simple random: sampling in which each unit of the target population has an equal
chance of being selected
sampling, stratified random: sampling in which the target population is divided into separate
subgroups, each of which is more internally similar than the overall population is, prior to
sample selection
sampling, systematic: sampling in which population units are chosen in accordance with a
predetermined sample selection system
sampling error: error attributable to actual variability in population units not accounted for by
the sampling method
scale (aerial photography): the proportion of the image size of an object (such as a land area) to
its actual size, e.g., 1:3000. The smaller the second number, the larger the scale
scale system: a system for ranking measurements or members of a population on a scale, such as
1 to 5
significance level: Type I error expressed as a percentage, a probability, that measured values
standard deviation: a measure of spread; the positive square root of the variance
-------
Glossary
standard error: an estimate of the standard deviation of means that would be expected if a
collection of means based on equal-sized samples of n items from the same population were
obtained
statistical inference: conclusions drawn about a population using statistics
statistics, descriptive: measurements of population characteristics designed to summarize
important features of a data set
stratification: the process of dividing a population into internally similar subgroups
stratum: one of the subgroups created prior to sampling in stratified random sampling
streamside management area: a designated area that consists of a waterbody (e.g., stream) and
an adjacent area of varying width where management practices that might affect water quality,
fish, or other aquatic resources are modified to protect the waterbody and its adjacent resources
and to reduce the pollution effect of an activity on the waterbody
Student's t test: a statistical test used to test for significant differences between means when only
two samples are involved
subjectivity: a characteristic of analysis that requires personal judgement on the part of the
person doing the analysis
target audience: the population that a monitoring effort is intended to measure
tillage: the operation of implements through the soil to prepare seedbeds and rootbeds, control
weeds and brush, aerate the soil, and cause faster breakdown of organic matter and minerals to
release plant foods
total maximum daily load: a total allowable addition of pollutants from all affecting sources to
an individual waterbody over a 24-hour period
transformation, data: manipulation of data such that it will meet the assumptions required for
analysis
Tukey's test: a test to ascertain whether the interaction found in a given set of data can be
explained in terms of multiplicative main effects
unit sampling cost: the cost of attributable to sampling a single population unit
variance: a measure of the spread of data around the mean
-------
Glossary
watershed assessment: an investigation of numerous characteristics of a watershed in order to
describe its actual condition
Wilcoxon's test: a nonparametric test for use when a test is only between two samples
-------
INDEX
accuracy 2-10, 4-14
allocation
Neyman 2-25, 2-27
proportional 2-25
analysis of variance 3-4
rank-transformed 3-4
best professional judgement 2-2
bias, see error
BMP
pass/fail rating system 4-9
scale rating system 4-9
BMP implementation assessments
site-specific 1-2
watershed 1-2
camera format 4-20
Census of Agriculture 2-13, 2-15, 2-16
Clean Water Act
Section 303(d) 1-2
Section 319(h) 1-2
Coastal Nonpoint Pollution Control
Program 1-1
Coastal Zone Act Reauthorization
Amendments of 1990 1-1
Section 6217(b) 1-2
Section 6217(d) 1-2
Section 6217(g) 1-2
complaint records 2-16
Computer-aided Management Practices
System 2-17
Conservation Technology Information
Center 2-28
consistency 4-8, 4-12
Cooperative Extension Service 2-16
cost of evaluations 4-17
County Transect Survey 2-28
County X example 2-22, 2-24, 2-25
data
accessibility 1-5,1-6
electronic storage 1-6
historical 2-18
life cycle 1-5
longevity 1-5
management 1-5
reliability 1-5
transformation 4-10
Economic Research Section, USD A 2-14
error 2-8
due to nonrespondents 2-10
measurement 2-8
reducing 2-10
sampling 2-10
Type 12-12
Type II2-12
estimate
point 2-11
pooled 3-3
estimation 2-11
evaluations
expert 4-1, 4-7
information obtainable from 4-1
mock 4-8
self 4-1, 4-13
site 4-7
teams 4-7
training for 4-8
variable selection 4-4
variables 4-2
farm numbers, VSDA 2-16
Farm Service Agency 2-17, 4-21
Aerial Photography Field Office 4-21
Field Office Computing System 2-17
finite population correction term 2-18
Friedman test 3-4
hypothesis
alternative 2-12
null 2-12
hypothesis testing 2-12
implementation rating 4-9
interviews, personal 4-1
Kruskal-Wallis test 3-4
Land Maps, county 2-16
Land Use and Land Cover, VSGS 2-16
Management measures 1-2
Mann-Whitney test 3-2
monitor 1-3
and CNPCPs 1-3
-------
baseline 1-4
compliance 1-4
effectiveness 1-4
implementation 1-3, 2-1
objectives 1-4, 2-1
project 1-4
trend 1-3
uses 1-4
validation 1-4
Monitoring Guidance for Determining the
Effectiveness ofNonpoint
Source Control Measures 1-4,
1-5, 2-1, 5-1
National Agriculture Statistics Service 2-16,
4-14
National Crop Residue Management Survey
2-28
National Oceanic and Atmospheric
Administration 1-1
National Resources Inventory 2-15
Natural Resources Conservation Service 4-21
nonpoint source pollution, sources 1-1
photographs 4-13
aerial 2-18
photography
aerial 4-1, 4-20
resolution 4-20
scale 4-20
population
assumptions about 2-8
sample, definition 2-2
target, definition 2-2
units, definition 2-2
variation 2-8
precision 2-10, 2-19
presentations 5-1
and time lag 5-1
audience 5-2
criteria 5-1
format 5-2
graphics 5-3
major factors 5-2
oral 5-2, 5-3
resources 5-4
written 5-2, 5-3
quality assurance and quality control 1-4,
4-12
quality assurance project plan 1-4
questionnaires
content 4-18
design 4-17
dichotomous 4-19
elements 4-18
layout 4-19
multiple-choice 4-19
objective 4-18
open-ended 4-19
ordering of questions 4-19
phrasing 4-19
pretest 4-19
response format 4-19
rating systems
binary 4-9
consistency 4-10
overall rating 4-11
scale 4-9
terms 4-10
sample size, estimation 2-18
sampling
cluster 2-5, 2-27
per unit cost 2-25
probabilistic 2-2
simple random 2-3, 2-20
strategy 2-13
stratified random 2-3, 2-24
systematic 2-8, 2-27
timing 2-13
scale, appropriate 1-3
standard deviation, pooled 3-2
statistical inference 2-2
statistics
confidence interval 2-11
descriptive 2-11
difference quantity 3-2
overall mean 2-25
parametric 2-19
relative error 2-20
significance level 2-12
software 3-1
stratum mean 2-25
Student's t test 2-21, 3-2
two-sample 3-2
-------
Surveys
accuracy of information 4-14
mail 4-1, 4-14
telephone 4-1, 4-14
tests
one-sided, hypotheses 3-1
two-sided, hypotheses 3-1
Tukey 's test 3-4
U.S. Environmental Protection Agency 1-1
Wilcoxon 's test 3-3
[Note: Italicized page numbers indicate
location of definitions of terms.]
-------
APPENDIX A
Statistical Tables
-------
Appendix A
Table Al. Cumulative areas under the
to
^P
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
1.1
1.2
1.3
1.4
1.5
1.6
1.7
1.8
1.9
2.0
2.1
2.2
2.3
2.4
2.5
2.6
2.7
2.8
2.9
3.0
3.1
3.2
3.3
3.4
0.00
0.5000
0.5398
0.5793
0.6179
0.6554
0.6915
0.7257
0.7580
0.7881
0.8159
0.8413
0.8643
0.8849
0.9032
0.9192
Zp)
^^
0.01
0.5040
0.5438
0.5832
0.6217
0.6591
0.6950
0.7291
0.7611
0.7910
0.8186
0.8438
0.8665
0.8869
0.9049
0.9207
0.9332I 0.9345
0.9452 ""0.9463
0.9554 0.9564
0.9641
0.9713
0.9772
0.9821
0.9861
0.9893
0.9918
0.9938
0.9953
0.9965
0.9974
0.9981
0.9987
0.9990
0.9993
0.9995
0.9997
0.9649
0.9719
0.9778
0.9826
0.9864
0.9896
0.9920
0.9940
0.9955
0.9966
0.9975
A
A
/
y
0.02
0.5080
0.5478
0.5871
0.6255
0.6628
0.6985
0.7324
0.7642
0.7939
0.8212
0.8461
0.8686
0.8888
0.9066
0.9222
0.9357
0.9474
0.9573
0.9656
0.9726
0.9783
0.9830
0.9868
0.9898
0.9922
0.9941
0.9956
0.9967
0.9976
0.9982 0.9982
0.9987
0.9991
0.9993
0.9995
0.9997
0.9987
0.9991
0.9994
0.9995
0.9997
/"*"•>
Normal distribution (values of p corresponding
/ 1^-
/
0.03
0.5120
0.5517
0.5910
0.6293
0.6664
0.7019
0.7357
0.7673
0.7967
0.8238
0.8485
0.8708
0.8907
0.9082
0.9236
0.9370
0.9484
0.9582
0.9664
0.9732
0.9788
0.9834
0.9871
0.9901
0.9925
0.9943
0.9957
0.9968
0.9977
0.9983
0.9988
0.9991
0.9994
0.9996
0.9997
\
\
/ A
1^ p
r
^^
0.04
0.5160
0.5557
0.5948
0.6331
0.6700
0.7054
0.7389
0.7704
0.7995
0.8264
0.8508
0.8729
0.8925
0.9099
0.9251
0.9382
0.9495
0.9591
0.9671
0.9738
0.9793
0.9838
0.9875
0.9904
0.9927
0.9945
0.9959
0.9969
0.9977
0.9984
0.9988
0.9992
0.9994
0.9996
0.9997
0.05
0.5199
0.5596
0.5987
0.6368
0.6736
0.7088
0.7422
0.7734
0.8023
0.8289
0.8531
0.8749
0.8944
0.9115
0.9265
0.9394
0.9505
0.9599
0.9678
0.9744
0.9798
0.9842
0.9878
0.9906
0.9929
0.9946
0.9960
0.9970
0.9978
0.9984
0.9989
0.9992
0.9994
0.9996
0.9997
rsa — ct
0.06
0.5239
0.5636
0.6026
0.6406
0.6772
0.7123
0.7454
0.7764
0.8051
0.8315
0.8554
0.8770
0.8962
0.9131
0.9279
0.9406
0.9515
0.9608
0.9686
0.9750
0.9803
0.9846
0.9881
0.9909
0.9931
0.9948
0.9961
0.9971
0.9979
0.9985
0.9989
0.9992
0.9994
0.9996
0.9997
0.07
0.5279
0.5675
0.6064
0.6443
0.6808
0.7157
~07486
0.7794
0.8078
0.8340
0.8577
0.8790
0.8980
0.9147
0.9292
0.9418
0.9525
0.9616
0.9693
0.9756
0.9808
0.9850
0.9884
0.9911
0.9932
0.9949
0.9962
0.9972
0.9979
0.9985
0.9989
0.9992
0.9995
0.9996
0.9997
0.08
0.5319
0.5714
0.6103
0.6480
0.6844
0.7190
0.7517
0.7823
0.8106
0.8365
0.8599
0.8810
0.8997
0.9162
0.9306
0.9429
0.9535
0.9625
0.9699
0.9761
0.9812
0.9854
0.9887
0.9913
0.9934
0.9951
0.9963
0.9973
0.9980
0.09
0.5359
0.5753
0.6141
0.6517
0.6879
0.7224
0.7549
0.7852
0.8133
0.8389
0.8621
0.8830
0.9015
0.9177
0.9319
0.9441
0.9545
0.9633
0.9706
0.9767
0.9817
0.9857
0.9890
0.9916
0.9936
0.9952
0.9964
0.9974
0.9981
0.9986 0.9986
0.9990
0.9993
0.9995
0.9996
0.9997
0.9990
0.9993
0.9995
0.9997
0.9998
-------
Appendix A
Table A2. Percentiles of the ta ,jf distribution (values off such that 100(l-a)% of the
distribution is less than t)
df
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
35
40
50
60
80
100
150
200
inf.
a = 0.40
0.3249
0.2887
0.2767
0.2707
0.2672
0.2648
0.2632
0.2619
0.2610
0.2602
0.2596
0.2590
0.2586
0.2582
0.2579
0.2576
0.2573
0.2571
0.2569
0.2567
0.2566
0.2564
0.2563
0.2562
0.2561
0.2560
0.2559
0.2558
0.2557
0.2556
0.2553
0.2550
0.2547
0.2545
0.2542
0.2540
0.2538
0.2537
0.2533
S
±±_
a =0.30
0.7265
0.6172
0.5844
0.5686
0.5594
0.5534
0.5491
0.5459
0.5435
0.5415
0.5399
0.5386
0.5375
0.5366
0.5357
0.5350
0.5344
0.5338
0.5333
0.5329
0.5325
0.5321
0.5317
0.5314
0.5312
0.5309
0.5306
0.5304
0.5302
0.5300
0.5292
0.5286
0.5278
0.5272
0.5265
0.5261
0.5255
0.5252
0.5244
^~
s
z
/
/
a = 0.20
1 .3764
1 .0607
0.9785
0.9410
0.9195
0.9057
0.8960
0.8889
0.8834
0.8791
0.8755
0.8726
0.8702
0.8681
0.8662
0.8647
0.8633
0.8620
0.8610
0.8600
0.8591
0.8583
0.8575
0.8569
0.8562
0.8557
0.8551
0.8546
0.8542
0.8538
0.8520
0.8507
0.8489
0.8477
0.8461
0.8452
0.8440
0.8434
0.8416
— -».
\
\
\
\
a = 0.10
3.0777
1.8856
1.6377
1 .5332
1 .4759
1.4398
1 .4149
1 .3968
1.3830
1.3722
1 .3634
1.3562
1.3502
1 .3450
1 .3406
1.3368
1 .3334
1 .3304
1.3277
1.3253
1 .3232
1.3212
1.3195
1 .3178
1 .3163
1.3150
1 .3137
1 .3125
1 .31 14
1 .3104
1 .3062
1.3031
1.2987
1 .2958
1 .2922
1.2901
1 .2872
1 .2858
1.2816
Area
/
S. ff
f^-
t
a =0.05
6.3137
2.9200
2.3534
2.1318
2.0150
1 .9432
1 .8946
1 .8595
1 .8331
1 .8125
1 .7959
1 .7823
1 .7709
1 .7613
1 .7531
1 .7459
1 .7396
1 .7341
1 .7291
1 .7247
1 .7207
1 .7171
1 .7139
1 .7109
1 .7081
1 .7056
1 .7033
1 .701 1
1 .6991
1 .6973
1 .6896
1 .6839
1 .6759
1 .6706
1 .6641
1 .6602
1 .6551
1 .6525
1 .6449
= a
a = 0.025
12.7062
4.3027
3.1824
2.7765
2.5706
2.4469
2.3646
2.3060
2.2622
2.2281
2.2010
2.1788
2.1604
2.1448
2.1315
2.1199
2.1098
2.1009
2.0930
2.0860
2.0796
2.0739
2.0687
2.0639
2.0595
2.0555
2.0518
2.0484
2.0452
2.0423
2.0301
2.0211
2.0086
2.0003
1 .9901
1.9840
1 .9759
1 .9719
1.9600
a =0.010
31 .8210
6.9645
4.5407
3.7469
3.3649
3.1427
2.9979
2.8965
2.8214
2.7638
2.7181
2.6810
2.6503
2.6245
2.6025
2.5835
2.5669
2.5524
2.5395
2.5280
2.5176
2.5083
2.4999
2.4922
2.4851
2.4786
2.4727
2.4671
2.4620
2.4573
2.4377
2.4233
2.4033
2.3901
2.3739
2.3642
2.3515
2.3451
2.3264
a =0.005
63.6559
9.9250
5.8408
4.6041
4.0321
3.7074
3.4995
3.3554
3.2498
3.1693
3.1058
3.0545
3.0123
2.9768
2.9467
2.9208
2.8982
2.8784
2.8609
2.8453
2.8314
2.8188
2.8073
2.7970
2.7874
2.7787
2.7707
2.7633
2.7564
2.7500
2.7238
2.7045
2.6778
2.6603
2.6387
2.6259
2.6090
2.6006
2.5758
-------
Appendix A
Table A3. Upper and lower pereentiles of the Chi-square distribution
df
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
35
40
50
60
70
80
90
100
200
£.
f
I
I
\
1
I
X
\
\
\
V
-^
— I —
T°
A.
Are
?=
^—-
i = 1-p
P
0.001
0.002
0.024
0.091
0.210
0.381
0.599
0.857
1.152
1.479
1.834
2.214
2.617
3.041
3.483
3.942
4.416
4.905
5.407
5.921
6.447
6.983
7.529
8.085
8.649
9.222
9.803
10.391
10.986
1 1 .588
14.688
17.917
24.674
31.738
39.036
46.520
54.156
61.918
143.84
0.005
0.010
0.072
0.207
0.412
0.676
0.989
1.344
1.735
2.156
2.603
3.074
3.565
4.075
4.601
5.142
5.697
6.265
6.844
7.434
8.034
8.643
9.260
9.886
10.520
11.160
1 1 .808
12.461
13.121
13.787
17.192
20.707
27.991
35.534
43.275
51.172
59.196
67.328
152.24
0.010
0.020
0.115
0.297
0.554
0.872
1.239
1.647
2.088
2.558
3.053
3.571
4.107
4.660
5.229
5.812
0.025 0.050 0.100 0.900
0.001
0.051
0.216
0.484
0.831
1.237
1.690
2.180
2.700
3.247
3.816
4.404
5.009
0.004 1 0.016
0.103 0.211
0.352
0.711
1.145
1.635
2.167
2.733
3.325
3.940
4.575
5.226
5.892
5.629 6.571
6.262 7.26T
6.908 7.962
6.408 7.564
7.015
7.633
8.260
8.897
9.542
10.196
10.856
1 1 .524
12.198
12.878
13.565
14.256
14.953
18.509
22.164
29.707
37.485
45.442
53.540
61.754
70.065
156.43
8.231
8.907
9.591
10.283
10.982
1 1 .689
12.401
13.120
8.672
9.390
10.117
10.851
11.591
12.338
13.091
13.848
14.611
13.8441 15.379
14.573 16.151
15.308 16.928
16.047 ~1 7.708
16.791
20.569
24.433
32.357
40.482
48.758
57.153
65.647
74.222
162.73
18.493
22.465
26.509
34.764
43.188
51.739
60.391
69.126
77.929
168.28
0.584
1.064
1.610
2.204
2.833
3.490
4.168
4.865
5.578
6.304
7.041
7.790
8.547
9.312
10.085
10.865
1 1 .651
12.443
13.240
14.041
14.848
15.659
16.473
1 7.292
18.114
18.939
19.768
20.599
24.797
29.051
37.689
46.459
55.329
64.278
73.291
82.358
1 74.84
2.706
4.605
6.251
7.779
9.236
10.645
12.017
13.362
14.684
15.987
17.275
18.549
19.812
21.064
22.307
23.542
24.769
25.989
27.204
28.412
29.615
30.813
32.007
33.196
34.382
35.563
36.741
37.916
39.087
40.256
46.059
51 .805
63.167
74.397
85.527
96.578
107.57
118.50
226.02
0.950 0.975 0.990 0.995 0.999
3.841
5.991
7.815
9.488
1 1 .070
12.592
14.067
15.507
16.919
18.307
19.675
21.026
22.362
23.685
5.024 6.635 7.879
7378] 9.210| 10.597
9.348
11.143
12.832
14.449
16.013
17.535
19.023
1 1 .345
13.277
15.086
16.812
18.475
20.090
21.666
20.483 23.209
21.920
23.337
24.736
26.119
24.725
26.217
27.688
29.141
24.996 27.488! 30.578
26.296I 28.845I 32.000
27.587 30.191] 33.409
28.869] 31.526| 34.805
30.144
31.410
32.671
33.924
35.172
36.415
37.652
38.885
40.113
41.337
42.557
43.773
49.802
55.758
67.505
79.082
90.531
101.88
113.15
124.34
233.99
32.852
34.170
35.479
36.781
38.076
39.364
40.646
41.923
43.195
44.461
45.722
46.979
53.203
59.342
71.420
83.298
95.023
106.63
118.14
129.56
241.06
36.191
37.566
38.932
40.289
41 .638
42.980
44.314
45.642
46.963
48.278
49.588
50.892
57.342
63.691
76.154
88.379
100.43
112.33
124.12
135.81
249.45
12.838
14.860
16.750
18.548
20.278
21.955
23.589
25.188
26.757
28.300
29.819
31.319
32.801
34.267
35.718
37.156
38.582
39.997
41.401
42.796
44.181
45.558
46.928
48.290
49.645
50.994
52.335
53.672
60.275
66.766
79.490
91.952
104.21
116.32
128.30
140.17
255.26
10.827
13.815
16.266
18.466
20.515
22.457
24.321
26.124
27.877
29.588
31.264
32.909
34.527
36.124
37.698
39.252
40.791
42.312
43.819
45.314
46.796
48.268
49.728
51.179
52.619
54.051
55.475
56.892
58.301
59.702
66.619
73.403
86.660
99.608
112.32
124.84
137.21
149.45
267.54
------- |