Guidance for Geospatial Data
Quality Assurance Project Plans

EPA QA/G-5G

Quality Staff
Office of Environmental Information
United States Environmental Protection Agency

Washington, DC 20460
PEER REVIEW DRAFT

February 28, 2002


-------
FOREWORD

The U.S. Environmental Protection Agency (EPA) has developed the Quality Assurance
(QA) Project Plan as a tool for project managers and planners to document the type and quality
of data and information needed for making environmental decisions. This document, Guidance
for Geospatial Data Quality Assurance Project Plans (EPA QA/G-5G), contains advice and
recommendations for developing a QA Project Plan for projects involving geospatial data,
including both newly collected or acquired data from other sources.

This document was designed for internal use and provides guidance to EPA program
managers and planning teams. It does not impose legally binding requirements and may not
apply to a particular situation based on the circumstances. EPA retains the discretion to adopt
approaches on a case-by-case basis that differ from this guidance where appropriate. EPA may
periodically revise this guidance without public notice.

This document is one of the U.S. EPA Quality System Series documents. These
documents describe the EPA policies and procedures for planning, implementing, and assessing
the effectiveness of the Quality System. As required by EPA Order 5360 A1 (EPA, 2000a),
this document is valid for a period of up to five years from the official date of publication. After
five years, this document will be reissued without change, revised, or withdrawn from the U.S.
EPA Quality System Series. Questions regarding this document or other Quality System Series
documents should be directed to the Quality Staff at:

U.S. EPA
Quality Staff (2811R)

1200 Pennsylvania Ave., NW
Washington, DC 20460
Phone: (202) 564-6830
Fax: (202) 565-2441
E-mail: quality@epa.gov

Copies of EPA Quality System Series documents may be obtained from the Quality Staff
directly or by downloading them from its home page:

www. epa. gov/quality

EPA QA/G-5G

l

Peer Review Draft
February 2002


-------
ACKNOWLEDGMENTS

This document reflects efforts to adapt the QA Project Plan elements (EPA, 2001b) to
projects involving geospatial data collection and use. The contribution of the Geospatial Quality
Council members to the first draft of this document, and subsequent input from selected
geospatial data users and the Quality Staff to this peer review draft, is greatly appreciated.

EPA QA/G-5G

11

Peer Review Draft
February 2002


-------
TABLE OF CONTENTS

Page

CHAPTER 1 INTRODUCTION	1

1.1	What is the Purpose of this Document?	1

1.2	Why is Planning for Geospatial Projects Important?	2

1.3	What is EPA's Quality System? 	3

1.4	What Questions will this Guidance Help to Address?	7

1.5	Who can Benefit from this Document?	7

1.6	The Graded Approach to QA Project Plans	8

1.7	How Does this Guidance Relate to Existing EPA Practices Using
Geospatial Data? 	10

1.8	How Is this Document Organized? 	11

CHAPTER 2 OVERVIEW TO CREATING A QA PROJECT PLAN	13

2.1	Introduction 	13

2.2	Related QA Project Plan Guidance and Documentation	14

2.3	QA Project Plan Responsibilities 	16

2.4	Secondary Use of Data	17

2.5	Revisions to QA Project Plans	17

2.6	Overview of the Components of a QA Project Plan 	18

CHAPTER 3 GEOSPATIAL DATA QA PROJECT PLAN GROUPS AND

ELEMENTS	21

3.1	Introduction 	21

3.1.1	Al. Title and Approval Sheet 	21

3.1.2	A2. Table of Contents 	22

3.1.3	A3. Di stributi on Li st 	22

3.1.4	A4. Project/Task Organization 	24

3.1.5	A5. Problem Definition/Background	25

3.1.6	A6. Project/Task Description	25

3.1.7	A7. Quality Objectives and Criteria	26

3.1.8	A8. Special Training/Certification	28

3.1.9	A9. Documents and Records	28

3.2	Group B: Data Generation and Acquisition	29

3.2.1	Bl.	Sampling Process Design	30

3.2.2	B2.	Sampling and Image Acquisition Methods	33

3.2.3	B3.	Sample Handling and Custody	34

3.2.4	B4.	Analytical Methods	35

3.2.5	B5.	Quality Control	36

EPA QA/G-5G

ill

Peer Review Draft
February 2002


-------
TABLE OF CONTENTS (continued)

Page

3.2.6	B6. Instalment/Equipment Testing, Inspection, and

Maintenance 	37

3.2.7	B7. Instalment/Equipment Calibration and Frequency	38

3.2.8	B8. Inspection/Acceptance Requirements for Supplies

and Consumables 	39

3.2.9	B9. Data Acquisition Requirements (Nondirect

Measurements) 	40

3.2.10	BIO. Data Management	43

3.3	Group C: Assessment/Oversight	53

3.3.1	CI. Assessments and Response Actions	54

3.3.2	C2. Reports to Management 	58

3.4	Group D: Data Validation and Usability 	59

3.4.1	Dl. Data Review, Verification, and Validation	60

3.4.2	D2. Verification and Validation Methods	62

3.4.3	D3. Reconciliation with User Requirements	63

CHAPTER 4 GRADED APPROACH EXAMPLES	65

4.1	Minimum Documentation Example: Creating a Cartographic
Product from a Spreadsheet Containing Facility Latitude/Longitude
Coordinates 	65

4.1.1	Group A:	Project Management 	65

4.1.2	Group B:	Measurement/Data Acquisition	67

4.1.3	Group C:	Assessment/Oversight	68

4.1.4	Group D:	Data Validation and Usability	68

4.2	Medium Documentation Example: Routine Global Positioning Survey

Task to Produce a GIS Data Set	69

4.2.1	Group A: Project Management and Systematic Planning

to Define the Task 	69

4.2.2	Group B: Data Collection 	70

4.2.3	Group C: Assessment and Oversight	72

4.2.4	Group D: Data Validation and Usability	73

4.3	Complex Documentation Example: Developing Complex Data sets in

a GIS for Use in Risk Assessment Models 	73

4.3.1	Group A: Project Management 	74

4.3.2	Group B: Measurement/Data Acquisition	76

4.3.3	Group C: Assessment/Oversight	77

4.3.4	Group D: Data Validation and Usability	78

EPA QA/G-5G

iv

Peer Review Draft
February 2002


-------
TABLE OF CONTENTS (continued)

Page

APPENDIX A: BIBLIOGRAPHY 	 A-l

APPENDIX B: GLOSSARY	B-l

APPENDIX C: PRINCIPAL DATA QUALITY INDICATORS FOR

GEOSPATIAL DATA	 C-l

EPA QA/G-5G

v

Peer Review Draft
February 2002


-------
LIST OF FIGURES

Figure	Page

1.	The EPA Quality System Approach to Addressing Geospatial Data Applications	4

2.	Steps of the Systematic Planning Process 	6

3.	An Example Table of Contents and Distribution List 	23

4.	An Example Organizational Chart	24

5.	GIS Flow Diagram	44

LIST OF TABLES

Table	Page

1.	EPA QA Policy and Requirements Documents	5

2.	Types of Documents Published as Part of the EPA Quality System 	5

3.	Questions that this Guidance Will Help to Address	7

4.	Continuum of Geospatial Projects with Differing Intended Uses 	9

5.	EPA QA Guidance Documents 	15

6.	Summary of QA Groups and Elements 	19

7.	Typical Activities and Documentation Prepared Within the System Development
Life Cycle of a Geospatial Data Project to Be Considered When Establishing

the QA Program for the Hardware/Software Configuration	50

LIST OF ACRONYMS

DQO	Data quality objectives

EPA	U.S. Environmental Protection Agency

FGDC	Federal Geographic Data Committee

GIS	Geographic information system

GPS	Global positioning system

QA	Quality assurance

QC	Quality control

RMSE	Root mean square error

SSURGO	Soil Survey Geographic (data produced by the U.S. National Resources

Conservation Service)

TIGER	Topologically Integrated Geographic Encoding and Referencing

EPA QA/G-5G

vi

Peer Review Draft
February 2002


-------
CHAPTER 1

INTRODUCTION

Quality Assurance Project Plan: "A document describing in comprehensive detail the
necessary QA, Quality Control (QC), and other technical activities that must be implemented
to ensure that the results of the work performed will satisfy the stated performance criteria"
[EPA Requirements for Quality Assurance Project Plans (QA/R-5) (EPA, 2001b, glossary)].

1.1 What is the Purpose of this Document?

The EPA Quality System defined in EPA Order 5360.1 A2, Policy and Program
Requirements for the Mandatory Agency-wide Quality System (EPA 2000d), includes coverage
of environmental data or "any measurement or information that describe environmental
processes, location, or conditions; ecological or health effects and consequences; or the
performance of environmental technology. For EPA, environmental data includes information
collected directly from measurements, produced from models, and compiled from other sources
such as databases or literature." The EPA Quality System is based on an American National
Standard, ANSI/ASQC E4-1994.

Consistent with the national standard, E4-1994, Section §6.a.(7) of EPA Order 5360.1 A2
states that EPA organizations will develop a Quality System that includes "approved Quality
Assurance (QA) Project Plans, or equivalent documents defined by the Quality Management
Plan, for all applicable projects and tasks involving environmental data with review and approval
having been made by the EPA QA Manager (or authorized representative defined in the Quality
Management Plan). More information on EPA's policies for QA Project Plans are provided in
Chapter 5 of the EPA Manual 5360 Al, EPA Quality Manual for Environmental Programs
(EPA, 2000a) and Requirements for Quality Assurance Project Plans (QA/R-5) (EPA, 2001b).
This guidance helps to implement the policies defined in Order 5360.1 A2. It is intended to help
geospatial professionals who are unfamiliar with the requirements of QA Project Plans develop a
document that meets EPA standards.

This guidance document describes the type of information that would be included in a
QA Project Plan for a geospatial data project. Using this guidance, anyone from a geographic
information system (GIS) technician at an EPA extramural supplier (e.g., contractor, university,
or other organization) to an EPA Project Manager, Work Assignment Manager, or other EPA
staff member, will know what information is needed in a QA Project Plan for projects involving
geospatial data.

After reviewing this guidance document, the reader will have a clearer understanding of
how to comply with these policies for geospatial projects. Not all elements of a QA Project Plan

EPA QA/G-5G

1

Peer Review Draft
February 2002


-------
[as described in EPA's Guidance for Quality Assurance Project Plans (QA/G-5) (EPA, 1998a)]
are applicable to all geospatial projects. Therefore, this guidance is provided to assist in the
development of a QA Project Plan that is appropriate for the project. The elements, as described
in the general EPA guidance on QA Project Plans (EPA, 1998a), are written with a focus on
environmental data collection. This guidance helps the reader interpret those requirements for a
geospatial project.

This document is just one of many documents that support EPA's Quality System.
Quality Management Plans and other EPA Quality System documents are not discussed in detail
in this guidance, but are also relevant and applicable to the use of geospatial data for or by EPA.
Several other related documents may also serve as useful references during the course of a
project, especially when other types of environmental data are acquired or used. This geospatial
guidance supplements the Guidance on Quality Assurance Project Plans (QA/G-5) (EPA,
1998a). Table 1 in Section 1.3 of this document lists other documents that provide additional
information about quality requirements at EPA.

1.2 Why is Planning for Geospatial Projects Important?

Planning is important in geospatial projects because it allows the project team to identify
potential problems that may be encountered on a project and develop ways to work around or
solve those problems before they become critical to timelines, budgets, or final product quality.
Many examples exist of how a lack of planning impacts quality in geospatial projects. Lack of
planning and detailed knowledge about data needs can cost a project a great deal of time and
effort.

Example: Importance of Planning

Consider the case in which planning was not conducted on a project that required existing
geospatial soils data. The project team needed a good quality source of soils data in a
geospatial format. They decided to use the Soil Survey Geographic (SSURGO) data
produced by the U.S. Natural Resources Conservation Service. SSURGO provides highly
detailed soils data and had the content they required. They began their project by
downloading SSURGO data for a single pilot project area and developed a series of
applications programs over several weeks to correctly analyze and process the SSURGO
data. When they had completed the pilot successfully, they began downloading the
SSURGO data for the remainder of the study areas throughout the country. Only then did
they discover that SSURGO data were only available in certain parts of the country and did
not cover two-thirds of their project sites. The project team had to choose a different soils
database and reengineer their entire project to make use of this different geospatial data set.

EPA QA/G-5 G

2

Peer Review Draft
February 2002


-------
A good QA Project Plan is valuable to a geospatial project in the following ways:

It can be used to guide project personnel through the development process, helping
ensure that choices are consistent with the established objectives and requirements for
the project.

Because the document fully describes the plans for the project, it will lead to a project
with more transparency, better communication among the project team members, and
better results for the decision maker.

Using a QA Project Plan reduces the risk of schedule and budget overruns.

If the QA Project Plan is properly followed, the project will lead to a more defensible

outcome than a project without proper planning documentation.

It will document the criteria and assumptions in one place for easy review and

referral by anyone interested in the process.

It uses a consistent format, making it easy for others to review the procedures and
ensuring that individual steps are not overlooked in the planning phase.

In addition to these benefits, a project with a well-defined QA Project Plan often takes
less time and effort to complete than a project without a planning document. Projects without
planning documents are more likely to require additional cost and time to correct or redo
collection, analysis, or processing of environmental data. The savings resulting from good
planning typically outweighs the time and effort spent to develop the QA Project Plan. Poor
quality planning often results in poor decisions. The costs of decision-making mistakes can be
enormous and far outweigh the costs of proper planning for quality.

What are the characteristics of a scientifically sound geospatial data project plan? A
scientifically sound, quality-based geospatial QA Project Plan

provides documentation of the outcome of the systematic planning process

is developed using a process designed to minimize errors

documents the standard operating procedures that will be followed

documents the data sources, format, and status of the existing data to be used in the

project [including topological status, accuracy, completeness, and other required

Federal Geographic Data Committee (FGDC) metadata]

is frequently updated as new information becomes available or as changes in

methodology are requested

provides for the documentation of any changes from the original plan.

1.3 What is EPA's Quality System?

EPA has developed comprehensive requirements and procedures to include QC and QA
in the planning stage of every project involving the use of environmental data. The EPA Quality
System is described in EPA Order 5360.1 A2 (EPA, 2000d), which contains policy and program
requirements for the mandatory, Agency-wide quality system. Emphasis is placed on planning

EPA QA/G-5G

3

Peer Review Draft
February 2002


-------
for quality in projects before they have begun, rather than performing quality assurance and
quality control planning during or after a project has been completed.

Figure 1 illustrates
the role of a QA Project Plan
for geospatial data projects
within the context of the
EPA Quality System. This
guidance document describes
all essential quality assurance
information needed for a
geospatial project. The
figure shows the flow of data
through data collection; data
processing and analysis; and
data validation, review, and
assessment.

The EPA Quality
System is a management
system that provides the
elements necessary to plan,
implement, document, and assess the effectiveness of QA and QC activities applied to
environmental programs conducted by or for EPA. The EPA Quality System encompasses the
collection, evaluation, and use of environmental data by or for EPA and the design, construction,
and operation of environmental technology by or for EPA. EPA's Quality System has been built
to ensure that environmental programs are supported by the type, quality, and quantity of data
needed for their intended use. The EPA Quality System integrates policy and procedures,
organizational responsibilities, and individual accountability. Table 1 lists the documents that
constitute the EPA Quality System.

Many of these EPA documents were developed and designed specifically to address
environmental samples such as soil, surface water, or groundwater and the subsequent chemical
analyses. However, the Quality System also provides guidance for planning for quality when
using many other types and sources of data, including geospatial data.

Table 2 provides an overview of the types of documents available from EPA that
describe different components of the Quality System. In particular, internal policy documents
form the basis for the quality system. Guidance documents provide instructions and advice to
meet the requirements in different types of EPA-sponsored projects and tasks. Web-based
access to these documents is available at http://www.epa.gov/quality.

Figure 1. The EPA Quality System Approach to

Addressing Geospatial Data Applications

EPA QA/G-5G

4

Peer Review Draft
February 2002


-------
Table 1. EPA QA Policy and Requirements Documents

Title/Number

Type

Description

Policy and Program Requirements
for the Mandatory Agency-wide
Quality System (Order 5360.1 A2),
May 2000 (EPA, 2000d)

Internal
Policy

Quality requirements for EPA organiza-
tions that produce environmental data

EPA Quality Manual for Environ-
mental Programs (Order 5360 A1),
May 2000 (EPA, 2000a)

Internal
Policy

Specifications for satisfying the
mandatory Quality System defined in
EPA Order 5360.1

EPA Requirements for Quality
Management Plans (QA/R-2) (EPA,
2001c)

Requirement

Requirements for Quality Management
Plans for organizations that receive
funding from EPA

EPA Requirements for Quality
Assurance Project Plans (QA/R-5)
(EPA, 2001b)

Requirement

Requirements for QA Project Plans
prepared for activities conducted or
funded by EPA

ANSI/ASQC E4-1994, Specifications
and Guidelines for Quality Systems
for Environmental Data Collection
and Environmental Technology
Programs (ANSI/ASQC, 1995)

National
Standard

Basic guidelines by which a quality
program for environmental data
collection and environmental
technology can be planned,
implemented, and assessed

Table 2. Types of Documents Published as Part of the EPA Quality System

Document Type

Contents/Purpose

Policy/EPA Orders

EPA policies and minimum requirements for
the Agency-wide Quality System

Requirements Documents (those beginning
with an "R") (e.g., EPA Requirements for
Quality Management Plans, QA/R-2)

Specific requirements necessary to fulfill
policies

Guidance Documents (those beginning with a
"G") (e.g., Guidance on Quality Assurance
Project Plans, QA/G-5)

Documents developed to help EPA and non-
EPA organizations meet requirements. These
documents are developed with specific types
of environmental data or procedures in mind.

EPA QA/G-5G

5

Peer Review Draft
February 2002


-------
How does systematic planning relate to a OA Project Plan? Systematic planning
identifies the expected outcome of the project; its technical goals, cost, and schedule; and the
criteria for determining whether the inputs and outputs of the various intermediate stages of the
project, as well as the project's final product, are acceptable. The goal is to ensure that the
project will produce the right type, quality, and quantity of data to meet the user's needs. EPA
Order 5360.1 A2 (EPA, 2000d) requires projects for EPA environmental programs to use a
systematic planning process to develop acceptance or performance criteria when collecting,
evaluating, or using environmental data.

The systematic planning process
can be applied to any type of data-
generating project. The seven basic steps
of the systematic planning process are
illustrated in Figure 2. The first three
steps can be considered preliminary
aspects of scoping and defining the
geospatial data collection or processing
effort, while the last four steps relate
closely to the establishment of
performance criteria or acceptance criteria
that will help ensure the quality of the
project's outputs and conclusions.

Performance and acceptance criteria are
measures of data quality established for
specific data quality indicators and used
to assess the sufficiency of collected
information. Performance criteria apply
to information that is collected for the
project. These criteria apply to new data.

Acceptance criteria apply to the adequacy
of existing information proposed for
inclusion in the project. These criteria
apply to data drawn from existing
sources. Generally, performance criteria
are used when data quality is under the
project's control, while acceptance
criteria focus on whether data generated
outside the project are acceptable for their
intended use on the project (e.g., as input
to GIS processing software).

Systematic planning is based on a common-sense, graded approach. This means that the
extent of systematic planning and the approach to be taken match the general importance of the
project and the intended use of the data. For example, when geospatial data processing is used to

Figure 2. Steps of the Systematic Planning
Process

EPA QA/G-5G

6

Peer Review Draft
February 2002


-------
help generate data either for decision making (i.e., hypothesis testing) or for determining
compliance with a standard, EPA recommends that the systematic planning process take the
form of the Data Quality Objectives (DQO) Process that is explained in detail within Guidance
for the Data Quality Objectives Process (QA/G-4) (EPA, 2000c).

1.4 What Questions will this Guidance Help to Address?

For quick reference to the information in this document, Table 3 provides a summary of
the main questions addressed, indicating the chapter and sections containing this information.

Table 3. Questions that this Guidance Will Help to Address

Questions

Relevant Sections

How should the results of the planning phase for a geospatial data
project be documented in a QA Project Plan?

3.1.7, 3.2.9

What quality assurance documentation is needed?

3.1.9

How do I document the acceptable level of uncertainty?

3.1.7, 3.2.9

What are some of the important metrics of quality for evaluating
geospatial data (e.g., sensitivity analysis for GIS) and how can this
information be used?

Appendix C

How do I conduct and document the data evaluation process?

3.3, 3.4

How do I assess the quality of geospatial data obtained from other
sources (i.e., secondary use of existing data)?

3.2.9

What is needed to plan for data management (the process) and hardware/
software configuration?

3.2.10

How do I document changes from the planned process described in the
QA Project Plan?

Chapter 2

1.5 Who can Benefit from this Document?

Anyone developing geospatial projects or using geospatial data for EPA will benefit from
this document. This document will help in the creation of a QA Project Plan that specifically
addresses the issues and concerns related to the quality of geospatial data, processing, and
analysis. This document will help anyone who is

creating geospatial data from maps, aerial photos, or other sources

generating or acquiring the aerial photos

using existing data sources in their geospatial projects

generating new geospatial data from Global Positioning System (GPS) receivers
developing complex analysis programs that manipulate geospatial data

EPA QA/G-5G

7

Peer Review Draft
February 2002


-------
overseeing applications programming or software development projects—to
understand how planning is related to developing software programs that use
geospatial data

reviewing QA Project Plans for geospatial data—to understand the steps and details
behind the planning

serving as a QA Officer for a group that creates or uses geospatial data.

1.6 The Graded Approach to QA Project Plans

The "graded" approach to developing QA
Project Plans means that QA Project Plan
development is commensurate with the scope,
magnitude, or importance of the project itself.

This means that for geospatial projects that are
narrow in scope, that will not result in decisions
that have far-reaching impacts, or that are not complex, a simple QA Project Plan would be
adequate. For complex, broad-scope projects that might lead to regulatory decisions, a more
comprehensive and detailed QA Project Plan may be required. Major factors in determining the
level of detail needed in the QA Project Plan include the importance of the data, the cost, and the
organizational complexity of the project.

The Graded Approach: The scope and
complexity of the project drive the scope
and complexity of the QA Project Plan.

Geospatial projects usually have a critical
software development component as well as the
locational data component. The quality issues
surrounding software development are also to be
taken into account [see Information Resources
Management Policy Manual (Directive 2100) for
more information].

Complex Projects: Many complex
geospatial projects require the develop-
ment of sophisticated applications or
software programs. EPA Directive 2100
(jhttp://www.epa.gov/irm jpolman), The
Information Resources Management
Policy Manual (Chapter 17, System Life
Cycle Management), categorizes soft-
ware development projects based on size
and complexity.

Two aspects of a geospatial project are
important for defining the level of QA effort
required: intended use of the project output and
the project scope and magnitude. The intended
use of the geospatial data determines the

potential consequences or impacts that might occur because of quality problems. Table 4 shows
examples of project data uses frequently encountered in geospatial projects and the
corresponding QA issues to address. It is important to attempt to determine the use of the
geospatial data or analysis product in the decision-making process to ensure that the data
produced are of sufficient accuracy and are of the appropriate type and content to support the
decision for which they were created or gathered. Table 4 lists the example projects in
decreasing order of the rigor of quality assurance. Final word on the level and degree of rigor
for the acceptable level of quality assurance of a specific project lies with the QA Officer.

EPA QA/G-5G	Peer Review Draft

8	February 2002


-------
Table 4. Continuum of Geospatial Projects with Differing Intended Uses

Purpose of Project

Typical Quality Assurance Issues

Level of QA

Regulatory compliance
Litigation

Congressional testimony

Legal defensibility of data sources
Compliance with laws and regulatory mandates

applicable to data gathering
Legal defensibility of methodology

A





Regulatory development
Spatial data development
(Agency infrastructure
development)

Compliance with regulatory guidelines
Existing data obtained under suitable QA

program
Audits and data reviews





Trends monitoring
(nonregulatory)
Reporting guidelines

(e.g., Clean Water Act)
"Proof of principle"

Use of accepted data-gathering methods
Use of accepted models/analysis techniques
Use of standardized geospatial data models
Compliance with reporting guidelines





Screening analyses
Hypothesis testing
Data display

QA planning and documentation as appropriate
Use of accepted data sources
Peer review of products

As shown in Table 4, projects with a high potential for being involved in litigation (either
causing new litigation or being evaluated in ongoing litigation) will generally require a higher
level of effort and quality standards in a corresponding QA Project Plan. More modest levels of
defensibility and rigor are required for data used for technology assessment or "proof of
principle," where no litigation or regulatory action are expected. Still lower levels of
defensibility would be needed for basic exploratory research requiring extremely fast turn-
around or high flexibility and adaptability. In such cases, work may have to be replicated under
tighter controls or the results carefully reviewed prior to publication. By analyzing the end-use
needs, appropriate QA criteria can be established to guide the program or project.

Other aspects of the QA effort can be established by considering the scope and
magnitude of the project. The scope of the geospatial project determines the complexity of the
QA Project Plan; more complex applications require more QA effort. The magnitude of the
project determines the resources at risk if quality problems lead to rework and delays. Data
processing projects with nationwide scope that will produce new Agency-wide data sources (for
example, development of the National Hydrography Dataset) would call for sophisticated quality

EPA QA/G-5G

9

Peer Review Draft
February 2002


-------
assurance and quality control procedures and extensive QA planning and implementation (and
documentation to support evaluation in the secondary use of existing data). Other projects may
involve simply acquiring existing digital, geospatial data to create a map in support of manage-
ment meetings or internal communications. Projects with different scopes are likely to require
different levels of QA planning. The level of detail for any particular project is decided by the
project's EPA QA Officer. In the case of extramural research, the project's QA Officer will
discuss the QA category with the EPA QA Officer so there are no misunderstandings, and any
questions will ideally be resolved before work on the QA Project Plan begins.

Specific examples of how the considerations described above can be used to define the
scope of a project's QA effort are provided in Chapter 4 of this document.

1.7 How Does this Guidance Relate to Existing EPA Practices Using Geospatial Data?

Geospatial data technologies have been used in EPA research since the early 1970s.

From 1986 to 1990, GIS was implemented in all ten Regional Offices, under the direction of the
Office of Information Resource Management, which offered hardware and software to regions
that assembled a multidisciplinary support team. Today, geospatial data technology is used in all
ten of the Regional Offices, most of the 12 national Program Offices (e.g., Office of Water), and
several of the 13 Administrator's Offices (EPA, 2001a) (see
http://intranet, epa.gov/geosinfo/baseline. htm).

Although early usage was dominated by remote-sensing research activities in EPA
laboratories, today GIS is the dominant geospatial technology used within the Agency. Other
technologies (e.g., remote sensing, visualization, and GPS) are mostly used in conjunction with
GIS. The use of geospatial data technologies is highly varied, ranging from mapping and
dissemination of information to complex modeling and tool development. GIS is applied to a
wide range of environmental concerns, driven by Agency mandates.

Geospatial data needs within EPA are highly varied, as are data sources. An estimated
80 to 85 percent of EPA applications using geospatial data constitute secondary use of existing
data acquired from external sources, primarily other federal agencies and states. While EPA
generates relatively little geospatial map data, it does generate two types: (1) data used in
regulatory, enforcement, or compliance activities and (2) geospatial data produced as a result of
program analyses that use geospatial technologies. The utility of EPA-generated geospatial data
is often compromised by missing or inaccurate spatial information.

The use of geospatial technologies by EPA staff will only continue to grow, and
geospatial data needs will only increase. Concerns about locational accuracy and data
completeness need to be addressed by the development of QA Project Plans for geospatial data
projects involving both data developed in-house at EPA and the use of existing data acquired
from external sources. A QA Project Plan would help ensure that geospatial data were suitable
for informed analysis and decision making, via a three-phased project approach: planning,
implementation, and assessment.

EPA QA/G-5G

10

Peer Review Draft
February 2002


-------
1.8 How Is this Document Organized?

Chapter 1 contains background information about EPA's quality systems planning,
concepts, and definitions.

Chapter 2 describes the components of a QA Project Plan, identifies the point in the
project life cycle at which the QA Project Plan is developed, and describes how a QA Project
Plan fits into the overall schedule and performance of a geospatial project. The roles and
responsibilities of various staff members on the project are also described as well as how and
when to make revisions to QA Project Plans.

In Chapter 3, QA Project Plan content guidelines are presented and the organizational
structure of a QA Project Plan is defined in terms of "groups" and "elements." Guidance for the
types of information that are included for each group and element is presented, noting their
relevance in geospatial projects.

Chapter 4 contains examples of QA Project Plans for geospatial projects to help the
reader understand what this type of QA Project Plan contains and to illustrate the "graded
approach," in which content changes in response to differing scope and complexity.

The following appendices are also included in this document:

Appendix A: Bibliography
Appendix B: Glossary

Appendix C: Principal Data Quality Indicators for Geospatial Data.

EPA QA/G-5G

11

Peer Review Draft
February 2002


-------
[This page intentionally left blank]

EPA QA/G-5G	Peer Review Draft

12	February 2002


-------
CHAPTER 2

OVERVIEW TO CREATING A QA PROJECT PLAN
2.1 Introduction

As explained in Chapter 1, QA Project
Plans are necessary for all work performed by or
for EPA that involves the acquisition of
environmental data generated from direct
measurement activities, collected from other
sources, or compiled from computerized
databases. This chapter provides more informa-
tion on the source and intent of these policies and
provides information on other related guidance
and requirements documents, roles and
responsibilities in creating QA Project Plans, and
information on how and when to update QA
Project Plans.

What is the purpose of a OA Project Plan? The QA Project Plan documents the
systematic planning process for any data collection or use activity, as it documents how QA and
QC activities will be planned and implemented. To be complete, the QA Project Plan will meet
certain guidelines for detail and coverage (see EPA Requirements for Quality Assurance Project
Plans (QA/R-5) (EPA, 2001b), but the extent of detail is dependent on the type of project, the
data to be acquired and processed, the questions to be answered, and the decisions to be made.
Overall, the QA Project Plan is to provide sufficient detail to demonstrate that

the project's technical and quality objectives are identified and agreed upon
the intended data acquisition and data processing methods are appropriate for
achieving project objectives

the assessment procedures are sufficient for confirming that output data and products
of the type and quality needed are obtained

any limitations on the use of the output data and products can be identified and
documented.

EPA allows for flexibility in the organization and content of a QA Project Plan to meet
the unique needs of each project or program. Although most QA Project Plans will describe
project- or task-specific activities, there may be occasions when a generic QA Project Plan may
be more appropriate. A generic QA Project Plan addresses the general, common activities of a
program that are to be conducted at multiple locations or over a long period of time; for
example, a large monitoring program that uses the same methodology at different locations. A
generic QA Project Plan describes, in a single document, the information that is not site- or time-

The QA Project Plan is the critical
planning document for any environ-
mental data collection operation because
it documents how QA and QC activities
will be implemented during the life
cycle of a program, project, or task. The
QA Project Plan is the blueprint for
identifying how the quality system of the
organization performing the work is
reflected in a particular project and in
associated technical goals (EPA, 1998a).

EPA QA/G-5G

13

Peer Review Draft
February 2002


-------
specific but applies throughout the program. Application-specific information is then added to
the approved QA Project Plan as that information becomes known or completely defined. A
generic QA Project Plan is reviewed periodically to ensure that its content continues to be valid
and applicable to the program over time (EPA Requirements for Quality Assurance Project
Plans (QA/R-5) (EPA, 2001b).

2.2 Related QA Project Plan Guidance and Documentation

Complex, broad-scope projects involving environmental data and geospatial databases
may involve developing QA Project Plans that cross over many boundaries. For example, a
multiyear, human health risk assessment project may involve taking and analyzing air samples
from industrial sites, developing sophisticated software models, developing complex GIS
procedures to process and analyze existing data from sources external to the project for use in
the models, creation of new geospatial data, use of aerial photographs for ground-truthing,1 and
perhaps creating land-cover layers from new satellite imagery. Projects such as these may have
more than one QA Project Plan. For example, there may be an overall QA Project Plan that
establishes quality procedures, policies, and techniques for the project as a whole. Then for each
subtask that contains a substantial amount of work or contains activities that in themselves
require QA Project Plans, additional QA Project Plans may be required. In the example
mentioned above, the following QA Project Plans would be needed:

overall QA Project Plan that describes the quality system to be used on the project
QA Project Plan for the geospatial data aspects of the data collection and analysis
QA Project Plan for collection and analysis of air samples.

Each of these QA Project Plans may have similar information regarding overall project
scope, purpose, management structure, and so on. But within the other QA groups—namely,
Measurement and Data Acquisition (Group B), Assessment/Oversight (Group C), and Data
Validation and Usability (Group D)—each QA Project Plan would contain specific and detailed
information and procedures concerning the activities to be carried out for that specific project, be
it environmental sampling, modeling development, or geospatial data use.

Table 5 lists additional guidance documents that may be related to projects in which
geospatial data are used. If the project involves diverse activities, the additional relevant
documents listed in Table 5 offer guidance. For the most updated list of guidance documents,
see http://www.epa.gov/quality.

1 The use of a ground survey to confirm the findings of an aerial survey or to calibrate quantitative aerial or
satellite observations.

EPA QA/G-5G

Peer Review Draft
February 2002

14


-------
Table 5. EPA QA Guidance Documents

Title/Number

Description

Guidance for the Data Quality
Objectives Process (QA/G-4)

Guidance on the DQO Process, a systematic planning
process for environmental data collection

Guidance on Systematic Planing for
Data Collection (QA/G-4A)

Guidance on systematic planning of environmental
data collection in general, with emphasis on activities
beyond those supporting hypothesis testing.

Decision Error Feasibility Trials
Software (QA/G-4D)

PC-based software for determining the feasibility of
data quality objectives defined using the DQO
Process

Guidance for the Data Quality
Objectives Process for Hazardous
Waste Sites (G-HW)

Guidance on applying the DQO Process to hazardous
waste site investigations

Guidance on Quality Assurance
Project Plans (QA/G-5)

Guidance on developing QA Project Plans that meet
EPA guidelines

Guidance on Data Quality Indicators
(QA/G-5I)

Guidance on the principal data quality indicators of
precision, accuracy, representativeness, completeness,
comparability, and sensitivity

Guidance for Choosing a Sampling
Design for Environmental Data
Collection (QA/G-5S)

Guidance on developing a data collection strategy to
meet planning objectives

Guidance for the Preparation of
Standard Operating Procedures
(QA/G-6)

Guidance on the development and documentation of
standard operating procedures

Guidance on Technical Audits and
Related Assessments (QA/G-7)

Guidance to help organizations plan, conduct,
evaluate, and document technical assessments

Guidance for Data Quality
Assessment: Practical Methods for
Data Analysis (QA/G-9)

Guidance for statistically based methods to evaluate
the extent to which data satisfy the user's needs

Data Quality Assessment Statistical
Toolbox DataQ UEST (QA/G-9D)

PC-based software for implementing the statistical
methods described in the Guidance for Data Quality
Assessment

Guidance for Developing a Training
Program for Quality Systems
(QA/G-10)

Guidance on developing program-specific, quality
systems training programs for all levels of
management and staff

Overview of the EPA Quality System

Brief summary of the quality management guidelines
of the EPA Quality System

Guidance on Environmental Data
Verification and Validation (QA/G-8)

Guidance on environmental data verification,
validation, and integrity

EPA QA/G-5G

15

Peer Review Draft
February 2002


-------
2.3 QA Project Plan Responsibilities

Who is responsible for creating a OA Project Plan? The QA Project Plan may be
prepared by an in-house EPA organization (such as the GIS group), a contractor, an assistance
agreement holder, or another federal agency under an interagency agreement. Most likely, the
QA Project Plan will be a cooperative endeavor involving product users (e.g., EPA program
managers funding the project), project managers responsible for the successful completion of the
project, QA professionals, and technical staff responsible for carrying out the work.

For projects having limited scope, the QA Project Plan can be developed by a small team
consisting of the product user, the EPA Project Manager, the project leader, and the technical
staff. It is a guide to ensure that the quality of final products and resulting decisions meet
criteria specified at the origination of the project.

Except where specifically delegated, all QA Project Plans prepared by non-EPA
organizations are to be approved by EPA before they are implemented. It is Agency policy that
the QA Project Plan be reviewed and approved by an authorized EPA reviewer to ensure that the
document contains the appropriate content and level of detail. This may be the EPA Project
Manager with the assistance and approval of the EPA QA Manager (EPA, 2001a, Sec. 2.5). The
project leader and QA officer are to evaluate any changes to technical procedures before
submitting new information to EPA.

All QA Project Plans are to be implemented as approved for the intended work. The
organization performing the work is responsible for implementing the approved QA Project Plan
and ensuring that all personnel involved in the work have copies of the approved QA Project
Plan and all other necessary planning documents. These personnel are to understand the quality
guidelines prior to the start of data generation activities (EPA, 2001a, Sec. 2.6).

Personnel developing and reviewing a geospatial data QA Project Plan are to have the
proper experience and educational credentials to understand the relevant issues. The QA Project
Plan is to be prepared such that external reviewers can understand the technical and quality
issues associated with the project.

Discussions between the work managers and the technical staff are essential to creating a
useful QA Project Plan. Management alone may not have an in-depth understanding of the
complexity of geospatial data and its potential pitfalls. Geoprocessors may understand the data
well but may not have enough background and scope information from management to
determine the type, quantity, and quality of data required to meet the intended use. Only through
an open quality planning process where all responsible parties meet to discuss quality goals and
criteria can a useful QA Project Plan be developed.

EPA QA/G-5G

16

Peer Review Draft
February 2002


-------
2.4	Secondary Use of Data

In geospatial projects, use of existing
data from a source external to the project is
almost always required. When designing a
project and, in turn, developing a QA Project
Plan, the question of which GIS data sources
to use is important. For example, in a project
where elevation data are required, criteria for
selecting appropriate elevation data are
needed. Determining which source of digital
evaluation model data (e.g., based on guidelines for scale, quality, and level of detail) is most
appropriate for a project would require a dialog with management and technical staff to address
the differences between available data sources in order to determine which source could produce
a product adequate for its intended use. This decision-making process and the outcomes of the
decisions are to be included in the QA Project Plan.

2.5	Revisions to QA Project Plans

Because of the complex and diverse nature of environmental data operations, changes to
project plans, methods, and objectives are often required. When a substantive change is
warranted, the QA Project Plan is to be modified to reflect the change and is to be submitted for
approval.

According to EPA policy, a revised QA Project Plan is to be reviewed and approved by
the same authorities that performed the original review. Changed procedures may be
implemented only after the revision has been approved. Changes to the technical procedures are
to be evaluated by the EPA QA Manager and Project Manager to determine if they significantly
affect the technical and quality objectives of the geospatial data project. If the procedural
changes are determined to have significant effects, the QA Project Plan is to be revised and
reapproved, and a revised copy is to be sent to all the persons on the distribution list. Only after
the revision has been received and approved (at least verbally with written follow-up) by project
personnel is the change to be implemented.

For programs or projects of longer duration, QA Project Plans need at least annual review
to conform to EPA policy.

Refer to Guidance for Quality Assurance Project Plans (QA/G-5) (EPA, 1998a) and EPA
Requirements for Quality Assurance Project Plans (QA/R-5) (EPA, 2001b)
(ihttp://www.epa.gov/quality/documents) for additional information on how to handle QA Project
Plan revisions.

Secondary Use of Data is the use of
environmental data collected for other
purposes or from other sources, including
literature, industry surveys, compilations
from computerized databases and informa-
tion systems, and results from computerized
or mathematical models of environmental
processes and conditions.

EPA QA/G-5 G

17

Peer Review Draft
February 2002


-------
2.6 Overview of the Components of a QA Project Plan

This section provides a list of the components of a QA Project Plan included in EPA
Requirements for Quality Assurance Project Plans (QA/R-5) (EPA, 2001b). The components of
a QA Project Plan are categorized into "groups" according to their function and "elements"
within each group that define particular components of each group and form the organizational
structure of the QA Project Plan. QA groups are lettered and QA elements are numbered.

The four groups are:

Group A. Project Management—The elements in this group address the basic area
of project management, including the project history and objectives, roles and
responsibilities of the participants, etc. These elements ensure that the project has a
defined goal, that the participants understand the goal and the approach to be used,
and that the planning outputs have been documented.

Group B. Data Generation and Acquisition—The elements in this group address
all aspects of project design and implementation. Implementation of these elements
ensure that appropriate methods for sampling, measurement and analysis, data
collection or generation, data handling, and QC activities are employed and are
properly documented.

Group C. Assessment and Oversight—The elements in this group address the
activities for assessing the effectiveness of project implementation and associated QA
and QC activities. The purpose of assessment is to ensure that the QA Project Plan is
implemented as prescribed.

Group D. Data Validation and Usability—The elements in this group address the
QA activities that occur after the data collection or generation phase of the project is
completed. Implementation of these elements ensures that the data conform to the
specified criteria, thus achieving the project objectives.

Table 6 is a complete list of the QA Project Plan groups and elements. Subsequent
chapters of this document provide detailed information about the guidelines for sections of
specific relevance to geospatial data projects. Some titles of the QA Project Plan elements, listed
in Table 6, are slightly different in subsequent chapters to emphasize the application to
geospatial data.

EPA QA/G-5G

18

Peer Review Draft
February 2002


-------
Table 6. Summary of QA Groups and Elements

Group

Element

Title

A

1

Title and Approval Sheet

A

2

Table of Contents

A

3

Distribution List

A

4

Project/Task Organization

A

5

Problem Definition/Background

A

6

Project/Task Description

A

7

Quality Objectives and Criteria

A

8

Special Training/Certification

A

9

Documents and Records

B

1

Sampling Process Design

B

2

Sampling and Image Acquisition Methods

B

3

Sample Handling and Custody

B

4

Analytical Methods

B

5

Quality Control

B

6

Instrument/Equipment Testing, Inspection, and Maintenance

B

7

Instrument/Equipment Calibration and Frequency

B

8

Inspection/Acceptance Requirements for Supplies and Consumables

B

9

Data Acquisition Requirements (Nondirect Measurements)

B

10

Data Management

C

1

Assessments and Response Actions

C

2

Reports to Management

D

1

Data Review, Verification, and Validation

D

2

Verification and Validation Methods

D

3

Reconciliation with User Requirements

EPA QA/G-5G

19

Peer Review Draft
February 2002


-------
[This page intentionally left blank]

EPA QA/G-5G	Peer Review Draft

20	February 2002


-------
CHAPTER 3

GEOSPATIAL DATA QA PROJECT PLAN GROUPS AND ELEMENTS

3.1 Introduction

Th eEPA Requirements for Quality Assurance Project Plans (QA/R-5) (EPA, 2001b)
describes the elements EPA has specified for QA Project Plans. This guidance document
provides specifics on how to develop these components for geospatial data projects, including
suggested items to be included for each element. Each of the QA Project Plan elements that are
specified in EPA (2001b) are listed below and are described here for application to a geospatial
data project.

3.1.1 Al. Title and Approval Sheet

What is the purpose of this element? The
purpose of the approval sheet is to enable
officials to ensure that the quality planning
process has been completed before significant
amounts of work have been completed on the
project and to document their approval of the QA
Project Plan.

What type of information should be included in this element? The title sheet clearly
denotes the title of the project, the project sponsor, and the name of the organization preparing
the QA Project Plan. It includes any additional information on the title sheet that is necessary
for the project (e.g., project number, contract number, additional organizations involved).

The approval sheet (which may or may not be a separate page) lists the names and
signatures of the officials who are responsible for approving the QA Project Plan. The
approving officials typically include the organization's technical Project Manager, the
organization's QA Officer or Manager, the EPA (or other funding agency) Technical Project
Manager/Project Officer, the EPA (or other funding agency) Quality Assurance Officer or
Manager, and other key staff, such as the task manager(s) and QA Officer(s) of the data to be
used or collected for the project.

Suggested Content:

•	Title of plan

•	Name of organization

•	Names, titles, and signatures of
appropriate officials

•	Approval dates.

EPA QA/G-5G

21

Peer Review Draft
February 2002


-------
3.1.2 A2. Table of Contents

What is the purpose of this element? The
table of contents provides an overall list of the
contents of the document and enables the reader
to quickly find specific information in the
document.

Suggested Content:

Table of contents
List of tables, figures, references,
and appendices
• Document control format when
required by EPA Project Manager.

What type of information should be
included in this element? The table of contents lists all sections, tables, figures, references, and
appendices contained in the QA Project Plan. The major headings for most QA Project Plans
closely follow the list of required elements; an example is shown in Figure 3. While the exact
format of the QA Project Plan does not have to follow the sequence given here, it is generally
more convenient to do so, and it provides a standard format for the QA Project Plan reviewer.

The table of contents of the QA Project Plan may include a document control component
when required by the EPA Project Manager or QA Manager. This information would appear in
the upper right-hand corner of each page of the QA Project Plan when the document control
format is desired. The document control component, together with the distribution list (as
described in Element A3), facilitates control of the document to help ensure that the most current
version or draft of the QA Project Plan is in use by all project participants. Each revision of the
QA Project Plan would have a different revision number and date.

3.1.3 A3. Distribution List

What is the purpose of this element? This
element is used to ensure that all individuals who
are to have copies of or provide input to the QA
Project Plan receive a copy of the document.

Suggested Content:

Individuals and organizations to
receive approved QA Project Plan
Individuals and organizations
responsible for implementation
Individuals and organizations who
will receive updates.

What type of information should be
included in this element? All the persons
designated to receive copies of the QA Project

Plan, and any planned future revisions, would be listed in the QA Project Plan. This list,
together with the document control information, will help the Project Manager ensure that all
key personnel in the implementation of the QA Project Plan have up-to-date copies of the plan.
Note that the approved QA Project Plan can be delivered electronically.

EPA QA/G-5G

22

Peer Review Draft
February 2002


-------
CONTENTS

Section

List of Tables	 iv

List of Figures 	v

A Project Management	 1

1	Project/Task Organization 	 1

2	Problem Definition/Background	3

3	Project/Task Description 	4

4	Data Quality Objectives	7

4.1	Project Quality Objectives	7

4.2	Measurement Performance Criteria	8

5	Documentation and Records	 10

B Measurement Data Acquisition	 11

6	Sampling Process Design	 11

7	Analytical Methods Requirements 	 13

7.1	Organics	 13

7.2	Inorganics 	 14

7.3	Process Control Monitoring	 15

8	Quality Control Requirements 	 16

8.1	Field QC Requirements	 16

8.2	Laboratory QC Requirements 	 17

9	Instrument Calibration and Frequency 	 19

10	Data Acquisition Requirements 	20

11	Data Management	22

C Assessment/Oversight	23

12	Assessment and Response Actions	23

12.1	Technical Systems Audits 	23

12.2	Performance Evaluation Audits	23

13	Reports to Management 	24

D Data Validation and Usability	24

14	Data Review, Validation, and Verification Requirements 	24

15	Reconciliation with Data Quality Objectives	26

15.1	Assessment of Measurement Performance	26

15.2	Data Quality Assessment	27

Distribution List

N. Watson, EPA/ORD (Work Assignment Manager)*	B. O'Donnell, State University (Data Management)

B. Walker, EPA/ORD (QA Manager)	E. Reynolds, ABC Laboratories (Subcontractor
J. Warburg, State University (Principal Investigator) Laboratory)

T. Downs, State University (QA Officer)	P. Lafferton, ABC Laboratories (QA Manager
G. Johnston, State University (Field Activities) Subcontractor Laboratory)

F. Haller, State University (Laboratory Activities)

indicates approving authority

Figure 3. An Example Table of Contents and Distribution List

EPA QA/G-5G

23

Peer Review Draft
February 2002


-------
Suggested Content:

Identified roles and responsibilities
Documentation of the Q A Manager's
independence of the unit generating
the data

The individual responsible for
maintaining the official QA Project
Plan is identified

Organization chart showing lines of
responsibility and communication
List of outside external organizations
and subcontractors in the
organization chart.

3.1.4 A4. Project/Task Organization

What is the purpose of this element? The
purpose of this element is to provide EPA and
other involved parties with a clear understanding
of the role that each party plays in the
investigation or study and to provide the lines of
authority and reporting for the project.

What type of information should be
included in this element? The specific roles,
activities, and responsibilities of participants, as
well as the internal lines of authority and
communication within and between organiza-
tions, would be detailed. The position of the QA
Manager or QA Officer would be described. The

principal data users, decision maker, Project Manager, QA Manager, and all persons responsible
for implementation of the QA Project Plan would be included—for example, data management
personnel who maintain documentation of the initiation and completion of data searches,
inquiries, orders, and order receipts,
as well as of problems (e.g.,
incorrect or partial orders received,
unacceptable overflights or film
processing) and corrective actions
that allow project managers to verify
data acquisition progress. Also
included would be the person
responsible for maintaining the QA
Project Plan and any individual
approving deliverables other than
the project manager. A concise
chart showing the project organiza-
tion, the lines of responsibility, and
the lines of communication would
be presented; an example is
provided in Figure 4. For complex
projects, it may be useful to include
more than one chart—one for the
overall project and others for each
major subtask.

Figure 4. An Example Organizational Chart

EPA QA/G-5G

24

Peer Review Draft
February 2002


-------
In geospatial projects for which GIS analysts acquire or collect geospatial data from
external sources, the project organization element would describe how communications about
these data (quality, completeness, problems acquiring, etc.) would be handled between the
analyst and the project managers. The Project/Task Organization (A4) element designates
individuals to whom staff can bring issues regarding project status and data quality.
Additionally, it helps project managers know which technical staff will be responsible for
performing each part of the project, better enabling management to obtain adequate status and
quality information whenever necessary.

3.1.5 A5. Problem Definition/Background

What is the purpose of this element?
The purpose of this element is to describe the
background and context driving the project
and to identify and describe the problem to be
solved or analyzed.

What type of information should be
included in this element? The following types
of information may be included:

Suggested Content:

Description of the proj ect's purpose,
goals, and objectives
Identification of programs this project
supports

Description of the intended use of the
data to be gathered.

a description of the underlying purpose of the project
a description of the goals and objectives of the project

a description of the driving need for this project (e.g., regulation, legal directives,
research, outreach)

other projects, programs, or initiatives this project may be supporting
a description of the ultimate use of the final data or analysis

a description of the general overview of ideas to be considered and approaches to be
taken on a particular project

the decision makers and/or those who will use the information obtained from the
project.

3.1.6 A6. Project/Task Description

What is the purpose of this element?
The purpose of this element is to provide the
participants with a background understanding
of the project tasks and the types of activities
to be conducted. It includes a brief descrip-
tion of the data to be acquired and the
associated quality goals, procedures, and
timetables for project and task completion.

Suggested Content:

The specific problem to be solved or
decision to be made
Sufficient background for a historical
and scientific perspective
Schedule and cost.

EPA QA/G-5G

25

Peer Review Draft
February 2002


-------
What type of information should be included in this element? Detailed descriptions of
processing tasks will be created in Group B elements. Summaries and bulleted lists are adequate
for most types of information to be included here. Items to consider including are

a description of the location of the study area and the processes and techniques that
will be used to acquire necessary geospatial data

a description of any special personnel or equipment required for the specific type of
work being planned

information on how data processing and management will be performed and by
whom

identification and description of project milestones and the schedule associated with
achieving these milestones

deliverables, the schedule associated with generating and submitting them, and the
format to which these deliverables are to adhere

a work breakdown structure associated with the project, detailing the individual work
components associated with the milestones and deliverables, whose progress will be
tracked throughout the duration of the project.

3.1.7 A7. Quality Objectives and Criteria

What is the purpose of this element? The
purpose of this element is to document the
quality objectives of the project and to detail
performance and acceptance criteria through the
systematic planning process that will be
employed in generating the data. Performance
and acceptance criteria can take many forms.

The overall goal in setting the criteria is to ensure
that the project will produce the right type,
quality, and quantity of data to meet the user's
needs.

Where does the information for this element come from? This information comes from
the systematic planning process. The systematic planning process is a means of ensuring that the
appropriate quality and quantity of data and processing are performed on the project to produce
products adequate for their intended use. Systematic planning is required even when the project
or task will not result in a definable decision. During systematic planning, performance criteria
are to be specified so that, during quality assessment, there is a known benchmark against which
quality can be gauged. The criteria for quality are to be set at a level commensurate with the
project-specific requirements. In other words, performance and acceptance criteria specify the
level of quality that would be acceptable for the final data or product. They are not to be set
higher or lower than what is required to meet the needs of that particular project.

Suggested Content:

•	The quality objectives for the project

•	The performance and acceptance
criteria used to evaluate quality. (Use
the systematic planning process to
develop quality objectives and
performance criteria [see EPA Quality
Manual for Environmental Programs,
Section 3.3.8.1 (EPA, 2000a), for
more information].)

EPA QA/G-5G

26

Peer Review Draft
February 2002


-------
How are quality objectives and criteria determined? They are determined through the
systematic planning process as the planning team reviews and discusses what is needed for the
basic questions to be answered or the decision to be made with the project results (see
Section 1.3). For example, if a regulatory decision is the ultimate product of the task, then the
Agency strongly recommends using the DQO Process. Data quality objectives are qualitative
and quantitative statements that

clarify the intended use of the data

define the type of data needed to support the decision

identify the conditions under which the data are to be collected

specify tolerable limits on the probability of making a decision error due to

uncertainty in the data.

For decision-making programs in which systematic planning takes the form of the DQO
Process, these criteria are represented within data quality objectives (EPA, 2000b) that express
data quality requirements to achieve desired levels of confidence in making decisions based on
the data.

What are some of the forms that performance or acceptance criteria might take in a
geospatial data project? Examples may include

a description of the resolution and accuracy required in input data sources
statements regarding the speed of applications programs written to perform data
processing (e.g., "the programs must be able to make 10,000 Monte Carlo simulation
runs within 8 hours")

criteria for choosing among several existing data sources for a particular geospatial
theme (e.g., land use); geospatial data needs are often expressed in terms of using the
"best available" data, but different criteria—such as scale, content, time period
represented, quality, and format—may need to be assessed to decide which are the
"best available" (when more than one is available) to use on the project
specifications regarding the accuracy needs of coordinates collected from GPS
receivers

requirements for aerial photography or satellite imagery geo-referencing quality, such
as specifications as to how closely these data sources need to match spatially with
ground-based reference points or coordinates
criteria to be met in ground-truthing classified satellite imagery.

If address geo-coding is to be performed, indicate the criteria for minimum overall match
rate and any tolerances to be used in address matching procedures, including whether or not
spatial offsets are to be supplied in the resulting coordinates and, if so, what the offset factor is to
be. If the project is to build new geospatial data sets through a map digitizing process, indicate
requirements for topology, label errors, attribute accuracy, overlaps and gaps, and other
processing quality indicators.

EPA QA/G-5G

27

Peer Review Draft
February 2002


-------
Appendix C, Principal Data Quality Indicators for Geospatial Data, provides additional
information regarding data quality indicators that could be reflected in quality criteria to be
specified in this element.

3.1.8	A8. Special Training/Certification

What is the purpose of this element? The
purpose of this element is to document any
specialized training requirements necessary to
complete the project. This element is a good
place to discuss how these requirements will be met and how to verify that they have been met.

What type of information should be included in this element? Requirements for
specialized training for field-sampling techniques such as global positioning technology, photo
interpretation, and data processing would be specified. Depending on the nature of the project,
the QA Project Plan may address compliance with specifically mandated training requirements
(e.g., software contractors needing company certification or employees needing software
training). This element of the QA Project Plan would show that the management and project
teams are aware of specified health and safety needs as well as any other organizational safety
plans. Training and certification for necessary personnel would be planned well in advance of
the implementation of the project. All
certificates or documentation representing
completion of specialized training would be
maintained in personnel files.

3.1.9	A9. Documents and Records

What is the purpose of this element? This
element defines which documents and records are
critical to the project. It provides guidance to
ensure that important documentation is collected,
maintained, and managed so that others can
properly evaluate project procedures and
methods.

What type of information should be
included in this element? This element could be
used to provide guidelines for clearly docu-
menting software programs (including revisions)
and models, field operation records (for GPS
activities), and metadata guidelines.

Metadata are required in geospatial data
created on federal government contracts, and this

Suggested Content:

•	Any special training or certification
requirements for the project

•	Plans for meeting these requirements.

Suggested Content:

•	Description of the mechanism for
distributing the QA Project Plan to
project staff

•	List of the information to be included
with final products, including
metadata records, calibration and test
results (for GPS or remote sensing
tasks), processing descriptions
provided by data vendors (e.g.,
address matching, success rate reports
from address matching vendors)

•	List of any other documents applic-
able to the project, such as hard-copy
map source material, metadata
provided with data from secondary
data sources, interim reports, final
reports

•	All applicable requirements for the
final disposition of records and
documents, including location and
length of retention period.

EPA QA/G-5G

28

Peer Review Draft
February 2002


-------
element is a good place to indicate metadata requirements. Detailed metadata indicating the
source, scale, resolution, accuracy, and completeness are needed to assess the adequacy of
existing data for use (EPA, 2000d). The Federal Geographic Data Committee
(,http://www.fgdc.gov) has developed metadata standard for geospatial data generated for and by
all federal agencies. If an external source of existing data does not supply metadata (preferably,
Federal Geographic Data Committee-compliant metadata including quality data elements), or
additional information from the external source cannot be obtained, then the quality of these data
for this project cannot be evaluated. The data would be of unknown quality and unsuitable for
producing a product adequate for its intended use.

Other types of documentation and records that would be described in the Documents and
Records (A9) element include field operation records, analysis records, and data handling
records. This element would be used to describe the generation of these records (where, by
whom, and what format they will be stored and reported in). This element would discuss how
these various components will be assembled to represent a concise and accurate record of all
activities affecting data quality.

In some environmental sampling projects, records and documentation that refer to
geospatial data collection may be included in the environmental sample planning portion of a
general QA Project Plan, rather than in a geospatial QA Project Plan. In these cases, the GPS
records are associated with the environmental sampling in general, not with the geospatial data
records and documentation. The Documentation and Records (A9) element of a geospatial QA
Project Plan could then reference the GPS records requirements that are described in the
environmental sampling QA Project Plan.

3.2 Group B: Data Generation and Acquisition

Geospatial projects may involve the creation of new geospatial data from field measure-
ments (e.g., from GPS measurement, aerial photography, or satellite imagery) or may involve the
acquisition and use of existing geospatial data originally created for some other use. The
Group B elements of the QA Project Plan are used to

describe the quality assurance and quality control of the instruments, procedures, and
methods used to create new geospatial data (the first eight elements)
describe the methods of acquiring, assessing, and managing data from existing
sources for the project [Data Acquisition Requirements (Nondirect Measurements)
(A9) and Data Management (BIO) elements].

While the first eight elements are often associated with the creation of new data from
measurements, the Quality Control (B5) element may be used to outline and document quality
control procedures used on certain existing data sources. For example, it could be used to
document quality control procedures when map digitizing will be performed or when classified
satellite imagery is to be assessed for quality via ground-truthing procedures.

EPA QA/G-5G

29

Peer Review Draft
February 2002


-------
Data Acquisition Requirements (Nondirect Measurements) (B9) and Data Management
(BIO) elements are often the most significant parts of the Group B elements in geospatial
projects. This is because geospatial projects almost always involve the use of existing data
sources from outside organizations (e.g., existing geospatial data products like Topologically
Integrated Geographic Encoding and Referencing data, Digital Line Graph data, National Land
Cover Data, and Digital Elevation Model data). In addition, geospatial projects inherently
involve data management—therefore, the Data Management (BIO) element will require
extensive inputs to the QA Project Plan since it is used to describe the data management
procedures used to ensure that data are processed and handled in ways that meet the accuracy
and quality required on the project.

Whereas the methods described in the Group B elements are summarized in the Project/
Task Description (A6) element, the purpose here is to provide detailed information on the data
collection procedures and methods.

3.2.1 Bl. Sampling Process Design

What is the purpose of this element? This
element describes all the relevant components of
the data collection or image acquisition design,
defines the key parameters to be estimated, and
indicates the number and type of samples or
images expected. It also describes where, when,
and how samples or images are to be taken. The
information is to be sufficiently detailed to enable
a person knowledgeable in this area to understand
how and why the samples or images will be collected. Most of this information may be
available as outputs from the final steps of the systematic planning process.

What type of information should be included in this element? This element would be
used to describe how the project will acquire the "right" data for the project. For example, if the
project will be using satellite imagery, it is important to consider the type, quality, and resolution
of imagery.

Use the Sampling Process Design (Bl) element to describe the geographic extent of
locational data to be acquired. Describe the size, shape, and location of the project's study area.
Document whether it is feasible to collect new geospatial data for the project, and why. If the
project involves a number of discrete study areas (for example, a set of regulated industrial
sites), data of differing dates, quality, resolution, or scale may be available. Determine whether
different resolutions of data may be used in different parts of the project. This issue arises when
very accurate data exist for some portions of a study, but not for others. An example issue to
address in this element would be whether a single, uniform data source would be acceptable even
though in some areas it does not contain the most recent data, or in some areas, the resolution is
not as high as in the other data sources, would be addressed in this element.

Suggested Content:

•	Description of the data acquisition
design

•	For existing data from other sources,
how data will be evaluated for use

•	For geospatial data to be collected, the
design for acquisition (e.g., how and
where locational data will be
acquired).

EPA QA/G-5G

30

Peer Review Draft
February 2002


-------
When acquiring locational information using GPS equipment, this element would be used
to describe the locations to be used and the rationale for this design. In many cases, GPS will be
used to gather information at specific, known locations. For example, this element may specify
that GPS data will be collected at each spotted owl nest found or at each outfall encountered
along a body of water. For other projects, a sampling design may be implemented to collect data
using sophisticated sampling techniques. For example, when collecting soil samples to be
analyzed for contamination, sampling techniques may be used to determine the number of
samples to be taken and the method for determining the locations (e.g., based on a systematic
grid of predefined size or by using judgmental sampling procedures, etc.) The Sampling Process
Design (Bl) element would be used to describe the sampling design as it relates to the locations.
The sampling design might take into account procedures for dealing with locally interfering
objects such as tree canopies, towers, buildings, or high-relief terrain that could impact or eclipse
the GPS signal. Within the description of the sampling design, this element would also describe
the frequency of locational sampling or image acquisition. When decisions are made on the
number and location of observations or images to be taken, the QA Project Plan would describe
how these decisions were derived to meet the requirements of the planned interpretation (e.g.,
accuracy and precision requirements) or analysis.

Finally, the objectives for collecting the identified geospatial data are to be formulated in
the planning stage of the project. This element would explain why these data are being acquired
and how they will be used on the project.

What are some examples of issues that would be addressed in this element? Acquiring
locational data with GPS frequently involves a certain amount of uncertainty regarding the exact
location to be captured. This uncertainty can occur when collecting data for use in regulatory
analyses. Some examples of the types of questions that could be addressed here include the
following:

When collecting industrial site information, what, precisely, is to be collected: the
location of the facility main gate or the main office front door? the location of
holding ponds or other waste units?

Is it necessary to collect all waste unit locations or just the location of the general
center of all the waste units?

How important is the accuracy of these particular locations?

The Sampling Process Design (Bl) element might also describe the frequencies and
logistics involved in the GPS or imagery acquisition tasks. For example, information in this
element would provide answers to questions such as,

When do the data need to be collected, processed, and ready to be used on the
project?

Are there any constraints due to seasonality? For example, is imagery to be acquired
with "leaf off or "leaf on"? Can GPS acquisition be done on weekends?

EPA QA/G-5G

31

Peer Review Draft
February 2002


-------
When performing work with plants and animals, what seasonal factors will affect the
ability to find or track these species?

What logistical activities can be planned to facilitate GPS data collection?

Are special vehicles required?

Will the sampling take place on water? If so, what provisions for water
transportation are necessary?

To address some of these issues, the use of bar charts showing time frames of various QA
Project Plan activities is recommended to identify both potential bottlenecks and the need for
concurrent activities. The most appropriate plan for a particular direct measurement or remote
sensing task will depend on the practicality and feasibility of the plan, the key characteristic to
be estimated, and the resources needed for implementation (e.g., the costs of direct measurement
or remote sensing and interpretation).

The Sampling Process Design (Bl) element is the place to discuss the need for base
station data, if applicable. In addition, for projects involving digitizing source maps directly into
GIS format, issues related to evaluating source materials might be discussed.

What might be included in this element for projects involving acquisition of new aerial
photography? This element would include issues related to precision, seasonality, resolution
(pixel size), geo-regi strati on techniques and quality, delivery medium (analog photos or digital
orthophotography), and types and levels of vendor processing. An imagery acquisition plan
could be used to identify the types of data required, spatial resolution, overpass date(s)/time(s),
and supporting data required. Consider the following specific issues:

What final surface characteristic(s) does the project require (e.g., vegetation type,
canopy cover, soil type, or vegetation stress)? This derived parameter or analysis will
determine what type of imagery is needed.

For film-product aerial photography, are black-and-white, true-color, or false-color
products needed?

Is a particular time of year appropriate for imagery acquisition?

What time of day are aerial photos or satellite images to be captured (usually not an

option for satellite imagery, but may be for aerial photography)?

What documentation is needed on climatic factors, such as maximum allowable cloud

cover and snow cover?

EPA QA/G-5G

32

Peer Review Draft
February 2002


-------
3.2.2 B2. Sampling and Image Acquisition

Methods

What is the purpose of this element? This
element would be used to document procedures
and methods for collecting samples. As with all
other considerations involving geospatial
sampling or image acquisition, methods are to be
chosen with respect to the intended application of
the data. Different sampling or imagery
acquisition methods have different operational
characteristics—such as cost, difficulty, and
necessary equipment—that affect the representa-
tiveness, comparability, accuracy, and precision of the final result.

What type of information should be included in this element? Consider systematic
planning requirements when choosing the methods to ensure that (1) the measurements,
observations, or images accurately represent the portion of the environment to be characterized;
(2) the locational coordinates sampled are of sufficient accuracy to support the planned data
analysis; and (3) the locational coordinates sampled meet completeness requirements. Be sure
that data collected via GPS will meet the requirements for the intended use. Use standard
operating procedures to ensure that acquisition procedures are consistent across multiple staff
members and that Agency standards are used when available.

Identify the type of direct measurement, observation, or image to be acquired and the
appropriate sampling methods to be used from applicable methods approved by EPA. Each
direct measurement, observation, or image has its own characteristics that define the method
performance and the required sampling to represent the environment. Address the following:

actual sampling locations

choice of measurement or remote-sensing method

delineation of a proper measurement, observation, or image entity

inclusion of all entities within the abstract universe sampled (Appendix C addresses

the need for completeness indicators).

This element would address the issues of responsibility for the quality of the data, the
methods for making changes and corrections, the criteria for deciding on a new sample location,
and documentation of these changes. It would describe appropriate corrective actions to take if
there are serious flaws in the implementation of the sampling methodology. For example, if part
of the complete set of GPS measurements or imagery samples to be acquired is found to be
inadequate for its intended use, describe how replacements will be obtained and how these new
samples will be integrated into the total set of data.

Suggested Content:

•	Description of data collection
procedures

•	Methods and equipment to be used

•	Description of GPS equipment
preparation requirements

•	Description of performance
requirements

•	Description of corrective actions to be
taken if problems arise.

EPA QA/G-5G

33

Peer Review Draft
February 2002


-------
3.2.3 B3. Sample Handling and Custody „ ^	^ ^

r	J Suggested Content:

„71 , • ,, Cl, . ,	• Description of requirements for handling

What is the purpose of this element?	, . „ , .	°

~ , , ,, ~ : ~~	and transfer or hard-copy imagery or

This element is used to define the proiect-	, , , , •

A r ,	other hard-copy data inputs,

specific requirements for handling samples 		r	

and, perhaps, hard-copy aerial photographs or

other source documents such as maps. These

project-specific requirements may be necessary to prove that source materials and samples have
been properly handled and managed during the course of the project.

What type of information should be included in this element? Aerial photography
delivered in hard-copy format may need to go through a chain-of-custody procedure. However,
GPS coordinates, satellite imagery, and digital orthophotography are usually delivered and
processed in electronic form. Therefore, the Sample Handling and Custody (B3) element has
limited applicability on geospatial projects. The procedures for handling, maintaining, and
processing electronic data are described in the Data Acquisition Requirements (Nondirect
Measurements) (B9) and Data Management (BIO) elements.

Hard-copy aerial photography, original source maps, and hard copies of satellite imagery
can sometimes be of great importance in geospatial projects. They may provide the only source
of concrete information regarding industrial facilities and their surroundings, especially when
historical aerial photos are available for particular areas. Therefore, these documents need to
undergo careful and deliberate chain-of-custody procedures to ensure that they are not lost,
misplaced, altered, or destroyed. This element is used to document chain-of-custody procedures
and, for geospatial projects, may only be applicable for the QA Project Plan if hard-copy
documents such as aerial photos are acquired and used. However, chain-of-custody procedures
for environmental media samples (air, water, soil) would be developed and documented in QA
Project Plans for the environmental sampling portions of the project.

For aerial photographs, source maps, and other hard-copy documents, this element is
used to ensure that the documents are

transferred, stored, and analyzed by authorized personnel
not physically degraded through handling

properly recorded and tracked to ensure that their whereabouts are known at all times
in case they need to be used by different researchers.

The QA Project Plan discusses the source material or imagery handling and custody
procedure at a level commensurate with the intended use of the data. This discussion might
include the following

a list of names and responsibilities of those who will be handling the documents
a description and example of the document numbering system

EPA QA/G-5G

34

Peer Review Draft
February 2002


-------
procedures that will be used to maintain the chain of custody and documentation of

handling procedures within the organizations using these documents

the physical location and filing system to be used to store and manage the documents.

Few geospatial projects will need to fully develop a chain-of-custody process for source
documents. However, for projects that do acquire and use rare, original, or irreplaceable source
documents (aerial photos, printed maps, archival satellite imagery), it is a good idea to design
and document chain-of-custody procedures.

The forms and procedures used to track the chain of custody of source documents could
be described in the Documents and Records (A9) element. In this way, the documentation to be
maintained would be described in Documents and Records (A9) element and the procedures
themselves would be described in the Sample Handling and Custody (B3) element.

3.2.4 B4. Analytical Methods

What is the purpose of this element?

When GPS coordinates, aerial photos, or satellite
imagery is to be processed or interpreted, the
Analytical Methods (B4) element would be used
to document these interpretation or processing
methods. For remote sensing data sets,
requirements may need to be developed for the image analysis or processing to produce new data
sets. Image analysis may range from manual interpretation/characterization to the application of
algorithms and/or models.

What type of information should be included in this element? This element would
document algorithms/models to ensure they are applied correctly and consistently. For example,
when using remote sensing data sets, some requirements may need to be developed for the image
analysis or data processing that produces new data sets. Examples of new data sets derived from
remote sensing are

plant biomass indices that convert visible and near infrared to a scalar value
representing the relative amount of green vegetation

land-cover classifications that segment an image into classes (pavement, water,
vegetation) based on reflectance and/or thermal radiance of each pixel.

This element would address methods to be used, and in particular, whether the selected
methods differ from standard procedures. For example, most biomass estimators such as the
Normalized Difference Vegetation Index were developed to be applied to surface reflectance,
not digital numbers or radiance values. If a conversion to reflectance is not performed, some
justification would be noted. Statistics-based clustering (classification) of an individual image
can be performed on the digital number values; however, if the classification is to be performed

Suggested Content:

•	Image processing and/or photo-
interpretation methods to be used

•	List of method performance
standards, if applicable.

EPA QA/G-5G

35

Peer Review Draft
February 2002


-------
on multiple images, some type of image normalization would need to be performed. This
element of the report would describe the approach used.

Similarly, for aerial photo interpretation tasks, the methods used to interpret the photos
would be documented in this element. Existing standard operating procedures could be cited or
included to describe the interpretation methods and relate them to the desired products to be
generated from the interpretation.

3.2.5 B5. Quality Control

What is the purpose of this element?

Quality control is the "overall system of technical
activities that measures the attributes and
performance of a process, item, or service against
defined standards to verify that they meet the
stated requirements established by the customer,
operational techniques, and activities that are
used to fulfill requirements for quality" (EPA,

2001b). The Quality Control (B5) element
documents any QC checks not defined in other QA Project Plan elements and would reference
other elements that contain this information, where possible. This element relies on performance
criteria described in the Quality Objectives and Criteria (A7) element. In other words, use the
Quality Objectives and Criteria (A7) element to describe acceptable performance criteria and use
the Quality Control (B5) element to describe the procedures to be used to assess the
performance.

What type of information would be included in this element? The Quality Control (B5)
element is primarily applicable when generating new data, such as using GPS to collect
coordinates, using a digitizing procedure to convert source maps into GIS formats, or using
ground-truthing procedures to assess the accuracy of classified satellite imagery.

QC checklists are often a means of ensuring that proper procedures are used at each step
in data collection, or of checking and assessing the quality of map digitizing or satellite ground-
truthing results. QC checklists could be developed and described in the Quality Control (B5)
element to facilitate efficient and accurate fieldwork when using GPS receivers. QC checklists
could help analysts and management ensure that equipment has been checked and is operating
properly before fieldwork begins each day, and to ensure that proper procedures are used when
collecting calibration points (first-order control points) as well as the coordinates themselves.

Including QC procedures to be used in map digitizing in the Quality Control (B5)
element is important to ensure that digitizing staff convert the correct map features in a way that
meets accuracy requirements. For example, describe checklists to be used by the digitizer to
confirm that georegi strati on of the map-to-ground coordinates is within tolerances and that each

Suggested Content:

•	QC activities needed for GPS
measurements, field observations,
map digitization, image acquisition,
image processing, or image analysis

•	The frequency of each check and
corrective action required when limits
are exceeded.

EPA QA/G-5G

36

Peer Review Draft
February 2002


-------
required feature from the map is digitized and added to the appropriate GIS layer or feature
class.

Quality control of classified satellite imagery would involve some ground-truthing
procedures. These QC procedures may be documented in the Quality Control (B5) element and
checklists to be completed by the responsible staff may be described.

What assessments would be done to verify that the criteria have been met? The
assessment process includes verifying the data set (or product) specifications. The evaluations
planned provide a basis for logical decisions on the applicability of the data or images to the
current project. Examples include

ensuring that the requested special bands have been delivered
checking against independent data sets such as other images or vector products
examining the cloud coverage of images to ensure that cloud coverage extent does
not impede use of the data

ensuring that the view angle of imagery is as specified.

Although the project-specific requirements listed above may seem rather simple, many
geospatial projects have a large extent and variety of geospatial data. The directions in this
element of the QA Project Plan ensure that all these data are evaluated systematically and
completely.

The Quality Control (B5) element would also be used to document the actions to be taken
if QC checks identify errors or failures in quality of data capture procedures.

3.2.6 B6. Instrument/Equipment Testing,

Inspection, and Maintenance

What is the purpose of this element? The
purpose of this element is to discuss the
procedures used to verify that all instruments and
equipment are maintained in sound operating
condition and are capable of operating at
acceptable performance levels. This element
provides a mechanism for ensuring that
equipment used in geospatial projects is
operating to specifications. If the project does
not involve the use of any measurement equip-
ment, then it can be stated that this element is not
applicable in the QA Project Plan.

What type of information would be included in this element? Standard operating
procedures may be referenced or included in the Instrument/Equipment Testing, Inspection, and

Suggested Content:

•	Description of how inspections and
acceptance testing of instruments,
equipment, and their components
affecting quality will be performed
and documented

•	Description of how deficiencies will
be resolved

•	Description of (or reference to) how
periodic preventive and corrective
maintenance of measurement or test
equipment will be performed.

EPA QA/G-5G

37

Peer Review Draft
February 2002


-------
Maintenance Requirements (B6) element to document the required procedures for equipment
testing and inspection (e.g., for GPS equipment). Descriptions of procedures may include

estimates of the possible impact of equipment failure on overall data quality,

including timely delivery of project results

any relevant site-specific effects (e.g., environmental conditions)

steps for assessing the equipment status.

This element would address the scheduling of routine calibration and maintenance
activities, the steps that will be taken to minimize instrument downtime, and the prescribed
corrective actions for addressing unacceptable inspection or assessment results. This element
would also include periodic maintenance procedures. Supply the reader with sufficient
information to review the adequacy of the instrument/equipment management program.

Before a GPS survey is undertaken, it is recommended to that equipment be tested to
ensure that it works properly. Check the unit to confirm critical settings, because these settings
remain in memory when the receiver is turned off; failure to do so could result in inaccurate
data.

Routine preventive maintenance schedules need to be established and records maintained
on all instruments, equipment, and computer hardware and software systems used for the
acquisition of data, analysis of photographs, and graphics functions conducted. Designate
appropriate personnel who use instruments and equipment requiring routine maintenance as
responsible for ensuring that maintenance is performed in accordance with relevant standard
operating procedures or equipment instructions, and that maintenance is properly documented.
This will help ensure that maintenance records are available on request.

When aerial photography is needed in a geospatial project, inform the data producer of
the requirement to provide documentation of the equipment used, as well as its maintenance and
testing records, to assure project-specific requirements for their task are met.

3.2.7 B7. Instrument/Equipment

Calibration and Frequency

What is the purpose of this element? The
purpose of this element is to identify the
equipment to be calibrated and to document the
calibration method and frequency of each
instrument.

What type of information might be
included in this element? Identify any equipment
or instrument that requires calibration or
standardization to maintain acceptable performance. Include or reference standard operating

Suggested Content:

•	Instruments used for data collection
whose accuracy and operation need to
be maintained within specified limits

•	Description of (or reference to) how
calibration will be conducted

•	How calibration records will be
maintained and traced to the
instrument.

EPA QA/G-5G

38

Peer Review Draft
February 2002


-------
procedures that document how calibration of the equipment (e.g., for GPS receiver units) would
be accomplished. Generally, this will involve collecting locations with the GPS unit and
comparing them to known, high-quality reference points.

Identify and describe the calibration or standardization method for each instrument in
enough detail for someone else to duplicate the method. Reference external documents such as
standard operating procedures, providing these documents can be easily obtained. Fully
document and justify nonstandard methods.

If very high accuracy is required for locational data, geospatial data collectors can turn to
reference calibration data supplied by National Institutes of Standards and Technology, which
compares the frequency standard of each satellite to their frequency standard. (See
http://www.boulder.nist.gov/timefreq/service/gpstrace.htm.)

Aerial photography firms might be requested to supply calibration documentation for the
equipment used to capture any aerial photographs on the project. In addition, any film
processing equipment calibration documentation (if receiving hard-copy photographs rather than
electronic versions) would be included in this element.

3.2.8 B8. Inspection/Acceptance Require-
ments for Supplies and Consumables

What is the purpose of this element? The
purpose of this element is to establish and docu-
ment a system for inspecting and accepting all
supplies and consumables that may directly or
indirectly affect the quality of the project or task. If these requirements have been included
under another section, it is sufficient to provide a reference.

What type of information should be included in this element? Geospatial projects may
require the use of supplies and consumables such as film, photography paper, or batteries that
need to be checked to assure they meet requirements. Clearly identify such supplies or
consumables to be used on the project. Document the acceptance criteria by which the supplies
or consumables will be judged, the procedures used to test the materials and consumables, and
the frequency of these tests. Finally, document the corrective actions to be taken in case supplies
or consumables do not meet acceptance criteria.

If a geospatial component of a larger environmental sampling project exists, consumables
and supplies used during sample collection would be included in the QA Project Plan for the
environmental sampling portion of the project.

Suggested Content:

Description of how and by whom
supplies and consumables will be
inspected and accepted.

EPA QA/G-5G

39

Peer Review Draft
February 2002


-------
3.2.9 B9. Data Acquisition Requirements

(Nondirect Measurements)

What is the purpose of this element?

Quality assurance includes not only the collection
of new data, but also an evaluation of any
existing data used. The secondary use of existing
data (or "nondirect measurements") is an
important component of many geospatial data
projects. These data are to be evaluated to deter-
mine that they are of adequate quality for the project's needs. This element documents the
sources of data and the criteria used to evaluate the quality of this data.

How is "secondary use of existing data" defined and what are some examples for
geospatial data projects? Almost every geospatial project makes use of existing data, because
data collection is resource intensive and time consuming. Collecting new geospatial data can be
avoided by using existing sources of geospatial data developed by local, state, and federal
agencies, as well as commercial data vendors. The most common types of commercially
available geospatial data are up-to-date street centerline files (with accurate address ranges) and
satellite imagery from commercial vendors. Various federal agencies generate and supply large
quantities of geospatial data that are used throughout the country; examples include Digital Line
Graphs, Digital Elevation Models, the National Land Cover Database, and the National
Hydrography Dataset.

What is the purpose of the acceptance criteria for secondary use of existing data, and
what are some specific criteria to consider? Criteria would be developed to assure existing data
from other sources is of the type, quantity, and quality needed to meet the project's product
objectives. These criteria would be documented in the Data Acquisition Requirements
(Nondirect Measurements) (B9) element. Examples of these criteria include

project-specific requirements for content and accuracy of data to be acquired
standards for metadata needed for the planned data quality assessments
acceptable coordinate systems

-	projection

-	units

-	datum

-	spheroid

acceptable data formats (One way of documenting this is to indicate that any format
supported as a transfer format by the GIS software system is acceptable, particularly
if the best source of data for the project is from a computer-aided design package,
because extensive editing and manipulation could be required to convert the data into
an acceptable format.)

acceptability criteria of non-GIS sources (ZIP code lists, latitude/longitude lists) from
spreadsheets or database files

Suggested Content:

Description of secondary data used
Description of the intended use of the
data

Acceptance criteria for using the data
in the project and any limitations on
that use.

EPA QA/G-5G

40

Peer Review Draft
February 2002


-------
acceptable levels of data loss if any data conversion is to be done

the geographic coverage requirements (e.g., Does the external data to be assessed

cover the study area? This is especially relevant for projects with study areas in AK,

HI, Guam, or other U.S. Territories.)

how limitations of these data are to be documented.

Additional items to consider when writing this element include the following:

To the extent that they are known, "gray" areas in the use of the data in the project
would be documented here. For example, if the only available data source is at a
scale or accuracy that is questionable for its intended use, make sure these concerns/
limitations are documented and the potential effects on the final data are known. If
this analysis has not yet been completed when the QA Project Plan is being
developed, this element would contain directions for documenting this information.
If an outside service (such as commercially available geo-coding companies) is to be
used to produce geographic coordinates from addresses, define the acceptable limits
for completeness and accuracy of matching and document their data processing
procedures.

For remote-sensing data sets, similar criteria and assessments would be provided in this
element. In addition, the level of processing (and the product) would be identified and
documented in the task for the commercial vendor.

The Data Acquisition Requirements (Nondirect Measurements) (B9) element of the QA
Project Plan would clearly identify the intended sources of previously collected geospatial data
or imagery to be used in the project. The care and skepticism applied to the generation of new
data are also appropriate to the use of previously compiled data. For example, EPA risk
assessment and risk management analyses use spatial interrelationships of natural resources,
human populations, and pollution sources by processing existing geospatial data within GIS. If
data are inappropriate due to scale, accuracy, resolution, or content, this may lead to
inappropriate products and decision errors. The quality of the outputs is dependent upon the
quality of the input data, as well as the project's data management and processing
software/hardware configuration, including documentation and metadata.

The Data Acquisition Requirements (Nondirect Measurements) (B9) element would also
include a discussion of limitations on the use of the data and the nature of the uncertainty of the
data. For many of the most commonly used geospatial data (such as U.S. Geological Survey
Digital Line Graph layers, Digital Elevation Model data, or National Land Cover data), the
existing metadata are the end user's only source of information about the accuracy, content,
usefulness, and completeness of the data. The user will evaluate these existing data sources
against the requirements of the project using the supplied metadata. Evaluation criteria are set to
determine the minimum acceptable quality of data that can be used. The Data Acquisition
Requirements (Nondirect Measurements) (B9) element would contain instructions for
documenting any effects of compromises made in order to use the data.

EPA QA/G-5G

41

Peer Review Draft
February 2002


-------
How should quality issues be documented when using, combining, or analyzing data
from different sources? This element of the QA Project Plan would contain guidance on
combining different data sources from widely different scales. For example, if the project is to
identify the parcels in a city that are within a floodplain boundary, two types of data might be
used: geospatial parcel data and floodplain boundries. Geospatial parcel data are usually of very
high accuracy and precision because they represent legal property boundaries. Floodplain
boundaries are frequently less accurate by their very nature. A floodplain boundary is usually
defined as the point to which the water will rise given a rainfall episode that is likely to occur
once in 50 years or once in 100 years. The floodplain boundary does not represent any actual
physical or environmental boundary—it only represents the probable location of a boundary
based on statistical analysis of historical rainfall data. The uncertainty resulting from combining
these data sets would be documented so that users of the resulting analysis (geographic overlay
of parcels and flood zones) will understand how to evaluate any decisions made.

How is metadata used in quality assurance? As mentioned above, metadata are virtually
the only source of information about the quality and accuracy of existing data. Candidate
geospatial data sets may not have metadata if they were created prior to the development of the
1995 FGDC standards. External data sources may need to be contacted to determine data avail-
ability, condition, and constraints on their use. If only partial documentation is obtained, the risk
to project objectives of using data of unknown quality would need to be considered. If
independent quality assessment or caveats accompany the data, any resulting product would
reduce that risk to an acceptable level.

What other issues might be described in this element? The Data Acquisition
Requirements (Nondirect Measurements) (B9) element could also be used to document and
evaluate the ability of the hardware/software configuration to handle existing data sources
chosen for use in the project. The data structure, media storage form, and platform requirements
can be critical to data processing and, therefore, the analyses to be performed in the project. For
example, some older data sets were created using formats that are not easily transformed into
those useable by the Agency's standard spatial analysis software. It is also important to consider
whether the acquired data are current and what the prospects are for continued updating to assure
future usefulness.

Logical consistency of acquired data is particularly important because it can affect data
processing and project results. Logic is based on thematic correlations providing the basis for
internal validity of a spatial data set, the types of errors encountered can usually be characterized
as systematic (i.e., bias), random, or a simple blunder (Veregin, 1992). Incompleteness of
attribute data and loss of data integrity can result in inconsistency of the relationships among the
encoded features.

Logical consistency of multivariate data sets of environmental attributes can be screened
by statistical tests to evaluate characteristics such as the amount and distribution of missing data,
statistical parameters (e.g., sample mean, standard deviation, and coefficient of variation), and

EPA QA/G-5G

42

Peer Review Draft
February 2002


-------
data distributions; out-of-range values for the measurement scales; and correlations (see EPA,
2000b).

Logical consistency checks can be performed within a geospatial database (e.g., ensure
that no parcels in a parcel database have a "development status" code of "undeveloped" along
with a "number of buildings" attribute greater than 0, because this is logically inconsistent).
Logical consistency checks can also be performed between geospatial databases (e.g., given a set
of latitude/longitude coordinates of industrial stacks, ensure that none of them are located in a
water feature when overlaid into a land use or hydrography layer).

The Data Acquisition Requirements (Nondirect Measurements) (B9) element would be
used to document checks performed on the existing data by the data producers, or, in the absence
of such information from the data producer, this element can be used to develop descriptions of
the most important checks to perform on the data to ensure that they are usable in the project.

How does one assess the accuracy of geospatial data sets—especially vector data sets?
For example, what is the accuracy of the U.S. Census Bureau's Topologically Integrated
Geographic Encoding and Referencing (TIGER) data? This is a difficult question to answer; it
would be answered by reviewing available metadata and processing information and applying
professional judgment to assess the accuracy based on this information.

3.2.10 B10. Data Management

What is the purpose of this element? This
element presents an overview of the operations,
calculations, transformations, or analyses
performed on geospatial data or their attributes
throughout the project. Diagrams and graphics
illustrating the sources of each data set, the steps
through which each one will be processed
(including combinations to create new data sets),
the names and characteristics of interim data sets,
and the naming conventions used at each step can
be used to illustrate the processing methodology.
The Data Management (B10) element would
document operations performed on the data at
each step of the process (see Figure 5).

What type of information might be
included in this element? The Data Management
(B10) element includes a discussion and
description of records kept throughout the
project. This is similar to what would be
included in the Documents and Records (A9)

Suggested Content:

•	Description of the project manage-
ment or activities

•	Flow charts of data usage and
processing

•	Description of how data will be
managed to reduce processing errors

•	Description of the mechanism for
detecting and correcting errors in data
processing

•	Examples of checklists or forms to be
used

•	Description of the hardware/software
configuration to be used on the
project

•	Description of the procedures that will
be followed to demonstrate accept-
ability of the process

•	Description of the data analysis or
statistical techniques to be used.

EPA QA/G-5G

43

Peer Review Draft
February 2002


-------
Figure 5. GIS Flow Diagram

element, but includes more detailed descriptions of data set names and processing methods. The
Data Management (BIO) element might also discuss the requirements for internal program
documentation (that is, programmers' comments included with programs). Describe how
analysts and others such as software developers will document their work and the steps they take
during the course of the project to acquire, analyze, and manage the geospatial data or develop
needed software. Describe the function of these notes at the end of the project. For example,
when final reports are created to document the overall project and its conclusions, processing
notes created by the analysts and managers can provide the actual data processing steps,
preserving them to the level of detail required to fully understand the project's technical details
or to recreate the product.

The documentation in the Data Management (BIO) element might start by describing the
process of data management for newly collected geospatial data sets that will undergo data
processing in the project. Describe the activities that generate new geospatial data sets through
data processing, the use of digitizing tables to render GIS layers from hard-copy map sources, or
the synthesis of new data sets from existing data and newly collected data.

What would be covered in this element for geospatial data sets newly collected by GPS?

1. Define and create data dictionary. The Data Management (B10) element documents
the data dictionary itself. The data dictionary defines the acceptable attributes and codes to be
collected during fieldwork. For example, if the project involves collecting information on the
location and type of outfall pipes, the data dictionary might include a description of fields used
to store pipe material, pipe size, pipe status, and so on. For each of those data fields, coded

EPA QA/G-5G

44

Peer Review Draft
February 2002


-------
values would be defined in the data dictionary to restrict the data collector to data using specific,
predetermined, valid codes. This would reduce post-processing and cleanup when the data are
uploaded to the GIS and would ensure that the correct information is collected in the field.

2.	Transfer the data dictionary to the GPS units. On many modern GPS units, the
electronic data dictionary can be transferred so that the acceptable coding values are accessible
in the field. The process by which the data dictionary will be transferred and checked once
transferred would be described in the Data Management (BIO) element.

3.	Collect and transcribe field notes. Field notes from data collectors are to be collected
and transcribed for use during the data processing and data quality control process. The Data
Management (BIO) element would document how the notes will be collected, who will collect
them, who will input the notes in a form for use by others, and what format and software will be
used to store the notes. In addition, the steps and procedures for using the field notes to check
data discrepancies and for noting questions during the data transfer and processing steps would
be described.

4.	Download the GPS data into the GIS. Use the Data Management (BIO) element to
describe the process by which GPS data will be downloaded on the GIS processing computers,
and list steps for backing up the raw data and ensuring that it was transferred completely and
successfully. The description would also include the procedures for converting the coordinate
data into GIS databases, for converting the attribute data into database files, and for reintegrating
these data with the coordinate data.

5.	Correct the GPS coordinates (if necessary). Describe the process to be used to
perform the differential corrections on the raw GPS coordinates. If a base station or other GPS
unit was used to collect the appropriate reference information, describe the details of the process.
Describe any procedures used to check for outliers or other problems created when averaging
multiple data locations into a single aggregated location. These types of checks might include
calculating the standard deviation of each set of points to be averaged and then checking the
standard deviations to make sure none are greater than the specified accuracy criteria.

6.	Document the method, accuracy, and description data for the GPS coordinates. The
method, accuracy, and description data would be integrated into the metadata for the processed,
final GPS data sets1.

'Note that the EPA Locational Data Policy is being reviewed in light of the FGDC metadata guidelines and
Executive Order 12906. As the EPA Locational Data Policy is updated, the Latitude/Longitude Data Standard may
also be revised to add enough new codes to achieve minimum compliance with FGDC guidelines. See
http://oaspub.epa.gov/edr/EP'ASTD$.STARTUP for status. Extramural organizations (non-EPA), may need to
request this document from their EPA work assignment manager.

EPA QA/G-5G

45

Peer Review Draft
February 2002


-------
What would be covered in the Data Management (BIO) element for a map digitizing
project?

Descriptions of how the maps will be prepared for digitizing (e.g., Will Mylar
overlays be used to extract the appropriate linework from maps? If so, what will the
procedure be?)

• A description of which lines or other information will be extracted from the maps
The procedure for assigning identifiers to the features to be digitized
A description of the georeferencing identifiers (tics) that might be used to transform
the digitized data into geographic coordinate systems

Procedures to check the completeness and accuracy of the digitizing effort (see
Section 3.3)

The tolerances to be used on the digitizing transformations. For example, when
re-registering maps to a digitizing table, what is the acceptable root mean square
value to determine whether or not the registration was accurate enough? The root
mean square value would also be indicated in the Quality Objectives and Criteria
(A7) element as a quality criteria.

By documenting and specifying these types of procedures and tolerances, the digitizing
process will go more smoothly and will result in data that require less correction and editing.
Similar descriptions explaining how the nonspatial data (attributes) will be collected from the
maps, entered into a database, and linked up with the spatial data would be included in the Data
Management (BIO) element.

Group C elements (see Section 3.3) would be used to describe how these data (both
spatial and nonspatial) are to be checked and corrected. The Data Management (BIO) element
would be used to document processing and data management methodologies.

When existing data (acquired from an external source') are to be used on the project, what
might be included in the Data Management (BIO') element to describe how these data will be
managed during the course of the project?

The procedures to be used to back up the raw data

The procedures to be used to construct the GIS database from these data sources (For
example, if multiple geographic data sets are required to cover the study area,
describe how each data set will be projected and/or transformed into a common
coordinate system, how the data sets will be appended together to create a single
seamless layer, and what will be done with the resulting layers during the course of
the project.)

Descriptions of how quality of these processes will be assessed and problems
corrected will be addressed in Group C elements

The procedures to be used to process and analyze these data (for example, detailed
flow charts indicating the procedures to be used at each step of the process and
explicitly defining the input and output data for each step)

EPA QA/G-5G

46

Peer Review Draft
February 2002


-------
Definitions of naming conventions for geospatial data sets—during the course of the
project many interim data products may be created; by defining and using a system of
naming conventions, data management is improved.

What would be included in the Data Management (BICH element to discuss the
development and creation of project-specific applications programs or subprograms? For
projects involving the development of applications programs that combine underlying GIS
commands or operations, document the name, purpose, and functions of each program.
Documentation of these programs ("macros") provides additional information about specific
operations to be performed during the project. Many of these procedural programs are
developed during the course of the project—not before. The Data Management (BIO) element
creates a placeholder for descriptions of these macros. Because macros are a prime operational
tool in geospatial projects, they are to be developed, documented, and checked carefully. Many
of the quality errors that crop up unexpectedly at the end of geospatial projects are due to errors
in macro programs that are not caught and corrected early in the process. Use the Group C
elements (see Section 3.3) to describe how macro programs will be evaluated to ensure that they
produce results of the quality indicated in the Quality Objectives and Criteria (A7) element.

The Data Management (BIO) element provides guidance to GIS analysts and technicians
for properly testing informal macro programs. The Data Management (BIO) element could also
be used to describe the process whereby macro programs will be checked by senior analysts or
QA Officers to ensure that they are working correctly. The Data Management (BIO) element
might also be used to specify where data are stored and managed on computers, including path
names to project files.

Security

Security is an important aspect of data management and quality assurance in general,
because security problems may affect the data quality and data usability. The Data Management
(BIO) element may be used to describe procedures and issues related to the following:

•	Internet Security: Internet security is an important issue in geospatial
projects that use the Internet to acquire or transmit data. Describe potential
problems with acquiring or transmitting data caused by Internet firewalls. For
example, if acquiring existing data from EPA, will access to data within
EPA's firewall be a problem?

•	Confidential Business Information: Highly detailed and legally binding
procedures are required when working with data designated as Confidential
Business Information. If geospatial data (or related attribute data) have been
labeled as Confidential Business Information, the appropriate procedures are
to be followed. In addition, the Data Management (BIO) element could be
used to document and describe how the application of Confidential Business
Information procedures will affect data access, and therefore, the project
timeline.

EPA QA/G-5G

47

Peer Review Draft
February 2002


-------
• General Computer and Physical Plant Security: The Data Management
(BIO) element could be used to describe any special considerations,
procedures, or characteristics of the computing environment or physical plant
that might affect the security of the data being processed on the project. For
example, if there are special considerations regarding user access rights to
particularly sensitive data, the Data Management (BIO) element could be used
to document these issues.

Electronic Exchange Formats

When the results of a geospatial project are to be transmitted to other data users in the
organization or to external organizations, the Data Management (BIO) element would be used to
document the formats to be used for the data exchange.

Hardware/Software Configuration

What might be the general structure of the discussion of the hardware/software
configuration presented in this portion of the OA Project Plan? The discussion of
hardware/software configurations will depend on the purpose of the subprograms to be
developed on the project. If the purpose of the overall project is to develop GIS or geospatial
software for a wider audience of users beyond the project team itself, then it would be helpful for
the QA Project Plan to take into account EPA policies regarding software development, life-
cycle planning, and other policies outlined in the Information Resources Management Policy
Manual (EPA, 1998b).

For projects where applications programs or processing programs are developed solely
for use as data processing enablers on the project, the Data Management (BIO) element may be
used to describe the hardware and software configuration under which the project will be
performed. For example, discuss the computer hardware configuration for the project and
discuss GIS or other geospatial software required to perform the data processing.

What might be included in the OA Project Plan for geospatial software development
projects whose purpose is to develop a standardized software product for an audience beyond the
project team?

For these projects, the Data Management (BIO) element would be used to discuss the
major design issues of the software. However, the Data Management (BIO) element would
supplement, not replace, a formal software design and development methodology in which the
details of the software's design and operation would be documented.

This element may also address performance requirements (e.g., run times) and other
features that characterize or assess the hardware/software configuration. This discussion could
be incorporated within a general overview of the configuration's QA program. [Assessments
that target the GIS software itself and its ability to process geospatial data are addressed by the

EPA QA/G-5G

48

Peer Review Draft
February 2002


-------
Group C elements within the QA Project Plan (see Section 3.3).] The configuration's QA
program is jointly planned and implemented by the project management team and the software
developer's independent QA staff, generally as part of systematic planning [the Quality
Objectives and Criteria (A7) element]. It addresses the use of standards, test planning and
scheduling, level of documentation required, personnel assignments, and change control. It also
ensures that timely corrective action will be taken as necessary. Items within the systems
development life cycle that are relevant to the particular modeling project may also be
considered when establishing the configuration's QA program. Examples of such items, taken
from Chapter 4 of EP A's Information Resources Management Policy Manual (Directive 2100)
(EPA, 1998b) and the Information Technology Architecture Roadmap,2 are provided in Table 7.

What important issues would the OA Project Plan address for the hardware/software
configuration's OA program?

It is important that the QA Project Plan specify the particular QA procedures that will be
implemented within the software development project to ensure that the data generated by the
product are defensible and appropriate for the planned final use. This section of the QA Project
Plan would address QA efforts performed as the data management and processing systems are
being developed. These efforts may include

identifying necessary requirements for the hardware/software configuration and
establishing quality criteria that address these requirements within the systematic
planning and needs analysis phase of the project [Quality Objectives and Criteria
(A7) element];

implementing an appropriate project management framework to ensure that the
requirements and quality criteria established for the hardware/software configuration
are achieved [as discussed in the Project Management Group (A4-A9) elements and
the Data Acquisition Requirements (Nondirect Measurements) (B9) element]
performing testing and other assessment procedures on the configuration to verify
that the requirements and quality criteria are being met [details on the assessment
procedures are addressed in the Assessment Methods and Response Actions (CI)
element].

The magnitude of these QA efforts will depend on the underlying complexity of the geospatial
data effort and the required hardware/software configuration. Therefore, EPA's graded approach
(Chapter 1) will direct the overall scope of these QA efforts.

2Publishedby EPA's Office of Technology Operations and Planning, formerly the Office of Information
Resources Management, Directive 2100 establishes a policy framework for managing information within EPA. It
can be accessed online at http://www.epa.gov/irmpoli8/polman/index.html. The Information Technology
Architecture Roadmap, which contains annual updates of this document, can be found at (internal EPA web site)

http.V/Basin. rtpnc. epa.gov:9876/etsd/ITARoadMap. nsf.

EPA QA/G-5G

49

Peer Review Draft
February 2002


-------
Table 7. Typical Activities and Documentation Prepared Within the System Development Life Cycle of a
Geospatial Data Project to Be Considered When Establishing the QA Program for the

Hardware/Software Configuration

Life Cycle Stage

Typical Activities

Documentation

Needs Assessment and
General Requirements
Definition

• Assessment of needs and requirements

interactions in systematic planning with users
and other experts

•	Needs assessment documentation
(e.g., in the QA Project Plan, if
applicable)

•	Requirements document

Detailed Requirements
Analysis

•	Listing of all inputs, outputs, actions,
computations, etc. tnat the geographic
information or modeling system is to perform

•	Listing of ancillary needs such as security and
user interface requirements

•	Design team meetings

•	Detailed requirements document,
including performance, security,
user interface requirements, etc.

•	System development standards

Framework Design

• Translation of requirements into a design to be
implemented

• Design document(s), including
technical framework design,
software design (algorithms, etc.)

Implementation Controls

•	Coding and configuration control

•	Design/implementation team meetings

•	In-line comments

•	Change control documentation

Testing, Verification, and
Evaluation

•	Verification that the software code, including
algorithms and supporting information system,
meets requirements

•	Verification that the design has been correctly
implemented

•	Beta testing (users outside QA team)

•	Acceptance testing (for final acceptance of a
contracted product)

•	Implement necessary corrective actions

•	Test plan

•	Test result documentation

•	Corrective action documentation

•	Beta test comments

•	Acceptance test results

Installation and Training

• Installation of data management system and
training of users

•	Installation documentation

•	User's guide

Operations, Maintenance,
and User Support

• Usage instructions and maintenance resources
for geographic information or model system and
databases

•	User's guide

•	Maintenance manual or
programmer's manual

System Retirement and
Archival

• Information on how data or software can be
retrieved if needed

•	Project files

•	Final report


-------
How are requirements and criteria placed on the hardware/software configuration
addressed in systematic planning? Elaborating further on the first bullet above, the systematic
planning phase of the study [Quality Objectives and Criteria (A7) element] defines requirements
and quality criteria for the data processing system to ensure that the project's end-use needs can
be adequately met. For example, criteria on errors propagated by data processing would be
established during systematic planning to ensure that uncertainty requirements for the mode
outputs can be met. Such requirements and criteria, therefore, impact the project's hardware/
software configuration.

In systematic planning, questions such as the following may be addressed when defining
these requirements and quality criteria:

What are the required levels of accuracy and uncertainty for numerical
approximations?

Are the selected mathematical features of the program (e.g., algorithms, equations,
statistical processes) appropriate for the program's end use?

Are the correct data elements being used in the calculations performed within the
program's algorithms?

What requirements regarding documentation and traceability are necessary for the
program's inputs, interim outputs, and final outputs?

Other items addressed during systematic planning that are likely to impact assessment of
the hardware/software configuration include security, communication, software installation, and
system performance (e.g., response time). These issues are addressed briefly below.

What kinds of documentation might the OA Project Plan address as part of hardware/
software configuration for a software development project? When documenting planning and
performance components of hardware/software configuration, project and QA Managers may
tailor the documentation to meet the specific needs of their project. Examples of different types
of documentation that can be generated for various tasks within the planning phase of the
system's life cycle include the following:

•	Requirements Documentation (WEE, 1998): The general requirements document
gives an overview of the functions that the model framework will perform.

•	Design Documentation: Design documents plan and describe the structure of the
computer program. These are particularly important in multiprogrammer projects in
which modules written by different individuals interact. Even in small or single-
programmer projects, a formal design document can be useful for communication and
for later reference.

•	Coding Standards or Standard Operating Procedures: These may apply to a single
project or a cumulative model framework and need to be consistent across the
development team.

EPA QA/G-5G

51

Peer Review Draft
February 2002


-------
Testing Plans (FIPS1323): Testing is to be planned in advance and is to address all
requirements and performance goals.

•	Data Dictionary. A data dictionary can be useful to developers, users, and
maintenance programmers who may need to modify the programs later. The data
dictionary is often developed before code is written as part of the design process.

•	User'sManual. The user's manual can often borrow heavily from the requirements
document, because all the software's functions would be specified there. The scope
of the user's manual would take into account such issues as the level and
sophistication of the intended user and the complexity of the interface. Online help
can also be used to serve this function.

•	Maintenance Manual. The maintenance manual's purpose is to explain a
framework's software logic and organization for the maintenance programmer.

Source Code: It is very important to store downloadable code securely and to archive
computer-readable copies of source code according to the policies of the relevant
regulatory program.

•	Configuration Management Plan (IEEE, 1998): The configuration management plan
provides procedures to control software/hardware configuration during development
of the original software and subsequent revisions.

Additional information and examples can be found in Chapter 17 of EPA's Information
Resources Management Policy Manual (Directive 2100) (EPA, 1998b). In general, it is best to
coordinate any discussion of documentation in the QA Project Plan with information presented
in the Documentation and Records (A9) element.

What kinds of standards do I include in the hardware/software configuration's OA
program to ensure that the configuration is compliant and acceptable? The configuration is to be
designed to comply with applicable EPA information resource management policies and data
standards, which can be found within EPA's Information Resources Management Policy Manual
(Directive 2100) (EPA, 1998b). Other standards may also be applicable and are to be cited, such
as the Federal Information Processing Standards, which govern the acquisition of U.S.
Government information processing systems. This element of the QA Project Plan is the place
to introduce these standards and discuss how the project will ensure that they will be addressed.

Sources for determining specific types of standards include the following:

•	EPA's Information Resources Management Policy Manual (Directive 2100) (EPA,
1998b) includes EPA hardware and software standards to promote consistency in use
of standard support tools such as computer-aided software engineering tools and
coding languages, as applicable, by contractors and EPA staff in GIS software
development and maintenance efforts.

3 Federal Information Processing Standards

EPA QA/G-5G	Peer Review Draft

52	February 2002


-------
•	Chapter 5 of EPA's Information Resources Management Policy Manual (Directive
2100) (EPA, 1998b) defines applicable EPA data standards.

•	EPA's Environmental Data Registry (http://www.epa.gov/edf) promotes data
standardization, which allows for greater ease of information sharing.

The EPA Information Technology Architecture Roadmap provides guidance for the
selection and deployment of computing platforms, networks, systems software, and
related products that interconnect computing platforms and make them operate.
Publications on Federal Information Processing Standards govern the acquisition of
U.S. Government information processing systems.

Directives and standards such as these are frequently revised. Therefore, it is important that
these directives and standards be reviewed frequently to ensure that the latest versions are being
utilized. See http://oaspub.epa.gov/edr/EPASTD$.STARTUP for standard status. Extramural
organizations may check with their EPA work assignment manager for current status. The QA
Project Plan is to specify how the configuration will be verified or demonstrated according to
these and other standards.

3.3 Group C: Assessment/Oversight

Group C elements are used to document the process of evaluating and validating the data
collection and data processing activities on the project. In other words, Group C includes
descriptions of the quality assessments and evaluations, and describes the reports and actions to
be taken, based on assessments.

Whereas Group B elements describe the methods of collecting geospatial data types and
methods of choosing and managing geospatial data sources, Group C elements focus on the
quality assessments that will be performed during the data processing of the project. In addition,
Group C is used to describe the procedure for addressing quality problems.

There is some overlap between discussions in the Data Management (BIO) element and
those in Group C. This is because data management and the programs used to manage and
process geospatial data are the root of many the quality problems. However, Group C is to be
used to augment the Data Management (BIO) element when using existing data and to describe
the steps taken to ensure that assessments in the Data Management (BIO) element and other parts
of the QA Project Plan are implemented.

EPA QA/G-5G

53

Peer Review Draft
February 2002


-------
3.3.1 CI. Assessments and Response Actions

What is the purpose of this element? This
element describes the internal and external
checks necessary to ensure that

• all elements of the QA Project Plan
are correctly implemented as
prescribed

the quality of the data and product
generated by implementation of the
QA Project Plan is adequate
corrective actions, when needed, are
implemented in a timely manner and
their effectiveness is confirmed.

What type of information might be included in this element? Based on the project's
quality needs, scope, and limitations on uncertainty, different levels of assessments and response
actions may be appropriate. For each of the assessments described in the Assessment and
Response Actions (CI) element, include a description of activities that will be used to correct
problems or errors, as applicable.

The following types of assessments would be documented in the Assessment and
Response Actions (CI) element as a means of ensuring that secondary data being evaluated meet
the specifications noted in the Quality Objectives and Criteria (A7) and Data Acquisition
Requirements (Nondirect Measurements) (B9) elements:

Check locations of features in existing data against locations of these features in other
data sources. For example, describe how digital elevation model elevations will be
spot-checked against topographical maps, to ensure that the accuracy of the digital
elevation models is within its accuracy specifications.

Check attribute data to ensure that it is of acceptable quality, based on the criteria
specified in the Quality Objectives and Criteria (A7) element (see Appendix C for
more information).

Describe how senior level scientist/GIS analysts will review processing procedures
during methodology development. Identify potential processing problems, issues,
and work-arounds.

Describe the requirements for reviewing data at the end of each processing step. Are
data consistent? Are data values correct given the processing manipulation
performed? Are the locations of geographic entities within expected norms based on
processing techniques employed? If macros or other data processing programs are
run, describe how data inputs and outputs will be tested to ensure that their
characteristics are as expected and that the programs performed the functions defined
for them.

Suggested Content:

•	Description of each assessment

•	Information expected and success
criteria

•	Assessments to be done within the
project team and which are done
outside the project team

•	The scope of authority of assessors

•	Discussion of how response actions to
assessment findings are to be
addressed

•	Description of how corrective actions
will be carried out.

EPA QA/G-5G

54

Peer Review Draft
February 2002


-------
Describe the methods used to compare, evaluate, and assess the data produced in each
step of the project to ensure that they have been processed correctly. When macros
are used to automate a multistep process, code the macro in such a way that the
results of each step can be independently examined so that, if problems are found in
the final output data set, the error can be found by reviewing data at each prior step in
the process.

Use the Assessment and Response Actions (CI) element to describe tests that
compare processed geospatial data to the original or source data sets throughout
production. Describe expected changes in the data and unexpected or erroneous
changes. For example, when converting from raster to vector data formats, compare
the vectorized data to the original raster data to ensure that the appropriate cell size
was used and that no transformations or inappropriate aggregations occurred. When
converting from vector to raster, describe how the raster data set's cells would be
coded when original vector lines divide the raster cells. Will the vector polygon
having the greatest area be used for the cell code, or will the cell be coded using an
average of the values in the coincident polygons?

Describe how the assessments will ensure that no geographic features or data were
lost, deleted, or removed unexpectedly. Loss of geographic features can be an issue
when tolerances are inappropriately applied, resulting in coalescence of geographic
features. Identify methods of ensuring that the right number of features are present at
each step of the process; by doing so, problems with feature loss due to inappropriate
tolerances can be determined.

Even in projects having limited scope or complexity, it may be appropriate to
describe the procedures used to design, develop, and test macro programs during the
course of the project. Use the Assessment and Response Actions (CI) element to
document that procedure, especially in light of how the programs will be assessed for
proper operation.

For all assessments, identify who will conduct the assessment, indicating their
position within the project's organization.

Describe how and to whom the assessment information will be reported.

Define the scope of authority of the assessors, including stop-work orders and when
assessors are authorized to act.

The following is a description of various types of assessment activities available to
managers of geospatial projects for evaluating the effectiveness of project implementation.

A. Readiness review is a technical check to determine if all components of the project
are in place so that work can commence on a specific phase.

These reviews can help avoid redoing expensive field work by assuring that
equipment is in proper working order (e.g., charged battery pack, adequate
performance of GPS receiver units) and that adequate logistical preparations, such as
acquiring supporting materials and property access are performed before a survey.

EPA QA/G-5G

55

Peer Review Draft
February 2002


-------
B.	Technical Systems Audit is a thorough and systematic, on-site, qualitative audit in
which facilities, equipment, personnel, training, procedures, and record keeping are
examined for conformance to the QA Project Plan. The technical systems audit is a
powerful audit tool with broad coverage that may reveal weaknesses in the
management structure, policy, practices, or procedures. It is ideally conducted after
work has commenced (such as during image acquisition) but before it has progressed
very far. The technical systems audit provides opportunity for corrective action.

For example, technical systems audits are conducted for remote sensing operations by
the QA staff of an EPA contractor, or by the Agency itself, to compare observed
operations with a set of approved standard operating procedures and QA protocols
defined in the QA Project Plan for the work assignment. These audits are facilitated
by use of an audit questionnaire designed to systematically guide the auditor through
various remote-sensing processes. The questionnaire ensures that all pertinent
operations are thoroughly evaluated during the audit. Findings are recorded on a
project-specific checklist. Audit reports document appropriateness of operations,
note problems and obstacles, and recommend corrective actions to the project
manager, who notifies EPA management via a memorandum.

C.	Performance Evaluation is a type of audit in which the quantitative data generated by
a measurement system such as GPS are obtained independently and compared with
routinely obtained data to evaluate the proficiency of the sample collector. The QA
Project Plan lists the performance evaluations that are planned, identifying

the sample to be taken

the target location to be covered

the timing/schedule sample duplication

the aspect to be assessed (e.g., precision, bias).

On a project where new aerial photography is being acquired, for example, the
project lead, upon receipt from the photo laboratory, would screen the original film
(or contact prints, and/or enlargements) for such parameters as exposure, length of
the leader/trailer, and appropriate camera mounting; verify the acceptability of
overflight products [i.e., scale (correctness), coverage (completeness), resolution
(detection limit)] for photo analysis requirements; and document findings to ensure
overall image acceptability.

D.	Surveillance is the continual or frequent monitoring of the status of a project and the
analysis of records to ensure that specified requirements are being fulfilled. It can
occur at various steps in the project and be a self-assessment or an independent
assessment.

For example, the production of output from the photo laboratory (and/or digital
scanning) subcontractor would be monitored to ensure they are able to meet the

EPA QA/G-5G

56

Peer Review Draft
February 2002


-------
deliverable date and provide photos enlarged to common scale. Under an umbrella
QA Project Plan covering many routine tasks, processes and products could be
inspected internally using standardized QA checklists (e.g., film and photography
screening photo analysis reports) documented in monthly reports assessing the
progress, performance, and quality of activities.

E.	Audit of Data Quality reveals how the data were handled, what judgments were
made, and whether uncorrected mistakes were made. Performed prior to producing a
project's final report, audits of data quality can often identify the means to correct
systematic data reduction errors.

For example (or at the minimum), a formalized procedure would be described for
quality assessment during implementation of a project processing geospatial data
(whether collected or acquired) on a GIS to prepare a product. Describe assessment
and response activities to ensure the quality of the product, including review of the
acquired data or images assessment reports [Data Acquisition Requirements
(Nondirect Measurements) (B9) element] to ensure that the lineage is traceable and
defensible for the type of information required. If inadequacies are identified, the
data analyst would contact the project's data producer to correct any identified
problems, or if the data were acquired from an outside source, a different data set
may need to be acquired for processing. Any problems identified and corrective
actions taken would be documented to ensure that the project requirements are
satisfied. Reviews of the interim steps in data reduction or transformations by an
independent analyst are also needed prior to the product's completion to confirm
adequacy of reductions and transformations and to confirm that topology is
established properly for the data set. Any problems identified in the data set
produced by the project or omissions in documentation identified by these reviews
need to be corrected before the product is completed.

F.	Peer review is primarily an external scientific review. Reviewers are chosen who
have technical expertise comparable to the project's performers but who are
independent of the project. Peer reviews ensure that the project activities

were technically adequate

were competently performed

were properly documented

satisfied established technical requirements

satisfied established quality assurance requirements.

In addition, peer reviews assess the assumptions, calculations, extrapolations,
alternative interpretations, methods, acceptance criteria, and conclusions documented
in the project's report. The names, titles, and positions of the peer reviewers, if
known, are to be included in the QA Project Plan and their planned findings report(s).

EPA QA/G-5G

57

Peer Review Draft
February 2002


-------
Responsibilities for reports documenting responses to peer-review comments and
completed corrective actions would be specified.

For example, project team members review photo interpretations made by the project
analyst and the technical supervisor in order to assess and validate the reasonableness
and soundness of interpretations.

G. Data Quality Assessment involves the application of statistical tools to determine
whether the data meet the assumptions under which the data quality objectives and
data collection design were developed and whether the total error in the data is
tolerable. Guidance for Data Quality Assessment: Practical Methods for Data
Analysis (QA/G-9) (EPA, 2000b) provides guidance for planning, implementing, and
evaluating data quality assessments.

For example, a geospatial data set could be reviewed by an independent analyst to
check data quality (e.g., univariate descriptive statistics and outlier tests), logical
consistency (e.g., thematic correlations) for internal validity of multivariate data sets,
proper topology, and traceable and defensible lineage.

How might the assessments be documented? The number, frequency, and types of
assessments would be included in this element. Depending on the nature of the project, there
may be more than one assessment. The QA Project Plan would specify the individuals, or at
least the specific organizational units, who will perform the assessments. Independent
assessments are performed by personnel from organizations not connected with the project but
who are technically qualified and who understand the QA requirements of the project.

Audits, peer reviews, and other assessments often reveal findings of practice or
procedure that do not conform to the written QA Project Plan. Because these issues need to be
addressed in a timely manner, the protocol for resolving them is outlined in this element together
with proposed corrective actions to ensure that such actions are performed effectively. The
person to whom the concerns are to be addressed, the decision-making hierarchy, the schedule
and format for oral and written reports, and the responsibility for corrective action are all
discussed in this element. This element also explicitly defines the unsatisfactory conditions
upon which the assessors are authorized to act and list the project personnel who are to receive
assessment reports.

3.3.2 C2. Reports to Management

Suggested Content:

What is the purpose of this element? This
element provides a place to document the
frequency, type, distribution, and content of
reports that will record the status of the project
and, specifically, data assessments made in the
Assessment and Response Actions (CI) element.

• Frequency and distribution of reports
issued to management that document
assessments, problems, and progress

• Individuals or organizations

responsible for preparing the reports
and actions recipients would take

upon receipt of the reports.

EPA QA/G-5G

Peer Review Draft
February 2002

58


-------
What type of information might be included in this element? The graded approach to QA
Project Plans implies that, for projects of very limited scope, quality requirements, or size, a
simple description of the use of weekly or monthly status e-mails may be appropriate. For more
complex projects with many processing steps, data sources, and complex processing methods,
more formal reports may be required and documented in the Reports to Management (C2)
element.

Effective communication among all personnel is an integral part of a quality system.
Planned reports provide a structure for apprising management of the project schedule, deviations
from approved QA and test plans, the impact of these deviations on data quality, and potential
uncertainties in decisions based on the data. Verbal communication regarding deviations from
QA plans would be noted in summary form in the Data Review, Verification, and Validation
(Dl) element.

No matter how informal or formal the reports may be, it is appropriate to describe the
content, frequency, and distribution of these reports in the Reports to Management (C2) element.
This element would also identify the individual or organization responsible for preparing the
reports and action recommendations that might be included in the reports. An important benefit
of the status reports is the opportunity to alert management to data quality problems, propose
viable solutions, and procure additional resources. If the project is not assessed continually
(including evaluation of the technical systems, measurement of performance, and assessment of
data), the integrity of the data generated in the project may not meet quality requirements.
Submitted in a timely manner, these assessment reports will provide an opportunity to
implement corrective action when most appropriate.

At the end of a project, a report documenting the data quality assessment findings is
submitted to management.

3.4 Group D: Data Validation and Usability

Group D elements describe final data validation and usability procedures used to ensure
that the final product meets quality and completeness criteria. Because geospatial projects
involve a great deal of data processing, frequent manipulations of geospatial data, and sometimes
extensive software development, many assessments may be carried out during the course of the
project. These types of assessments would be documented in the Data Management (BIO)
element and in the Assessment and Response Actions (CI) element. Group D elements facilitate
examination of the final data product or cartographic product to ensure that it is of acceptable
quality and can be used for its intended purpose.

The process of data verification requires confirmation by examining or providing
objective evidence that the requirements of these specified QC acceptance criteria are met. In
design and development, verification concerns the process of examining the result of a given
activity to determine conformance to the stated requirements for that activity. The process of

EPA QA/G-5G

59

Peer Review Draft
February 2002


-------
data or imagery verification effectively ensures the accuracy of data, using specified methods
and protocols, and is often based on comparison with reference or control points and base data.

The process of data validation requires confirmation by examination and provision of
objective evidence that the particular requirements for a specific intended use have been
fulfilled. Validation, usually performed by someone external to the data generator, is the process
of examining a geospatial product or result to determine conformance to user needs.

3.4.1 Dl. Data Review, Verification, and

Validation

What is the purpose of this element? This
element would be used to describe the criteria that
will be used in accepting or rejecting the final product. Many of these criteria may be gleaned
from assessments and checks identified in other portions of the QA Project Plan. However, in
the Data Review, Verification, and Validation (Dl) element, pay close attention to those criteria
that would make the data inappropriate for its intended use. When producing a final product in a
geospatial project, many quality checks and assessments are carried out during production [as
described in the Data Management (BIO) and Assessments and Response Actions (CI)
elements], but the final product itself would also undergo final checks to ensure that it meets the
objectives for usability and quality.

What type of information might be included in this element? For data collection
involving GPS surveys or aerial photography, note how closely the coordinates or imagery
represent the actual surface feature and whether or not that difference is within acceptable
tolerances. By noting deviations in sufficient detail, subsequent data users will be able to
determine the data's usability under scenarios different from those included in project planning.
The strength of conclusions that can be drawn from data (see Guidance Document for Data
Quality Assessment: Practical Methods for Data Analysis (QA/G-9) (EPA, 2000b) has a direct
connection to the sampling design and deviations from that design. Where auxiliary variables
are included in the overall data collection effort (for example, groundwater or ozone data), they
would be included in this evaluation. [Environmental data are covered in Guidance for Quality
Assurance Project Plans (QA/G-5) (EPA, 1998a).]

How would sample collection and handling procedures or deviations be handled? Details
about the acquisition of geospatial samples and imagery are important for properly interpreting
the results. The Sampling and Image Acquisition Methods (B2) element provides these details,
which include sampling or imagery acquisition procedures and equipment (e.g., camera and film
type, control points). Acceptable departures (for example, alternate GPS sampling sites) from
the QA Project Plan, and the action to be taken if the requirements cannot be satisfied, are to be
specified for each critical aspect. Validation activities would note potentially unacceptable
departures from the QA Project Plan. Comments from field surveillance on deviations from
written field survey or flight plans would also be noted.

Suggested Content:

• The criteria to be used to validate and
verify the final product.

EPA QA/G-5 G

60

Peer Review Draft
February 2002


-------
What type of quality control steps would be performed in this element? The Quality
Control (B5) element of the QA Project Plan specifies the QC checks that are to be performed
during sample collection, handling, and analysis. These include analyses of reference data or
control points and calibration standards that provide indications of the quality of data being
produced by specified components of the measurement process. For each specified QC check,
the procedure, acceptance criteria, and corrective action (and changes) would be specified. Data
validation would document the corrective actions that were taken, samples or images affected,
and the potential effect of the actions on the validity of the data.

When data or materials are acquired from other sources, verify that the materials are
received as originally ordered and that the order is complete. For example, for samples taken by
GPS technology, the standard deviation of the field data can be checked during the
postprocessing data assessment. For imagery, the contents of each photo data package or digital
file can be checked for coverage and quality upon completion receipt. If new photographs were
acquired, accuracy of elevations and positions would be checked against targets placed on the
ground to mark control points in advance of the aerial survey/photography.

Scientists and contractors performing photogrammetric analysis tasks would be expected
to adhere to standards such as the National Map Accuracy Standards and other standard
operating procedures for data analysis and product generation (e.g., comparison of index point
coordinates from the end of a measurement session with those taken at the beginning to see if the
discrepancy exceeds digitizer control limits). Positional accuracy of points and associated area
perimeters, as well as the methods used to establish them, would be reported in ground control
reports as part of a draft photogrammetry report. The latter would be reviewed in the product
accuracy assessment to determine if accuracy met project objectives established for data use.
Known but withheld coordinates would be used to evaluate the final compilation by comparison
to at least one test point established for each project area and carried through in the photogram-
metric process. If no targets were established, three or more discrete imaged features would be
used as controls and compared to field-survey ground coordinates or comparable features on
existing photographs or maps. The residuals or discrepancies between field-established
coordinates and the photogrammetric coordinates at two points can be used to indicate a
misidentification, with the residual (discrepancy) at the third point identifying any bad
(misidentified) point.

If instruments such as GPS receivers, digitizing tablets, or other measurement equipment
are used on the project, document the results of calibration activities in this element. Ensure that
the calibrations

were performed within an acceptable time prior to generation of data or imagery
were performed in the proper sequence
included the proper number of calibration points.

When calibration problems are identified, any data or imagery produced between the
suspect calibration event and any subsequent recalibration would be flagged to alert data users.

EPA QA/G-5G

61

Peer Review Draft
February 2002


-------
3.4.2 D2. Verification and Validation

Methods

What is the purpose of this element? This
element is the appropriate place to describe how
the final products will be verified and validated.

Whereas the Data Review, Verification, and
Validation (Dl) element documents what final
checks will be performed, this element describes
how these checks will be carried out.

As with Data Review, Verification, and
Validation (Dl) element, a substantial amount of
the information relevant to this element may be
found in other QA elements throughout the QA
Project Plan. This element would include many,
if not all, of those procedures. However, because
Group D elements (including this element) concentrate on verifying and validating the final
products, it addresses ways of modifying or adding to previous assessments to ensure that the
final product is acceptable.

What type of information might be included in this element? This type of validation and
verification might be necessary, for example, when the final product is a database that will be
distributed and used by others. Throughout the production or analysis process, a number of QA
checks and assessments are carried out to ensure that procedures are being followed correctly.
However, at the very end of the process, a series of final checks are to be implemented to make
sure the data will be usable by the intended audience. The amount of data validated is directly
related to the project data quality objectives. The percentage of data validated for the specific
project, together with its rationale, would be outlined or referenced. The QA Project Plan would
have a clear definition of what is implied by "verification" and "validation." The type of checks
(and their descriptions) might include

verifying that each output data set falls into the correct geographic location and has
the specified coordinate system and precision

verifying that the files to be delivered are of the specified format [For example, if the
project defines that the output format is to be compressed Spatial Data Transfer
Standard format, the staff member responsible for the Verification and Validation
Methods (D2) element would ensure that the each of the output data sets is indeed in
Spatial Data Transfer Standard format.]

verifying that each data set can be unpackaged, uncompressed, or otherwise
configured for use by end-users

verifying that all of the required database tables and fields are present.

Suggested Contents:

•	Description of validation and
verification processes for the final
products

•	Discussion of issues related to
resolving problems detected and
identification of individuals or
authorities who will determine
corrective actions

•	Description of how the results of the
validation will be documented for the
product users

•	Definition of differences between
validation and verification issues.

EPA QA/G-5G

62

Peer Review Draft
February 2002


-------
If a map or cartographic product is to be the final deliverable, the Verification and
Validation Methods (D2) element would be used to describe how the content of the map will be
checked to ensure that it meets the criteria set out in Groups A and B. For example, do the
specified layers exist in the map? Is the title correct? Does the legend reflect each of the data
layers in the map? Does the map cover the correct geographic extent? Is the scale of the map
correct?

3.4.3 D3. Reconciliation with User

Requirements

What is the purpose of this element? The
purpose of this element is to outline and specify,
if possible, the acceptable methods for evaluating
the results obtained from the project. This
element includes scientific and statistical evalua-
tions of data to determine if the data are of the
right type, quantity, and quality to support their
intended use.

In most geospatial projects, an
abbreviated form of systematic planning
addressing acceptance and/or performance
criteria rather than a formal DQO Process will be
followed. In environmental sampling projects that have a geospatial component, systematic
planning would be completed with respect to the media sampling design and analytical methods;
associated locational data also need established acceptance and performance criteria against
which they can be evaluated.

Data quality assessment follows data validation and verification. This process
determines how well validated data can support their intended use. If an approach other than
data quality assessment has been selected (e.g., product review), an outline of the proposed
activities would be included. For example, graphics products including draft, interim, and final
enlargements; scanned photographs; and associated overlays would be reviewed during the
internal and external report review process to ensure they meet established graphics standards.
The final site analysis report packages would be assessed for quality of site imagery, photo
annotations, accuracy of interpreted photographic features, and quality of the associated
descriptive text. The editorial quality and consistency of materials included in the report would
be evaluated and documented on a QA review checklist.

Data quality assessments of general-purpose databases produced during the course of the
project would be compared to quality criteria as specified in Quality Objectives and Criteria
(A7), Data Acquisition Requirements (Nondirect Measurements) (B9), and Data Management
(BIO) elements. For example, on projects where the goal is a database of georeferenced water
quality locations, the assessment phase would determine whether the final data met the

Suggested Content:

•	Description of how the products or
results will be reconciled with
requirements defined by the data user
or decision maker

•	Description of how reconciliation
with user requirements will be
documented and how issues will be
resolved

•	Discussion of limitations on the use of
the final data product and how these
limitations will be documented for
data users or decision makers.

EPA QA/G-5G

63

Peer Review Draft
February 2002


-------
performance criteria (e.g., for accuracy and completeness). The Reconciliation with User
Requirements (D3) element would document this comparison and note any deviations that would
affect the final product.

Assigning and communicating roles and responsibilities for product reviews
[documented in the Project/Task Organization (A4) element] is important. These reviews
would, in turn, be coordinated with external QA reviews performed by EPA personnel at the
draft and final stages of the report.

EPA QA/G-5G

64

Peer Review Draft
February 2002


-------
CHAPTER 4

GRADED APPROACH EXAMPLES

This chapter is designed to illustrate the structure and content of a geospatial QA Project
Plan, providing an example of the elements discussed in Chapter 3. This chapter is important for
two reasons: (1) implementation of a new process is always more understandable with
examples, and (2) these examples will provide the reader with some insight into the
implementation of the EPA graded approach.

In each example, the information provided under each relevant QA Project Plan element
is described to illustrate the application of the element to that example. These examples also
discuss the documentation appropriate for each project.

4.1 Minimum Documentation Example: Creating a Cartographic Product from a

Spreadsheet Containing Facility Latitude/Longitude Coordinates

In this example, the geospatial professional has been asked to generate a nationwide map
displaying the locations of certain kinds of industrial facilities based on the locations provided
by the requestor in an Excel spreadsheet. Only a subset of the facilities located in the
spreadsheet will be mapped. The locations are provided in latitude/longitude format. The subset
is identified by a specific code located in a column in the spreadsheet.

4.1.1 Group A: Project Management

Project/Task Organization (A4)—Element A4 would simply state the name, role, and
contact information for the geoprocessor performing the work, the person responsible for
checking project quality, and the requestor.

Problem Definition/Background (A5)—The geoprocessing professional may have to
seek more information from the requestor in order to complete this element. The critical types of
information for a limited scope project like this would be as follows:

Identify the audience for the map.

Identify and describe the purpose of the map.

Describe the documentation needed to accompany the map, if any. For example, if
the data sources used on the map or the purpose of the map require explanation,
document this project-specific requirement.

Describe the data requirements for the map, including contextual information (for
example, state or county boundaries, hydrography, labels) to be included on the map.
Document any project-specific requirements regarding product disclosure or
sensitivity. Describe whether or not the map or the data shown are in any way
confidential.

EPA QA/G-5G

65

Peer Review Draft
February 2002


-------
Project/Task Description (A6)—Describe the steps to be taken to complete the project
and define, as much as possible, the product to be generated. Things to consider include the
following:

How will the Excel spreadsheet be converted for use in the GIS?

How will the data be checked for quality?

Which records will be displayed (if not all)? What is the criteria for selecting specific
records to be used in the map?

What will the map to be generated look like? Include the size, format, title, legend,
scale, use of color, and other data to be included (e.g., state boundaries, county
boundaries).

How and when will draft maps be generated, reviewed, and revised?

Quality Objectives and Criteria (A7)—Describe the quality objectives for the project. In
a case like this one, example objectives may include the following:

•	The latitude/longitude coordinates in the spreadsheet are to reflect the actual
locations of the facilities. Developing a quality objective like this is important,
because the requestor may assume the locations are accurate or precise without
having examined them. By including this objective, the geoprocessing professional
sets a criterion that can be checked in the assessment phase to address obvious
inconsistencies in the latitude/longitude coordinates. For example, some coordinates
may only include a latitude/longitude to the closest degree, while others may include
latitude/longitude down to a decimal degree. Coordinates that are only precise to a
degree of latitude/longitude may be questioned as to their precise representation of an
actual facility location.

•	The original latitude/longitude coordinates are to be converted into a GIS format and
displayed on the map without loss of precision or accuracy.

•	The projection used for the ancillary layers is to match that used for the facility
locations. For example, if the ancillary layers (states and counties) are in North
American Datum of 1927, but the facility latitude/longitude coordinates are in the
North American Datum of 1983, there will be inaccuracy in the location of the
facilities as it relates to the boundaries. Facilities near state boundaries could appear
to be in the wrong state.

•	Only those facilities of interest in the spreadsheet are to be displayed on the map.

•	Facilities that are not in the continental United States (for example, Guam, Hawaii,
Alaska, etc.) need to be considered. That is, make sure the requestor has specified
whether they are to be shown or not.

Appendix C may provide additional information that would be useful when deciding what types
of quality characteristics may be considered and documented in the Quality Objectives and
Criteria (A7) element.

EPA QA/G-5G

66

Peer Review Draft
February 2002


-------
Special Training/Certification (A8)—In this example, the geospatial professional has the
required background and experience to perform the work. However, if the map product were to
be used in an official EPA publication, requirements for cartographic training might need to be
specified here.

4.1.2 Group B: Measurement/Data Acquisition

The first eight elements addressing sampling and measurements are not required in this
project because no new data collection will take place. These elements may be included in the
QA Project Plan with the text "Not Applicable" next to each.

Data Acquisition Requirements (Nondirect Measurements) (B9)—Describe the sources
of each data set to be used in the map. For example, describe or document

the name of the individual who provided the spreadsheet (if different than the
requestor)

when the spreadsheet was delivered
the format (program) of the spreadsheet

the origin of the spreadsheet (it is very important to know where the requestor got the
facility locations. The requestor is presumably NOT the originator of the
latitude/longitudes but was provided them from some other source.)
existing information about how the facility locations were derived
the format of the latitude/longitude coordinates

the date the locations were derived (does the date the locations were acquired affect
the purpose of the map? For example, if the locations were derived ten years ago, but
the map is to show the current set of facilities, there may be increased uncertainty as
to the accuracy of the data.)

the contents or metadata for the other data layers on the map.

Data Management (BIO)—Describe how the data will be managed once acquired from
the requestor. For a small project such as this, consider the following:

Describe the applications format to be used to store the converted spreadsheet data
file (e.g., dBase, Microsoft Access, INFO, other).

Document any changes to field definitions necessary when converting the
spreadsheet.

Document the computer path to the data file(s) along with the names of the original
input file and the names of any files created during the process of converting the data
to GIS format.

Document the input and output projection parameters used to reproject the data into a
map-based coordinate system.

EPA QA/G-5G

67

Peer Review Draft
February 2002


-------
Document and describe any custom subprograms used to process the data or to create
the map.

Describe the GIS software programs and versions used to process the data.

4.1.3	Group C: Assessment/Oversight

Assessments and Response Actions (CI)—The primary assessments to be described for
this project would include

the method of ensuring that all spreadsheet records were properly translated into GIS
records, including codes, numbers, and records (describe how the GIS data will be
assessed to ensure that data were transferred correctly)

the method of ensuring that the resulting map accurately shows the locations of the
entities from the spreadsheet

the method of ensuring that there are no errors (typos, missing elements) in the map
itself

the method of correcting errors found during the assessment.

Reports to Management (C2)—For this project, reports to management may only be
required at the end of the task. In the Reports to Management (C2) element, discuss the content
and scope expected in the final report. The final report may simply be an e-mail or informal
memorandum, describing the completion of the project, the map deliverables, any problems
encountered and their resolution.

4.1.4	Group D: Data Validation and Usability

Data Review, Verification, and Validation (Dl)—State the criteria used to review and
validate—that is, accept, reject, or qualify—data in an objective and consistent manner.

In a narrow scope project like this one, it may be difficult to objectively state criteria the
data need to meet. It may be more appropriate to explore the data quality and report to the map
requestor any omissions, problems, or concerns with the data.

Verification and Validation Methods (D2)—Describe the process for validating and
verifying the data. Describe how the results will be communicated. In a project like this, the
input data would be explored in an informal fashion to locate any problems. Some examples of
data exploration include the following:

Does every facility contain a latitude/longitude coordinate? List those that do not.
Are the latitude/longitude coordinates consistent in their precision? For example, do
some records contain data only to whole degrees while others contain more precise
latitude/longitudes? If so, is there a question about variability in the quality of the
data?

EPA QA/G-5G

68

Peer Review Draft
February 2002


-------
Do the latitude coordinates contain leading (minus signs) indicating locations in
the Western hemisphere? Are all of the records consistent with regard to the use of
minus signs for longitude?

Do there appear to be any transpositions of latitude/longitude in the file? Create a
simple map of the latitude/longitude coordinates. Do any of them appear in strange
locations (for example, far outside the continental U.S.)?

Reconciliation with User Requirements (D3)—A limited scope project like this one
has probably not undergone an extensive, systematic planning process. Therefore, this element
can be used to communicate any potential problems found with the data file when compared to
the performance criteria provided for the intended use.

After reviewing the input data set (as above), create a summary for the requestor
indicating the nature of any omissions, errors, questions, or concerns about the data and their
impact on the intended use. It is important to note that in a project like this, the requestor may
not have personally reviewed the data and, therefore, may not be aware of potential problems.
By providing a summary report, the requestor is given the option of modifying the map request,
seeking clarification from the data originator on questions, and/or withdrawing the request.

4.2 Medium Documentation Example: Routine Global Positioning Survey Task to

Produce a GIS Data Set

The example illustrates how elements B1 through B8 would be used when collecting
primary geospatial data. The other two graded-approach examples concentrate more on the Data
Acquisition Requirements (Nondirect Measurements) (B9) and Data Management (BIO)
elements issues related to the use of existing data rather than on the approaches used for new
data specifically collected for a particular project.

A QA Project Plan for this task would document task-specific objectives for the survey
and data evaluation criteria for the locational data to be collected. The task description and roles
and responsibilities would be related to standard operating procedures and reporting forms of a
single organization to avoid redundancy of documentation. Evaluation tasks would be specified
to produce reports needed for product acceptance (or rejection). If accepted, "truth in labeling"
information for the data set would be reported as standard metadata and entered into the GIS.

An adequate level of detail would be needed to clearly communicate agreed-upon survey
objectives, data quality indicator criteria, and assessment and reporting requirements.

4.2.1 Group A: Project Management and Systematic Planning to Define the Task

The project management elements would emphasize task roles and responsibilities for
planning and documenting the objectives of the task, evaluation criteria, and required
assessments. Requirements for metadata records would also be documented.

EPA QA/G-5G

69

Peer Review Draft
February 2002


-------
Distribution List (A3)—The distribution list for the QA Project Plan on a project like this
might include the EPA QA Officer, the EPA Task Leader, EPA Project Manager, GIS analysts,
GPS technicians, and field staff.

Project/Task Organization (A4)—This element might describe the roles and
responsibilities of each team member and provide an organization chart illustrating lines of
communication and chain-of-command responsibilities. The organization description would
clearly identify individuals with responsibility for developing, reviewing, and approving the QA
Project Plan. Roles and responsibilities would be defined for field data collection, data
management and processing, data quality assessment, reporting to the user, and records
management.

Problem Definition/Background (A5)—The problem definition and background state-
ment would describe the regulatory or decision-making context in which the project is operating.
For example, describe the driving force behind the data collection effort and describe how the
data will be used and by whom.

Project/Task Description (A6)—In this example project, the description would clearly
state that the project will collect precise latitude/longitude coordinates using GPS equipment and
that the results of the data collection process will be a complete and accurate GIS database of
these locations, along with descriptive attributes. The project involves fieldwork and the use of
GPS measurement equipment; therefore, the project/task description could discuss the basic
assumptions and environment in which the project will utilize these methods.

Quality Objectives and Criteria (A7)—The user would provide criteria for acceptable
data quality indicators such as accuracy (e.g., consistent with the EPA Locational Data Policy
and standards), equipment sensitivity, precision, comparability, and completeness. Language
from standard operating procedures could be used to describe the data quality requirements and
to specify the criteria by which the collected data would be assessed.

Special Training/Certification (A8)—Describe how the field staff will be trained on the
proper use of the GPS receiver, if necessary.

Documentation and Records (A9) —Requirements for task record keeping and/or
metadata specifications or standards (e.g., EPA Method Accuracy and Description Codes or
Federal Geographic Data Committee Standards) would be documented or included by reference.
A data dictionary might also be described to fully document the database column names, types,
widths, and contents, including any numeric coding schemes used to store nonlocational
(attribute) data.

4.2.2 Group B: Data Collection

The data collection elements would describe in detail the implementation of standard
operating procedures for field data collection (included by reference) during data management.

EPA QA/G-5G

70

Peer Review Draft
February 2002


-------
The hardware/software configuration would be briefly described to document planned
requirements and appropriate standard operating procedures to assure usefulness of the data set.

Sampling Process Design (Bl)—To meet the task objectives and data quality indicator
criteria developed in systematic planning, a survey design would be developed describing the
sampling targets, sampling time, and frequency of data collection. Documentation of the design
would include the rationale for choosing the specific sites to be sampled.

Sampling and Image Acquisition Methods (B2)—This element would be used to
describe the actual procedures and methods used to collect the locational data using the GPS
devices. Existing standard operating procedures such as those developed in EPA Region 5 and
EPA Region 8 for GPS data collection could be cited or referenced, if those procedures will be
used on this project. Include any special considerations regarding property access,
transportation, or other logistical issues in this element.

Sample Handling and Custody (B3)—GPS data collection results in electronic files that
will be downloaded and processed using GIS software. Therefore, there is no physical sample
handling. This element might be used to describe how the electronic files from the GPS
receivers are to be transmitted to the processing computers and who will do the transmitting.

Quality Control (B5)—Describe the overall quality control methods used to ensure that
the locations for which latitude/longitude coordinates are collected meet the sampling design and
are of the quality as set forth in the Quality Objectives and Criteria (A7) element.

Identify QC activities and the method to be used to obtain measurements. Describe the
corrective action if the measurement is outside the performance limits.

Establish quality control methods for key entry, digitizing, or manually entering data to
make ensure the data are correct. For example, provide a checklist to make sure field staff stand
over the correct locations for the specified amount of time for GPS measurements.

Measurements and observations can be compared to standard measurements and observations, or
assessed against tolerance limits, to determine whether the data collection equipment is
functioning within acceptable bounds or performance limits. Specify performance measures,
measurement methods used, and the acceptable performance limits.

Instrument/Equipment Testing, Inspection, and Maintenance Requirements (B6)—

Describe the procedures to be used to test, inspect, and maintain the GPS receivers. If standard
operating procedures will be followed, cite them rather than duplicating their content here.

Instrument Calibration and Frequency (B7)—Note when periodic calibration of GPS
equipment is to be performed. Describe the method of calibration and the frequency. Also, note
where the calibration results are to be documented so they can be assessed before each GPS
receiver is checked out for use. Cite—rather than reproducing—existing calibration procedures
already specified in existing GPS standard operating procedures.

EPA QA/G-5G

71

Peer Review Draft
February 2002


-------
Inspection/Acceptance Requirements for Supplies and Consumables (B8)—Include a
requirement to check batteries for the GPS receivers before commencing fieldwork. Discuss the
requirement that batteries for each GPS receiver be fully charged and that any backup batteries
also be charged and ready to go prior to fieldwork.

Data Management (BIO)—For this project, data management activities would involve
the storage and conversion of the GPS coordinates and associated attributes into GIS format and
the subsequent data processing and manipulations of the coordinate and attribute data necessary
for the final database to meet requirements for content, accuracy, projection, and format.
Describe the procedures to be used during these processing steps in order to provide a complete
overview of data management and manipulation. Describe any file naming conventions to be
followed.

4.2.3 Group C: Assessment and Oversight

These elements would focus on the activities for assessing the effectiveness of project
implementation and associated QA and QC activities to ensure that the QA Project Plan and its
standard operating procedures are implemented as prescribed, including reports to project
management and their response actions.

Assessments and Response Actions (CI)—Performance evaluations subsequent to
training would document any GPS operator problems. Readiness reviews would include checks
on equipment function such as sensitivity of detection and precision, correct recording and
processing menus, base station availability, and survey logistics. The individuals or organiza-
tional units who will perform the assessments would be designated (e.g., regional coordinator,
task manager). Standardized checklists can be used. During the survey, quality control proce-
dures would be performed such as checks for accuracy against benchmarks. Any deviations
from the task data collection design (e.g., lack of property access, interference) would be noted
during the daily verification of data collection and reported, as well as field observations in
designated forms to meet reporting requirements (EPA Method, Accuracy, and Description code
requirements).

Assessment and differential correction would be performed with the designated software
and base or reference station information before processing to produce the input file. Data
quality assessments would include checking final data point locations with the field map for
completeness, verifying that data quality indicator criteria were met, that metadata are adequate,
and that files were adequately transferred and backed up. Input files for the GIS would be
checked by an independent reviewer (e.g., regional coordinator or task manager) to assure they
were complete, adequately documented to controls, and that they meet data quality indicators
such as sensitivity, accuracy, precision, completeness, and if appropriate, comparability.

As manipulation of the coordinate data in the GIS occurs, continued assessments of the
quality and accuracy of the manipulations would take place to ensure no discrepancies were

EPA QA/G-5G

72

Peer Review Draft
February 2002


-------
introduced as a result of processing errors. Describe these checks and assessments and note
when they would be made during the process of generating the GIS data set.

Reports to Management (C2)—Describe appropriate feedback loops to project
management (e.g., Regional Coordinator) to assure prompt corrective action (e.g., GPS unit
repair).

4.2.4 Group D: Data Validation and Usability

Use Group D elements to describe how field notes, reports, and other documents would
be used to verify and validate the measured locations. These elements would also be used to
describe how the data will be verified and validated. These activities address the data quality
assessments that occur after data are collected and downloaded to a personal computer.

Data Review, Verification, and Validation (Dl)—Once the final data set has been
created, it would be reviewed, verified, and validated to ensure that it satisfies the quality,
accuracy, and completeness required as defined in the Quality Objectives and Criteria (A7)
element. Describe this review process in the Data Review, Verification, and Validation (Dl)
element. Describe what will be reviewed, verified, and validated.

Verification and Validation Methods (D2)—Describe how the final GIS data set will be
validated and reviewed. For example, describe how the final data set's attribute tables will be
compared to the data dictionary [as specified in the Documents and Records (A9) element] to
ensure that the format and content of the data files are correct. Also describe how the locations
of the final data set will be compared to both the original locations collected by the GPS
receivers and the actual, true locations of the features collected. Verify that the EPA Method,
Accuracy, and Description codes are present and accurately reflect the data collection process.

Reconciliation with User Requirements (D3)—This element would describe how the
results of the data assessments, validations, and verifications will be compared and reconciled
with criteria developed to ensure that the final deliverables (geospatial data or nongeospatial data
files) are of sufficient quality to satisfy project requirements. For this project, document whether
the final data meet, do not meet, or partially meet the quality objectives set out in the Quality
Objectives and Criteria (A7) element. This might include descriptions of the success in
capturing all the desired locations, noting whether postprocessing of the GPS coordinates
resulted in sufficient locational accuracy, as specified in the Quality Objectives and Criteria (A7)
element. If not, the impact on the intended use needs to be discussed.

4.3 Complex Documentation Example: Developing Complex Data sets in a GIS for Use

in Risk Assessment Models

This project is to produce GIS database products that will be integrated into a risk
assessment model. Risk assessment modelers and scientists would define the requirements for
the geospatial products for their model in iterations with geospatial professionals. This project

EPA QA/G-5G

73

Peer Review Draft
February 2002


-------
would involve digitizing spatial data sets from map sources, acquiring and converting existing
data, creating subprograms within commercial off-the-shelf software to generate data,
performing spatial analyses between GIS layers (for example, using spatial overlays to compare
land use and demographic data), creating GIS databases for use in risk assessment models, and
creating maps. The project would also involve interactions with risk assessment modelers and
scientists, who would describe the geospatial products required for their models.

4.3.1 Group A: Project Management

Title and Approval Sheet (Al)—The approval sheet would include individuals who will
define the GIS input data requirements for the models, accept the GIS data prior to inclusion in
the models, review and check the geospatial data against the acceptability requirements, and
check the subprograms created in the commercial off-the-shelf software to ensure they are
working correctly. The project manager approving the project for implementation and the
organization's QA manager would also be included.

Distribution List (A3)—Provide names and addresses of participating project managers,
QA managers, and representatives from each technical team working on the project (planners,
suppliers, and reviewers).

Project/Task Organization (A4)—Provide the participating project managers (client and
supplier), QA managers, and representatives from each technical team working on the project
(planners, suppliers, and reviewers), listing their roles and responsibilities. An overall QA
Project Plan created for the larger risk assessment modelling project might serve as a starting
point for this element. The project organization chart and task descriptions can be expanded
with information on the roles of those involved with the geospatial portion of the project.

Problem Definition/Background (A5)—Includes a summary definition of the problem,
background of the overall project, as well as specific problem definitions and backgrounds of the
geospatial portion of the project. One could summarize the Problem Definition/Background
(A5) element of the QA Project Plan for the risk assessment modelling project as a whole,
adding additional information relevant to the geospatial processing portion that is the focus of
this QA Project Plan.

Project/Task Description (A6)—Focus on the project description and tasks for the
geospatial processing project, integrating them with the schedule for the overall risk assessment
project. The project/task description for the geospatial processing portion might include general
descriptions of the data sources, processing steps, and data outputs to be created. Schedules
would be defined, quality assessment techniques would be outlined, and quality assessment
documentation and reports to the clients to be produced would be described.

Quality Objectives and Criteria (A7)—Establishing quality criteria for the information
product output and relating it to data quality indicators to be checked within implementation of
the data processing project is often difficult to do for geospatial projects of moderate to high

EPA QA/G-5G

74

Peer Review Draft
February 2002


-------
complexity. In general, the data quality problems have much more to do with processing
procedures (e.g., incorrect calculations, projections, programmatic manipulations, or procedural
oversights) than with the ultimate locations of geographic entities to be analyzed or with source
maps or data. Missteps in processing procedures often lead to nonsensical or incorrect data
being produced or manipulated in future steps. Specific geospatial locations may be correct, but
the attribute data produced for them may be incorrect.

If possible, state the requirements for positional accuracy. General qualitative statements
are often the only possible way of describing the quality objectives for geospatial processing
(e.g., Fuzzy tolerances used during processing will be set to the smallest possible level in order
to ensure that data processing steps do not negatively affect existing locational accuracy). Other
examples of narrative descriptions of quality objectives include the following:

•	Reprojections, transformations, and other procedures that modify locational
information must result in positional data that is accurate to the level of precision of
the geospatial software being used.

•	When digitizing data from map sources, be sure to document the acceptable root
mean square error. This number is a measure of how closely the digitizer was able to
match the source document to known geographic coordinates and, ultimately, is a
measure of the positional accuracy achieved in converting paper maps into digital
format.

When performing attribute manipulations using database calculations, transformations, or
formulas, it is presumed that no error is acceptable. Equations should be checked to assure they
are coded correctly, and if they are, there are likely to be no errors in the resulting data. In other
words, it would not make sense to say 90% of the resulting data are to be within 1% of the
correct apportioned population.

Special Training/Certification (A8)—Any special training or experience in operating the
commercial off-the-shelf software would be noted here.

Documentation and Records (A9)—Describe the requirements for documentation on the
project. Policies for establishing metadata, especially a description of which FGDC-compliant
metadata will be captured and how the metadata will be stored and managed would be included.
Information on how the methodological procedures used on the project would be captured and
documented might be included. For example, in geospatial projects where many steps are taken
to configure, process, convert, transform, and manipulate the various data layers, taking careful
note of procedures as they are developed is advantageous. This element could be used to specify
how those notes will be entered into a document, at what level of detail, and how they will be
used later in the project.

When subprograms written in commercial off-the-shelf software environments are to be
developed this element would be used to specify requirements for internal documentation of

EPA QA/G-5G

75

Peer Review Draft
February 2002


-------
subprograms (e.g., program header information, and requirements for in-line program
comments), and for external documentation of subprograms (e.g., summaries of the
subprogram's purpose, inputs, outputs, and functions).

4.3.2 Group B: Measurement/Data Acquisition

Sampling Process Design (Bl)—In this project, all of the marked-up maps provided by
the survey respondents are to be digitized and entered into the GIS. Therefore, the Sampling
Process (Bl) element would simply state this requirement.

Sampling and Image Acquisition Methods (B2)—Since 100% of the source maps will
be entered into the GIS, this element might simply state that this is a 100% sample.

Sample Handling and Custody (B3)—As part of this project, one or more maps will be
received from industrial sites, indicating the location of their facilities and related features of
interest (e.g., wells, property boundaries, and other information). These maps serve as source
material and are to be handled and managed very carefully. This element would be used to
describe any procedures for storing the maps, managing a check-in and check-out procedure so
that each map's whereabouts are known, and documenting how these source materials will be
handled so that none are lost or damaged.

Quality Control (B5)—Quality control procedures for the digitizing process would be
documented in this element. These include procedures that indicate exactly how each map will
be registered to the digitizing table, which features will be digitized, how features will be given
identifying codes, and, especially, how at the completion of digitizing each resulting GIS data set
will be checked against the original map to ensure that all required features have been digitized
correctly.

Instrument/Equipment Testing, Inspection, and Maintenance Requirements (B6)—

Document any inspection, testing, or maintenance recently performed or required to ensure that
the digitizing table (or tables) are operating within the vendor's specified tolerance.

Instrument Calibration or Standardization and Frequency (B7)—Occasionally,
digitizing tables will encounter calibration or operation problems causing incorrect or erroneous
coordinates to be captured. Describe any calibration procedures (usually obtained from the
manufacturer) that will be used to ensure that the precision of the digitizer is within
specifications provided by the vendor.

Data Acquisition Requirements (Nondirect Measurements) (B9)—Describe the sources
of each data set to be used in the project as follows:

Define the source of each data layer to be used. Include the metadata provided with
each layer. Some of the most important metadata elements include source citation,
source scale, date of production, completeness, and use restrictions.

EPA QA/G-5G

76

Peer Review Draft
February 2002


-------
Describe how each source will be used during the project.

Describe why each existing data source was chosen for use in the project. What are
the reasons these particular data sets are deemed to be superior to others (if more than
one option exists)?

Describe checks to be performed on the existing data to ensure that they were
generated correctly and have the predicted content, format, and projection.
For existing data received from unknown sources (e.g., spreadsheet data provided by
other team members), quality checks would be extensive. Describe these checks
(logical consistency, completeness, geospatial location accuracy, etc.).

Data Management (BIO)—Describe how the data will be managed once acquired from
the requestor. For this complex task, the Data Management (BIO) element will be quite
extensive, including information on the following topics:

path names to all data sources to be used on the project

methods to be employed to ensure that any informal subprograms will be developed
and tested to ensure they operate as expected (e.g., accurate calculations)
a description of the formats of the data sources, any interim or temporary data sets to
be created, and the final data products

a data dictionary that describes, for each source database and the final product, the
content, type, name, and field width of each attribute

if a full requirements-design-development-testing process is to be carried out for any
programs to be written, documentation of that development process, including the
documents that resulted from that process in the Data Management (BIO) element.

4.3.3 Group C: Assessment/Oversight

Assessments and Response Actions (CI)—At each processing step on this project,
quality assessments are to be performed to ensure that the data sources, interim products, and
final databases meet quality objectives. In the Assessments and Response Actions (CI) element,
include methods for ensuring that

all source maps were digitized

all source features were accurately digitized

each map source was registered to within specified tolerances on the digitizing tablet

(creating checklists to track these assessments might be helpful)

attribute codes and categorical data assigned to digitized features were complete and

accurate

each existing data source used was downloaded completely and without corruption of
coordinates or attributes

each existing data source has the correct input coordinate system information
any reprojections/transformations of input data sets were carried out correctly
(including datum shifts, if applicable)

EPA QA/G-5G

77

Peer Review Draft
February 2002


-------
each processing step or "macro" was performed correctly and was performed on the
correct input data

proper coordinate precision (e.g., single precision or double precision) was
maintained throughout each step of the process

there was no unacceptable loss of precision or rounding of coordinates throughout
processing due to raster-to-vector conversions, topological rebuilds, or other
procedures

calculations resulting in new data fields are performed correctly, that any constants
used were entered correctly, and that the resulting data are within expected ranges.

For each of the assessment methods above, describe the methods to be used to correct the
problem and reprocess any resulting data sets.

Reports to Management (C2)—Describe the interim reports to be submitted to manage-
ment throughout the project and note the frequency and content expected for each. For this risk
assessment project, reports to management might include

weekly or biweekly reports describing progress, problems, errors encountered, or
unexpected occurrences

monthly summary of processing status (Which data layers have been processed and
through which stage of the project? Include information about any sites that require
special processing. For example, if there are any sites outside the continental United
States, what special provisions for coordinates systems, projections, and precision
need to be made?)

final reports indicating overall processing results, identifying the products created,
and describing the assessment methods used to gauge accuracy [use information from
the Assessments and Response Actions (CI) element],

4.3.4 Group D: Data Validation and Usability

In most geospatial projects, the Group D elements will describe the process of checking
and validating the final data or maps to be delivered. If the activities in the Group C elements
are properly carried out during the course of the project, the Group D elements will uncover few
problems.

Data Review, Verification, and Validation (Dl)—State the criteria used to review and
validate—that is, accept, reject, or qualify—data in an objective and consistent manner. For this
project, this element would include a description of the criteria used to assess whether the final
deliverables are correct. For this project, any errors, omissions, corrupted data files, incorrect
calculations, or missing information would result in rejection and reprocessing of the final files.
It is hoped that any errors detected in the final data files or coverages are the result of problems
in the last stages of processing. This assumes that the actions carried out in Group C have
identified errors and problems during early and middle stages of production.

EPA QA/G-5G

78

Peer Review Draft
February 2002


-------
Verification and Validation Methods (D2)—Describe the process for validating and
verifying the data and how the results will be communicated. In addition, for this element,
describe

the method for reviewing each final data set to be delivered, in general terms
specific methods for reviewing each data set [For example, if the data sets to be
delivered are a set of database files containing such things as the populations for each
land-use type within a certain distance of an industrial facility, this element would be
used to describe checks to ensure that the final data files contain the appropriate
numbers of records (e.g., all of the census block groups over the entire study area are
accounted for) and that the population aggregations or disaggregations have been
done correctly (e.g., there are no negative population counts and spot checks indicate
that population summaries are correct by performing manual calculations]
the method for ensuring that each data file has been not corrupted and can be
uncompressed (if compressed for delivery).

If the actions described in Group C are followed, any problems encountered at this stage
would be limited to the generation of the final deliverable files themselves—not to a serious flaw
in the methodology or steps performed earlier in the project.

Reconciliation with User Requirements (D3)—This element would describe how the
results of the data assessments, validations, and verifications will be compared and reconciled
with criteria developed to ensure that the final deliverables (geospatial or nongeospatial data
files) are of sufficient quality to satisfy project requirements. For this project, this element
would document whether each component of the final deliverables (i.e., each data file or spatial
data layer) meets, does not meet, or partially meets the quality objectives stated in the Quality
Objectives and Criteria (A7) element. For example, did all database calculations that created
new database fields produce correct results? When comparing the spatial locations of lines and
polygons in final output data sets to original data sets, was there any inappropriate movement of
those features? If there were problems, errors, or inconsistencies, the Reconciliation with User
Requirements (D3) element would include a description of how these problems will affect
usability of the final data sets.

EPA QA/G-5G

79

Peer Review Draft
February 2002


-------
APPENDIX A
BIBLIOGRAPHY

American National Standards Institute/American Society for Quality Control (ANSI/ASQC).
(1995). Specifications and Guidelines for Quality Systems for Environmental Data
Collection and Environmental Technology Programs (E4-1994). American National
Standard.

Federal Geographic Data Committee. (1997). Content Standards for Digital Geospatial
Metadata, Federal Geographic Data Committee, Washington, DC.

Institute of Electrical and Electronics Engineers (IEEE). (1998). Standard 830: IEEE
Recommended Practice for Software Requirements Specifications. IEEE Standards
Collection: Software Engineering (Volume 4: Resource and Technique Standards).
Piscataway, NJ.

National Institute of Standards and Technology (NIST). (1994). Federal Information Processing
Standards Publication 173-1. Gaithersburg, MD. Available:

http://www. itl. nist.gov/fipspubs/.

U.S. Environmental Protection Agency. (1998a). EPA Guidance for Quality Assurance Project
Plans (QA/G-5) (EPA/600/R-98/018). Washington, DC: Office of Research and
Development.

U.S. Environmental Protection Agency. (1998b). Information Resources Management Policy
Manual (Directive 2100). Washington, DC.

U.S. Environmental Protection Agency. (2000a). EPA Quality Manual for Environmental
Programs (Order 5360 Al). Washington, DC.

U.S. Environmental Protection Agency. (2000b). Guidance for Data Quality Assessment:
Practical Methods for Data Analysis (QA/G-9) (EPA/600/R-96/084, QA00 Update).
Washington, DC: Office of Environmental Information.

U.S. Environmental Protection Agency. (2000c). Guidance for the Data Quality Objectives
Process (QA/G-4) (EPA/600/R-96/055). Washington, DC: Office of Environmental
Information.

U.S. Environmental Protection Agency. (2000d). Policy and Program Requirements for the
Mandatory Agency-wide Quality System (EPA Order 5360.1 A2). Washington, DC.

EPA QA/G-5G

A-l

Peer Review Draft
February 2002


-------
U.S. Environmental Protection Agency. (2001a). Geospatial Baseline Report, Office of
Environmental Information, Washington, DC.

U.S. Environmental Protection Agency. (2001b). EPA Requirements for Quality Assurance
Project Plans (QA/R-5) (EPA/240/B-01/003). Washington, DC: Office of Environmental
Information.

U.S. Environmental Protection Agency. (2001c). EPA Requirements for Quality Management
Plans (QA/R-2) (EPA/240/B-01/002). Washington, DC: Office of Environmental
Information.

Veregin, H. (1992). GIS Data Quality Assessment Tools. Internal research project report. Las
Vegas, NV: Environmental Monitoring Systems Laboratory, U.S. Environmental Protection
Agency.

EPA QA/G-5G

A-2

Peer Review Draft
February 2002


-------
APPENDIX B

GLOSSARY

Acceptance Criteria: Specific limits placed on an item, process, or service defined in require-
ments documents. Acceptance criteria are acceptable thresholds or goals for data, usually based
on individual data quality indicators (precision, accuracy, representativeness, comparability,
completeness, and sensitivity).

Accuracy: The degree to which a calculation, measurement, or set of measurements agree with
a true value or an accepted reference value. Accuracy includes a combination of random error
(precision) and systematic error (bias) components which are due to sampling and analytical
operations. A data quality indicator. EPA recommends that this term not be used and that preci-
sion and bias be used to convey the information usually associated with accuracy.

Address Geocoding: Assigning x,y coordinates to tabular data such as street addresses.

Attribute: Any property, quality, or characteristic of sampling unit. The indicators and other
measures used to characterize a sampling site or resource unit are representations of the attributes
of that unit or site. A characteristic of a map feature (point, line, or polygon) described by
numbers or text; for example, attributes of a tree represented by a point might include height and
species. (See related: Continuous)

Attribute Accuracy: The closeness of attribute values (characteristic of the location) to their
true value, which includes continuous attributes with measurement error (e.g., elevation) and
categorical accuracy resulting from misclassification (e.g., soil types on a soil map).

Band: One layer of a multispectral image that represents data values for a specific range of
reflected light or heat—such as ultraviolet, blue, green, red, infrared, or radar—or other values
derived by manipulating the original image bands.

Bias: In a sampling context, the difference between the conceptual, weighted average value of
an estimator over all possible samples and the true value of the quantity being estimated. An
estimator is said to be unbiased if that difference is zero. The systematic or persistent distortion
of a measurement process that deprives the result of representativeness (i.e., the expected sample
measurement is different than the sample's true value). A data quality indicator.

Cell Size: The area on the ground covered by a single pixel in an image, measured in map units.

Classification: The process of assigning a resource unit to one of a set of classes defined by
values of specified attributes. For example, forest sites will be classified into the designated
forest types, depending on the species composition of the forest. Systematic arrangement of
objects into groups or categories according to established criteria.

EPA QA/G-5G

B-l

Peer Review Draft
February 2002


-------
Comparability: The degree to which different methods, data sets, and/or decisions agree or can
be represented as similar.

Completeness: The amount of valid data obtained compared to the planned amount, usually
expressed as a percentage.

Computer-Aided Design Package: An automated system for the design, drafting, and display
of graphical information.

Continuous: A characteristic of an attribute that is conceptualized as a surface over some
region. Examples are certain attributes of a resource, such as chemical stressor indicators
measured in estuaries.

Coordinates: Linear and/or angular quantities that designate the position of a point in relation to
a given reference frame.

Data Quality Indicators: Quantitative and qualitative measures of principal quality attributes,
including precision, accuracy, representativeness, comparability, completeness, and sensitivity.

Data Quality Objectives: Qualitative and quantitative statements derived from the DQO
Process that clarify study objectives, define the appropriate type of data, and specify tolerable
levels of potential decision errors that will be used as the basis for establishing the quality and
quantity of data needed to support decisions.

Data Quality Objectives Process: A systematic tool to facilitate the planning of environmental
data collection activities. Data quality objectives are the qualitative and quantitative outputs
from the DQO Process.

Datum (plural Datums): In surveying, a reference system for computing or correlating the
results of surveys. There are two principal types of datums: vertical and horizontal. A vertical
datum is a level surface to which heights are referred. In the United States, the generally adopted
vertical datum for leveling operations is the National Geodetic Vertical Datum of 1929 (see
below). The horizontal datum is used as a reference for position. The North American Datum of
1927 (see below) is defined by the latitude and longitude of an initial point (Meade's Ranch in
Kansas), the direction of a line between this point and a specified second point, and two
dimensions that define the spheroid. The new North American Datum of 1983 (see below) is
based on a newly defined spheroid (GRS80); it is an Earth-centered datum having no initial point
or initial direction.

Digital Elevation Model: The representation of continuous elevation values over a topographic
surface by a regular array of z-values, referenced to a common datum. Typically used to
represent terrain relief.

EPA QA/G-5G

B-2

Peer Review Draft
February 2002


-------
Digital Line Graph: Digital data produced by the U.S. Geological Survey. These data include
digital information from the U.S. Geological Survey map base categories such as transportation,
hydrography, contours, and public land survey boundaries.

Digital Orthophotography: See Orthophotography

Digitizing table: An electronic device consisting of a flat surface and a handheld cursor that
converts positions on the table to digital x,y coordinates.

Feature: An entity in a spatial data layer, such as a point, line, or polygon, that represents a
geographic object.

Federal Geographic Data Committee (FGDC): The Federal Geographic Data Committee
coordinates the development of the National Spatial Data Infrastructure (NSDI). The NSDI
encompasses policies, standards, and procedures for organizations to cooperatively produce and
share geographic data. The 17 federal agencies that make up the FGDC are developing the NSDI
in cooperation with organizations from state, local, and tribal governments, the academic
community, and the private sector.

Federal Information Processing Standard (FIPS): Standards approved by the Secretary of
Commerce under the Information Technology Management Reform Act (Public Law 104-106).
These standards and guidelines are issued by the National Institute of Standards and Technology
(NIST) as Federal Information Processing Standards (FIPS) for use government-wide. FIPS
coding standards include, for example, two-digit numeric codes used to identify each of the
50 U.S. states and three-digit numeric codes used to identify each U.S. county.

Geographic Feature: Sqq Feature.

Geographic Information System (GIS): A collection of computer hardware, software, and
geographic data designed to capture, store, update, manipulate, analyze, and display
geographically referenced data.

Geospatial Data: The information that identifies the geographic location and characteristics of
natural or constructed features and boundaries on the earth. This information may be derived
from, among other things, remote-sensing, mapping, and surveying technologies.

Global Positioning System (GPS): A constellation of 24 satellites, developed by the U.S.
Department of Defense, that orbit the Earth at an altitude of 20,200 kilometers. These satellites
transmit signals that allow a GPS receiver anywhere on Earth to calculate its own location. The
Global Positioning System is used in navigation, mapping, surveying, and other applications
where precise positioning is necessary.

EPA QA/G-5G

B-3

Peer Review Draft
February 2002


-------
Graded Approach: The process of basing the level of managerial controls on the item or work
according to the intended use of the results and the degree of confidence needed in the quality of
the results.

Grid: A data structure commonly used to represent map features. A cellular-based data
structure composed of cells or pixels arranged in rows and columns (also called a raster).

Ground-truthing: The use of a ground survey to confirm the findings of an aerial survey or to
calibrate quantitative aerial or satellite observations.

Imagery: Visible representation of objects and/or phenomena as sensed or detected by cameras,
infrared, and multispectral scanners, radar, and photometers. Recording maybe on photographic
emulsion (directly, as in a camera, or indirectly, after being first recorded on magnetic tape as an
electrical signal) or on magnetic tape for subsequent conversion and display on a cathode ray
tube.

Kriging: A weighted, moving-average estimation technique based on geostatistics that uses the
spatial correlation of point measurements to estimate values at adjacent, unmeasured points. A
sophisticated technique for filling in missing data values, kriging is named after a South African
engineer, D.G. Krige, who first developed the method. The kriging routine preserves known data
values, estimates missing data values, and estimates the variance at every missing data location.
After kriging, the filled matrix contains the best possible estimate of the missing data values, in
the sense that the variance has been minimized.

Landsat: A series of orbiting satellites used to acquire remotely sensed images of Earth's land
surface and surrounding coastal regions.

Leaf On/Leaf Off: The characteristic of deciduous vegetation based on seasonality. Refers to
whether deciduous trees have leaves during image acquisition.

Locational: Of or referring to the geographic position of a feature.

Map Digitization: Conversion of map data from graphic to digital form.

Map Projection: A mathematical formula or algorithm for translating the coordinates of
features on the surface of the Earth to a plane for representation on a flat map.

Map Resolution: The accuracy with which the location and shape of map features are depicted
for a given map scale.

Map Scale: A statement of a measure on the map and the equivalent measure on the Earth, often
expressed as a representative fraction of distance, such as 1:24,000.

EPA QA/G-5G

B-4

Peer Review Draft
February 2002


-------
Map, Thematic: Map designed to provide information on a single topic, such as geology,
rainfall, or population.

Metadata: Information about a data set. Metadata for geographical data may include the source
of the data; its creation date and format; its projection, scale, resolution, and accuracy, and its
reliability with regard to some standard.

Method, Accuracy, and Description Data: A coding scheme developed by EPA to promulgate
standards for describing the type and quality of spatial data. The coding scheme includes both
database field definitions and standardized codes.

Modeling: Development of a mathematical or physical representation of a system or theory that
accounts for all or some of its known properties. Models are often used to test the effect of
changes of components on the overall performance of the system.

National Geodetic Vertical Datum of 1929: Reference surface established by the U.S. Coast
and Geodetic Survey in 1929 as the datum to which relief features and elevation data are
referenced in the conterminous United States; formerly called "mean sea level 1929."

National Hydrography Data set: A comprehensive set of digital spatial data that contains
information about surface water features such as lakes, ponds, streams, rivers, springs, and wells.

National Map Accuracy Standards: Specifications promulgated by the U.S. Office of
Management and Budget to govern accuracy of topographic and other maps produced by federal
agencies.

National Institute of Standards and Technology (NIST): A non-regulatory federal agency
within the U.S. Commerce Department's Technology Administration whose mission is to
develop and promote measurement, standards, and technology to enhance productivity, facilitate
trade, and improve the quality of life. NIST laboratories provide technical leadership for vital
components of the Nation's technology infrastructure needed by U.S. industry to continually
improve its products and services.

National Land Cover Data (NLCD): A nationally consistent land-cover data set developed by
the National Land Cover Characterization program.

National Spatial Data Infrastructure (NSDI): The technologies, policies, and people
necessary to promote sharing of geospatial data throughout all levels of government, the private
and nonprofit sectors, and the academic community. The NSDI was established in 1994 by
Executive Order 12906.

North American Datum of 1927: The primary local geodetic datum used to map the United
States during the middle part of the 20th century, reference to the Clarke spheroid of 1866 and an

EPA QA/G-5G

B-5

Peer Review Draft
February 2002


-------
initial point at Meade's Ranch, Kansas. Features on U.S. Geological Survey topographic maps,
including the corners of 7.6-minute quadrangle maps, are referenced to this datum. It is
gradually being replaced by the North American Datum of 1983.

North American Datum of 1983: A geocentric datum based on the Geodetic Reference System
1980 ellipsoid (GRS80). Its measurements are obtained from both terrestrial and satellite data.

Orthophotography: Perspective aerial photography from which distortions owing to camera tilt
and ground relief have been removed. Orthophotography has the same scale throughout and can
be used as a map.

Performance Criteria: Measures of data quality that are used to judge the adequacy of collected
information that is new or original, otherwise known as "primary data."

Photogrammetry: Science or art of obtaining reliable measurements or information from
photographs or other sensing systems.

Positional Accuracy: The closeness of locational information to its true position.

Precision: (i) The degree to which replicate measurements of the same attribute agree or are
exact. Precision is the degree to which a set of observations or measurements of the same
property, usually obtained under similar conditions, conform to themselves. A data quality
indicator (See related: Accuracy, Bias), (ii) The number of significant decimal places used to
store floating point numbers (e.g., coordinates) in a computer. Single precision denotes use of up
to seven significant digits to store floating point numbers. Double precision denotes use of up to
14 significant digits to store floating point numbers.

Projection: A mathematical model that transforms the locations of features on the Earth's
surface to locations on a two-dimensional surface.

QA Project Plan: A document describing in detail the necessary quality assurance, quality
control, and other technical activities that should be implemented to ensure the results of the
work performed will satisfy the stated performance criteria.

Quality Assurance (QA): An integrated system of management activities involving planning,
implementation, documentation, assessment, reporting, and quality improvement to ensure that a
process, item, or service is of the type and quality needed and expected by the client.

Quality Control (QC): The overall system of technical activities that measure the attributes and
performance of a process, item, or service against defined standards to verify that they meet the
stated requirements established by the customer; also, operational techniques that are used to
fulfill requirements for quality.

EPA QA/G-5G

B-6

Peer Review Draft
February 2002


-------
Quality Management Plan: A document that describes a quality system in terms of the
organizational structure, policy and procedures, functional responsibilities of management and
staff, lines of authority, and required interfaces for those planning, implementing, documenting,
and assessing all activities conducted.

Raster Data (Raster Image): A spatial data model made of rows and columns of cells. Each
cell contains an attribute value and location coordinates; the coordinates are contained in the
order of the matrix, unlike a vector structure, which stores coordinates explicitly. Groups of cells
that share the same value represent geographic features.

Remote Sensing: Process of detecting and/or monitoring chemical or physical properties of an
area by measuring its reflected and emitted radiation.

Root Mean Square Error: The square root of the average of the set of squared differences
between dataset coordinate values and coordinate values from an independent source of higher
accuracy for identical points.

Representativeness: The degree to which data accurately and precisely represent the frequency
distribution of a specific variable in the population.

Scale: Relationship existing between a distance on a map, chart, or photograph and the
corresponding distance on the Earth.

Soil Survey Geographic (SSURGO) Data: A nationwide, geospatial, soils database created by
the Natural Resources Conservation Service from 1:250,000-scale soil maps.

Spheroid: An ellipsoid that approximates a sphere. Used to describe (approximately) the shape
of the earth.

SSURGO: See Soil Survey Geographic Data.

Tic: A point on a map representing a location whose coordinates are known in some system of
ground measurement such as latitude and longitude.

Topography: Configuration (relief) of the land surface; the graphic delineation or portrayal of
that configuration in map form, as by contour lines. In oceanography the term is applied to a
surface such as the sea bottom or surface of given characteristics within the water mass.

Topologically Integrated Geographically Encoding and Referencing (TIGER) System: The

data system developed by the U.S. Census Bureau to describe the boundaries of all census
geography (e.g., states, counties, census tracts) and to tie decennial census tabulations to census
boundaries.

EPA QA/G-5G

B-7

Peer Review Draft
February 2002


-------
Topology: The spatial relationships between connecting or adjacent features in a geographic
data layer. Topological relationships are used for spatial modeling operations that do not require
coordinate information.

Vector: A data structure used to represent linear geographic features. Features are made of
ordered lists of x,y coordinates and represented by points, line, or polygons; points connect to
become lines, and lines connect to become polygons. Attributes are associated with each feature.

EPA QA/G-5G

B-8

Peer Review Draft
February 2002


-------
APPENDIX C

SPATIAL DATA QUALITY INDICATORS FOR GEO SPATIAL DATA

The Federal Information Processing Standard (FIPS) 173 (NIST, 1994) emphasized five
components of data quality that are basic to the Federal Geographic Data Committee metadata
[see Section 3.1.9, Records and Documentation (A9)]:

•	Accuracy—positional	• Logical consistency

•	Accuracy—attribute	• Lineage

•	Completeness

In geospatial data, like that for other environmental data, accuracy is defined as the
closeness of results to "true" values (surveying or remote-sensing reference points). All spatial
data are inaccurate (have error) to some degree. Generally stated, error (r) is equivalent to the
difference between the estimated value and the true value. Because a certain amount of
inaccuracy is inherent in all locational measurements, the degree of inaccuracy must be assessed
and compared to the accuracy required for the final geospatial data product.

There are two kinds of geospatial data accuracy:

•	Positional Accuracy is the closeness of the locations of the geospatial features to
their true position.

•	Attribute Accuracy is the closeness of attribute values (characteristics at the
location) to their true values. This applies to accuracy of continuous attributes such as
elevation and accuracy of categorical attributes such as soil types.

Positional Accuracy

An example of the kinds of positional accuracy problems that maybe encountered is
illustrated in the map of Condea Vista, in southeastern Oklahoma City.

The polygon on the map represents the boundary of a Resource Conservation and
Recovery Act site from a permit file map that was referenced to the U.S. Geological Survey
7.5-minute quad sheet and digitized. The points on the map are all estimates of the latitude/
longitude of the site derived by various methods. Note the distribution of the points. All are
valid, but some are not as accurate as others. Three points—ZIP code, PLSS, and an address
match—fall outside the facility boundaries. In systematic planning, requirements for the
project's positional accuracy need to be defined. Then, collected or acquired data are evaluated
against those requirements. Reporting requirements for data providers or data producers
document targets for accuracy (e.g., proof in labeling) and information for consumers to use in
determining fitness for use. Accuracy targets such as the FGDC's National Standard for Spatial
Data Accuracy Test Guidelines and EPA's Locational Reporting Standard of ± 25 meters might
be referenced.

EPA QA/G-5G

C-l

Peer Review Draft
February 2002


-------
Accuracy can be assessed by comparing geospatial data to a source map or data of higher
accuracy and determining statistical measures such as root mean square error and confidence
levels (e.g., error bars on kriging contours) to judge the amount of inaccuracy. A rule of thumb is
to use at least 20 points for comparison. For example:

•	Evaluation Data Set: Envirofacts Address Matching Points

•	Compared to higher accuracy source: Texas GPS border survey (20 points)

•	Projection: National Lambert Meters, (North America Datum of 1983)

•	Geographic area: Brownsville, TX to Las Graces, NM

•	Absolute difference in x range 8-669 m; y 8-1090 m

•	Root mean square error (RMSE) (x) = 187; RMSE (y) = 257

•	Accuracy = 2.4477*0.5*(RMSE(x) + RMSE(y)) = 544

•	Reporting: Tested 544 meters horizontal accuracy at 95% confidence level

location result in different "answers." The method used to
determine a facility location is a data quality indicator.

In systematic planning, it is important to set quality criteria for data or products being
produced or for those acquired from another source such as a map or spatial data set. Determine
the maximum error allowable in the product and see if it meets the project needs (e.g., EPA's
target for location information is ±25 meters by GPS). The data producer may provide or be
requested to provide statistics of accuracy for any acquired products. Identifying the steps used

EPA QA/G-5G

C-2

Peer Review Draft
February 2002


-------
to produce or create the data set would be helpful in order to document any transformations
between coordinate systems or reformatting that could impact accuracy. This could include
estimating the error in each transformation or conversion and checking on the propagation of
error between steps. For example, check the resolution of a product map by comparing the
projection to known values and compute the root mean square error.

Attribute Accuracy

Attributes are facts tied to the Earth's surface. Attributes include qualitative facts like
soil classification for areas of the Earth's surface on a soil map and quantitative facts like slope
or population at a point on the Earth's surface. Attributes are linked to geographic features in a
geospatial database via database identifiers. Attribute errors can be introduced from direct
observation, remote-sensing interpretation, or interpolation and can affect the accuracy of the
facts. Data producers need to provide accuracy information as proof of product.

For quantitative attribute accuracy, assessments can be carried out that vary with the data
use and its complexity, such as

•	assessing standard error for quantitative data (e.g., 7-meter uncertainty in slope value
based upon known 1-meter standard deviation in elevation measurements)

•	assessing or documenting known measurement error (e.g., Landsat "striping," where
error exists in every 6th row in a scene and is removed by a simple arithmetic
operation)

•	development of uncertainty models and Monte Carlo analysis to determine
uncertainty for spatial models.

For qualitative attributes accuracy, assessments can be carried out for classification of
nominal errors. A standard must be identified for comparison of the evaluated data to "true"
values such as ground-level observations of land characteristics, and the results reported for
evaluation against an accuracy criteria such an error matrix. Such a standard and evaluation can
provide the percentage of classification cases that are correct, percentage correctly classified, or a
Kappa Index, which adjusts for correct identification by chance. As part of the systematic
planning process, evaluation criteria (for example, accuracy or uncertainty criteria) need to be
developed and used in evaluation of the data for fitness for use.

Completeness

Completeness is defined as the degree to which the entity objects and their attributes in a
data set represent all entity instances of the abstract universe (defined by what is required for the
project's data use in systematic planning). Metadata should provide a good definition of the
abstract universe with defined criteria for selecting the features to include in the data set so the
data user can perform an independent evaluation. Missing data (incompleteness) can affect
logical consistency needed for correct processing of data by software.

EPA QA/G-5G

C-3

Peer Review Draft
February 2002


-------
Logical Consistency

A spatial data set is logically consistent when it complies with the structural
characteristics of the data model and is compatible with attribute constraints defined for the
system. In systematic planning, logical rules of structure (such as rules for topological
relationships) could be identified, as well as rules for attribute consistency needed for appropriate
data use. When acquiring data from another source or when creating new data, tests could be
planned to check spatial data against those defined requirements. For example:

•	In an electric utility application, a logical consistency rule may be in place indicating
that electrical transformers must always occur on power poles. If so, ensure that each
electrical transformer is assigned to a power pole. Those that are not are logically
inconsistent.

•	Are there valid attribute values for objects (e.g., for date attributes, the range of values
must fall between 1 and 31, inclusive)?

Inconsistencies violate rules and constraints. Data should meet rules and constraints such
as attribute range, geometric and topological constraints, and rules for spatial relationships in
order to be used according to the project's requirements. Consistency is needed for control of
transactions in database and software operations. Without consistency, additional time and effort
must be expended to allow software to handle the inconsistencies in ways that do not propagate
or increase the errors. Evaluations need to be reported in displays or written reports to
characterize product quality.

Precision

Precision is a data quality indicator often used for environmental data that were,
unfortunately, not included in the FIPS 173 list. It is defined as the number of decimal places or
significant digits in a measurement (related to standard deviation around the mean of many
measurements and rounding off). Although GIS software transactions are often more precise
(more significant figures) than the data it processes, errors can occur (e.g., conversion of data
with two significant figures, which displaces point locations to one, as shown in Figure 2).

When the coordinates used to represent the locations of geographic features have low precision
(that is, few significant digits), this might be an indicator of data quality that needs to be
assessed. If the precision of the coordinates in the data are not sufficient to represent the
geographic features to the degree required, this issue should be documented and a determination
made as to whether the data will accommodate their intended use.

Lineage

Data lineage is the description of the origin and processing history of a data set. It
includes the name of the organization that produced the data so that its policies, procedures, and
methods can be evaluated to see if they were biased in representing the surface of the Earth or its
features. For example, if lineage indicates that the U.S. Geological Survey is the originator of a
geospatial data set, then certain assumptions about their policies, procedures, and methods could
be made. For example, the U.S. Geological Survey requires that no more than 10 percent of

EPA QA/G-5G

C-4

Peer Review Draft
February 2002


-------
precision (for example, rounding up to the
nearest degree of latitude/longitude), may not
precisely reflect actual locations. Precision is a
data quality indicator.

points tested on a map boundary can be in error by more than 1/30 of an inch at a scale of 1 inch
to 20,000 feet. Lineage also provides references for data accuracy (for example, map accuracy
standards), how accuracy was determined, and corrections made in producing the source map
from which the data were derived. Lineage for general metadata provides spatial data quality
characteristics such as accuracy, precision, and scale for a series of products. Information as to
the coordinate systems used to reference locations (including necessary, unique projection
parameters that are requi red to fully document map projections) are also components of lineage
information in metadata.

EPA QA/G-5G

C-5

Peer Review Draft
February 2002


-------