Options for the National Drinking Water Contaminant Occurrence Data Base / Background Document (working Draft) for the National Contaminant Occurrence Data Base Stakeholders Meeting, May 21-22, 1997, Washington, D.C


United States      Office of Water     EPA 815-D-97-001
Environmental Protection  (4606)       May 1997
Agency
Options for
the National Drinking
Water Contaminant
Occurrence Data Base
Background Document
(Working Draft)
For the
National Contaminant Occurrence Data Base
Stakeholders Meeting
May 21-22, 1997
Washington, D.C.

-------
[ This page left intentionally blank]

-------
DISCLAIMER
This “Options” Paper represents an effort by EPA staff to consolidate into a single working
draft Background Document a number of suggestions and ideas generated in the course of
discussions by the Office of Ground Water and Drinking Water’s National Contaminant
Occurrence Data Base Team. This draft will be subject to extensive revision, development,
and qualification as the Agency proceeds through both the external public and internal EPA
deliberative processes. The information presented in this document is a discussion of
possible options available to the EPA and should not be interpreted as EPA policy.

-------
[ This page left intentionally blank]

-------
Table of Contents
Page
Executive Summary
Part One: Overview i
I. Introduction I
II. Legislative Basis 2
Ill. Relation to Other Safe Drinking Water Act Activities 3
IV. Need for State, Public and Scientific Community Input 4
Part Two: Data Base Development and Use 7
V. What will the Data Base be used for? 7
VI. How should the EPA make the data in the Data Base readily
accessible to the public? 7
VII. Which contaminants not currently regulated or on the unregulated
contaminant list should be included in the Data Base? 8
VIII. What information is required for reliable, scientifically sound data
to support the Administrators decisions to regulate a contaminant? 12
IX. What options should be considered for the design and structure of
the Data Base? 21
X. What sources of data should the Data Base draw on and should
historical data be included? 24
A. Sources of Data 24
1. Data to be Included in the Data Base 24
2. Potential Data Sources 25
a. Safe Drinking Water Information System
(SDWIS) 25
b. STOrage and RETrieval System(STORET) 26
c. National Water Information System (NWIS) -
U.S. Geological Survey 27
d. Other Federal Data Bases 28
e. State Data Bases 28
f. Private Data Bases 29
B. Historical Data 29
XI. What factors and methods may be important in selecting and
interpreting the data and in its use by many users? 30
XII. What are the Next Steps beyond this Stakeholders Meeting? 35

-------
EXHIBITS Page
Statutory Requirements for Occurrence Data Base and Related Activities 3
Timeline: Requirements of the SDWA Amendments of 1996 5
NCOD Relation to the Contaminant Candidate List and Administrator’s
Determinations 6
TEXT BOX I - National Drinking Water Advisory Council, Working
Group on Occurrence and Contaminant Selection;
Proposal for the First Contaminant Candidate List 10
TEXT BOX 2 - What quality assurance and quality control steps are
important to consider in specifying data elements and
selecting data for inclusion in the NCOD? 15
TEXT BOX 3 - Safe Drinking Water Information System (SDWIS)
Unregulated Contaminant Monitoring Minimum Data
Elements Reported to EPA 20
Concept Flow Chart - Option 2 23
TEXT BOX 4 - Analysis of Tetrachloroethylene and Trichloro-
ethylene in Public Water Systems 32
APPENDICES
A - SDWAA Section 126, Occurrence Data Base 38
B - Unregulated Contaminant List 40
C - Priority List of Substances Which May Require Regulation Under
The Safe Drinking Water Act 41
o - Safe Drinking Water Information System (SDWIS) 43
E - Storage and Retrieval System (STORED 45
F - National Water Information System (NWIS) 52
G - Envirofacts 54
H - Safe Drinking Water Information System (SDWIS) Sampling
Business System Data Elements 56
- Glossary 63
J - Data Tables for TEXT BOX 4 69

-------
EXECUTIVE SUMMARY
Background Document
Options for
The National Drinking Water Contaminant Occurrence Data Base
The Environmental Protection Agency (EPA) requests input from the public, States and the
scientific community on the design and structure, input parameters and requirements, and
use and interpretation of a National Drinking Water Contaminant Occurrence Data Base.
The Safe Drinking Water Act Amendments of 1996 (SDWA Amendments, Section 126)
require establishing a National Drinking Water Contaminant Occurrence Data Base to:
• Include both regulated and unregulated contaminants
• Identify contaminants that may be placed on the Contaminant Candidate List
• Support the Administrator’s determinations to regulate contaminants in the future
• Support the review of existing regulations every six years and of monitoring
requirements
• Make the data base available to the public in readily accessible form
• Be assembled by August 1999, and maintained thereafter.
Section 125 of the Safe Drinking Water Act Amendments of 1996 provides the EPA
Administrator the authority to require reporting of parametric data for both regulated and
unregulated contaminants to support regulatory decisions. The approach described below
relies principally on electronic data reporting.
Specific questions that EPA needs input on are:
(1) What will the National Drinking Water Contaminant Occurrence Data Base (NCOD or
“the Data Base”) be used for?
EPA decisions affected by the Data Base include the Administrator’s contaminant regulation
determination, the completion of the second Contaminant Candidate List, and the six year
review of existing drinking water standards and monitoring requirements.
(2) How should the EPA make the data in the Data Base readily accessible to the public?
Several options exist: use of the Internet, preparation of a National Summary compiled from
the Data Base, and use of the Safe Drinking Water Hotline.
(3) Which contaminants that are not currently regulated or on the unregulated contaminant
list should be included in the Data Base?
The 1991 Drinking Water Priority List, the Superfund contaminant list and the Pesticide
registration list provide starting points for contaminants to be considered for inclusion in the
Data Base. -
(4) What information is required for reliable, scientifically sound data to support the
Administrator’s decisions to regulate a contaminant?

-------
Data quality will be the product of sampling and laboratory practices that are known and
documented. These practices should be documented in the data base to allow comparison
of data nationally across States and other data sources. Such documentation may include
an indicator of data quality or specifically identify such factors as analytical method used and
detection limit. Additional monitoring information necessary to support regulatory and other
uses of the Data Base need to be discussed.
(5) What options should be considered for the design and structure of the Data Base?
Because resources are limited for a second drinking water data base, electronic connection
to data bases such as the Safe Drinking Water Information System (SDWIS), the STOrage
and RETrieval System (STORET) and the U.S. Geological Survey’s National Water
Information System (NWIS), as well as others, may be one useful option, with SDWIS being
considered for the “core” of the Data Base for public water system data. Other options are
a stand-alone data base or use of summary data in a scaled down data base.
(6) On what sources of data should the Data Base draw and should historical data be
included?
The law identifies parametric data for both regulated and unregulated contaminants. Data
sources may include SDWIS for contaminants in drinking water and STORET, NWIS and
State ambient water quality data bases for contaminants in source waters that are likely to
occur in public water systems. Other reliable public and private sources of data may be
used.
(7) What factors and methods may be important in selecting and interpreting the data and
in its use by many users?
The primary objective in the interpretation of the data is to support the Administrators
decision about whether to regulate a contaminant in the future. Reliability of the data is a
key factor identified in the law. Statistics and the methods of other relevant fields will be
applied to analyzing and interpreting the data. The data of the NCOD will need to be
related to geographic, environmental and health effects data.
Several decisions need to be made in the next two to three years: Administrators decision
whether to regulate at least five contaminants by 2001; the second Candidate Contaminant
List by 2001 to be specifically supported by this Data Base; and the six year review of
existing regulated contaminant standards by 2002. These activities indicate the need for the
Data Base to have reliable data for use as soon as possible. However, because of the
potential complexity of assembling the NCOD, a practical option may be establishing it in
“phases”, such as making it operational for public water system data first, then for EPA
ambient water quality data, followed by other Federal and State ambient water data.
The Stakeholders Meeting on May 21-22, 1997, in Washington, D.C. will examine these
questions with a wide range of participants representing the public, States, the drinking
water industry, and environmental organizations.
II

-------
Background Document
Options for
The National Drinking Water Contaminant Occurrence Data Base
Presented to the
Stakeholders Meeting
May 21-22, 1997
Ariel Rios Building, Room 6226
1200 Pennsylvania Avenue
Washington, D.C.
Part One: Overview
I. Introduction
The U.S. Environmental Protection Agency (EPA) requests input from the public, States and
the scientific community on the design and structure, input parameters and requirements,
and use and interpretation of a National Drinking Water Contaminant Occurrence Data
Base. The Safe Drinking Water Act Amendments of 1996 require the establishment of a
National Drinking Water Contaminant Occurrence Data Base (NCOD or uthe Data Base”)
for both regulated and unregulated contaminants. (SDWA Amendments, Section 126) This
data base is to provide the basis for identifying contaminants that may be placed on the
Contaminant Candidate List (SDWA Amendments, section 102(b)) and to support the
Administrator’s determinations to regulate contaminants in the future. The data base is also
expected to support the review of existing regulations every six years and of monitoring
requirements. The law indicates that information from the data base is to be available to the
public in readily accessible form. The data base must be assembled by August, 1999, and
maintained thereafter. The approach described below relies on principally on electronic data
reporting.
Specific questions that EPA needs input on are:
(1) What will the Data Base be used for?
(2) How should the EPA make the data in the Data Base readily accessible to the public?
(3) Which contaminants that are not currently regulated or on the unregulated contaminant
list should be included in the Data Base?
1

-------
(4) What information is required for scientifically sound data to support the Administrator’s
decisions to regulate a contaminant?
(5) What options should be considered for the design and structure of the Data Base?
(6) On what sources of data should the Data Base draw and should historical data be
included?
(7) What factors and methods may be important in selecting and interpreting the data and
in its use by many users?
The purpose of this document is provide the States, public water systems (PWS), the public
and other federal agencies with an understanding of the current perspective being applied
to the development of the National Contaminant Occurrence Data Base within the
Environmental Protection Agency. The document describes the legislative basis for the data
base, its relation to other drinking water program activities, options currently being
considered for its development, proposed data elements to ensure data reliability,
possibilities for its use and interpretation, current contaminant occurrence analysis,
descriptions of existing data bases that might be considered for inclusion to the NCOD, and
future steps to develop the data base.
II. Legislative Basis
By August 1999, the Environmental Protection Agency is to assemble and maintain a
National Drinking Water Contaminant Occurrence Data Base. (SDWA Amendments,
Section 126; see Appendix A) To accomplish this assignment from Congress, EPA is to:
• Use inforination on occurrence of both regulated and unregulated contaminants in
PWS;
• Use reliable information from other public and private sources;
• Obtain input from interested parties and Science Advisory Board (SAB) on structure
and design, input parameters and requirements, and use and interpretation of the
data;
• Solicit recommendations of National Academy of Sciences (NAS) and States, and
any interested parties can provide recommendations, on contaminants to be
included, including additional unregulated contaminants;
• Make data available to public in readily accessible form;
• Include detection of regulated contaminants at a quantifiable level in a PWS;
• Include unregulated contaminant data for PWS above 10,000 population and for
representative sample of PWS serving 10,000 or fewer; and
• Use data from the data base in making determinations for which contaminants to
regulate in the future.
2

-------
Under the SDWA Amendments Section 125 (amending section 1445 (a)(1)(A)) as amended,
the EPA Administrator has the authority to require submission of parametric data and other
information to establish regulations and for other purposes. Section 125 also sets out a
specific process for requiring the monitoring and reporting of unregulated contaminants. All
PWSs serving more than 10,000 people and only a representative sampling of systems
serving 10,000 or fewer people are required to monitor and report data for inclusion in the
NCOD.
Ill. Relation to Other Safe Drinking Water Act Activities
The National Contaminant Occurrence Data Base supports many other activities identified
in the SDWA Amendments. These activities are described in the table below and appear
in the Timeline of uRequirements of the SDWA Amendments of I 996 :
Statutory Requirements for Occurrence Data Base and Related Activities
February 1998 and then
every 5 years
Publish Drinking Water Contaminant Candidate
List
August 1998
Review monitoring iequirements for not fewer
than 12 regulated contaminants and promulgate
any necessary modifications
August 1999
Assemble National Drinking Water Contaminant
Occurrence Data Base for regulated and
unregulated contaminants
August 1999
Issue Regulations for Unregulated Contaminant
Monitoring
August 1999 and then
every 5 years
Publish Unregulated Monitoring List for not
more than 30 chemicals
August 2001 and then
every 5 years
Publish Determinations of whether or not to
issue regulations for at least 5 contaminants
August 2002 and then
every 6 years
Review existing regulations and monitoring
requirements and modify as appropriate
August 2003 and then
every 5 years
Issue Proposed Maximum Contaminant Level
Goals and Regulations for Selected
Contaminants
February 2005 and then
every 5 years
Issue Final Maximum Contaminant Level Goals
and Regulations for Selected Contaminants
3

-------
The Data Base is specifically identified to support the Drinking Contaminant Candidate List
that will identify contaminants for future regulatory consideration and the Administrators
determination as to whether to regulate these contaminants. These relationships are shown
in M NCOD Relation to the Contaminant Candidate List and Administrators Determinations.”
IV. Need for State, Public and Scientific Community Input
Section 126 is very specific about input from States, the public and the scientific community:
• EPA is to solicit recommendations from the Science Advisory Board, the States and
other interested parties concerning the development and maintenance of the data
base, including its structure and design, data input parameters and requirements, and
data use and interpretation. (Paragraph (g)(2))
• EPA is to periodically solicit recommendations from officials of the National Academy
of Sciences and the States, and any person may submit recommendations, with
respect to the contaminants that should be included in the data base. Such
recommendations shall include reasonable documentation that (a) the contaminant
occurs or is likely to occur in drinking water, and (b) the contaminant poses a risk to
public health. (Paragraph (g)(4)) -
It is the intent of EPA that this Stakeholder Meeting, as well as others that are planned, will
give the opportunity for as many people to provide input to the development of the NCOD
as possible. Comments are requested on the questions posed in this document and on
questions not specifically addressed here, but have a relationship to the NCOD
development.
EPA Regional Role
The EPA Regional Offices will solicit the states and various programs within their respective
regions for input and recommendations concerning the development of the Data Base,
including its structure and design, data input parameters and requirements, and data use
and interpretation. The Regional Offices will also review and comment on all regulation and
guidance documents relating to or affecting the National Contaminant Occurrence Data
Base. This input will be provided to the EPA Office of Water.
4

-------
Requirements of the SDWA Amendments of 1996
cycle repeats each 5 yrs
cycle repeats each 5 yrs
________ cycle repeats each S yrs
Publish
Drinking Water
Contaminant
Candidate List
I
Publish
Unregulated
Monitoring List &
Requirements
for not more than
30 contamInants
Publish
Drinking Water
Contaminant
Candidate List
j fln&ruI.wb dow 1— —1 1
limo
(F.bms y8, 1911)
3)
(August 6, 1119)
5pr
(August 8, 2001)
iyr,imo
(F.bnt. y 8,2003)
U
Publish
Drinking Water
Contaminant
Candidate List
•r lOyt l lyr, imo
U
Publish
Unregulated
Monitoring List &
Requirements
for not more than
30 contaminants
Publish Determinations
(regs, no regs), for at least 5
contaminants and Proposals
here or within 2 yrs
Requirements for the review of existing monitoring requirements (24 mo), or review and revision” of
NPDWRs (each 6 yrs) are not included above
Publish Determinations
(regs, no regs), for at least 5
contaminants and Proposals
here or within 2 yrs
SDWA
Amendments,
August 6, 1996
I ,
Legend: ______
Round 1
cs Round 2
_ IfflhIifluII} Round 3
m i ndsndodm
5

-------
NCOD RELATION TO THE CONTAMINANT CANDIDATE LIST
AND ADMINISTRATOR’S DETERMINATIONS
DATA SOURCES
NCOD
USES AND ACTIONS
Regulated Contaminant
Occurence (Parametric)
Data
6
I Contaminant
Candidate List
Unregulated
Contaminant List
Revised (every 5
years)
Contaminants Needing
More Information
Contaminants Selected for
Monitoring under
Unregulated Contaminant
Monitoring Regulation
Remaining Contaminants
for which Information is
Needed
States and Public
Water Systems
Reliable Information
from Other Public
and Private Sources
r
Review of Current
Contaminant Regulation
and Monitoring
Requirements
Administrator’s
Determination of
Adequacy or Modification
of Existing Regulations
Unregulated
Contaminant Occurence
Data
Other Contaminant
Occurrence Data
Contaminants Selected
for Administrator’s
Determination on Future
Regulation

-------
Part Two: Data Base Development and Use
V. What will the Data Base be used for?
It is anticipated that once the Data Base is assembled, it will have many other uses,
including supporting other Federal and State programs that rely on drinking water standards
and data for their operation, and assisting the public in understanding the quality of the
water they drink. A key factor in incorporating data into the Data Base will be the reliability
of the data to support scientifically sound Administrative decisions concerning future drinking
water standards and monitoring. Notably, reporting non-detections of contaminants (that
is, contaminants tested for but not found) may be as important as reporting detections in
determining the size and use of the National Contaminant Occurrence Data Base.
The NCOD must be able to meet various OGWDW requirements including: the contaminant
identification process; the regulatory development process; the unregulated development
process; the regulatory review process; economic analysis for future regulation (ground
water disinfection rule, arsenic rule, radon rule, disinfection-by-products rule); and other
drinking water contaminant regulatory activities.
While the National Contaminant Occurrence Data Base is specifically to be used to support
the Contaminant Candidate Listing and Contaminant Selection processes, the data will be
used in other EPA drinking water activities. The data will be drawn on to review the current
regulations and monitoring requirements to determine whether they should be modified.
States may draw on the data to compare drinking water contaminant occurrence nationally
to their experiences locally through the required consumer confidence reporting. Other
environmental programs may also use the data to assist them in determining existing
contaminant levels for prevention, permit and remedial activities. The data should also be
able to assist in particular in guiding source water protection actions.
Preliminary EPA Staff Proposal: Use the data for determining which contaminants to
regulate in the future, which contaminants will be on the unregulated contaminant monitoring
list, and for reviewing existing drinking water standards and monitoring requirements for
possible future adjustment.
VI. How should the EPA make the data in the Data Base readily accessible to the
public?
Several options exist to make the data in the Data Base readily accessible to the public:
(1) User-friendly electronic connection via the Internet could provide electronic lists of
contaminants by geographic reference, maps of contaminant occurrence and standard
7

-------
Consumer Confidence Report responses;
(2) A national summary of data in the Data Base could provide statistical overview of
contaminants being tracked in the Data Base; and
(3) Information could be provided through the Safe Drinking Water Hotline.
Relation to Consumer Confidence Reporting
The Consumer Confidence Reporting required under Section 114 of the SDWA Amendments
should be based on the same data that will be reported to the NCOD. Options are being
considered about what data may be required for Consumer Confidence Reports. Additionally,
options for the NCOD might include a report that electronic users of the NCOD can download,
or in hard copy, for a standard set of drinking water contaminant occurrence information for
a particular public water system or area.
Preliminary EPA Staff Proposal: Use the Internet, National Summary and the Hotline to
make data readily accessible to the public. Continue to identify other means of making the
data accessible to the public. Determine the appropriate relationship to Consumer Confidence
Reporting.
VII. Which contaminants not currently regulated or on the unregulated contaminant
list should be included in the Data Base?
The process to identify and select contaminants for future regulation started with the 1991
EPA “Priority List of Substances Which May Require Regulation Under the Safe Drinking
Water Act.” (Federal Register, Vol. 56, No. 9, pp.1470-1473.) The Safe Drinking Water Act
also provides for monitoring of contaminants that, while not regulated, are considered to be
candidates for further evaluation in the regulatory process. These latter contaminants are
referred to as “unregulated contaminants,” which comprise a list of 45 contaminants. (The
Unregulated Contaminant Monitoring List and the Priority List of Substances which may
Require Regulation Under the SDWA are Appendices B and C.) The National Drinking
Water Advisory Council (NDWAC) is a group established under SDWA to advise EPA on
drinking water issues. It formed a working group to address occurrence and contaminant
selection and recently has developed a preliminary list of contaminants (including pathogens)
to be considered by EPA for listing and possible future regulation. Text Box I identifies
contaminants that the Working Group on Occurrence and Contaminant Selection has
indicated should comprise the first Contaminant Candidate List from which the Administrator
may choose not less than five contaminants to make determinations about whether they
should be regulated or not. The contaminants on these lists may be a starting point for
considering which contaminants to include in the Data Base. After February 1998, EPA is to
revise the Contaminant Candidate List every five years and is developing a “Contaminant
Identification Method” to accomplish this revision, which is to be supported by information from
the NCOD (SDWA Amendments, Section 102(a)).
8

-------
The SDWA Amendments place some additional requirements on EPA. EPA is to solicit
recommendations from officials of the National Academy of Sciences and States, and the
public may submit recommendations, with respect to contaminants for inclusion in the Data
Base. For a contaminant to be considered, reasonable documentation should show that the
contaminant occur or is likely to occur in drinking water and poses a risk to public health.
For some potential contaminants, only production and use data may be available. Should
these potential contaminants be included in the Data Base? If so, what data should be
obtained for useful analyses?
Preliminary EPA Staff Proposal: Consider all sources of information for potential
contaminant that may be included in the NCOD, as well as drawing on the EPA ’s Contaminant
Identification Method to identify contaminants for inclusion in the Data Base.
9

-------
TEXT BOX I
National Drinking Water Advisory Council
Working Group on Occurrence and Contaminant Selection
Proposal for the First Contaminant Candidate List
The National Drinking Water Advisory Council’s Working Group on Occurrence
and Contaminant Selection met on April 3-4, 1997, to develop recommendations
for the EPA in developing the first Contaminant Candidate List. The initial
Candidate List (below) was developed from a starting list prepared by the
Agency of approximately 370 contaminants. The initial Candidate List includes
contaminants from the STORET (Storage & Retrieval) database which had data
indicating concentrations in ambient waters that exceeded the health effect level
as calculated from the IRIS (Integrated Risk Information System). The STORET
database houses occurrence data on contaminants found in ambient ground and
surface water, and includes some drinking water data. It also contains
contaminants from the US Geological Survey’s National Water Quality
Assessment (NAWQA) program that were found in 10% or more of the samples
(e.g., MTBE, and several pesticides).
Additional contaminants were identified from the 1991 Drinking Water Priority
List which the Working Group felt may be of concern to drinking water, but where
additional data evaluation is necessary. Recommendations for microorganisms
will be developed after consultation from a number of experts in the field of
microbiology.
The following contaminants are proposed to be on the initial list:
Acetochlor Diuron Nickel
Acetone Endosulfan Nitrobenzene
Alachlor EPIC (s-ethyl-dipropyl- 0-cresol
Aldicarbs thiocaebamate) Phenol
Aluminum Ethylene glycol Prometon
Butylate MEK (methyl ethyl Propanil
Chlorpyrifos ketone) Sulfate
DCPA (Dacthal and MTBE (methyl t-butyl Tebuthiuron
metabolites ether) Terbacil
DDE (p,p Dichloro- Malathion Triazines and
diphenyldichloro- Methyl parathion deg radates
ethylene) Metolachlor Trifluralin
Diazinon Metribuzin Zinc
10

-------
• The microorganisms for proposal should be drawn from the following group,
based on recommendations to be developed by an EPA-convened expert
panel:
Acanthamoeba Cyclospora Mycobacteria
Adenovirus Hepatitis E virus, Norwalk virus
Aeromonas Hepatitis A virus Rotavirus
Astroviruses Helicobacterpylori Viruses (entero)
Campylobacter Mycrosporidia
11

-------
VIII. What information is required for Reliable, Scientifically Sound Data
to Support the Administrator’s Decisions to Regulate a Contaminant?
A. Data Quality and Data Elements
A major issue regarding development of the National Drinking Water Contaminant
Occurrence Data Base concerns the quality and selection of data to be included
in the Data Base. Data quality will in turn affect use and interpretation of the data.
The SDWA Amendments of 1996 indicate that the Administrator shall assemble
and maintain the data base for reliable information from public and private
sources, in addition to that data for regulated and unregulated contaminants in
drinking water from public water systems. The characteristics of “reliable data”
need to be described for the purposes of data use and especially for the
Administrator’s decisions on contaminant regulation. A practical question is: how
can measures of reliability be made essential elements of the NCOD so that a
user can be informed that the data meet a given level of quality? Considerations
of “reliability” may have a major influence on the data base and which data are
used for different types of decisions.
Data reliability refers in part to the soundness of the previously applied monitoring
and laboratory procedures used to generate the data. Laboratories typically
adhere to defined quality assurance/quality control (QNQC) procedures to insure
that results are reproducible and are of known and acceptable precision.
According to Standard Methods for the Examination of Water and Wastewater ,
American Public Health Association 1992, “quality assurance is a set of operating
principles that will produce data of known and defensible quality.” In contrast,
quality controls are initial demonstrations of a laboratory’s capability to produce
credible results. EPA’s data quality requirements are addressed in “EPA
Requirements for Quality Assurance Project Plans for Environmental Data
Operations” (EPA A/R-5, August 1994).
Measures of precision, accuracy, representativeness, and comparability (PARC)
are examples of indicators typically used to assess the quality of field methods
and laboratory results. These measures need to be transformed into data
elements that can be reported in the Data Base. Precision is the degree of
agreement among a set of repeated measurements or replicate samples.
Accuracy is a measure of confidence in a measurement. It is the extent of
agreement between an observed value and the true value. Accuracy can be
measured through the use of quality control standards derived from known
concentrations. Representativeness is the extent to which measurements actually
depict the true environmental condition or population being evaluated. A number
of factors may affect the representativeness of the data, including sampling
location. For example, samples collected just below a pipe outfall are not
representative of average conditions of a stream and will introduce bias to the
measurements. Likewise, samples taken just below the water table of an
unconfined aquifer will not be representative of water quality 200 feet deeper
where a public water well may be drawing its water. Comparability is the extent
12

-------
to which data from one study can be compared directly to either past data or data
from another study. Standardized sampling and analytical methods as well as
consistent units of reporting and site selection procedures help ensure
comparability. Comparability is improved by the use of the same definition for any
particular data element across data sources drawn on. Measures of precision,
accuracy, representativeness, and comparability help to evaluate sources of
variability and error and thereby increase confidence in the data. (The Volunteer
Monitor’s Guide to Quality Assurance Project Plans, U.S. EPA Office of Wetlands,
Oceans and Watersheds, 1996).
In designing the National Contaminant Occurrence Data Base, it may be possible
to create an index of data quality or reliability that could be included as a data
field and used as the basis for searching the NCOD on broad data quality criteria.
The index may be an overall score of data quality provided by the data source and
be based on very specific guidelines provided by EPA. The indicator may be tied
to a specific data collection effort or project, or another relevant unit of record.
Scoring may also be a good way to tier the data in the NCOD for different types
of decisions based on specific score ranges. Another approach may be to identify
specific data elements that should be addressed at each tier as uindicatorsn of
data quality, with tiers being related to the decisions to be made or the uses of the
data.
It may make good sense to define different tiers” of data quality for different data
applications. For example, if the primary purpose of the NCOD is to support
regulation development, as specified by the Amendments, a higher order of data
quality will be needed than when the data are used for basic contaminant
identification. In general, the more stringent the data quality demands, the fewer
the number of data sets that will meet those demands. The range of potential
uses of the NCOD is extremely broad and includes in addition to regulatory
development, pollution prevention, contaminant identification, co-occurrence
analysis and public right-to-know, among other uses. In this context, it may be
useful to define two or three levels of data quality that are practical for the range
of potential uses of the NCOD. The general public may want the greatest number
of data sources at their fingertips and be less concerned with data quality. It
should be kept in mind, however, that the lower the quality standards, the greater
the danger of misuse and misinterpretation of the data because of their unknown
or poor quality.
In general, data reliability can also correspond to the correct interpretation and
use of the NCOD data. This is particularly a concern when data from multiple
sources are being combined in an analysis. Understanding the behavior of the
data on a macro level, through an exploratory data analysis, can help to
determine correct application of the data. Such an analysis can also be useful to
identify outliers and characterize the temporal and spatial coverage of the data.
In the future, EPA may be able to provide general guidelines to those analyzing
data in the NCOD. Text Box 2 addresses factors contributing to reliable data and
associated data elements. The measurable or identifiable characteristics of data
13

-------
quality should be captured in the data elements reported with the other
contaminant data to allow for comparability of data across sources or time.
Preliminary EPA Staff Proposal: Data for regulated and unregulated
contaminants submitted by States based on certified laboratory analyses are
considered reliable. Include indicator information for sample collection,
laboratory analysis and data quality, such as analytical method and detection limit,
to ensure reliable, scientifically sound data are used to make the appropriate
decision and keep the NCOD manageable for its principal uses.
14

-------
Text Box 2
What quality assurance and quality control steps are important to consider
in specifying data elements and selecting data for inclusion in the NCOD?
Specific quality assurance/quality control information will need to be associated
with the data on contaminant occurrence so that scientifically sound decisions on
future drinking water regulations can be made. But which of the following quality
control data elements should be used to define the quality of the data?
1. What sample handling protocols for sample collection, preservation, etc. were
followed?
2. What method did the lab use to analyze the sample? Is it an approved EPA
method? Is it a method generally accepted by the scientific community? What is
the precision and accuracy of the method?
3. What analytical equipment was used by the lab to perform the analysis?
4. Has the lab conducted precision, accuracy, and method detection limits studies
for the analysis in question? What is the % Relative Standard Deviation (RSD)?
What was the percent recovery to the true value?
5. Many of the methods specify quality control (QC) procedures which must be
followed to ensure accurate and precise data. Did the lab follow the QA
procedures specified by the method?
6. Were QC samples analyzed along with the samples? Contamination is a common
source of error in both sampling and analytical procedures. QC samples help
identify when and how contamination might occur. For most projects, there is no
set number of field or laboratory QC samples which must be taken. The general
rule is that 10% of samples should be QC samples. Data quality is determined
by evaluating the results of all the quality control samples and determining
precision and accuracy.
7. Did the lab method blank demonstrate low system background by verifying that
contamination does not exist above an established acceptable level?
8. Were duplicate samples and matrix spikes analyzed?
9. What detection limit was used by the lab? The term detection limit can apply to
monitoring and analytical instruments as well as to methods. In general, detection
limit is defined as the lowest concentration of a given pollutant a particular method
or equipment can detect at greater than zero. Readings that fall below the
detection limit are too unreliable to use in the data set. Furthermore, as readings
approach the detection limit (that is, as they go from higher, easier-to-detect
15

-------
concentrations to lower, harder-to-detect concentrations) they become less and
less reliable. The instrument detection limit (IDL) is equal to three times the
standard deviation of a series of ten replicate measurements of the calibration
blank signal. The method detection limit (MDL) is the minimum concentration of
a substance that can be measured and reported with 99% confidence.
10 Has the lab which produced the data demonstrated their capability to perform the
analyses through successful participation in performance evaluation (PE) studies?
Did it meet the acceptance criteria normally applied to data in Water Supply or
Water Pollution PE studies during the time frame of when the data was
generated?
How should the factors identified above be reflected in the NCOD dataelements?
Some specific information that affect data quality based on the questions and
discussion above are:
Sample Collection and Handling :
Sampling point/location
Collection time/date
Collection method
Sample container used
Sample volume
Storage temperature
Sample Dechlorinated (YIN), with what?
Sample Acidified (YIN), with what?
Preserved sample pH
Laboratory Analysis :
Sample type
Method Used for analysis
Instrument used
Contaminants
CAS number
Analytical results
Unit of measurement
Detection level
16

-------
Data Quality :
Data Quality (Accepted/Rejected/Preliminary)
Have the analytical results met established data quality criteria discussed in this
section?
• Followed proper sampling protocols.
• Analysis performed by a certified lab.
• Followed method QC procedures.
• Demonstrated precision and accuracy for the method used
• Method Blank analyzed and results were within established acceptable limits.
• QC samples analyzed and results were within established acceptable limits.
• Duplicate samples analyzed and results were within established acceptable limits.
• Matrix spikes analyzed and results were within established acceptable limits.
• Met Performance Evaluation Studies criteria during the time frame of when the
data was generated.
Which of these are key data elements that could be indicators of data quality for the
NGOD? To make the Data Base manageable, could criteria be developed to rate
data sources for quality or use, then reference the data source and report
analytical method and detection limit for data qUality?
17

-------
B. Additional Information Requirements
In addition to data quality factors reflecting sampling and laboratory practices,
other information is needed to fully use the NCOD data. The law identifies data
on regulated and unregulated contaminants as necessary for inclusion in the
NCOD. The SDWA Amendments then indicate that the Data Base may include
data on other contaminants that may be of concern in the future and establishes
a process to identify these contaminants. Since the data on this range of
contaminants needs to allow comparability of data for the Administrator’s
determinations about which contaminants may require future regulation, the data
need to be of known and documented quality. Key questions may be:
• Should the data elements reported also characterize the environmental
setting to allow comparison?
• Should the data base incorporate data on chemical, microbial, and
radiological production, use, and contaminant disposal for contaminants
alikely to occur in drinking water”?
Therefore, the information to be included in the design of the Data Base may
address some or all of the following considerations:
• Contaminant Identification - the contaminant name; concentration
measurement; any co-occurring contaminants
• Occurrence Conditions - data on the condition in which the contaminant is
found that affects drinking water quality and/or the contaminant’s
production and use
• Sampling Point - all data on the contaminant location, date of collection,
weather conditions, collection depth
• Sample Collection - handling procedures, testing methods, detection limits
and sampling procedures
• Quality Assurance/Quality Control - procedures to insure that results are
reproducible and are of known and acceptable precision, and the data can
be interpreted accurately
Reported contaminant data that do not address these considerations sufficiently
(that is, leave out some of this information) may be used as an initial indication of
contaminant occurrence or likely occurrence, but not as a confirming result for
purposes of regulatory decision.
EPA will consider whether modifications to the Safe Drinking Water Information
System (SDWIS) need to be made to accommodate additional information for the
18

-------
NCOD, since SDWIS will likely be the core data system for public water system
data. A listing of the SDWIS data elements for Unregulated Contaminant
Monitoring is found in Text Box 3. The Unregulated Contaminant Monitoring data
elements represent EPA ’s first attempt to establish a minimum set of information
requirements for a contaminant occurrence data base in 1994. While these data
elements are used to were first used to describe contaminant occurrence
information, key questions remain as to:
• whether they are adequate to reflect data reliability across potential data
sources for making future contaminant regulation determinations, and
• which data elements are needed to assemble an effective National
Contaminant Occurrence Data Base.
Preliminary EPA Staff Proposal: Consider major regulatory and public uses
of the NCOD and evaluate what minimum set of information requirements would
meet these needs.
19

-------
TEXT BOX 3
Safe Drinking Water Information System (SDWIS)
Unregulated Contaminant Monitoring
Minimum Data Elements Reported to EPA
PWS Information
PWS Identification Number
Sampling Point Information
Sampling Point Identification Number
Sampling Point Type
Water Source Type
Sample Information
Sample Identification Number
Sample Collection Date
Laboratory Information
Composite
Laboratory Analytical Results
Unit of Measure
Contaminant(s)
Analytical Results - Sign
Analytical Results - Value
EPA Analytical Method Number
Source: Environmental Protection Agency. Office of Water. State Reporting
Guidance for Unregulated Contaminant Monitoring . EPA-812-B-94-OO1.
August 1994.
20

-------
IX. What options should be considered for the design and structure of
the Data Base?
The structure of the database must give the users the ability to analyze
occurrence data, and provide support to the contaminant selection process and
resulting Administrators determinations of which contaminants to regulate in the
future. The design of the NCOD will depend on a selection process that will
consider the principal options available and the cost and feasibility to design a
system based on the requirements of the SDWA amendments of 1996.
(1) One of the options would be consideration of the technological and structural
capability of SDWIS and other EPA systems (STORET, Envirofacts, PCS,
NWIS, etc.) to accomodate a centralized system design. The analysis of such
an option would need to determine what addtional data must be collected to
support the analysis of occurrence data that is not in the current structure of
SDWIS or other systems (e.g. concentration data at, above or below the MCL,
unregulated contaminant chemical data, emerging chemical data, co-occurrence
data, well data, hydrgeology data, etc.). EPA also will need to consider the cost
and benefits, pros and cons of modifying SDWIS or other databases to
accomodate the NCOD needs.
(2) The concept of a virtual NCOD database would have to consider the electronic
connectivity software or use an existing electronic connectivity tool to meet the
electronic data interchange requirements. If a distributed system is selected,
connectivity must be made to other databases for data transfer, searching, and
archiving data. The electronic connection shall have data quality control
capability, archiving capability, public document storage, ad-hoc searching and
canned report capabilities. Access capability would be to EPA, other federal
agencies, states and the general public. This option is presented in the
“Concept Flow Chart, Option 2” below.
(3) The feasibility of a stand-alone system should be considered. This would most
likely involve engineering an entirely new data base, as well as developing the
long-term maintenance approach. This option would allow full statistical analysis
of the data, as well as other types of analyses, including GIS.
(4) Summarized data in a scaled down data base may be a third option. SDWIS
may still need to be modified to obtain the necessary data. State and national
data would be summarized prior to entering them into a data base. Only major
trends could be provided, and statistical analysis would be limited.
(5) Other options may be considered that provide the quality of data needed along
with public accessibility.
21

-------
See the Appendix G for a description of EPA’s Envirofacts that serves as a
model for a virtual data base for environmental data, drawing on EPA’s other
data bases in a user-friendly format, with existing queries for standard questions.
Preliminary EPA Staff Proposal: Conduct feasibility and data needs analysis
of data systems to determine NCOD options.
22

-------
NATIONAL DRINKING WATER CONTAMINANT OCCURRENCE DATA
CONCEPT FLOW CHART
OPTION 2
DATA SOURCES
SOW1S “CORE”
for Public Water
Systems
/
STORET and
Other Data
Bases (e.g.,
USGS, States)
Source Water
Quality
‘I
LINKAGE
AND SCREEN
ELECTRONIC
CONNECTI(
with SEARCH
ENGINE to meet
Data Quality
Objectives
t
NATIONAL
CONTAMINANT
OCCURRENCE
DATA BASE
VIRTUAL
DATA BASE
N
SEARCH
ENGINE
with D.Q.O.
Screen
/
ARCHIVE
for Administrative
Decisions
If 23
STAKEHOLDER INPUI.
USERS AND USES
EPA Drinking Water Program
Contaminant ldentification*
Regulatory Determination”
Regulation Development”
Data Gap Analysis
Research & Development
Review of Existing MCLs”
and standards
Public Access
National Summary
Hotline
CWA 305(b) Report
Internet
Other EPA Programs,
especially
Source Water Protection
State Drinking Water and
Environmental Programs
PUBLIC
I”
“Indicates Principal Uses of NCOD

-------
X. What sources of data should the Data Base draw on and should historical
data be included?
A. Sources of Data
Several Federal data bases already contain data that could be considered
contaminant occurrence data for contaminants occurring or likely to occur in
drinking water of PWSs. The Safe Drinking Water Information System (SDWIS)
receives data from States for regulated and unregulated contaminants in PWSs.
The Storage and Retrieval (STORED System includes contaminants found in
ambient ground and surface waters that could likely occur in drinking water.
These and other data bases will be discussed below. In order that the NCOD
may be developed iii an orderly and manageable way, it may be necessary to
establish the Data Base incrementally. That is, the NCOD may be constructed
using existing EPA and other data sources that are most accessible first, and
then added to over time, as experience with other data bases is gained,
especially in proving their data reliability.
1. Data to be Included in the Data Base
The SDWA Amendments are specific with respect to the data to be included in
the Data Base:
- Regulated contaminants: Information on the detection of the
contaminant at a quantifiable level in public
water systems, including detection of the
contaminant at levels not constituting a
violation of its maximum contaminant level
(MCL).
- Unregulated contaminants: Monitoring information collected by public
water systems serving more than 10,000
people, and for a representative sampling of
public water systems serving 10,000 or fewer
people; and
Other reliable and appropriate monitoring
information on the occurrence of
contaminants in public water systems.
- Reliable information from other public and private sources.
The National Academy of Sciences, the States and the public may recommend
other contaminants to be included in the Data Base. These recommendations
must include documentation that the contaminant occurs or is likely to occur in
drinking water, and the contaminant poses a risk to public health. The Data
24

-------
Base may then include data on such contaminants, if so determined by EPA
through the contaminant identification process.
2. Potential Data Sources
Currently, the Safe Drinking Water Information System (SDWIS) maintained by
the EPA Office of Ground Water and Drinking Water (OGWDW) contains
information on regulated contaminants for most PWSs. SDWIS collects
information on regulated contaminants that are not in compliance with maximum
contaminant levels (e.g. concentrations above the MCL). The SDWIS database,
particularly the Sampling Business System is structured to store and retrieve
much of the type of data (i.e., parametric or occurrence data) that would be
needed in an occurrence database. SDWIS may serve as a “core” of the
NCOD. Other information from “other sources” may be linked to the SDWIS
database. Other databases, particularly EPA systems, will need to be evaluated
to determine how user needs may be met by incorporating electronic data
interchange (EDI) of existing systems. No existing system may meet the current
needs of the NCOD. Analyses of existing systems, including STORET (source
water quality data) and other federal data systems, will determine whether the
current design of SDWIS for regulated and unregulated contaminant data can
meet the different uses of occurrence data as outlined under the 1996
Amendments to the Act.
a. Safe Drinking Water Information System (SDWIS)
The Safe Drinking Water Information System (SDWIS) is the data base for
information related to drinking water quality in the United States. It is actually
two distinct, although related, data systems: SDWIS/FED and SDWIS/LAN.
SDWIS/FED is the Federal component that resides on a mainframe computer
at the EPA’s National Computer Center (NCC). SDWISIFED replaces the
Federal Reporting Data System (FRDS) and is the national repository for a
subset of State and EPA Regional data on inventory data on public water
systems, data related to compliance with drinking water regulations, and other
data like unregulated contaminant monitoring data.
SDWIS/LAN is a local area network (LAN) - based system designed to meet the
needs of the States that directly implement the drinking water program. At this
time, five states use SDWISILAN. SDWIS/LAN contains all of the data needed
(e.g., complete PWS inventory, monitoring schedules and sample results) to
compute compliance with drinking water regulations and to perform other Public
Water System Supervision (PWSS) functions. A subset of SDWIS/LAN data to
meet the reporting requirements is tranéferred to SDWISIFED on a quarterly
basis. This same data transfer function is also performed by States that do not
use SDWIS/LAN and have their own data systems. A summary of the current
25

-------
SDWIS/FED reporting requirements placed on State PWSS programs can be
found in the uConsolidated Summary of State Reporting Requirements for the
Safe Dnnking Water Information System (SDWIS)”, EPA 812-B-95-OO1,
November, 1995.
SOWIS has been designed using Information Engineering and Computer Aided
Software Engineering (CASE) tools. Following this approach, modules called
business systems have been developed that represent major areas of similar
functions such as inventory and sampling. The Inventory Business System
(IBS) contains data on the name and address of the public water system, the
treatment plants and sources of water and other facility data such as locational
data, and other data such as the population served, number of service
connections, geographic areas, service areas, etc. The Sampling Business
System (SBS) contains data on sampling locations, analytical results, and
contaminants (See Appendix H for a complete list of current SBS data attributes
(individual data elements)).
SDWIS also contains business systems for maintaining monitoring schedules
and for non-compliance determinations for the Total Coliform Rule. In addition,
SDWIS/FED also maintains data on violations, enforcement actions, and
variances/exemptions for regulated contaminants that exceed the MCL under
the SDWA regulations.
States are currently required to report unregulated contaminant monitoring data
to SDWIS/FED. The purpose of unregulated contaminant monitoring (UCM) is
to assist EPA in determining the occurrence of unregulated contaminants in
drinking water and whether future regulations are warranted. The reporting
guidance for these data (EPA 812-B-94-OO1, August, 1994), calls for the
reporting on 48 contaminants of a minimum set of data elements needed to
submit a given sample result to SDWIS/FED. A summary of the reporting
requirements including a list of these data elements can be found in Text Box
3. Only six states have reported this data, so little information exists on these
contaminants in SDWIS/FED. SDWIS is designed to store parametric data on
both regulated and unregulated contaminants occurring in drinking water. This
information is critical for EPA in determining contaminants for future regulatory
control. Unregulated contaminant monitoring data from six States resides in
SDWIS. EPA Regional Offices will work with States to increase the priority for
reporting this data for all States. Appendix 0 gives more details about SDWIS.
b. STOrage and RETrieval System (STORED
The Environmental Protection Agency’s Office of Water is about to complete the
re-engineering of its primary marine and freshwater ambient water quality and
26

-------
biological monitoring and information systems, the STOrage and RETrieval
System (STORET), Biological Information System (BIOS), and the Ocean Data
Evaluation System (ODES). This project, begun in 1992, is scheduled for its first
production implementation in late 1997. STORET, BIOS, and ODES contain
over 250 million parametric observations from over 1,000,000 sampling stations
nationwide, collected primarily by States. These systems serve as the Agency’s
primary sources of point and non-point source ambient water quality and
biological monitoring data and their analytical tools support a wide range of EPA
water quality and ecosystem health assessment activities. These data may
provide much of the information on contaminants likely to occur in drinking water
of public water systems.
Implementation of the new system will begin in late 1997, initially in a
client/server architecture probably using a UNIX/Oracle server and a PC-based
Windows 95 in client workstation configuration.
STORET X will emphasize the delivery of data to the end user, in the form most
compatible with the intended analysis. A variety of data delivery formats is
envisioned which will facilitate the export of STORET X data onto the local
workstation, from which it may be portrayed statistically or graphically, or
imported into a Geographic Information System (GIS). Users will have broad
latitude in defining these export formats. 1n addition, certain aids to data
interpretation will enable data browsing and to provide data summaries.
STORET may provide source water quality data for the NCOD for contaminants
likely to occur in drinking water of PWS. Appendix E gives more details about
STORET.
c. National Water Information System (NWIS) - U.S. Geological Survey
The National Water Information System (NWIS) is a database and a collection
of programs that facilitate storage, retrieval, and processing of water data by the
U.S. Geological Survey (USGS). Originally known as the Water Data Storage
and Retrieval System (WATSTORE), the system was designed and put into use
on a mainframe computer. Later revisions modified WATSTORE for use on a
distributed network of PRI ME minicomputers. Currently NWIS is undergoing a
migration to unix-based workstations.
NWIS comprises four sub-databases: the Water Quality System, the Automated
Data Processing System (ADAPS) for surface-water data, the Ground-Water
Site-Inventory System, and the Water-Use Data System. Introductory sections
from the chapters for each system are included in the Appendices.
The Water Quality System stores data for about 17,000 parameters for
thousands of intermittent and long-term sampling sites nationwide, including
27

-------
ground water and surface water. Every sampling site is characterized in its
header record. The ADAPS system is used for storing and processing the
streamfiow data collected from the 9,000 gages operated by the USGS.
Time-series stage data are entered as the raw data. These are adjusted and
used to compute streamfiow records. ADAPS also stores water-level data for
wells, lakes, and tidal stations. These data may augment STORET for source
water contaminants likely to occur in drinking water of public water systems.
The Ground-Water Site Inventory is a well index for most of the wells where
USGS collects data. Records include location, aquifer, depth, construction,
intended use, ownership, geophysical log availability, and other information.
The data from the USGS National Water Quality Assessment (NAWQA) which
is developing water quality from field sampling in nearly 60 large hydrologic
study areas in the US is reported to NWIS. These data are downloaded to
STORET. Appendix F provides more details concerning NWIS.
d. Other Federal Data Bases
Since the law gives a broad charge for inclusion of data, the NCOD may
incorporate data from the following data bases:
(1) EPA’s Permit Compliance System, which has information on municipal and
industrial dischargers;
(2) Comprehensive Environmental Response, Compensation and Liability
Information System, which has contaminant data from Superfund Sites;
(3) Resource Conservation and Recovery Information System which has permit
and contaminant data for RCRA facilities;
(4) Pesticides in Ground Water Data Base which includes data from source
water testing and pesticide registrant’s data;
(5) Toxic Release Inventory, which has data on chemical releases from
d ischargers;
(6) Centers for Disease Control water quality and public health data bases; and
(7) National Oceanic and Atmospheric Administration’s water quality data base.
e. State Data Bases
In addition to the contaminant occurrence monitoring data required to be
reported by public water systems and States for the NCOD, States and public
water systems maintain other data bases for source water quality that potentially
could provide data on “contaminants likely to occur in drinking water.” For
example, some PWSs are sampling, testing and collecting data on contaminants
that are perceived to be of potential local concern for which monitoring and
reporting is not required nder current Unregulated Monitoring requirements.
States have surface and ground water monitoring networks that are used for
28

-------
sampling, testing and collecting data on ambient water quality that may provide
a detailed indication of contaminants that may be of concern in the future from
source waters. State public health laboratories that test for drinking water
contaminants may be able to report data electronically and directly to the NCOD
via SDWIS and reduce State reporting burden in the future. Electronic
connections to these and other data bases will be an important source of data
in the future.
f. Private Data Bases
Private data bases may be a vast source of water quality data for watersheds
and aquifers that could help complete the picture of contaminants that may or
may not be affecting our drinking water sources in the future. Private data
sources may include chemical, radiological and microbiological data from
product companies’ testing laboratories, university research laboratories, and
non-profit and volunteer monitoring activities. Private data bases may also
include chemical, radiological and microbial product use and waste disposal.
These EPA and other data systems will need to be evaluated to determine how
user needs may be met by incorporating electronic data interchange (EDI) of
existing systems. No existing system may meet the current needs of the NCOD.
Analyses of existing systems, including STORET (source water quality data) and
other federal data systems, will determine whether the current design of SDWIS
or another data system that accomodate both regulated and unregulated
contaminant data can meet the different uses of occurrence data as identified
in the SDWA Amendments.
Preliminary EPA Staff Proposal: Initially, draw on parametric data for drinking
water contaminants from SDWIS and for contaminants likely to occur in drinking
water from selected data sets in STORET, including data downloaded from
NWIS. After 1999, expand the Data Base to include other Federal, State and
private data sets.
B. Historical Data
Some additional questions may be important in considering whether historical
data should be included in the Data Base:
1) How recent should the data in the NCOD be and what is considered current
enough to base important decisions?
2) Are summary statistics of historical monitoring data, as reported by a data
source, sufficient for purposes of the NCOD, since some historical data may not
be as complete as more recent data?
29

-------
The SDWA Amendments require that the NCOD be used in the Contaminant
Identification and Listing process and in the determination to a regulate of
SDWAA Section 102 (a)(1)(B)(i) and (ii). Since the dates for the implementation
of these activities are 1998 for the First Candidate Contaminant List, and 2001
for the Second Candidate Contaminant List and First Determination to Regulate
additional contaminants, historical data should be considered for inclusion in the
Data Base. Initial assembly of the Data Base is to be completed by August
1999. Potentially available data to include in SDWIS to address these needs
comprise: Phases I and II unregulated contaminant monitoring data and State
parametric data for regulated contaminants. The U.S. Geological Survey has
reviewed extensive sets of ambient water quality data which have been included
in its National Water Information System (NWIS) through its National Water
Quality Assessment for hydrologic units serving major populations. NWIS data
is downloaded electronically to STORET. The data described above may
constitute the first phase of data input to the Data Base. For other data bases,
EPA may need to review and audit them for data reliability and compatibility with
the NCOD. This data base review may identify the quality of data to be entered
into the NCOD, allowing data of various qualities to be included and applied only
to those levels of decisions appropriate for their quality.
Preliminary EPA Staff Proposal: Include historical data for all states on
unregulated contaminants, because future contaminant selection will rely on this
data. Include historical parametric data from States with electronic capabilities
to report it. Identify selected data sets from STORET and NWIS that may be
useful in establishing the presence of contaminants in source waters that may
be used for drinking water. Establish a data base audit function to examine
data reliability issues for different types of uses of and decisions resullting from
the NCOD.
XI. What factors and methods may be important in selecting and interpreting
the data and its use by many users?
Interpretation and reporting of drinking water contaminant data may be one of
the most important functions of the NCOD. Recipients of the data interpretations
must have confidence in this interpretation and reporting. Reliability of data is
a key factor for data to be included in the Data Base, as identified in the SDWA
Amendments. Data quality factors as they affect data elements were discussed
in Section VIII, and will influence the interpretation of the data. The other data
elements reported will also affect the data interpretation and which decisions the
data can be used for.
30

-------
The primary use of the NCOD is to support the Administrator’s determinations
of whether to regulate a particular contaminant in the future. If a determination
is made to regulate it, then the Data Base will be used to support the regulation
development and guide research in areas that the data cannot address. In
supporting these determinations, the NCOD data will need to be related to
geographic, environmental, and health effects data. The data will also be
analyzed for quality and completeness to address specific questions, such as:
• Is the potential contaminant occurring at significant levels in drinking
water or source water?
• Is the trend of occurrence of the contaminant increasing or decreasing
and the distribution expanding or contracting?
• Does the potential contaminant occur only regionally or locally?
• Does the potential contaminant occur at a level that may pose a public
health concern?
• What are types and levels of uncertainty concerning the occurrence of
the contaminant?
• What data indicate the need for additional supporting information
concerning the occurrence of the contaminant?
The data will be examined through the application of statistics and geographic
information systems (GIS), with one objective being reproducibility of the results
by others analyzing the data. A multidisciplinary team will evaluate the data for
any particular contaminant to ensure that the results are consistent with
application of the revelant fields’ principles and bodies of knowledge.
Considerations for both data quality and reported occurrence will guide analysis
of “outliers”; i.e., data points for any data element considerably beyond most of
the other data. An important consideration in the interpretation of the NCOD will
be to report results in an understandable format to the public.
As an example of one type of initial statistical analysis, Text Box 4 describes
results of combining different data sets based on the submission of State data
for what is referred to as “Phase I” unregulated contaminant monitoring done in
the late 1980s and early 1990s. The chemicals examined are
tetrachloroethylene and trichloroethylene. The analysis raises issues relevant
to the development of and interpretation of data for the NCOD.
Preliminary EPA Staff Proposal: Evaluate specialized uses of statistical and
GIS analyses for interpreting NCOD data. Explore options for reporting results
to the public in an understandable and useful form.
31

-------
TEXT BOX 4
Analysis of Occurrence of
Tetrachioroethylene and Trichloroethylene
in Public Water Systems
It is useful to provide an example in order to illustrate the importance of an
exploratory data analysis as a guide to data quality, use and interpretation. The
unregulated contaminant monitonng data (Phase I) that were collected by EPA
over a several year period in the early 1990s reside on the EPA mainframe in
ASCII format. Selective results of an exploratory data analysis of the chemicals,
tetrachloroethylene and trichloroethylene, are discussed below. Both
compounds are known to occur at Superfund sites. There were 180,716
records in this particular analysis. Appendix J has the State Phase I monitoring
data for this analysis.
• The data base includes a field that indicates whether a measured concentration
is also the detection limit. From this field, it could be determined that the
detection limit for the two compounds varies between .1 and 5 g/l, the
Maximum Contaminant Level (MCL) for both compounds. There were 40
instances where the detection limit actually exceeds the Maximum Contaminant
Level. These cases could represent potential outliers or errors in the data.
• Data from thirty seven states were analyzed. This excludes eleven states that
were excluded from the analysis because the data were considered suspect.
For example, in some of the states that were dropped, consistently high total
positive concentrations above the detection limit indicated that the states
selectively reported positive concentrations or targeted their monitoring to
locations where contamination was suspected. In other cases, the number of
detections and sites sampled were too low to be considered reasonable (e.g.,
District of Columbia, Virgin Islands). For the remainder of the data that were
kept in the analysis, there were not any obvious clues to suggest that the data
were not comparable based on the states’ collection and reporting practices.
Still, some caution remains in that the specific locations of the samples, inside
or outside the distribution system, are not specified and likely vary among sites.
In addition, replicate samples may be included in the data base. The units of
measure may also be inconsistent across records in some cases (for example,
data from FL and PA may have problems with consistent units, per Jim Walasek,
US EPA, personal communication).
• Tetrachioroethylene concentrations in the data base fall within the range 3300
and 0 g/l. The median concentration value is .5 g/l, the concentration most
frequently cited in the data base as the detection level. Similarly the maximum
tnchloroethylene concentration in the data base is 3669 jig/I and the minimum
is 0. The median value is .5 jig/I. These results indicate that the data are
32

-------
extremely skewed, with the majority of concentrations falling at or below the
detection level. Some of the very large concentrations may be data entry errors
in the original data base.
• Tetrachloroethylene data were collected during the period 1983-1992, with the
majority of samples collected between 1990 and 1992. The same is true for
trichioroethylene. Some of the data were obviously miscoded as a very few
dates are very old (e.g, early to mid I 900s).
• The cumulative frequency of detection (counts by observation) may be a
reasonable indicator of population exposure to contamination. The cumulative
frequency, however, can be misleading when trying to estimate the probability
or rate of occurrence of contamination events. Because a contaminant can
remain in the source water or distribution system for an extended time period,
there may not be a way to tie a specific sample to a unique contamination event.
As a result, a single event may be sampled multiple times. In addition, public
water systems may not collect samples at the same frequency per year,
compounding the problem of interpretation of the statistic.
• The proportion of monitoring sites in each state that have reported
concentrations above the detection limit is a potential way to characterize the
spatial distribution of the contaminants. In general, it can be used as a broad
indicator of the fraction of contaminated “wells” or “intakes” in a state. See next
bullet for details.
• Specifically, in this analysis, a site was counted as registering a “detection” if
the maximum concentration value recorded at a monitoring site was greater than
the detection limit. The proportion of contaminated monitoring sites in a state
was then calculated by dividing the number of monitoring sites with detections
by the total number of monitoring sites in the state. The states were ranked
according to descending values of this fraction. Results for trichloroethylene and
tetrachioroethylene are again very similar with the states GO, LA, NC, NJ, GA,
DE, AL, and IN consistently having the highest fractions of sites with
concentrations above the detection limit. (See attached graphics and tables).
Some cautions to consider, however, when interpreting this statistic include:
(1) The results could be biased if the spatial and temporal coverage of the
monitoring data are not similar across states. For example, states that collect
data over a greater area and/or with a greater frequency than other states are
more likely to find contamination.
(2) The results could be biased if the data do not represent monitoring of
independent contamination events over space. For example, data that are
spatially correlated, such as data from two different monitoring sites that sample
the same aquifer, could be measuring the same contamination event.
33

-------
(3) By selectively looking at the maximum value at a monitoring site, good
information may be lost if the observations that are dropped represent sampling
of independent contamination events.
• The average number of data records or observations/site was calculated for
each state to determine whether a state that has a relatively high fraction of
contaminated sites also collects more data per site. As can be seen in the
attached tables, this is not true for the states monitoring for tetrachloroethylene
and trichioroethylene. The fraction vanes between I and 6.47 observations/site
for both compounds.
This analysis suggests issues for stakeholders to consider that may be important
to the development and interpretation of the Data Base. Among these issues
are the questions of:
I) What are the core statistics or indicators to include in the NCOD that
characterize data quality or aid in the use and interpretation of the data?
2) How recent should the data in the NCOD be and what is considered current
enough to base important decisions?
3) Are non-detections important to include in the NCOD even if they are the
majority of the monitoring data?
4) Are summary statistics of historical monitoring data, as reported by a data
source, sufficient for purposes of the NCOD?
5) What kind of location data are important to include in the NCOD?
6) Should information on definitions, assumptions, methods of estimation,
modeling, rounding, and weighting (where appropriate), as well as information
on chain-of-custody, purpose of collection, and any changes in the
aforementioned over time be available to the user upon request to the source,
if not available on-line? And should the availability of this type of information be
a prerequisite for data to be part of the NCOD?
34

-------
XII. What are the Next Steps beyond this Stakeholders Meeting?
The EPA will provide a summary of the May 2 1-22 Stakeholders meeting to its
participants. Additionally, EPA will make that summary available to other
individuals and groups to encourage continued public input.
Other Planned Stakeholder Input
Beyond the May 21-22, 1997, Stakeholder Meeting, EPA plans to make
presentations to the Science Advisory Board (SAB) in mid 1997. Additionally,
EPA is planning to hold follow-up meetings with Stakeholders and the SAB in
1998. The EPA would also like to identify opportunities to make presentations
to public and private organizations’ meetings where there is an identified interest
in the National Contaminant Occurrence Data Base and its use.
The EPA Regional Offices will be working with their respective States to solicit
their input for development of the NCOD. The Regional Offices will also work
with State drinking water agencies to obtain regulated and unregulated
contaminant data as a priority of future State program activities. States will be
encouraged to use direct electronic reporting from certified laboratories
performing the drinking water contaminant analyses to reduce State reporting
burden.
Some Near-Term EPA Activities
EPA will be conducted detailed technical and feasibility analyses of NCOD
options using the input of this Stakeholders meeting and other meetings related
to this subject. Another proposed development is the preparation of a guidance
on data submittal to be used by any agency or person having data to input to the
NCOD. EPA will also need to develop a long-term maintenance plan for the
Data Base.
Phasing Development of the Data Base
A key consideration is the potential complexity of establishing the Data Base,
given the vast amount of data that could be put into it. A practical option may
be establishing it in Nphasesn, such as making it operational for public water
system data first, then for EPA ambient water quality data, followed by other
Federal and State ambient water data. Other options may be considered.
Preliminary EPA Staff Proposal: Work through EPA Regional Offices to solicit
State input on NçOD development and coordinate submission of data. Make
presentations to meetings of and provide other forms of outreach to
organizations concerned with drinking water quality and public health issues.
Continue to work with Stakeholders in a variety of forums to evolve the NCOD
35

-------
the NCOD to be the most useful and manageable tool possible for scientifically
sound data in contaminant selection and regulation.
36

-------
APPENDICES
37

-------
Appendix A
SDWAA Section 126, Occurrence Data Base
SEC. 126. OCCURRENCE DATA BASE.
Section 1445 (42 U.S.C. 300j-4) is amended by adding the following new subsection
after subsection (f):
‘(g) OCCURRENCE DATA BASE-
‘(1) IN GENERAL- Not later than 3 years after the date of enactment of the Safe
Drinking Water Act Amendments of 1996, the Administrator shall assemble and
maintain a national drinking water contaminant occurrence data base, using information
on the occurrence of both regulated and unregulated contaminants in public water
systems obtained under subsection (a)(1)(A) or subsection (a)(2) and reliable
information from other public and private sources.
‘(2) PUBLIC INPUT- In establishing the occurrence data base, the Administrator
shaU solicit recommendations from the Science Advisory Board, the States, and other
interested parties concerning the development and maintenance of a national drinking
water contaminant occurrence data base, including such issues as the structure and
design of the data base, data input parameters and requirements, and the use and
interpretation of data.
‘(3) USE- The data shall be used by the Administrator in making determinations
under section 1412(b)(1) with respect to the occurrence of a contaminant in drinking
water at a level of public health concern.
‘(4) PUBLIC RECOMMENDATIONS- The Administrator shall periodically solicit
recommendations from the appropriate officials of the National Academy of Sciences
and the States, and any person may submit recommendations to the Administrator, with
respect to contaminants that should be included in the national drinking water
contaminant occurrence data base, including recommendations with respect to
additional unregulated contaminants that should be listed under subsection (a)(2). Any
recommendation submitted under this clause shall be accompanied by reasonable
documentation that—
‘(A) the contaminant occurs or is likely to occur in drinking water; and
‘(B) the contaminant poses a risk to public health.
‘(5) PUBLIC AVAILABILITY- The information from the data base shall be
available to the public in readi’y accessible form.
‘(6) REGULATED CONTAMINANTS- With respect to each contaminant for
which a national primary drinking water regulation has been established, the data base
shall include information on the detection of the contaminant at a quantifiable level in
public water systems (including detection of the contaminant at levels not constituting a
violation of the maximum contaminant level for the contaminant).
‘(7) UNREGULATED CONTAMINANTS- With respect to contaminants for which
a national primary drinking water regulation has not been established, the data base
38

-------
shall include—
‘(A) monitoring information collected by public water systems that serve a
population of more than 10,000, as required by the Administrator under subsection (a);
‘(B) monitoring information collected from a representative sampling of
public water systems that serve a population of 10,000 or fewer; and
‘(C) other reliable and appropriate monitoring information on the
occurrence of the contaminants in public water systems that is available to the
Administrator.’.
39

-------
APPENDIX B
UNREGULATED CONTAMINANT LIST
GROUP 1 40 CFR 141 .40(n)(11)
Aidrin Metachior Carbaryl
Dieldrin Propachior 3-Hydroxycarbofuran
Metribuzin Dicamba Methomyl
Butachior
GROUP 2 40 CFR 141.40 (n)(12)
Sulfate
GROUP 3 40 CFR 141.40 (e)
Bromobenzene Chiorod ibromomethane 2 ,2-Dichtoropropane
o-Chlorotoluene Chloroethane 1,1 -Dichloropropene
p-Chlorotoluene Chloroform I ,3-Dichloropropene
m-Dcchlorobenzene Chloromethane 1,1,1, 2-Tetrachloroethane
Bromodichioromethane Dibromometha ne 1,1 ,2,2-Tetrachforoethane
Bromoform 1,1 -Dichloroethane 1,2, 3-Tnchloropropane
Bromomethanel ,3-Dichloropropane
GROUP 4 40 CFR 141.40 (j)
n-Butylbenzene p-Isopropyltoluene Fluorotrichioromethane
sec-Butylbenzene I ,2,3-Tnchlorobenzene Naphthalene
tert-Butylbenzene I ,2,4-Trimethytbenzene
Hexachlorobutadiene I ,3,5-Trimethylbenzene
n-Propylbenzene Bromochloromethane
lsopropylbenzene Dichiorodifluoromethane
40

-------
Appendix C
Priority List of Substances Which May Require Regulation
Under the Safe Drinking Water Act
(Source: Federal Register, Vol.56, No. 9,
January 14, 1991, pp. 1470-1474)
Inorganics Prometon
Aluminum 2 .4 5-T
Boron Thiodicaaarb
Chioramines Trifluralin
Chlorate
Chlorine Synthetic Organic Chemicals
Chloride dioxide Acrylonitrite
Chlorite Bromobenzene
Cyanogen chloride bromoch loroacetonitrile
Hypochiorite lion Bromodichioromethane
Manganese Bromoform
Molybdenum Bromomethane
Strontium Chiorination/Chioramination by-
Vanadium products (Misc.), e.g..
Zinc Haloacetic acids.
Haloketones, Chloral
Pesticides hydrate, MX-2 [ 3-chloro-4-
Asulam (dichloromethyl)-5-
Bentzaon hydroxy-2(5H)-furanofle],
Bromicil N-Organochloramines
Cyanazine Chioroethane
Cyromazine Chloroform
DCPA (and its acid metabolites) Chioromethane
Dicamba Chioropicrin
Ethylenethiourea o-Chlorotoluene
Fomesafen p-Chlorotoluene
Lactofen/Acifluorfen Dibromoacetonitrile
Metalaxyl Dibromochiaromethane
Methomyl Dibromomethane
Metolachior Dichioroacetonitrile
Metribuzin I ,3-Dichlorobenzene
Parathion degradation product Dichiorodifluoromethane
(4-Nitrophenol) 1,1 -Dichloroethane
41

-------
2, 2-Dichloropropane Naaphthalene
1,3-Dichioropropane Ozone By-products, e.g.,
1,1 -Dichloropropene Aldehydes, Epoxides,
I ,3-Dich oropropene Peroxides, Nitrosamines,
2,4-Dinitrophenol Bromate, lodate
2,4-Dinitrotoluene 1,1,1 ,2-Tetrachloroethane
2 ,6-Din trotoluene 1,1 ,2,2-Tetrachloroethane
I ,2-Diphenylhydrazine Tetrahydrofuran
Fluorotrichloromethane Trichioroaceta nitrite
Hexachtorobutadiene I ,2 ,3-Trichloropropane
Hexachioroethane
Isophorone Microorganisms
Methyl ethyl ketone C,yptospondium
Methyl isobutyl ketone
Methyl-t-butyl ether
42

-------
Appendix D
Safe Drinking Water Information System (SDWIS)
The current PWSS Program reporting requirements originated in the late 1980’s in
response to the promulgation of various National Primary Drinking Water Regulations
(NPDWR) following the 1986 amendments to the SDWA. At that time, the U.S. EPA
developed a national database management system, called the Federal Reporting Data
System (FRDS), to maintain basic inventory, violation, enforcement, and
variance/exemption data, that were needed for Federal oversight of the State PWSS
programs. The reporting requirements included some, but not all, of the critical EPA
PWSS program functions, such as state grant allocations, significant noncompliance
tracking, and enforcement tracking. The implementation of the 1986 amendments and
the realization that various program oversight responsibilities need to be supported by
better data led to the realization that the existing FRDS (FRDS-ll) reporting requirements
were inadequate to support all U.S. EPA program responsibilities. A new information
management philosophy was also developed, which considers the needs of both state
and federal users. This new approach allows for the sharing of data, lowering support
and maintenance costs, providing easier access to the system, and increasing overall
flexibility.
This new information system is known as the Safe Drinking Water Information System
(SDWIS). SDWIS is being designed following the recommendations of prior
analyses/reports that OGWDW should develop a data system that meets the needs of the
states, regions, and Headquarters. The Office of Water held many data user
requirements meetings around the country in the early I 990s to guide the development of
the System. Although each of these groups has the same ultimate goals, each is
responsible for different aspects of the PWSS program. The sharing of data between
these separate groups becomes very critical in the design of a system. Where a State
primacy agency may need sampling data to determine compliance with the regulations
and to determine if there is a public health threat or other problem with the water system,
the U.S. EPA may need the same data to judge occurrence and levels of contamination
for standard setting or to judge the appropriateness of regulatory requirements or
treatment technologies.
In the inventory business system, SDWIS holds information on water system operating
performance, including population served by the system. SDWIS also stores information
on water system facilities including data elements on: wells, intakes, water sources,
treatment techniques, and geographical information. Legal entities are also a ubusiness
group” that includes data on legal and operational contacts for the water system. The
sampling business system contains much more information that will likely be needed in
the National Contaminant Occurrence Database (NCOD). This system holds information
43

-------
on a sample description, methods used and contaminant concentration levels. In
contrast, State SDWIS data systems contain information on sample locations, laboratory
certification information, analytical methods used, and reasons for sampling (e.g. spill
occurrence in vicinity of treatment plant).
The Federal SDWIS receives a subset of information from State primacy agencies that
pertain to violation and compliance with the SDWA regulations pertaining to exceeding
the maximum contaminant levels (MCL) of regulated contaminants. Unregulated
contaminant monitoring (Section 1445 of the SDWA and 4OCFR Section 141.40)
currently collected in SDWIS requires the submission of monitoring data for substances
suspected of occurring in drinking water. This information is critical for EPA in
determining priority contaminants for future regulatory control. Unregulated contaminant
monitoring data from six States resides in SDWIS. EPA Regional Offices will work with
States to increase the priority for reporting this data for all States.
44

-------
Appendix E
STOrage and RETrieval System (STORET)
Introduction
The Environmental Protection Agency’s Office of Water is about to
complete the re-engineering of its primary marine and freshwater ambient
water quality and biological monitoring and information systems, the
STOrage and RETrieval System (STORED, Biological Information System
(BIOS), and the Ocean Data Evaluation System (ODES). This project,
begun in 1992, and scheduled for its first production implementation in
late 1997, will represent a first for the Agency in the area of large
systems re-engineering. STORET, BIOS, and ODES contain over 250 million
parametric observations from over 1,000,000 sampling stations
nationwide. These data, collected primarily by States, represent an
investment of over $2.2 billion. These systems serve as the Agency’s
primary sources of point and non-point source ambient water quality and
biological monitoring data and their analytical tools support a wide
range of EPA water quality and ecosystem health assessment activities.
This new system will better meet the emerging data and information
needs associated with watershed level environmental protection. This
new system will also facilitate the data sharing activities and spatial
assessment requirements necessary for successful local watershed
protection programs.
Implementation of new system will begin in late 1997, initially in a
client/server architecture probably using a UNIX/Oracle server and a
PC-based Windows 95 thin client workstation configuration.
Additionally, we may offer a version which will operate in a stand-alone
mode on a 32-bit Windows 95/Personal Oracle PC workstation. Final
determination of our implementation architecture will be made in mid
1997. We expect to precede our implementation and roll-out with a
period of “Beta” testing in selected EPA Regional Offices and States.
Background
The features of the new system are being carefully engineered to meet
45

-------
the information requirements of our federal, state, and local clients
engaged in ambient water quality and biological monitoring activities of
all kinds. The process of identifying the functions which make up these
activities, and identifying information generated by, and needed to,
conduct them is known as Information Engineering (IE).
IE employs a common repository of analytical tools to construct models
of data relationships and process flows to efficiently design an
information system to store data for an organization. To support the lE
approach the project team is using the Texas Instruments I-CASE tool
set, Information Engineering Facility to capture detailed data
requirements.
This project began in early 1992 with a series of system requirements
gathering workshops. EPA conducted over 15 Joint Requirement Planning
Sessions (JRPS) nationally involving well over 600 current STORET, BIOS,
and ODES users, as well as their middle and senior managers. These
workshops were attended by many State and local governments, and several
environmental organizations. From the results of these workshops the
Agency generated a high level data architecture, and formulated five
critical success factors for the new system:
1. It must be easy to get data in to and out of the system,
2. The system must have a menu access and browse capability,
3. The system must support the storage of quality assurance
and quality control (QA/QC) information on a project basis,
4. The system must be flexible and able to change with the
changing needs of its users,
5. The system must provide a wide range of standard output
formats, i.e., dBase, Lotus, ASCII... including the GIS
environment.
The immediate next steps conducted during 1993 and 1994 included the
completion of the business area analysis, construction of logical data
model, and the prototyping of system functional requirements. User
testing of portions of the new system began in late 1994. Users were
formally introduced to the new system during a National Workshop held in
Dallas in February 1995. At this workshop attendees began the testing
and validation of user requirements, a process which is crucial to user
46

-------
acceptance of the new system. At the last National Workshop, held in
December 1996, users began testing the complete system prototype.
Roll-out of Production Version 1.0 of the new system for use by the
federal, state, and local users is planned for late 1997.
Database Design
The new system, designated STORET X for development purposes, has been
divided into 5 primary business areas, each one representing a closely
related set of activities and their associated data. These are:
1. Identify and describe organizations which conduct ambient
water quality and biological monitoring activities.
2. Identify and describe the projects or surveys within which
these activities are carried out.
3. Identify and describe the physical locations (sites, areas)
at which monitoring occurs.
4. Identify and describe water quality sampling, observation,
and measurement activities which occur at these sites.
5. Record the results of sample analyses and field
measurement.
The STORET X prototype embodies these five business areas. As
mentioned earlier we will be demonstrating this new system to current
and future clients beginning in September 1996.
The following discussion highlights the key features of the new
system, with an emphasis on areas in which it differs from the legacy
systems it will replace.
Organizations
In STORET X, organizations will be the primary owners of data, and
will control access to it. Organizations will also own metadata, or
data describing their data. Organizations will own project
47

-------
descriptions, and lists of organizations and people with whom they work.
Organizations will also control a broad set of lists representing their
preferences or usual practices associated with their monitoring
activities. These lists may include aids to data entry (e.g. substances
tracked by monitoring activities, habitat evaluation criteria, and so
forth), equipment used in the field, methods used in their labs,
bibliographic references they use, and many others.
Projects or Surveys
Monitoring activities are organized by Project or Survey. The
descriptions of an organization’s projects will be kept, in summary
form, in STORET X. Field activities and their analytical results will
be linked directly to all the projects they support. Projects may in
turn be linked to programs, and because programs may be defined broadly
to include the projects of several organizations, data from any field
activity may be easily shared among both organizations and projects.
Project descriptions will permit the linking of data quality
objectives and other quality control plan items to a broad spectrum of
data. In this way, the needs of users for data quality descriptors can
be met with a minimum of data entry effort.
Sites or Areas Monitored
As in the legacy systems, all data concerning field work is keyed to
the specific location at which the field work is conducted, so that
measurements of water quality obtained can be linked to the place they
represent. The concept of “site” in STORET X is broader than it was in
these older systems.
Location is very important to EPA, and the EPA standards for
locational data are strictly followed in STORET X. In addition, all
applicable federal standards (FIPS, NIST, and others) are used wherever
possible.
Each STORET X site has a point of reference, whose latitude and
longitude are fully defined. In addition, each site may include an area
boundary, a field of actual monitoring locations, and the descriptions
48

-------
of any permanent sampling grid or transect found there. For facility
sites, additional locational data may be entered for the individual
end-of-pipe locations, and for well sites, a field of individual wells
may be described.
Sites may participate in external reference schemes, and may carry
identifiers from these schemes. For example, a site in STORET X might
have an NPDES number, and also be assigned a code to represent it within
a state regulatory program. In addition, any site which contributes
data to a project may be assigned a project-specific identifier to
assist project staff in easily identifying it.
Once a site has a defined reference point, with a latitude and
longitude consistent with EPA policies for locational data, it may be
assigned to one or more projects, and begin collecting samples. This
assures that all results are place-based.
Site Visits, Cruises or Trips
The collection of environmental data is always linked to a specific
site visit, to relate it to both space and time. Site visits are
treated as events on a trip (or cruise), and activities which are
related to multiple site visits are linked to the trip. Trip
descriptions will include the names of key participants, cooperating
organizations, and the sites to be visited. Certain Quality Control
(QC) activities such as the preparation and handling of trip level, as
well as, site and sample level QC samples are linked directly to the
trip and associated with their corresponding individual(s) samples.
Each site visit on the trip becomes an opportunity to make field
measurements, record observations about both the site and the
environment at the site during the visit, and to collect samples.
Single sites may be visited more than once during a trip, and sample
collection may occur repeatedly during each visit. Samples collected
may include biological catches/traps, sediment grabs, water, or air
samples.
Automated Date Loggers
49

-------
Data which is generated by a mechanical device which operates
unattended in the field may be entered by describing the device
in lieu of a trip, and a period of continuous operation in lieu
of a site visit. Each “line” of data generated by the device
becomes a field measurement activity in STORET-X.
Field Monitoring Activity
Field monitoring activity may consist of water, air, or sediment
sample collection, biological specimen catch/trap events, and any
measurements or observations obtained while at the site. Each of these
field activities is linked to those analytical results it generates.
Measurements and Observations
Information gathered in the field through the process of measuring or
observing the environment during the site visit is recorded in STORET X
as part of the site visit description. These data may include physical
conditions of the site itself, status of any equipment permanently
located at the site, biological habitat assessments, weather
observations, and simple field-determined physical or chemical data.
Samples
Samples are described according to the medium sampled, and the intent
for which they were collected. Methods and equipment used to collect
samples are fully described, by linkage to lists of methods and
equipment. These lists will be available from EPA, or client
organizations may choose to supply their own lists.
STORET X will accept descriptions of the sample collection process
which address the complete spectrum of water monitoring and sampling of
the biological community. For large area samples, such as trawls,
details such as the lat/long of its end points, the gear deployment
depth, the bottom conditions under the trawl, and others can all be
recorded.
50

-------
Samples can be created from other samples, by compositing, splitting,
or subsampling. Each new sample is linked to its “parents”, so that it
can be traced back to all the events which might influence its results.
A sample which is generated by a trawl (a “catch”) might be the parent
of a sample which is an individual fish. The fish in turn might be the
parent of a sample which is a specimen of liver tissue, and chemical
results for this liver specimen can thus be traced back to the spatial
coordinates of the original trawl.
Results
Each result is attached to a field monitoring activity. If the
activity was the collection of a water sample, the results are qualified
by all the methods used to collect, handle, store, and process that
sample. The results may be further qualified by the identity of the lab
performing the analytical work, and equipment and methods used in this
process. Statistical information concerning confidence intervals may be
supplied, and for results which are not quantified, de ction status and
quantitation status may be stored. Results which are counts or
percentages may be qualified by the range of some size or weight
variable which they represent.
Biological results are handled in different ways. For a “catch”, the
biota may be grouped and regrouped repeatedly for counting, weighing, or
measuring. For example, one grouping might be by taxon, and the counts
recorded for purposes of computing taxonomic diversity and richness.
Another grouping might be a user-defined histogram or class frequency
table of fish lengths within a species, and yet another might be to
record counts and weights of only adults, or only gravid females, or any
other category the analyst might request. A catch might be divided so
that a group contains only I individual, and a detailed description of
it recorded.
Planned Outputs
STORET X will emphasize the delivery of data to the end user, in the
form most compatible with the intended analysis. A variety of data
delivery formats is envisioned which will facilitate the export of
STORET X data onto the local workstation, from which it may be portrayed
51

-------
statistically or graphically, or imported into a Geographic Information
System (GIS). Users will have broad latitude in defining these export
formats. In addition, certain aids to data interpretation will be
available on our server, to enable data browsing and to provide data
summaries.
Summary
STORET X is the first major change to EPA ’s immensely popular STORET
System since its inception in 1964. With this new system, the water
monitoring community will have access to information and data structures
which accurately reflect the current and future way they do their jobs,
and which can be effectively used by decision makers to both plan and
evaluate the effectiveness of pollution prevention and abatement
programs.
52

-------
APPENDIX F
National Water Information System (NWIS)
CONTAMINANT OCCURRENCE DATA FROM THE U.S. GEOLOGICAL SURVEY
The National Water Information System (NWIS), described above, is
the comprehensive data repository within the U.S. Geological Survey.
Certain subsets of NWIS data have been packaged into datasets or
products that may be more convenient to use, or of greater specific
interest, for a national contaminant occurrence database for drinking
water. Some of the most significant of these datasets include:
o Data from Selected USGS National Stream Water-Quality Monitoring
Networks (WQN)—Available on CD-ROM and on the World-Wide Web at
http:/Iwwwrvares.er.usgs.gov/wqn96/. This dataset contains data from
two USGS national stream-water quality networks, the Hydrologic
Benchmark Network (HBN) and the National Stream Quality Accounting
Network (NASQAN), operated during the past 30 years. HBN comprises
63 stations on relatively small, pristine streams. NASQAN comprises
618 stations on larger streams with greater human influence. Both
networks include a long list of water-quality parameters.
o Pesticides in the Hydrologic System--A four-book series edited by
Robert Gilliom, USGS, and available from Ann Arbor Press (800-858-5299).
This series compiles data from a broad array of studies nationwide.
o Pesticide Use Maps—Maps showing geographic distribution of use,
aggregated at the county and the state level, for 208 pesticides.
Available on the web at http:/Iwww.wr.usgs.gov/pnsp/use92.
o Nitrate Risk Map for U.S. Aquifers—A dataset that combines information
on occurrence, loading, and soil characteristics that influence the risk
of nitrate contamination. Available on the World Wide Web at
http:I/wwwrvares.er.usgs.govlnawqa.nutrient. html. The map is also
available in fact sheet form (Fact Sheet 092-96) on the USGS home page
(www.usgs.gov) and as hard copies from:
Chief, NAWQA Program
U.S. Geological Survey
53

-------
413 National Center
Reston, VA 20192
o Aquifers of the U.S.—Digital Map—This GIS coverage is
available or soon to be available from Wendy Danchuk, USGS, Madison,
WI, 608-238-9333.
o Contaminants in the Mississippi River, 1987-1992, USGS Circular
1133, edited by Bob Meade, brings together the most comprehensive
view ever developed of water quality on our Nation’s largest river.
o Additional national synthesis efforts at the USGS will be producing
national maps and datasets on volatile organic compounds, metals,
radon, and microbes. Availability will be announced on the USGS home
page (www.usgs.gov).
54

-------
APPENDIX G
Envirofacts
Envirofacts Warehouse on the Internet
URL: http://www.epa.gov/enviro
The Environmental Protection Agency (EPA) created the Envirofacts Warehouse to provide
the public with direct access to the wealth of information in its databases. This helps EPA
fulfill its responsibility to make information available to the public, as required by the 1986
Superfund Reauthorization Act and other federal legislation. The Envirofacts Warehouse
is available through the World Wide Web (VWWV), allowing EPA to disseminate information
quickly and easily.
The Envirofacts Warehouse provides access to several databases and tools for users to
easily access the information contained in those databases. Currently, the components
of the Envirofacts Warehouse include program system, spatial, and demographic data;
metadata which describe data contained in the various databases; tools to access and
display the data; and other features that educate users on environmental issues.
National Database
Envirofacts currently contains a relational database of the national databases on
Superfund sites, hazardous waste handlers, discharges to water, toxic releases, and air
releases.
The national database also contains the Facility Index System (FINDS), which cross-links
facilities existing in multiple databases. Similarly, the Envirofacts Master Chemical
Integrator (EMCI) provides an index for EPA-regulated chemicals listed by program
system. The Locational Reference Tables (LRT) provide latitude and longitude coordinates
for EPA-regulated facilities. Other program system data will be added in the near future.
EPA Spatial Data Library System (ESDLS)
ESDLS stores spatial data for the Envirofacts Warehouse in ARC/INFO format. These
standardized spatial data enable users to visualize EPA facilities in relation to geographic
features such as roads, rivers, and county boundaries.
Demographic Data
The demographic database contains 1990 U.S. Bureau of Census data, which include, but
are not limited to, statistics on income, poverty status, race, and education level of the
population. This database integrates with ESDLS and the national database to enable
geo-demographic environmental analysis.
Metadata
55

-------
Metadata contain information that describes the data. Users can access metadata on-line
for most of the databases contained in the Envirofacts Warehouse and will eventually be
able to access all Warehouse metadata.
Envirofacts Tools
EPA provides several applications that allow users to access information contained in the
national database including on-line query forms and mapping tools. Users can access the
Envirofacts query capability from the Envirofacts Warehouse Website. This capability
allows users to query the national databases. Information can be retrieved directly from
a selected database (e.g., Toxic Release Inventory System), or users can query all
program systems in Envirofacts simultaneously. Users can create queries based on the
facility name or number, various geographic criteria, Standard Industrial Classification (SIC)
code, and/or data elements unique to each database. Maps On Demand applications use
data in the Envirofacts Warehouse to generate maps.
Additional Features
The Envirofacts Warehouse also includes pages to educate the public about the
environment. An environmental factoids page provides links to EPA and non-EPA sites
that contain general information of environmental interest. Chemical reference pages link
to sites outside the Envirofacts Warehouse Website including EPA resources, other federal
agencies, and select university sources that describe chemicals.
Contact: Pat Garvey
U.S. EPA, Office of Information Resources Management
401 M Street, SW (3408)
Washington, DC 20460
(202) 260-3103
Internet: garvey.pat©epamail.epa.gov
56

-------
APPENDIX H
Safe Drinking Water Information System (SDWIS)
Sampling Business System Data Elements
8
FEDERAL_LABORATORY_
NUMBER
‘
Laboratory Number normally assigned to a laboratory by
the EPA or used to designate a laboratory as a federal
entity.
9
WATER_FACILITY_STATE
Number/alphanumeric that uniquely identifies a Water
System Facility (e.g. Treatment Plant/Distribution
System/Well) within a Water System
10
SAMPLING_POINT

Number/alphanumeric that uniquely identifies a point
within a Water System Facility from which the sample is
drawn. (Associated with SAMPLING LOCATION)
11
SAMPLING_LOCATION
.
Alphanumeric that typically identifies a Sampling Point as
an address or equivalent text description. (Associated with
SAMPLING POINT)
FIEL FIELD NAME
DNO
INDIVIDUAL SAMPLE (T)I
SAMPLE SUMMARY (5)
Adc.ional number/alphanumeric to identify the sample.
Both LAB_SAMPLE_NUMBER and
STATE 1”
NUMBER
Laboratory t ..umber norma .y assigned to a Laboratory by
the State
57

-------
FIEL
D NO
12
FIELD NAME
SAMPLE_CATEGORY
INDIVIDUAL SAMPLE (T)I
SAMPLE SUMMARY (S)
Identifies the sample category as either Total
Coliform/Lead&Copper/ Chemical/General
Microbiological/Radionuclide/Water Quality Parameter
13
COMPLIANCE_INDICATOR
When set to ‘Y”, indicates that the sample has been taken
for compliance -
14
COLLECTION_DATE
The date in which the sample was collected
15
COLLECTION_TIME
The time at which the sample was collected
16
SAMPLE_TYPE
Indicates whether the sample is taken for ‘Routine’
purposes or is a ‘Repeat’, ‘Replacement’, ‘Confirmation’.
Several types are available although not all may be used
with samples taken for compliance.
17
REPEAT_LOCATION_COD
E
Indicates the location relative to the original Sampling
Point at which the repeat/invalid
replacement/confirmation/ sample was taken
(upstream/downstream/ original location/etc.)
18
LAB_RECEIPT_DATE_
SAMPLE
Date at which the Laboratory received the sample;
cannot be prior to Collection Date
19
COLLECTOR_IDENTIFICA-
TION_NUMBER
Number/alphanumeric that identifies each sample
Collector.
20
COLLECTOR_NAME
Name of the sample Collector.
21
SAMPLE_VOLUME
Value to indicate the size of the volume of water collected
for the sample.
22
LEAD_COPPER_SAMPLE_
TYPE
Type of Lead&Copper sample (for purposes of
Lead&Copper rule compliance)
23
SAMPLE_REJECTION_
REASON
Set of possible reasons to reject a sample prior to its
analysis at the lab.
24
COLLECTION_METHOD_
CODE
Code that indicats the Method used to collect the samle
.
25
ORIGINAL_LAB_SAMPLE
NUMBER
LAB_SAMPLE_NUMBER of the sample that was
originally taken and whose result required the current
Repeat/Invalid Replacement/Confirmation sample to be
samples
28
COMPOSITE_INDICATOR
When set to ‘Y’, indicates that the sample is a composite
and that the LAB_ COMPOSITE_NUMBER will be valued.
29
COMPOSITE_QUARTER

Used only for Radionuclide composites. The quarter
(112/3/4) to which the individual sample (collected during
that quarter butthat will be composited at the end of the
fourth quarter) should be attributed for purposes of
monitoring compliance.
,. .It_L)P I
58

-------
FIEL FIELD NAME INDIVIDUAL SAMPLE (T)I
D NO SAMPLE SUMMARY (5)
/ 30 , ANALYTE4 QDE Standsrd code usóq b epreseM a g lven$alytaAily
\ 4\ \ \‘\?S fl)4f $np valued here foi San*fló iumthaty , t ¶ $d , q
1 1 GAS NUMaERt . ‘ Chem caJAbstrathS ries Nt mbert\maytMiwd n place
> ktic ‘ “ ‘ k:m: f:fl f > < of ANALflE qOD IoY e* ued
\ $ here for anaMe+
/ 4 .%%. \ %: c% ’..I. 4 \ \‘ “ /
(Only valued hare ferSampte Summary )
35 FPFF flPIKIF PFSIP’ • “Field P t “ measured at the time/location of
• ._...i_CHLL _.
AL
36
TOTL_CHLORINE_RESIDU
AL
“Field Result” value measured at the time/location of
sample collection.
37
SAMPLE_WATER_
TEMPERATURE
“Field Result” value measured at the time/location of
sample collection.
38
TEMPERATURE_UNIT_
MEASURE
Temperature Unit of Measure--either C (Celsius) or F
(Farenheit)
39
TURBIDITY_MEASURE
“Field Result” value measured at the time/location of
sample collection.
40
PH_MEASURE
“Field Result” value measured at the time/location of
sample collection.
41
FLOW_RATE
“Field Result” value measured at the time/location of
sample collection.
,. iut value
sample collection.
FIEL
D
NO
FIELD NAME
INDIVIDUAL RESULT (T)/
SUMMARY RESULT (5)
59

-------
52
ANALYSIS_COMPLETION_
TIME
Time that laboratory analysis ends
53
STATE_NOTIFY_DATE
Date that the state receives the analytical result
54
DATA_QUALITY
Code indicating whether or not the analytical result meets
established data quality criteria
55
DATA_QUALITY REASON
Possible reasons that the code may not meet data quality

-
anuarus
J e I ANAtt$ 1S $ Thoa CQDE Standard anaIysis method code for the afla Me for Wh1 I
N< \ >1 < I • > resuit is assessed , > <
ei; ‘ MONtTORING_PEREODj Startdate ofthe monftoving period tthvhith the ansyI ca1
START DATE result or sample sumpiw Pis Ssigned. The moStoring
tti assi ; r q , : pSod must S fatjhe WàterSystem that collected
14 * 6 1 ‘4&iifl M 4 ! , the samp!e mthe rule agaThst wftith the result is
Ss IM ; S assess0dt .
III till lit it II Ill I till’ II 11111
5Q MQNITORINQPERIQD.JN Encf data At the monlioring period to w h the an ytlcel
QDATE , ç tesult or sample summary s as n Ø. The mo*odng
/ , < [ period must be val1 r he Wat4r System that collected
IA . the sample ru the uId ágaihst which the r4slt Is
L j t ’ . S r assessed! Ct 7
5 IOLUME_ASSAYED The amount of water used during the laboratory
I
assessment.
60
LAB REJECTION REASON
Possible lab comments (Too numerous to count /Turbid
Culture No Gas) that may cause the state to reject a
microbiological result.
FIEL
D
FIELD NAME
INDIVIDUAL RESULT (T)1
SUMMARY RESULT (5)
ANAL\ _C (
DATE
that laboratory analysis begins
1.... that laboratory analysis begins
Date that laboratory analysis ends
60

-------
FUEL
D
NO
61
FIELD NAME
MICROBE_PRESENCE_I NDI
CA-TOR
INDIVIDUAL RESULT (T)l
SUMMARY RESULT (S)
Presence/Absence indicator—P indicates that the
microbiological result is positive while A indicates a
negative result.
62
TEST_TYPE
Designates the result as either upresumptive or
Uconfi edn
63
COUNT
Value greater than 0 indicates a positive microbiological
result
64
COUNT_TYPE
Type of microbiological unit that is being counted per
specified count unit. Count type varies with the
microbiological organism where count is being recorded.
65
COUNT_UNITS
The units of measure associated with the microbiological
analytical result count.
66
LESS_THAN_INDICATOR
When set to ‘Y’, indicates that the analytical result is less
than either the Lab Reporting Leve (supplied by the lab)
or the federal minimum detection limit. Typically set to “Y”
for a non-detect result.
67
LESS_THAN_CODE
MRL - Lab Reporting Level—Indicates that the lab will
supply the minimum detection limit and will value the
DETECTION_LEVEL and DETECTION
IT_CODE fields. Note: this value may be the
federal minimum detection limit for that analyte but
SDWIS treates is as a Lab Reporting Level if the value is
supplied b y the lab. (Some laboratories wish to report a
value that is more stringent than the federal minimum
detection limit for the analyte.
MDL - Federal mimimum detection limit - value carried as
appropriate for each analyte and method— the source for
each value is the code for the Code of Federal
Regulation (40 CFD Section 141) et. al. If field contains
MDL, SDWIS will look up the federal detection limit for the
analyte and popuolate DETECTION_LEVEL and
DETECTION_LEVEL_UNIT_CODE fields. (The data for
MDLs is stored in SDWIS as TSAMAA MDL_Measure and
MDL_Msr_Unit_Code.)
68
DETECTION_LEVEL
If non-detect or uless than Lab Reporting Level” — value
supplied by Lab
69
DETECTION-
LEVEL_UNIT_CODE
Lab Reporting Level unit of measure
70
CONCENTRATION
If detect, this is the concentration value of the result
reported as a number.
71
CONCENTRATION_UNIT_C
ODE
Unit of measure associated with the concentration value.
72
REPORTED_MEASURE
If detect, this is the concentration value of the result; this
field preserves precision.
61

-------
FIEL
D
NO
FIELD NAME
INDIVIDUAL RESULT (T)I
SUMMARY RESULT (S)
S
73
REPORTED_MEASURE_CO
UNT
ERROR
The counting error estimated by the lab due to some
analytical anomaly—usualy expressed as a value
plus/minus the reported measure/unit of measure.
74
RESULTS_TYPE
Used only in Sample Summary -- list of possible types of
Sample Summary results
75
COUNT_QUANTITY
Count of each type of result within the Sample Summary
MEASURE Measure value that represents the result obtained from a
sample analysis
76
Unit of measure associated with the M
62

-------
APPENDIX I
GLOSSARY
[ These terms may useful in facilitating the understanding of this
document.]
accuracy. How closely an instrument measures the true or actual value
of the process variable being measured or sensed.
ambient. Environmental or surrounding conditions.
background level. In toxic substances monitoring, the average
presence of a substance in the environment,
originally referring to naturally occurring phenomena.
community water system (CWS). A public water system which serves
at least 15 service connections used by year-round residents or
regularly serves at least 25 year-round residents. Also see non-
community water system, transient water system and non-transient
non-community water system.
contaminant. Any physical, chemical, biological, or radiological
substance or matter that has an adverse effect on air, water, or soil.
contaminants with known occurrence. Contaminants which have
been found in PWSs and/or in source waters.
contamination. The introduction into water of microorganisms,
chemicals, toxic substances, wastes or wastewater in a concentration
that makes the water unfit for its intended use.
costlbenefit analysis. A quantitative evaluation of the costs which
would be incurred versus the overall benefits to
society of a proposed action such as the establishment of an
acceptable dose of a toxic chemical.
dose. The actual quantity of a chemical to which an organism is
exposed.
64

-------
Electronic Data Interchange (EDI). The computer-to-computer
exchange of business information in a standard format. All references
to EDI under EPA programs refers to the utilization of the ASC X12
Standards.
endemic (en-DEM-ick). Something peculiar to a particular people or
locality, such as a disease which is always present in the population.
exposure assessment. The determination or estimation (qualitative or
quantitative) of the magnitude, frequency,
duration, route, and extent (number of people) of exposure to a
chemical.
finished water. Water that has passed through a water treatment plant;
all the treatment processes are completed or “finished”. This water is
ready to be delivered to consumers. Also called PRODUCT WATER.
hazard evaluation. A component of risk assessment that involves
gathering and evaluating data on the types of
health injury or disease (e.g., cancer) that may be produced by a
chemical and on the conditions of exposure under which injury or
disease is produced.
human exposure evaluation. A component of risk assessment that
involves describing the nature and size of the
population exposed to a substance and the magnitude and duration of
their exposure. The evaluation could concern
past exposures, current exposure or anticipated exposures.
human health risk. The likelihood (or probability) that a given
exposure or series of exposures may have or will
damage the health of individuals experiencing the exposures.
indicator (chemical). A substance that gives a visible change, usually
of color, at a desired point in a chemical
reaction, generally at a specified end point.
margin of safety (MOS). Maximum amount of exposure producing no
measurable effect in animals (or studied
humans) divided by the actual amount of human exposure in a
population.
65

-------
maximum contaminant level (MCL). The maximum permissible level
of a contaminant in water which is delivered to the free flowing outlet of
the ultimate user of a public water system, except in the case of
turbidity where the maximum permissible level is measured at the point
of entry to the distribution system. Contaminants added to the water
under circumstances controlled by the user are excluded from this
definition, except those contaminants resulting from the corrosion of
piping and plumbing caused by water quality.
maximum contaminant level goal (MCLG). The maximum level of a
contaminant in drinking water at which no known or anticipated adverse
effect on the health of persons would occur, and which allows an
adequate margin of safety. Maximum contaminant level goals are non-
enforceable health goats.
medium-size water system. A water system that serves greater than
3,300 and less than or equal to 50,000 person.
monitoring. Measuring concentrations of substances in environmental
media or in human or other biological
tissues.
non-community water system (NCWS). A public water system that is
not a community water system. There are
two types of NCWSs: transient and non-transient.
non-transient non-community water system (NTNCWS). A public
water system that regularly serves at least 25 of the same nonresident
persons per day for more than six months per year.
parts per million (PPM). Paris per million parts, a measurement of
concentration on a weight or volume basis. This term is equivalent to
milligrams per liter (mg/L) which is the preferred term.
performance evaluation sample. A reference sample provided to a
laboratory for the purpose of demonstrating
that the laboratory can successfully analyze the sample within limits of
performance specified by the Agency. The
true value of the concentration of the reference material is unknown to
the laboratory at the time of the analysis.
66

-------
permissible dose. The dose of a chemical that may be received by an
individual without the expectation of a
significantly harmful result.
population at risk. A population subgroup that is more likely to be
exposed to a chemical or is more sensitive to a chemical than is the
general population.
precision. The ability of an instrument to measure a process variable
and to repeatedly obtain the same result. The ability of an instrument to
reproduce the same results.
public water system. A system for the provision to the public of piped
water for human consumption, If such
system has at least fifteen service connections or regularly least 60
days out of the year. Such term includes: 1) any collection, treatment,
storage, and distribution facilities under control of the operator of such
system and used
primarily in connection with such system, and 2) any collection or
pretreatment storage facilities not under such
control which are used primarily in connection with such system. A
public water system is either a Ircommunity water system” or a “non-
community water system.”
raw water. 1) Water in its natural state, prior to any treatment. 2)
Usually the water entering the first treatment
process of a water treatment plant.
risk. The potential for realization of unwanted adverse consequences
or events.
risk assessment. A qualitative or quantitative evaluation of the
environmental andlor health risk resulting from exposure to a chemical
or physical agent (pollutant); combines exposure assessment results
with toxicity assessment results to estimate risk.
risk management. Decisions about whether an assessed risk is
sufficiently high to present a public health concern and about the
appropriate means for control of a risk judged to be significant.
safe. Condition of exposure under which there is a “practical certainty”
that no harm will result in exposed individuals.
67

-------
Safe Drinking Water Act (SDWA). Commonly referred to as SDWA.
An Act passed by the U.S. Congress in 1974. The Act establishes a
cooperative program among local, state and federal agencies to insure
safe drinking water for consumers.
safe water. Water that does not contain harmful bacteria, or toxic
materials or chemicals. Water mayhave taste and odor problems, color
and certain mineral problems and still be considered safe for drinking.
source water. Rivers, streams, lakes and ground water that supply
drinking water to the public.
supplier of water. Any person who owns or operates a public water
system.
toxic pollutants. Materials contaminating the environment that cause
death, disease, birth defects in organisms that ingest or absorb them.
The quantities and length of exposure necessary to cause these effects
can vary widely.
toxic substance. A chemical or mixture that may represent an
unreasonable risk of injury to health or the environment.
transient water system. A non-community water system that does not
serve 25 of the same nonresident persons per day for more than six
months per year. Also called a transient non-community water system
(TNCWS).
unregulated contaminant. A contaminant in water for which no
maximum contaminant level (MCL) has been set by EPA.
virtual data base. A database that is created on a user’s workstation
which is the result of having queried one or more databases.
68

-------
APPENDIX J
Data Tables
for
TEXT BOX 4
69

-------
TRICHLORCETNYLENE
state tote? fl of monllonng sites total N of sites w/ “detects” fraction of sites w/ detect total N of observatIons N observations/site cuinubtiva N of detects cumutattve N or detectsltotet N observations
CO 121 91 075 170 140 140 082
LA 22 5 023 22 100 5 023
NC 472 54 011 768 163 134 017
IN 1361 124 009 3742 275 348 009
DE 214 17 008 338 158 26 008
NJ 1672 137 008 1772 106 170 010
AL 226 18 008 366 1.63 63 017
CA 6751 446 007 18328 2.71 6105 033
NY 1937 81 004 12535 647 688 005
KY 596 23 004 3825 6.42 73 002
MN 2348 67 003 2759 1.18 86 003
MO 289 5 002 339 117 5 001
OH 4616 99 002 6426 139 242 004
MD 1075 17 002 1877 175 50 003
PR 921 22 002 1356 1.47 43 003
AK 795 18 002 2084 282 23 001
WV 262 4 002 624 2.38 6 001
NM 1341 7 001 1596 1.19 11 001
UT 932 5 001 1278 137 10 001
WA 2280 17 001 3988 1.75 88 002
WY 316 3 001 328 104 4 001
GA 2108 20 001 2685 127 35 001
NV 90 1 001 192 213 3 002
VA 355 4 001 736 207 16 002
2791 13 0 3092 1.11 18 001
ND 576 1 0 617 1.07 1 0.00
M I 61 0 0 70 137 0 000
FL, IA, IL, M l, PA, RI, SC, SD, TN and WI did not monitor for Tflchloroetriytena

-------
TETRACHL.OROETI-IYLENE
state total C of rnOniloflng sites lotal # of sites w/ “detects” fraction of sites wf detects total C of observations C of observations/site cumulattve C of detects cumutattve C of detecta/ total C observations
CO 123 81 071 187 136 129 077
LA 22 4 018 22 100 4 018
CE 214 33 015 336 158 46 014
NC 476 5 1 012 772 162 88 011
AL 223 21 009 360 161 66 018
1 W 62 5 008 148 2 ,39 16 011
6798 527 006 18619 2,74 6363 034
NJ 2233 133 006 3329 1.49 270 008
PA 2500 119 005 4662 1.86 282 006
IN 1151 53 005 3160 215 154 0.05
I - tI 622 32 0 ,05 1327 213 41 003
SC 1269 39 003 2398 189 108 005
RI 446 15 003 1564 361 60 004
WY 320 11 003 332 104 11 003
AK 791 22 003 1983 251 69 003
NY 1935 5.4 0.03 12408 641 544 004
MT 695 29 0.03 1629 1.62 201 012
MO 1016 25 002 1678 175 59 003
MN 2342 31 002 2753 118 43 0.02
WV 373 6 002 822 22 10 001
WI 4341 101 002 5926 136 357 006
TN 365 6 002 1346 369 27 002
PR 922 19 002 1358 1.41 41 0.03
IA 2130 44 002 2238 105 63 003
AZ 2666 25 001 2965 l i i 21 001
WA 2260 29 001 3988 1.75 124 003
IL 2395 35 001 6361 266 ¶83 003
MO 289 4 001 340 l A B 6 0.02
GA 2104 26 001 2682 ¶27 62 002
UT 912 8 001 1276 14 57 004
OH 7606 115 001 16277 209 291 002
NM 1341 5 0 1596 1.19 6 001
VA 355 0 0 738 207 0 0.00
KY 563 2 0 2446 434 10 000
ND 563 0 0 5 96 106 0 000
SO and FL dId not monitor for Tetrachtoroelt lytene

-------