September 22, 1998
EPA-SAB-DWC-ADV-98-004
Honorable Carol M. Browner
Administrator
U.S. Environmental Protection Agency
401 M. Street, SW
Washington, DC 20460
Subject: An SAB Advisory on the National Drinking Water
Contaminant Occurrence Database
Dear Ms. Browner:
On June 18, 1998, the Drinking Water Committee (DWC) of the Science
Advisory Board (SAB) met to review the design phase considerations of the
National Contaminant Occurrence Data Base (NCOD). Review of the NCOD is
required in the 1996 Amendments to the Safe Drinking Water Act (SDWA). The
review was conducted in a public session under the provisions of the Federal
Advisory Committee Act (FACA).
This SAB advisory provides advice on an Agency work-in-progress. The
goal of an SAB advisory is to provide suggestions to the Agency for mid-course
corrections that will refine the ultimate product. In this case, the agency is
engaged in the development of the NCOD in response to a firm deadline
contained in the SDWA 1996. The SAB expects to conduct an additional review
after the agency has completed this initial phase of database development and
has considered updates to the system. At that time, a significant number of new
participants will be added to the reviewing panel - by changes in DWC
membership and/or inclusion of additional consultants - to ensure independent
assessment of the Agency's work.
-------
The materials provided to the SAB for review consisted of: a) a set of
briefing charts titled National Contaminant Occurrence Data Base Design Phase
Considerations; Briefing to and Questions for the Science Advisory Board dated
June 18, 1998; b) the National Drinking Water Contaminant Occurrence Data
Base - Development Strategy dated December 1997; and the c) NCOD Attribute
Type List dated April 30, 1998.
1. Background
Section 1445(g) of the SDWA requires the EPA Administrator to
"...assemble and maintain a national drinking water contaminant occurrence data
base, using information on the occurrence of both regulated and unregulated
contaminants in public water systems obtained under subsection (a)(1)(A) or
subsection (a)(2) and reliable information from other public and private sources."
In addition the act states that "In establishing the occurrence data base, the
Administrator shall solicit recommendations from the Science Advisory Board,
the States, and other interested parties concerning the development and
maintenance of a national drinking water contaminant occurrence data base,
including such issues as the structure and design of the data base, data input
parameters and requirements, and the use and interpretation of data."
The Agency intends to use the NCOD to help it identify contaminants for
future Contaminant Candidate Lists, to select contaminants for future regulation,
to develop new national primary drinking water regulations for selected
contaminants, to revise existing national primary drinking water regulations, and
to provide information to the public in a readily accessible form. The Agency
intends to build the NCOD on existing data sources (e.g., Safe Drinking Water
Information System-SDWIS and Storage and Retrieval of U.S. Waterways
Parametric Data-STORET) and to build in refinements later. NCOD data could
potentially include information on regulated and unregulated contaminant
occurrence, ambient monitoring data, and other data from research and special
studies. Both historical and future data are to be included in the data base.
Since May 1997, the Agency has worked to develop an NCOD strategy
and has interacted with stakeholders and other groups on technical issues
associated with SDWIS, STORET, microbiological contaminants, data quality,
sample test results, public health, environmental factors, public access, reporting
standards, and database design.
-------
The Agency plan for completing the development of the NCOD includes
the following:
a) Decision on data elements (spring 1998)
b) Decision on the electronic platform (spring 1998, plus 1 month)
c) Design and development (April 1998 to April 1999)
d) Develop analytical plan to apply to listing, selection, and regulation
e) NCOD operational test (April to July 1999)
f) Guidance on data submission to NCOD (June 1999)
g) Plan NCOD long-term maintenance and data analysis (June 1999)
h) Public access (August 1999)
The charge to the Committee asked:
a) Are the data elements included for Sample Test Results adequate
for scientific analyses, recognizing that more detailed data will still
be stored by the laboratory?
b) What types of results should be reported for peer review by the
scientific community relative to regulatory decisions? How should
these results be reported?
2. General Comments
The Science Advisory Board (SAB) is pleased that the Agency is
organizing drinking water data to facilitate its effective use. The principal
recommendation of the SAB is that the Agency should consider and clearly
articulate the intended uses of this data, and the methods that will be used for
data analysis and presentation, before the NCOD design is completed. This will
enable EPA scientists to more effectively identify those data elements that are
essential for inclusion within the data base. The Committee also recommends
that the EPA pay special attention to the collection and organization of high
quality data in the future and not to invest heavily in previously collected data of
less well-defined quality that was gathered before the NCOD was designed.
3. Specific Charge Questions
3.1 Are the data elements included for Sample Test Results
adequate for scientific analyses, recognizing that more
detailed data will still be stored by the laboratory?
-------
The Agency provided the DWC with a list of attributes for possible NCOD
inclusion. Some of the attributes would be reported while others would be
obtained from automated reference tables within the system. The list includes
over 120 separate attributes: about 10 percent of these are labeled as Sample
Test Results (STR). These include the following:
a) Concentration measure
b) Units of measure for concentration
c) Dead counts
d) Live counts
e) Detection limit measure
f) Detection limit type
g) Detection limit unit of measure
h) Lower 95% confidence measure
I) Upper 95% confidence measure
j) Percent recovery
k) Percent recovery standard deviation
I) Sign of the result
m) Validity indicator
Other attribute categories provide information on the sampling location,
water source, chemical identity and applicable drinking water standards,
distribution system, laboratory conducting the analysis, nature of the sample
collected, analytical techniques used, treatment techniques used on the water,
and zip code.
The Agency asked whether the NCOD attributes labeled as "Sample Test
Results" are adequate for the scientific analyses needed to identify contaminants
for future Contaminant Candidate Lists (CCL), to select contaminants for future
regulation, to develop new national primary drinking water regulations for
selected contaminants, to revise existing national primary drinking water
regulations, and to provide information to the public in a readily accessible form.
To support these uses, the Agency stated that the data base should be
designed to answer the following questions:
a) What is the contaminant?
b) At what concentration is the contaminant found?
c) Where and when is the contaminant found?
-------
d) What is the type of water source?
e) Is water treatment associated with the occurrence?
f) At what concentration is the contaminant a health concern?
g) What number of people are exposed?
h) Is there co-occurrence with other contaminants?
I) Why was the sample collected?
j) What is the level of confidence in the measure of concentration?
These are important questions to ask, however, the situation under which
the SIR data will be used is quite complex. Until the Committee has a clear
understanding of how the data will be applied to answer these questions in
support of the regulatory purposes noted above, it is not possible to fully
comment on whether these attributes are, or are not, adequate. Although some
reaction is possible as a result of observing elements in the list provided (see
Appendix A for some examples), the SAB does not feel that it is now useful to
dwell on these issues because its comments would only reflect a fragmented
picture of the uses intended for the database. Instead, the SAB recommends
that the Agency explicitly examine the intended uses of the contaminant
occurrence data. Doing so will lead the agency to a systematic approach to
define the specific data elements that need to be included in the NCOD.
The SAB understands that the Agency is currently developing a plan for
the analysis of data from the NCOD that will be broadly applied during
contaminant listing, selection, and regulation. This plan is now scheduled for
Agency review later in 1998. This analysis plan will lay out how the information
from the NCOD is to be reported and how it will be accessed by the public.
Overlapping the development of this plan, the Agency is settling on NCOD
design and development issues and will conduct an operational test of the
system during the April to July 1999 time frame. The SAB recommends that the
Agency move up its time-table for the development of the analysis plan.
As used in this report, this analysis plan would describe the use of data
from the NCOD and it should include at least:
a) a clear and formal statement of the purposes to which the data will
be put.
b) a formal statement of what the objectives of the data collection are
to be relative to its representativeness (i.e. representative of a
single water supply, representative of the nation as a whole,
determining whether contaminants are derived from the source
-------
water or introduced in treatment and distribution, etc.). This will
translate directly into a sampling plan and decisions about what
data can or should be included in the database.
c) expectations of precision and accuracy that will be needed to meet
the stated objectives of the data collection activity.
d) Sample test cases should be used by the Agency to insure that all
the data attributes required for the specified uses have been
identified.
Sample test cases should address the Agency's array of goals (e.g.,
regulatory development, exposure assessment, etc.), which are among the most
important questions to the Agency and its stakeholders. These test cases
should also provide a framework for developing quantitative statistical and
geographic procedures and facilitate the definition of specific input parameters
and sample and contaminant information needed to support scientifically
defensible statistical and geographic analyses. Sample test cases would also
help to identify a set of relevant data quality objectives pertaining to the input
parameters and contaminant measurement values used in the statistical
algorithms and geographic procedures. For example, some of the important
categorical factors uncovered by the test cases might be related to: a) treatment
processes, b) sample characteristics, and c) methods used for measuring the
contaminants and how missing information would be handled in the analysis
among other things. Each of these factors would have specific attributes
identified in the sample test cases or mock exercises.
An extension of the example may be illustrative of the utility of sample test
cases. If one wanted to evaluate the effect of a treatment process on a
contaminant, it would be important to capture changes in process from one
sampling episode to the next. At least two additional attributes would be needed
for the analysis, the location of the sampling points (i.e., source water and
treated water) and the detection limits for the analyses. Indicators of precision
and bias (in the measurement values, e.g., how non-detects were handled)
would be important data elements for each contaminant measurement reported
and included in the database. These factors, and others like them, would have
to be included in the data base to make a sensible analysis. The sample test
cases or mock exercises should make it clear whether one or more important
sample attributes that would be critical to the desired analyses have been
inadvertently omitted.
-------
Finally, the development of an analysis plan should involve consultation
on major issues with experts such as engineers for treatment processes, with
analytical chemists for sampling and contaminant analysis, and with
microbiologists for sampling and analysis of microbes.
The Committee expects several positive outcomes from the analysis plan
that will ensure that the NCOD provides data for regulatory analyses that meet
the highest scientific standards. It is the Committee's opinion that the NCOD will
produce such benefits for the agency and the regulated community if it is
properly developed. Specifically, the Committee would like to point out some
obvious benefits relating to data quality.
a) Establishing a database that has defined standards for data quality
and completeness will have major benefits. The SAB recommends
that the Agency bias its effort toward influencing and collecting
good quality contemporary data first and only invest in the
inclusion of older data as a secondary priority. Furthermore, such
attributes will allow casually submitted data that may be of poorer
quality to be segregated from good quality data that will be needed
for certain types of analyses.
b) Data taken at "standard stations" like water treatment plant intakes,
water treatment plant outlets, water wells, and designated sampling
points in the distribution piping of drinking water systems and at
designated ambient sampling points used by the United States
Geologic Survey (USGS) will prove most useful and should have
first priority. One-of-a-kind sampling programs or sampling
programs that do not have fixed sampling points should receive
secondary attention. The Agency should take the opportunity
provided by the NCOD to apply existing and emerging
technologies for presenting data and the Agency's analyses of the
data to the public (e.g., geographic information systems).
c) Sample data submitted by states, with analyses conducted by
certified drinking water laboratories using standard or draft
standard methods, will prove most reliable and should have priority
over sample data submitted from one-of- a kind surveys and/or
analyses conducted by laboratories that are not certified.
-------
d) Sample compositing requires special attention because it is only
appropriate for contaminants whose effects are associated with
total dose consumed over extended periods of time. It is not
appropriate for sampling which measures microbial contaminants,
chemicals with primary effects on development, or chemicals that
may lead to acute effects, such as, nausea, vomiting, or diarrhea.
Where appropriate (e.g., carcinogenic chemicals), compositing can
be done in particular places or at specific times or over different
magnitudes of space or time. There are a variety of techniques
that can be used for compositing. Though composited data are
potentially powerful in certain circumstances, their interpretation
and their comparison with other data on the same contaminants
can be quite difficult.
e) The Agency should consider how it will report data with many non-
detects (NDs) determined by different methods and by different
laboratories, each with their own detection limit. For example, one
could indicate for non-detects, one or more of the following. The:
(1) number of samples analyzed
(2) range of values for chemical contaminants
(3) reporting of microbes (yes/no presence, too many to count)
(4) number of samples with quantifiable levels
(5) number of N.D.s < 1st MDL - 1st Method
(6) number of N.D.s , 2nd MDL - 2nd Method
(7) 50% value determined by Maximum Likelihood Methods
(8) 90% value determined by Maximum Likelihood Methods
In summary, the Committee recommends that the following steps will help
it confirm that the most appropriate data elements are included in the NCOD. It
should determine exactly how the data elements will be used in the regulatory
process, exposure assessment, etc. by developing a detailed Analysis Plan as a
critical step in database design; design report forms to address each of their
needs and consider how the reports can be organized to make the results user
friendly; and build the database requirements using this information with
additional assistance from experts in the field.
8
-------
3.2 What types of results should be reported for peer review by
the scientific community relative to regulatory decisions?
How should these results be reported?
Again, there is no simple answer to this question; however, as indicated
above the development of an analysis plan will be important in responding to this
question. Once the intended use of the data are described and the possible
results from using the data are identified, the need for peer review and the
manner in which the results are to be reported can be addressed.
Peer review will undoubtedly occur in the context of a particular use of the
data. Part of the review will be focused on data quality, but peer reviewers will
also be interested in whether the data are sufficiently representative to
accomplish their intended purpose. The definition of uses of the data should be
explicit, not simply be couched in terms of "regulatory uses." With that further
level of specificity, data attributes would be identified with the uses of the data
rather than being defined a priori and without reference to specific uses. Using
this approach it is probable that many of the attributes listed may be found to be
unimportant. More importantly, such an organized approach could minimize the
number of important attributes that might be overlooked.
In conclusion, the SAB appreciates the opportunity to review and
comment upon this needed data base. The Agency has made substantial
progress in developing this tool that will be important to future drinking water
regulations. The SAB is confident that once the Agency's analysis plan showing
how the data are to be used in supporting future regulatory analyses is
completed that it will be possible to determine the final attributes that will need to
be included in the NCOD. The SAB would be pleased to review that plan and to
provide additional advice on the adequacy of the Sample Test Results included
in the NCOD in supporting the stated regulatory needs.
Sincerely,
Dr. Joan Daisey, Chair Dr. Richard J. Bull, Chair
Science Advisory Board Drinking Water Committee
Science Advisory Board
-------
APPENDIX A
Examples of Attributes Listed That Will Impact Data Use
To conduct exposure assessments, there is a need for representative
data. The sample test results attributes (STR) included do not seem to allow for
an assessment of representativeness. A data point does not represent a
concentration for a specific contaminant year round for all populations. How the
data are to be manipulated and used needs to be clear and this would affect the
attributes and data elements needed.
For example, the listed attributes will only provide answers to the
concentration (with upper and lower 95% confidence bounds), locations, and
point in time. However, we would also want to know an exposure level
representative of a longer term exposure concentration in other locations, or
exposure levels that would be applicable to all states, if a federal regulation is to
be developed that is to be applied to all states. Would the consideration of
factors such as the frequency of sampling, sample size, and number of sampling
location (and how they are distributed) needed for this purpose already be
incorporated and expressed in final values in the attributes such that no more
data manipulation is needed? The critical point is, how representative is the
'concentration measure' that would be reported? If other information is needed
to make this 'concentration measure' representative, then it should be added to
the attributes.
Another example is whether the Agency would like to identify a
concentration at which the contaminant is a health concern" How is this
concentration determined? The item 'applicable drinking water standards' is
included as an attribute that might address this question, but MCLs are not
solely health based. Many are based simply on quantitation limits. The basis for
deriving the health concern concentration should be included in the data base if
this question is of interest to the Agency.
The fields of the NCOD that pertain to the microbial contaminants seem to
be significantly underdeveloped. The fields reflect the very narrow viewpoint of
data to come from the Information Collection Rule (ICR). The SAB assumes that
the NCOD will be used for purposes other than analyzing the ICR data, therefore
the database needs to be developed within a broader context. For example, the
A-1
-------
attribute "Sample Result-Percent Recovery" is to be reported for protozoan
analyses only. It will be just as important to have information on the percent
recovery for other microorganisms, such as viruses. In addition, there is no way
to report results of qualitative analyses, such as those from PCR (polymerase
chain reaction) or MPN (most probable number) analyses.
The SAB could develop some additional microbiological attributes for the
NCOD, however, this would not be the most effective way to compile a complete
set of attributes. The SAB recommends that the EPA convene a group of
experts (internal, external, or both) to consider the issue of microbial attributes
needed to support regulation once the Analytical Plan is completed.
A-2
-------
APPENDIX B
ABBREVIATIONS
CCL Candidate Contaminant List
DWC Drinking Water Committee
ICR Information Collection Rule
MCL Maximum Contaminant Level
MDL Minimum Detection Level
MPN Most Probable Number
NCOD National Contaminant Occurrence Database
ND Non-Detects
PCR Polymerase Chain Reaction
SAB U.S. EPA Science Advisory Board
SDWA Safe Drinking Water Act Amendments (1996)
SDWIS Safe Drinking Water Information System
STORET Storage and Retrieval of U.S. Waterways Parametric Data
SIR Sample Test Results
USGS U.S. Geologic Survey
B-1
-------
NOTICE
This report has been written as part of the activities of the Science
Advisory Board, a public advisory group providing extramural scientific
information and advice to the Administrator and other officials of the
Environmental Protection Agency. The Board is structured to provide balanced,
expert assessment of scientific matters related to problems facing the Agency.
This report has not been reviewed for approval by the Agency and, hence, the
contents of this report do not necessarily represent the views and policies of the
Environmental Protection, nor of other agencies in the Executive Branch of the
Federal government, nor does mention of trade names or commercial products
constitute a recommendation for use.
-------
ENVIRONMENTAL PROTECTION AGENCY
SCIENCE ADVISORY BOARD
DRINKING WATER COMMITTEE
ROSTER
CHAIR
DR. RICHARD BULL, Batelle Pacific Northwest Laboratories, Molecular
Biosciences, Richland, WA
MEMBERS/CONSULTANTS
Dr. JUDY A. BEAN, Department of Epidemiology and Public Health, University of
Miami, School of Medicine, Miami, Florida 33136
DR. LENORE S. CLESCERI, Rensselaer Polytechnic Institute, Materials
Research Center, Troy, New York
DR. MARY DAVIS (consultant), Professor of Pharmacology & Toxicology,
Department of Pharmacology & Toxicology, Robert C. Byrd Health Sciences
Center, West Virginia University, Morgantown, WV
DR. YVONNE DRAGAN, McArdle Laboratory for Cancer Research, University of
Wisconsin, Madison, Wisconsin
DR. JOHN EVANS, Harvard Center for Risk Analysis, Boston, Massachusetts
DR. ANNA FAN-CHEUK, California Environmental Protection Agency, Berkeley,
California
DR. CHRISTINE MOE (consultant), Department of Epidemiology. University of
North Carolina, Chapel Hill, NC
DR. LEE D. (L.D.) MCMULLEN, Des Moines Water Works, Des Moines, IA
DR. CHARLES O'MELIA, Department of Geography and Environmental
Engineering, The Johns Hopkins University, Baltimore, MD
-------
DR. EDO D. PELLIZZARI, Research Triangle Institute, Research Triangle Park,
NC
DR. VERNE A. RAY, Medical Research Laboratory, Pfizer Inc., Groton,
Connecticut
DR. GARY A. TORANZOS, Department of Biology, University of Puerto Rico,
San Juan, Puerto Rico
DR. RHODES TRUSSELL, Montgomery Watson, Pasadena, CA
DR. MARYLYNN V. YATES, Department of Soil and Environmental Sciences,
University of California, Riverside, CA
SCIENCE ADVISORY BOARD STAFF
MR. TOM MILLER, Designated Federal Official, US EPA/Science Advisory
Board, 401 M Street, S.W. (1400), Washington, D.C. 20460
MS. MARY WINSTON, Staff Secretary, Science Advisory Board, 401 M Street,
S.W. (1400), Washington, DC 20460
-------
DISTRIBUTION LIST
Administrator
Deputy Administrator
Assistant Administrators
Deputy Assistant Administrator for Science, ORD
Director, Office of Science Policy, ORD
EPA Regional Administrators
EPA Laboratory Directors
EPA Headquarters Library
EPA Regional Libraries
EPA Laboratory Libraries
Library of Congress
National Technical Information Service
Congressional Research Service
-------
G:\USER\SAB\REPORTS\98REPORT\DWCADV.004
------- |