&EPA
   United States
   Environmental Protection
   Agency
2006 Drinking Water
Data Reliability
Analysis and Action
 Plan

 For State Reported Public
 Water System Data In the
 EPA Safe Drinking Water
 Information System /
 Federal Version
 (SDWIS/FED)
 Office of Water (4606M)
 EPA816-R-07-010
 www.epa.gov/safewater
 March 2008
                 Printed on Recycled Paper

-------
          2006 Drinking Water Data Quality Analysis and Action Plan

                                  Executive Summary

Safeguarding our nation's drinking water by developing effective and appropriate policy
decisions and conducting program oversight depends on data of known and documented quality.
The Safe Drinking Water Information System/Federal Version (SDWIS/FED) is the
Environmental Protection Agency's (EPA) principal database for the national drinking water
program. It contains data on public water systems (PWS) provided by the states to EPA.  It is
primarily used  for management of state and EPA programs and for informing the public about
the compliance status of their drinking water systems, and indirectly, the safety of their drinking
water.  EPA uses the information in SDWIS/FED for various analyses to support programmatic
decisions and identify trends, problems, and opportunities for improvement of the states' rule
implementation as well as program oversight. Consequently, the utility of SDWIS/FED
information for these purposes highly depends on the quality of the data it contains.

EPA routinely  evaluates state programs by conducting Data Verification (DV) audits, which
evaluate state compliance decisions and reporting to SDWIS/FED. EPA prepares triennial
summary evaluations based on the DV. This document presents results of EPA's third triennial
review of data  quality in SDWIS/FED and includes an evaluation of the data collected from 2002
through 2004. For the 38 states evaluated, we found that:

   •   Ninety-four percent of health-based violation  data in SDWIS/FED were accurate.
   •   Approximately 81% of the maximum contamination level (MCL) and surface water
       treatment technology (SWTR TT) violations were reported to SDWIS/FED.
   •   Including lead and copper treatment technology (LCR TT) violations, about 62% of the
       health-based violations (MCL and Treatment Technology violations) were reported to
       SDWIS/FED, where only 8% of LCR TT violations were reported to SDWIS/FED.
   •   Only approximately 30% of the monitoring and reporting (M/R) violations were reported
       to SDWIS/FED.
   •   The primary reason for non-reporting was due to compliance determination errors rather
       than data flow errors.
   •   Further, 60% of the health-based violations1 were reported on time and approximately
       30% of the monitoring and reporting violations were reported on time to SDWIS/FED in
       2004.

Background

SDWIS/FED contains data about PWS facilities, violations (e.g., exceptions and exceedances) of
Federal drinking water regulations adopted by the states, and enforcement actions taken by the
state. The regulations include health-based drinking water quality standards, performance of
1 The health-based violations in this reference do not include lead and copper treatment technology violations
because they have open-ended compliance period end dates.

-------
treatment techniques and/or process requirements. The focus of this report is on two types of
violations: (1) health-based violations (i.e., exceedance of maximum contaminant level or non-
performance of a treatment technique or process), and (2) monitoring and reporting violations
(i.e., a water system did not monitor, did not report monitoring results, or were late in reporting
results to the state.)

States manage their own processes and databases differently to document public water system
capabilities and their program management decisions concerning violations (or noncompliance),
and to record corrective actions undertaken. State data indicate that violations occur
infrequently at most public water systems (PWS). Violation data that states report to EPA
(SDWIS/FED) reflect only those major and minor noncompliance results that may lead to
adverse public health outcomes. Violations represent a small fraction of all the determinations
states make which demonstrates the safety of the nation's water supply.

The first triennial review of data quality evaluated data for the period 1996-1998.  That
assessment, which resulted in a detailed data analysis report in 2000, produced an action plan
under which states and EPA worked together to improve data quality.  The plan resulted in
actions that included training state  staff, streamlining reporting to SDWIS/FED, making
SDWIS/FED error reporting and correction more user-friendly, improving DVs, following up
with Regions after DVs,  and encouraging states to notify water systems of sampling schedules
annually. Similarly, the second triennial review of data quality analyzed the data from the period
1999-2001 and findings were presented in the 2003 report.  The recommended action plan in the
2003 report included:
   •   Development of state-specific compliance determination and quality improvements plans
       necessary to remedy the major problem areas,
   •   Conducting and improving  data quality analysis and report results,
   •   Implementation of the OGWDW information strategic plan and SDWIS/FED
       modernization,
   •   Development of an automated monitoring requirement and sampling schedule tracking
       system by the states, and evaluation of timeliness of violations and potential violation
       non-reporting.

This Review

Between 2002 and 2004, EPA conducted DV audits in  38 states and reviewed data on drinking
water system information, violations, and enforcement actions. See Table ES-1 for the list of DV
States. EPA evaluated 2,658 PWSs, of which 43% were Community Water Systems (CWS). See
Table ES-2 for the distribution of systems by system type and the size of population served. The
violations addressed by the DVs are shown in the Appendix B. The period of review by rule was
generally the two most recently scheduled monitoring periods for each water system  and
applicable rule. For the Total Coliform Rule (TCR) and the Surface Water Treatment Rule
(SWTR TT), the most recent four quarters were evaluated.

-------
             Table ES-1: States Subject to Data Verifications from 2002-2004
Region
1
2
3
4
5
States
CT, MA, RI, VT
NJ, VI
MD, PA, VA, WV
AL, FL, KY, MS, NC('02),
NC('04) SC, TN
IL, MI, MN, OH






Region
6
7
8
9
10
States
AR, NM, OK, TX
IA, MO
CO, SD, UT, WY
AZ, CA, R9 Tribes
AK, ID, WA
             Table ES-2: Number of Systems included in Data Verifications
                               by System Type and Size

System Size
Very Small (500 or fewer)
Small (5 01 -3, 3 00)
Medium (3,301-10,000)
Large (10,001-100,000)
Very Large (> 100,000)
Total
System Type
CWS
572
277
119
135
44
1,147
NTNCWS
637
123
9
4
0
738
TNCWS
696
36
6
0
0
773
Total
1,905
436
134
139
44
2,658
Summary of Results

For the MCL/SWTR TT violations, 81% of the data were reported to SDWIS/FED, Figure ES-3
summarizes the data quality estimates by violation type.  Of the non-reported violations, 74%
were due to compliance determination (CD) errors, where the states did not issue a violation
when a violation had occurred. Twenty-six percent of the non-reported violations were due to
data flow (DF) errors. Figure ES-4 summarizes the percentage of errors contributed from non-
reporting by violation type.  Approximately 94% of the data in SDWIS/FED were accurate. The
overall data quality (DQ) of the MCL/SWTR TT violations was 77%. This means that 77% of
the noncompliance determinations on MCL/SWTR TT were correctly reported in SDWIS/FED.
                                         in

-------
                 Figure ES-3: Data Quality Estimates by Violation Type
100.000/*
90.00°/<
80.000/*
70.000/*
60.000/*
50.00°/<
40.000/*
30.00°/<
20.000/*
10.00°/<
0.00°/<
)
)
)
)
)
)
)
)
)
)


81 .33%
Con
61 .69%
iplete

29.02%
ness
I
94.12%
A
94.30%
ccurac
88.35%
=y

i
77.21%

59.18%
DQ

27.08%
E3MCL/SWTRTT Violations n Health-Based Violations H M/R Violations
     Figure ES-4: Percentages of Error Contribution to Non-Reporting of Violations
100%

90%
80%
70% -
60%
50%
40% -
30%
20%
10% -
no/,


26.16%

73 84%


MCL/SWTRTT Violations

15.55%


84 45%



7.97%

	 92 03%v 	


Health-Based Violations M/R Violations
m CD Error D DF Error

CD: Compliance Determination
DF: Data Flow
                                          IV

-------
For the health-based violations including LCR TT violations, 62% of the data were reported to
SDWIS/FED. Of the non-reported violations, 84% were due to CD errors. Approximately 94%
of the health-based violations data in SDWIS/FED were accurate in SDWIS/FED. The overall
data quality of the health-based violations was 59%, i.e., approximately, 59% of the
noncompliance determinations on health-based standards were correctly reported in
SDWIS/FED.

Only 29% of the monitoring and reporting violations were reported to SDWIS/FED. Ninety-two
percent of the non-reported violations were due to CD errors. Approximately 89% of the
monitoring and reporting violations data in SDWIS/FED were accurate. The overall data quality
of the M/R violations was 27%, i.e., 27% of the noncompliance determinations on M/R were
correctly reported to SDWIS/FED.

Data Reliability Improvement Action Plan

Appendix  A is a joint plan of EPA and the Association of State Drinking Water Administrators
to achieve a goal of 90 percent complete and accurate data for health-based violations, as well as
improving the quality of monitoring and reporting violations and inventory data. Progress
toward accomplishment of this goal will be measured annually and assessed in 2009.

-------
Acknowledgements

The following people contributed to this analysis and the preparation of this report:

Project Lead        Drinking Water Protection Division
                    Office of Ground Water and Drinking Water

                    Chuck Job, Branch Chief, Infrastructure Branch

                    Leslie Cronkhite, Associate Branch Chief, Infrastructure Branch

Principal Author    Jade Freeman, Ph. D., Statistician, Infrastructure Branch

Contributing Author Lee Kyle, IT Specialist, Infrastructure Branch

Peer-review of the statistical methodology in this report has been provided by:

                    Anthony Fristachi, Exposure Analyst
                    National Center for Environmental Assessment
                    U.S. EPA Office of Research and Development

                    Tony R. Olsen, Ph. D., Environmental Statistician
                    Western Ecology Division
                    Health and Environmental Effects Research Laboratory
                    U.S. EPA Office of Research and Development

                    Arthur H. Grube, Ph.D., Senior Economist
                    U.S. EPA Office of Pesticide Programs

                    A. Richard Bolstein, Ph. D., Chairman (retired)
                    Department of Applied  & Engineering Statistics
                    George Mason University

                    John Gaughan, Ph. D., Associate Professor
                    Epidemiology & Biostatistics
                    Temple University School of Medicine

                    Matthias Schonlau, Ph.  D., Head Statistician
                    Statistical Consulting Service
                    The Rand Corporation

Collaboration on the Data Reliability Improvement Plan - Association of State Drinking
Water Administrators

-------
                                   Table of Contents

1.   Introduction	1
       1.1 Previous Activities 	2
       1.2 Regulatory Context  	2
       1.3 Changes in 2006 Analysis Method	3

2.   Overview of Data Verification	3

3.   Statistical Sample Design of Data Verification and Analytical Method	6
       3.1 Selection of States	6
       3.2 Selection of Systems with State	7
            3.2.1 Sample Frame	7
            3.2.2 Sample Design Data Verification	7
            3.2.3 Sampling Procedure  and Data Collection Activities	8
       3.3 Analytical Method: Weighting and Estimation	9

4.   Results from the Analysis of Data  Verification	12
       4.1 Analysis of Inventory Data	12
       4.2 Analysis of Violation Data	13
            4.2.1 Results from 2002-2004 Data Verifications	19
            4.2.2 Results from 1999-2001 Data Verifications 	23
            4.2. Data Quality Estimates from 1999-2001 and 2002-2004	25
       4.3 Analysis of Enforcement Data	31

5.   Analysis of Timeliness of Violation Reporting in SDWIS/FED	31

6.   Conclusion	34

7.   Data Reliability Improvement Action Plan	35

8.   Future Analysis of Data Reliability	35

Appendix A: 2006 Data Reliability Improvement Action Plan	37

Appendix B: Violations Addressed by Data Verification (DV)	45

Appendix C: Definition of Public Notification (PN) Tier	47

-------
   2006 Drinking Water Data Quality Assessment and Action Plan
1.     Introduction

The Safe Drinking Water Information System/Federal Version (SDWIS/FED) is the
Environmental Protection Agency's (EPA) principal database for the national drinking water
program. Its two major uses are (1) to help manage state and EPA programs and (2) to inform
the public about the compliance status of public water systems (PWSs) and, indirectly, the safety
of drinking water. The Federal government uses SDWIS/FED data for program management for
90 contaminants (as of 2005) regulated in drinking water at approximately 158,000 PWSs in 56
state and territorial programs and on Indian lands. Data received by EPA from states in
SDWIS/FED includes a limited set of water system descriptive information, e.g., system type,
population served, number of service connections, water source type), data on PWSs' violations
of regulatory standards and process requirements, and information on state enforcement actions.
These data, which EPA uses to assess compliance with the Safe Drinking Water Act (SDWA)
and its implementing regulations, represent the only data states are currently required to report to
EPA relative to drinking water safety. SDWIS/FED data can be accessed from the EPA web site
at www.epa.gov/safewater.

The utility of SDWIS/FED data for program management and public communication is highly
dependent on the quality of data housed by the system. To assess this quality, EPA routinely
conducts data verification (DV) audits in states and develops a summary evaluation every three
years called Drinking Water Data Quality Assessment. DV auditors evaluate compliance data in
state databases and hard copy files, monitoring plans, and other compliance information
submitted by PWSs.  The auditors also examine sanitary surveys, correspondence between the
state and the water system, compliance determination decisions, and enforcement actions taken
by the state.  Based on this information,  the auditors confirm whether all required information
was submitted to and evaluated correctly by the state and whether required reporting elements
were submitted to SDWIS/FED.

This report includes (1) a description of the methodology used; (2) analyses of the data from the
2002 to 2004 Data Verifications, the most recent triennial evaluation period; and (3) analysis of
the timeliness of reporting in SDWIS/FED. The report also describes a plan to address
continued improvement in drinking water compliance data reported by states. This report is not
intended for evaluating states' performance. This report is a tool to identify the gap between the
states' violation data and SDWIS/FED and to provide a benchmark for the collaborative efforts
between the states and EPA to bridge the gap  and improve the  data quality in SDWIS/FED.
1.1     Previous Activities

In 1998, EPA launched a major effort to assess the quality of the drinking water data contained
within SDWIS/FED to respond to concerns regarding incorrect violations in the database. EPA

-------
enlisted the help of its stakeholders in designing the review, analyzing the results for data
collected between 1996 and 1998, and recommending actions to improve drinking water data
quality. The first Data Reliability Analysis of SDWIS/FED was published in October 2000.

Findings of the first Data Reliability Analysis, which indicated that data quality needed
improvement, were later updated by the second triennial assessment in 2003 (which included
data collected between 1999 and 2001). Together, these assessments included comprehensive
recommendations for EPA and state primacy agencies on quality improvements. The reports
identified near-term actions that had already been taken or were still needed to improve data
quality more immediately.  To implement the recommendations, the states and EPA have
conducted numerous activities and projects to improve data quality.  Activities undertaken have
included a) providing training for states;  b) streamlining reporting to SDWIS/FED; c) making
SDWIS error reporting correction more user-friendly; d) improving data verifications; e)
following up with Regions on findings after data verifications; and f) encouraging states to
annually notify water systems of sampling schedules.

The Office of Ground Water and Drinking Water's (OGWDW) response to the data reliability
issues identified in the 2003 report included a commitment to conduct analyses which would
provide periodic data quality estimates (DQEs),  and provide input into program activities and
priorities necessary to improve the quality and reliability of the data.  Part of that commitment
was to publish the results of these analyses every three years.

1.2    Regulatory Context

States make a large number of determinations regarding public water systems' compliance with
drinking water regulations  and violations of these regulations are a small fraction of these
determinations. Since violations represent  a small fraction of all the determinations states make,
this result indicates the general safety of the nation's drinking water supply. For example, an
analysis of nitrate maximum contaminant level compliance data for Oklahoma from 2004,
showed only 3% of determinations resulted in violations.

The data considered for evaluating quality, particularly accuracy and completeness, consist of
the violations of health-based standards and monitoring and reporting requirements.  These data
are important for two reasons: (1) States and EPA program management relies on them to
identify priorities and (2) states and EPA use them to inform the public about the safety of its
drinking water. For federal program reporting purposes under the Government Performance
Results Act (GPRA), violation data have become a major focus.  EPA's 2006-2011 strategic plan
specifies a clean and safe water goal of "90% of the population served by community water
systems (CWS) meeting all health-based standards and treatments by 2011." A CWS which
meets all health-based standards and treatments  does not have a violation of the federal
regulations for maximum contaminant levels (MCL) or treatment techniques. Due to the
importance and emphasis on violation data, this  data quality evaluation methodology addresses
whether states  correctly identify and report the violations that should have been reported to EPA
according to state primacy  agreements pursuant  to Federal regulations.

-------
1.3    Changes in 2006 Analytical Method

In this analysis of 2002 to 2004 DV data, EPA uses a different method for evaluating the data
quality as described below.

!      In the previous report, the DQEs were calculated without considering the sample design
       of DVs, i.e., the selection process by which the systems are included in the sample. In
       this assessment, the DQEs are calculated using statistical sample design-based unbiased
       estimation. The sample design and the estimation method for calculating sample
       statistics are described in detail in Section 3.

!      The completeness measure of the violation data quality in the 2003 report represented the
       proportion of accurate data in SDWIS/FED out of all violation data that should be
       reported to SDWIS/FED.  However, in this 2006 analysis, EPA redefined completeness
       of SDWIS/FED based on any violation data reported to  SDWIS/FED regardless of the
       accuracy.

!      Because of the changes in the estimation method described above and non-random
       selection of states for DV audits, the results from this analysis will not be compared to
       those from the 2000 or 2003 assessments.

The statistical methodology for the analysis of DV data and the results are described in Sections
3 and 4. The additional analysis of the timeliness of reporting in SDWIS/FED is presented in
Section 5.

2.     Overview of Data Verification

EPA's OGWDW routinely conducts DV audits, which evaluate the management of state drinking
water programs.  During the DVs, EPA examines state compliance decisions, data on the system
compliance and violations in the state files, and the data required to be reported to SDWIS/FED.
During the DVs, EPA reviews data submitted by PWSs, state files and databases, and
SDWIS/FED, and compiles the results on the discrepancies among the data.  States have several
opportunities to respond to findings while DV personnel are on site, and provide additional
clarifying information if available. States also review the DV draft report before the final report
is produced, and their comments are  incorporated into the report.  EPA responds to every  state
comment, to explain in detail whether or not the state's additional information changed the
finding.

Until 2004, states were selected for DVs considering a number of factors; for example, the states
that had not been audited for a long period of time were selected for DVs. Also, in order to
minimize the burden on EPA Regions and states, OGWDW tried to maintain an even distribution
of DV states across the regions2. Further, resource constraints have affected the selection of
2 EPA is divided into 10 regional offices, each of which is responsible for several states and territories.

                                            3

-------
certain states since it is more costly to conduct DVs in some states than others. Between 2002
and 2004, EPA conducted DV audits in 38 states and reviewed data on drinking water system
information, violations, and enforcement actions (Table 2-1). State files for a total of 2,658
PWSs were evaluated, of which 43% were community water systems (Table 2-2). The
regulations addressed by the DVs and the compliance period reviewed for each regulation are
shown in Table 2-3.
              Table 2-1: States Subject to Data Verifications from 2002-2004
Region
1
2
3
4
5
States
CT, MA, RI, VT
NJ, VI
MD, PA, VA, WV
AL, FL, KY, MS, NC
('02),NC('04), SC, TN
IL, MI, MN, OH






Region
6
7
8
9
10
States
AR, NM, OK, TX
IA, MO
CO, SD, UT, WY
AZ, CA, R9 Tribes
AK, ID, WA
      Table 2-2: Number of Systems included in Data Verifications by Type and Size

System Size
Very Small (500 or fewer)
Small (5 01 -3, 3 00)
Medium (3,301-10,000)
Large (10,001-100,000)
Very Large (> 100,000)
Total
System Type3
CWS
572
277
119
135
44
1,147
NTNCWS
637
123
9
4
0
738
TNCWS
696
36
6
0
0
773
Total
1,905
436
134
139
44
2,658
3 Community water systems (CWSs) have at least 15 service connections or serve 25 or more of the same population
year-round. Nontransient noncommunity water systems (NTNCWSs) regularly serve at least 25 of the same persons
over 6 months per year. Transient noncommunity water systems (TNCWSs) provide water where people remain for
periods less than 6 months.

-------
Table 2-3: Period of Compliance for Rules Reviewed During 2002-2004 Data Verifications
Rule4
Inventory
Consumer Confidence Report (CCR)
Total Coliform Rule (TCR),
Surface Water Treatment Rule (SWTR),
Total Trihalomethanes (TTHMs)
Nitrates
Phase II/V excluding nitrates
Lead and Copper Rule (LCR),
Interim Radionuclides Regulation
Enforcement
Public Notification
Compliance Period Reviewed
Most Recent
Most Recent 12-Month Period Available in
SDWIS/FED
Most Recent Two Calendar Years
1999-2001
Most Recent Two Samples
Time Period Related to Violation
The review evaluated recent monitoring history to confirm that systems monitored according to
the required frequency.  For many rules, the review evaluated one year of information (Surface
Water Treatment Rule, Total Trihalomethanes, Total Coliform Rule, and Consumer Confidence
Report). The two most recent monitoring periods or review cycles were reviewed for some rules
(interim radionuclides, Lead and Copper Rule, sanitary surveys). In other instances, the review
covered a defined period, such as the most recent 3-year monitoring period for the Standard
Monitoring Framework  outlined in the Phase II/V Rule5
3.     Statistical Sample Design of Data Verifications and Analytical Methods

3.1    Selection of States

As mentioned in Section 2, the states are selected for DVs by considering the date of their last
verification, resource constraints, and burden on EPA Regions and states. This selection
4 CWSs were reviewed for inventory and each of the rules listed in this table. NTNCWSs are not subject to CCR,
TTHM monitoring, or the interim radionuclide regulation. TNCWSs are not subject to the requirements for CCR,
SWTR, TTHM, Phase II/V Rule, or interim radionuclide regulation.
5 The Standardized Monitoring Framework synchronizes the monitoring schedules for the Phase II/V regulation for
chemicals and the interim radionuclides rule across defined 3-year monitoring periods and 9-year monitoring cycles.

-------
procedure is a non-probability sampling method. Because of the subjective nature of the
selection process, non-probability samples add uncertainty when the sample is used to represent
the population as a whole.  The accuracy and precision of statements about the population can
only be determined by subjective judgment. The selection procedure does not provide rules or
methods for inferring sample results to the population, and such inferences are not valid because
of bias in the selection process.

When non-probability sampling is used, the results only pertain to the sample itself, and should
not be used to make quantitative statements about any population including the population from
which the sample was selected. Since the DV states were selected by a non-probability sampling
method, the results from the analysis only pertain to the DV states audited between 2002 and
2004. Therefore, it is not appropriate to make quantitative statements or inferences about the
entire nation from the selected states or comparisons with sampled state data quality results from
the previous years.

3.2    Selection of Systems within States

The DVs involve the evaluation of the states' compliance decisions and the agreement between
the data in the state files and SDWIS/Fed.  Since neither time nor resources allow a complete
census of consistencies between SDWIS/Fed and state records, EPA uses a statistically random
sample of systems that is drawn from the total number of systems in the state. EPA uses the
results from the probability sample of systems within each state to estimate DV compliance
results for each  state. The probability sample is designed to provide estimates with acceptable
precision while  minimizing the burden on Regions and states imposed by visits from auditors.
EPA plans to further reduce burden on Regions and  States through use of electronic data
comparison.

3.2.1   Sample Frame

A sample frame is a list of all members of a population (in this case, the PWSs), from which a
random sample  of members will be drawn.  In other words, the sample frame identifies the
population elements from which the sample is chosen. The population elements listed on the
frame are called the  sampling units. Often these are groups or clusters of units rather than
individual units. For each state, EPA developed a sample frame (i.e., a list of the current
inventory of PWSs in the state) using SDWIS/FED,  from which a random sample of PWSs was
selected according to the sample design.

3.2.2   Sample Design of Data Verification

The unit of analysis  is the recorded action taken by systems, not the systems themselves. The
sample design for DVs is a stratified random cluster sample. In stratified sampling, the
population is divided into non-overlapping subpopulations called strata and a random sample is
taken from each stratum. Stratification increases the precision of the estimates when the
population is divided into subpopulations with  similar characteristics within each stratum. In
cluster sampling, groups, or "clusters," of units in the population are formed and a  random
                                            6

-------
sample of the clusters is selected.  In other words, within a particular stratum, rather than
selecting individual units, clusters of units are selected.
In the analysis of DV data, systems are grouped into three strata according to the system type
(CWS, TNCWS, and NTNCWS) within each state. In the first stage of the sampling process,
systems are randomly selected within each stratum.  In the second stage, each action taken by the
system is recorded.  In other words, the system represents a cluster of actions.  A few examples
of these actions are:

!      System inventory information that must be reported to SDWIS/FED,

!      Violations of federal regulations (states also may report violations of state regulations),

!      Enforcement actions taken when violations occur.

3.2.3   Sampling Procedure  and Data Collection Activity

Once the current state inventory is retrieved from SDWIS/FED, the number of systems is
counted by size category (see Table 2-2 for size categories).  The sample size for each system
type within a state is calculated based on the acceptable precision level for the estimates within
margin of error in most states of plus or minus five percent, with a confidence level of 90 or 95
percent6. As discussed in section 3.2.2, the sample design is a stratified random cluster sample.
The required sample size is given by:

                                   n'h=(nh\deff}

where nhis the size of the sample (number of systems) required for stratum h (specific state and
system type) if a simple random sample is drawn,  deffis the design effect of the clustering and
is assumed to be greater than 1.0.  nh is given by
where nh= Number of systems required for the sample in stratum /z,

       Nh = Total number of systems in the state in stratum h,
6 For the three DVs that were conducted during the last quarter of 2004, (TX, VA, and IL), the confidence level for
CWSs was 95 percent and the margin of error was plus or minus seven percent.  For NTNCWSs and TNCWSs, the
confidence level was 90 percent and the margin of error was plus or minus seven percent.

-------
      Mh = Average number of actions in each system in stratum /z,

      Bh = Acceptable precision level (margin of error) for stratum /z,

      Za = The abscissa of the normal curve that corresponds to the confidence level,
and
      Ph = Proportion of discrepancy in violation data between DV results and SDWIS/FED in
            stratum h (estimated from the previous assessment.)

The design effect deff depends on the proportion of actions and decisions reviewed in the DV
that are consistent with the data in SDWIS/FED. This proportion is unknown before the DV;
therefore, the design effect is unknown. Lacking estimates of the design effect, the DV draws a
simple random sample within each stratum7. Because it excludes the design effect, this sample
may not be large enough to meet the precision targets.

The sample size is calculated in an Excel spreadsheet. Samples are drawn from the frame
according to the random numbers generated in an Excel spreadsheet produced by EPA. Using
the Excel spreadsheet random number generator, a random sample of systems is developed for
each stratum. Then, the DV auditors collect data from the state files for each sampled system on
PWS inventory, violations, and enforcement.

3.3   Analytical Method: Weighting and Estimation

In this analysis, sample weights are applied to the data to adjust for the unequal probability of
selection of systems, i.e., the differences in the likelihood of some systems appearing in the
sample.  Weights, based  on the probability of selection, allow unbiased representation of the
population from an unequal probability sample.

In the 2002-2004 DV data analysis, EPA estimated proportions related to consistency and
accuracy among state files, the state database, and SDWIS/FED for inventory information,
violation data, and enforcement actions. A few examples of such proportions are the proportion
of inventory data that are consistent between SDWIS/FED and the state  file, the proportion of
violation data that are reported to SDWIS/FED, and the proportion of enforcement data that are
consistent between SDWIS/FED and the state file.  In this report, these proportions are presented
                                                         /*,
as percentages after being multiplied by 100.  The proportion P is estimated by

                                    H nh mha
                               p _ h=\ a=\ /?=!
                                        nh
                                     h=l a=l
7 Future DVs can estimate deff using data from previous DVs and can incorporate the design effect into the sample
size calculation.

-------
                             N
where the sample weight Wh = —-,
       Nh= total number of clusters (systems) in stratum (system type) h, h=l,...,H,

       nh = number of sampled clusters in stratum h,

       mha = number of data elements (reviewed actions) from cluster a in stratum h,

       Ihap = 0 or 1 indicator for 0th data element from the system a and Stratum h,
       corresponding to a specific characteristic.
A simple illustration of the calculation procedure is presented here. Suppose there are three strata
(H=3), namely CWS, NTNCWS, and TNCWS in State A. Also, suppose that the total number of
systems in each stratum (system type in State A) is 6, 9, and 15 (Ni=6, A/2=9, and Ns=l5\
respectively. Further, the number of sampled systems is 3 for each stratum (n1=n2=n3=?>} and
there are three violations reviewed for accuracy from each sampled system (mha =3 for /z= 1,2,3
and cc=l,2,3). Let /te/?be 1 if the violation was accurately reported to SDWIS/FED or 0 if the
violation was incorrectly reported. Suppose the compiled data are as shown in Table 3-1. The
proportion of the violations with a discrepancy is estimated by the ratio of the sum of WhIha/3
and the sum of mha Wh,, which, in this case, is 55/90=0.6111 or 61.11%.

Sampling errors are also estimated for the proportion estimates. Sampling errors are measures of
the extent to which the values estimated from the sample (proportions in this analysis) differ
from the values that would be obtained from the entire population. Since there are inherent
differences among the members of any population, and data are not collected for the whole
population, the exact values of these differences for a particular sample are unknown.

To estimate the sampling errors, Taylor series expansion method is applied. The Taylor series
expansion method is widely used to obtain robust variance estimators for complex survey data
with stratified, cluster sampling with unequal probabilities of selection. The Taylor series
obtains an approximation to a non-linear function. The Taylor series expansion method is
applied to the variance of the proportion estimate as
                                                                     mhaf
                                                                      Vff (/   -P)
                   , where Varh(P) = "&-»»  N^±(eha -ej, eha =      —^	,

-------
and eL = —
          n,.
               . With the sampling error, the margin of error based on a 95 percent confidence
                             i     /*•
interval is calculated as tdfom5^Var(P), where ^ao25 is the percentile of the t distribution with

 df number of degrees of freedom, which is the number of clusters minus the number of strata.

                Table 3-1: Example of Proportion Estimation Procedure
Stratum
index h
1
State A
CWS






2
State A
NTNCWS






o
J
State A
TNCWS






E
Total
Number
of
Systems
Nh
6








9








15









Number
of
sampled
systems
nh
o
J








3








o
J









Weight
wh
2








3








5









System
Index
a
I


2


3


1


2


o
J


1


2


3



Total
number
of
Violations
mha
3


3


3


3


3


3


3


3


3



™hawh
6


6


6


9


9


9


15


15


15


90
Violation
Index
P
1
2
3
1
2
3
1
2
3
1
2
3
1
2
3
1
2
3
1
2
3
1
2
3
1
2
3

Violation
correctly
reported?
Yes=l;
No=0
•* ha/3
1
0
0
1
0
1
1
1
0
1
0
1
1
1
0
0
0
1
0



0


0
1

W 1
" h1 haft
2
0
0
2
0
2
2
2
0
o
J
0
3
3
3
0
0
0
o
J
0
5
5
5
0
5
5
0
5
55
In Section 4, various types of proportions of consistent, reported, and accurate data in
SDWIS/FED are calculated. These proportion estimates represent the data quality measures of
inventory, violation, and enforcement data in SDWIS/FED based on the DVs.
                                           10

-------
4.     Results from the Analysis of Data Verifications

This section presents various proportion estimates for inventory, violation, and enforcement data.
Also, the margins of error are calculated for each point estimate. The margin of error is based on
a 95% confidence interval, which includes the true proportion with 95% confidence. All
calculations were performed using SAS®.

4.1    Analysis of Inventory Data

States are required to report eight inventory data elements to SDWIS/FED for Grant Eligibility.
These elements are 1) public water system identification number (PWS ID; 2) system status 3)
water system type; 4) primary source water type; 5) population served; 6) number of service
connections; 7) administrative contact address; and 8) water system name. The records for
population or service connections are considered to be consistent when there is less than 10%
difference between the two records. Because the inventory data are analyzed at the system level,
the estimation approach can be based on a stratified random sampling. Then, the proportion of
systems for which the inventory elements were reported to SDWIS/FED without discrepancies
and its sampling error are estimated in Section 3.3 only at the cluster (system) level (or P=l).

Inventory data quality of each data element is displayed in Table 4-1. The overall data quality of
the eight inventory (water system identification) parameters assessed was 87%.  In other words,
87% of systems from DV states between 2002 and 2004 had consistent data for all eight
inventory data elements between their state files and SDWIS/FED database, or 13% of systems
had at least one data element reported with a discrepancy. The highest discrepancy rate was for
the administrative contact address element.
                                           11

-------
    Table 4-1: Percent PWSs reported Grant Eligibility Inventory Data to SDWIS/FED
                                 without Discrepancy
Reported Data without Discrepancy by Individual Data Element
pwsro
99.83%
(+/-0.20%)
System
status
(active or
inactive)
97.26%
(+/-1.15%)
Water
system
type
98.21%
(+1-0.72%)
Primary
source
type
99.31%
(+1-0.35%)
Population
served
97.11%
(+/-0.71%)
# service
connection
s
96.22%
(+1-0.93%)
Admin.
contact
address
95.97%
(+/-1.32%)
PWS
name
99.87%
(+1-0.09%)
Reported All
Inventory
Data Element
Data without
Discrepancy
87.4%
(+/. 1.94%)
4.2    Analysis of Violation Data

Federal regulations specify the outcomes which states must report to EPA that result in
noncompliance (violation) with (a) health-based drinking water quality maximum contaminant
levels (MCL) and related requirements for their attainment; (b) specified monitoring and
reporting (M/R) requirements necessary to determine whether sampling, testing and treatment
process checking occurred as stipulated in Federal regulations; and (c) health-based treatment
techniques (TT) and associated water system management processes for contaminants for which
it is not technologically or economically feasible to set an MCL.

Violation data are evaluated by comparing the following: 1) EPA's evaluation of the state's
compliance decision on the violations; 2) the assigned violations in the state files; and 3) the
violations reported to SDWIS/FED. All the findings from these comparisons can be grouped into
one of the categories as shown in Table 4-2. The total number of violations identified during the
2002-2004 DV is summarized below:

          •  Out of 198 TCR MCL violations, 163 violations were reported to SDWIS/FED.
          •  Out of 48 other (non-TCR) MCL violations, 21 violations were reported to
             SDWIS/FED.
          •  Out of 41 SWTR TT violations, 35 violations were reported  to SDWIS/FED.
          •  Out of 176 LCR TT violations, 5 violations were reported to SDWIS/FED.
          •  Out of 5,069 M/R violations, 1,589 violations were reported to SDWIS/FED.

The following measures of data quality of violation data in SDWIS/FED are evaluated:

!      Completeness of SDWIS/FED describes how many violations that are required to be
       reported are being reported to SDWIS/FED, expressed as a percentage. This quantity is
       estimated based on the violations found by EPA and reported to SDWIS/FED (EPA=Yes
       and  SDWIS/FED=Yes;l, 4 from Table 4-2) out of all violations found by EPA
       (EPA=Yes;l, 2, 3, 4 from Table 4-2).
       Non-reporting rate in SDWIS/FED describes how many violations that are required to
                                          12

-------
      be reported are not being reported to SDWIS/FED, expressed as a percentage. This
      percentage is the complement of the Completeness estimate, i.e., 100%-Completeness.

      Compliance Determination (CD) error rate in the non-reported violations describes
      how many non-reported violation data are the result of errors in states' compliance
      determination (i.e., a violation was not reported because the state did not identify it as a
      violation), expressed as a percentage. This quantity is estimated based on the violations
      found by EPA, but not reported to SDWIS/FED and where the assigned violation in the
      state file does not agree with EPA (EPA=Yes and SDWIS/FED=No and EPA^state File;
       3 from Table 4-2) out of all violations found by EPA and not reported to SDWIS/FED
      (EPA=Yes and SDWIS/FED=No; 2 and 3 from Table 4-2).

      Data Flow (DF) error rate in the non-reported data describes how many non-reported
      violation data are as a result of reporting problems from state to SDWIS/FED, expressed
      as a percentage. This quantity is estimated based on the violations found by EPA, but not
      reported to SDWIS/FED and where the assigned violation in the state file confirmed by
      EPA (EPA=Yes, State File=Yes, and SDWIS/FED=No; 2a and 2b from Table 4-2) out of
      all violations found by EPA and not reported to SDWIS/FED (EPA=Yes and
      SDWIS/FED=No; 2 and 3 from  Table 4-2).

      Accuracy of the data in SDWIS/FED describes how much of the violation data in
      SDWIS/FED are correct, expressed as a percentage. This quantity is estimated based on
      the violations found by EPA that agree with those reported to SDWIS/FED (EPA=
      SDWIS/FED; la, Id, and 4a from Table 4-2) out of all violations reported to
      SDWIS/FED (SDWIS/FED=Yes; 1, 4, 5, 6 from Table 4-2).

      Compliance Determination (CD) error rate in SDWIS/FED describes how much of the
      violations data in SDWIS/FED are incorrect violations types as a result of errors in the
      state=s compliance determination, expressed as a percentage. This quantity is estimated
      based on the violations found by EPA that disagree with those reported to SDWIS/FED,
      but which are missing in the state file (State File=No and EPA^SDWIS/FED from Table
      4-2) or the violations found by EPA that disagree with those found by the state,  which
      were then reported to SDWIS/FED as found by the state (EPA^State File= SDWIS/FED;
      Ic and 4b8 from Table 4-2) out of all violations reported to SDWIS/FED
      (SDWIS/FED=Yes; 1, 4, 5, 6 from Table 4-2).

      Data Flow (DF) error rate in SDWIS/FED describes how many of the reported
      violations data are incorrect violations types due to reporting problems from the state to
      SDWIS/FED, expressed as a  percentage. This quantity is estimated based on the
      violations found in the state files and confirmed by EPA, but which disagree with those
      reported to SDWIS/FED (EPA= Stated SDWIS/FED from Table 4-2) or the violations
      found by EPA that disagree with those found by the state, which were then reported to
8 If DV auditors determined it to be a CD error
                                         13

-------
       SDWIS/FED (EPA^State File= SDWIS/FED;lb, le, and 4b9 from Table 4-2) out of all
       violations reported to SDWIS/FED (SDWIS/FED=Yes; 1, 4, 5, 6 from Table 4-2).

 !      False Positive rate of the violation data in SDWIS/FED describes how much of the
       reported violation data in SDWIS are, in fact, false violations, expressed as a percentage.
       This quantity is estimated based on the violations not confirmed by EPA but reported to
       SDWIS/FED (EPA=No and SDWIS/FED=Yes; 5 and 6 from Table 4-2) out of all
       violations reported to SDWIS/FED (SDWIS/FED=Yes; 1, 4, 5, 6 from Table 4-2).

 !      Overall Data Quality Estimate in SDWIS/FED measures how many noncompliance
       determinations are correctly reported in SDWIS/FED among all noncompliance
       determinations (that are either violations or false-positive violations). This quantity is
       estimated based on the violations confirmed by EPA and correctly reported to
       SDWIS/FED (EPA=SDWIS/FED; la, Id, and 4a from Table 4-2) out of all violations
       found by EPA or in the state files and SDWIS/FED (EPA=Yes or State File=Yes or
       SDWIS/FED=Yes; 1-6 from Table 4-2). When the false positive rate is 0%, this measure
       is the product of Completeness and Accuracy.

Since the DV states were not randomly selected, the states were treated as a fixed stratification
variable for this analysis. During the DVs, there were systems that did not have any violations in
the sample and did not require any reporting to SDWIS/FED. Thus, the actual number of sample
systems used for the calculations was less than the number of sampled systems for the DVs.
Furthermore, sub-domain analysis by rules or system types resulted in single-cluster strata and/or
single observation in some clusters. A single-cluster stratum does not contribute in the
calculation of variance estimates, which may underestimate the sampling errors. Therefore, the
strata were combined within each EPA region except for overall data quality estimations, where
the strata were  combined within each DV state.
9 If DV auditors determined it to be a DF error.
                                          14

-------
                                     Table 4-2: Violation data Comparison Categorization
Was a violation found?
Found By
DV
Auditors
Yes
Yes
Found
In state
File
Yes
Yes
Reported
to
SDWIS
Yes
No
Were the assigned violations
in agreement?
la. DV Auditors=State
File=SDWIS
Ib. DV Auditors =State
File^SDWIS
Ic. DV Auditors ^State
File=SDWIS
Id. DV Auditors
=SDWIS^State File
le. DV Auditors ^State
File^SDWIS
2a. DV Auditors = State File
Example
A TCR violation 3 100-2 1 1 was found
in the state file, confirmed by DV
auditors, and correctly reported to
SDWIS/FED.
A TCR violation record 3 100-21 as
found in state file and confirmed by DV
auditors; the violation was incorrectly
reported to SDWIS/FED as 3100-222.
A TCR violation record 3 100-22 was
found in state file and reported to
SDWIS/FED as 3100-22 when the
violation should have been 3 100-21.
A TCR violation 3100-21 was reported
to SDWIS/FED and confirmed by DV
auditors but the state issued 3 100-22 in
the file.
A TCR violation record 3 100-22 found
in state file when it should have been
3 100-21 according to DV auditors,
while the violation was incorrectly
reported to SDWIS/FED as 3100-233
A TCR violation 3 100-2 1 was found in
the state file and confirmed by DV
auditors, but not reported to
SDWIS/FED.
Description
No discrepancy in SDWIS/FED
Data Flow error
Compliance determination error
No discrepancy in SDWIS/FED
Compliance determination error
by state and Data flow error
between state file and
SDWIS/FED
Non-reporting; Data Flow error
1 Acute TCR MCL violation.
2 Monthly TCR MCL violation.
3 Routine Major TCR Monitoring Violation
                                                              15

-------
Table 4-2: Violation data Comparison Categorization
Was a violation found?
Found By
DV
Auditors

Yes
Yes
No
Found
In state
File

No
No
Yes
Reported
to
SDWIS

No
Yes
Yes
Were the assigned violations
in agreement?
2b. DV Auditors ^State File
3.N/A
4a. DV Auditors
=SDWIS/FED
4b. DV Auditors
^SDWIS/FED
5a. State File=SDWIS/FED
5b. State File^SDWIS/FED
Example
A TCR violation 3 100-22 was issued in
the state file when it should have been
3 100-21, and the violation was not
reported to SDWIS/FED.
There should have been a TCR
violation 3 100-21, but the state did not
issue a violation and did not report to
SDWIS/FED.
There should have been a TCR
violation 3 100-21 issued in the state
file, but the notice of violation (NOV)
was not found in the state file, even
though the violation as correctly
reported to SDWIS/FED.
There should have been a TCR
violation 3 100-21 issued in the state
file, but NOV was not found in the state
file, while the violation as incorrectly
reported to SDWIS/FED as 3100-22.
A TCR violation 3100-21 was issued in
the state file and reported to
SDWIS/FED, but it should not have
been a violation.
A TCR violation 3 100-2 1 was found in
the state file, but the DV Auditors
concluded that there should not have
been a violation in the first place. In
addition, the state reported a different
TCR violation type (3 100-22) to
SDWIS FED.
Description
Non-reporting; Compliance
determination error by the state;
Data Flow error between the
state file and SDWIS/FED.
Non-reporting; Compliance
determination error
No discrepancy in SDWIS/FED
Compliance determination error
by state and/or Data Flow
between state File and
SDWIS/FED
False positive in SDWIS/FED
False positive in SDWIS/FED
                       16

-------
Table 4-2: Violation data Comparison Categorization
Was a violation found?
Found By
DV
Auditors
No
Found
In state
File
No
Reported
to
SDWIS
Yes
Were the assigned violations
in agreement?
6. N/A
Example
A TCR violation 3 100 was reported to
SDWIS/FED, but DV Auditors
concluded that there should not have
been a violation in the first place. In
addition, no evidence of a violation was
found in the state files because the state
rescinded a violation but has not
removed it if from SDWIS/FED.
Description
False positive in SDWIS/FED
                       17

-------
4.2.1   Results from 2002-2004 Data Verifications

The proportion estimates and the sampling errors for the violation DQE by violation types are
presented in Table 4-3. Eighty-one percent of the MCL and SWTR TT violations were reported
to SDWIS/FED. Seventy-four percent of the non-reported violations were due to compliance
determination errors and 26% were due to data flow errors. The reported violations in
SDWIS/FED were accurate at 94%. Overall, the DQE of the violation data was 77%. This means
that 77% of the noncompliance determinations on MCL/ SWTR TT standards were correctly
reported in SDWIS/FED.

Considering all health-based violations (MCL and TT violations, which include Lead and
Copper TT), 62 percent of the violations were reported to SDWIS/FED. This means that 38% of
the violations were not reported. Eighty-four (84) percent of the non-reported violations were
due to compliance determination errors and 16% were due to data flow errors.  The reported
violations in SDWIS/FED were accurate at 94%. Overall, the DQE of the health-based violation
data was 59%. This means that 59%  of the noncompliance determinations on all health-based
standards were correctly reported in  SDWIS/FED.

The quality of the health-based violations data was much lower than the MCL/SWTR TT data
because of the quality of data associated with the Lead and Copper Rule.  The data quality of the
LCR TT violations was the lowest at 7.6%. For example, we found that out of 176 LCR TT
violations, only 5 were reported to SDWIS/FED. The non-reporting was also mainly because of
compliance determination errors. Specifically, 161 out of 171 violations were not recognized as
violations when the violations had occurred.

Twenty-nine percent of the M/R violations were reported to  SDWIS/FED and 71% of the
violations were not reported. Ninety-two percent of the non-reported violations were due to
compliance determination errors and 8% were due to data flow errors.  The reported M/R
violations in SDWIS/FED were accurate at 88%. Overall, the DQE of the M/R violation data
was 27%, i.e., 27% of the noncompliance determinations on  M/R were correctly reported in
SDWIS/FED.
              Table 4-3: Data Quality Estimates (DQE) by Violation Type
                                          18

-------

% COMPLETENESS OF
SDWIS/FED
%NON-REPORTING ON
SDWIS/FED
%CD
ERROR ON
NON-
REPORTED
DATA
% DF ERROR
ON NON-
REPORTED
DATA
%ACCURACY OF DATA IN
SDWIS/FED
%CD ERROR WITH DATA
IN SDWIS/FED
%DF ERROR WITH DATA
IN SDWIS/FED
%FALSE POSITIVE DATA
IN SDWIS/FED
OVERALL DATA QUALITY


% COMPLETENESS OF
SDWIS/FED
%NON-REPORTING ON
SDWIS/FED
%CD
ERROR ON
NON-
REPORTED
DATA
%DF ERROR
ON NON-
REPORTED
DATA
%ACCURACY OF DATA IN
SDWIS/FED
%CD ERROR WITH DATA
IN SDWIS/FED
%DF ERROR WITH DATA
IN SDWIS/FED
%FALSE POSITIVE DATA
IN SDWIS/FED
OVERALL DATA QUALITY
TCR MCL
83.29%
(+1-9.66%)
16.71%
(+1-9.66%)
83.40%
15.07%)
16.60%
15.07%)
96.65%
(+/-2.62%)
0%
0%
2.26%
80.95%

MCL/SWTR TT
81.33%
18.67%
73.84%
16.30%)
26. 16%
16.30%)
94.12%
1.27%
(+/. 1.440/0)
0%
4.60%
(+/-2.90%)
77.21%
(+/-8.99%)
OTHER MCL
48.94%
(+/-27.05%)
51.06%
(+/-27.05%)
56.89%
43.12%)
43.11%
43.12%)
79.22%
(+/-20.56%)
5.73%
0%
15.05%
(+/-17.96%)
42.00%
(+/-26.67%)

LCRTT
7.6%
(+/-7.52%)
92.40%
(+/-7.52%)
91.76%
11.80)
8.24%
11.80)
100%
0 /o
0 /o
0%
7.6%
(+/-7.52%)
TOTAL MCL
78.42%
(+/-9.39%)
21.58%
(+/-9.39%)
73.84%
16.98%)
26.16%
16.98%)
94.91%
(+1-3.00%)
0.57%
0%
4.52%
(+/-2.90%)
75.
(+/-9
16%

Health-Based
Violations
61.
(+/-1
38.
(+/-1
84.45%
10.35%)
69%
31%
15.55%
10.35%)
94.30%
1.24%
0%
2.79%
59.
(+/-1
18%
SWTR TT
94.89%(+/-8.03%)
5.11%(+/-8.03o/o)
73.87% 26.87%
43.38%) 43.38%)
91.07%
3.98%(+/-7.26)
0%
4.95%
86.63%

M/R
29.02%
70.98%
92.03% 7.97%
(+1-1.75%) (+1-1.75%)
88.35%
3.18%
(+/. 1.540/0)
0.99%
7.48%
27.08%
*CD=Compliance determination
*DF=DataFlow

Note:  TCR MCL + Other MCL = Total MCL + SWTR TT = MCL/SWTR TT + LCR TT = Health-Based Violations.  M/R =
monitoring and reporting violations.
                                                      19

-------
In general, the majority of non-reported data were due to compliance determination errors, i.e.,
the states did not issue violations when violations had occurred. The violations had not been
recognized, not recorded by states as violations, and consequently, not reported to SDWIS/FED.
 We need to further examine the cause of such compliance determination errors. These errors
may be due to late reporting or rule interpretation discrepancies. Eliminating these errors will
significantly increase the completeness of the data in SDWIS/FED. For example, 84% of the
non-reported health-based based violations were due to compliance determination errors. If these
errors did not occur, the completeness of health-based violations in SDWIS/FED would be at
94% (62%+38%x84%).  Similarly, the completeness of M/R violations would also be at 94%
(29%+71%x92%).

The violation data are further evaluated by system type in Tables 4-4a-c.  The DQEs of
MCL/SWTR TT violations were not significantly different among the different system types.
Likewise, the DQEs of health-based violations were not significantly different between CWSs
and NTNCWSs. (The DQE of health-based violations for TNCWSs was not calculated since
LCR TT data were not collected for TNCWSs.)
          Table 4-4a:  MCL/SWTR TT Violations Data Quality Estimates (DQE)
                              by Public Water System Type

% COMPLETENESS OF SDWIS/FED
%NON-REPORTING IN SDWIS/FED
%CD ERROR ON
NON-REPORTED
DATA
%DF ERROR ON
NON-REPORTED
DATA
%ACCURACY OF DATA IN
SDWIS/FED
%CD ERROR WITH DATA IN
SDWIS/FED
%DF ERROR WITH DATA IN
SDWIS/FED
%FALSE POSITIVE DATA IN
SDWIS/FED
OVERALL DATA QUALITY
cws
78.87%
(+/-10.59%)
21.13%
(+/-10.59%)
61.24%
(+/-25.44%)
38.76%
(+/-25.44%)
93.15%
(+/-4.35%)
1.06%(+/-1.47%)
0%
5.80%
(+/-4.13%)
74.37%
(+/- 10.73%)
NTNCWS
83.07%
(+/- 12.32%)
16.93%
(+/- 12.32%)
49.15%
(+/-35.34%)
50.85%
(+/-35.34%)
83.42%
(+/-13.55%)
6.62%
(+/- 10.75%)
0%
9.96%
(+/-11.71%)
70.49%
(+/-13.61%)
TNCWS
83.25%
(+/-15.53%)
16.75%
(+/-15.53%)
96.21%
(+/-6.14%)
3.79%
(+/-6.14%)
97.37%
(+/-3.68%)
0.29% (+/-0.60%)
0%
2.34%
(+/-3.60%)
81.38%
(+/-15.51%)
       *CD=Compliance determination
       *DF=DataFlow
                                          20

-------
   Table 4-4b: Health-Based Violations Data Quality Estimates (DQE)
                     by Public Water System Type
*CD=
*DF=

% COMPLETENESS OF SDWIS/FED
%NON-REPORTING IN SDWIS/FED
%CL> EKKOK ON O/TW T?»»n» nw
NON-REPORTED /oDt tRROR ON
DATA NON-REPORTED
DATA
%ACCURACY OF DATA IN
SDWIS/FED
%CD ERROR WITH DATA IN
SDWIS/FED
%DF ERROR WITH DATA IN
SDWIS/FED
%FALSE POSITIVE DATA IN
SDWIS/FED
OVERALL DATA QUALITY
cws
53.39%
(+/-12.17%)
46.61%
(+/-12.17%)
78.45% 21.55%
(+/-16.17%) (+/-16.17%)
93.47%
(+/-4.15%)
1.01%(+/-1.40%)
0%
5.52%
(+/-3.94%)
51.22%
(+/-11.75%)
NTNCWS
40.86%
(+/-15.19%)
59.14%
(+/-15.19%)
93.57% 6.57%
(+/-10.15%) (+/-10.15%)
84.95%
(+/-12.58%)
6.01%
(+/-9.86%)
0%
9.04%
(+/-10.76%)
36.67%
(+/-13.02%)
Compliance determination
Data Flow
        Table 4-4c: M/R Violations Data Quality Estimates (DQE)
                     by Public Water System Type

% COMPLETENESS OF SDWIS/FED
%NON-REPORTING IN SDWIS/FED
%CD ERROR ON %DF ERROR ON
NON-REPORTED NON-REPORTED
DATA DATA
%ACCURACY OF DATA IN
SDWIS/FED
%CD ERROR WITH DATA IN
SDWIS/FED
%DF ERROR WITH DATA IN
SDWIS/FED
%FALSE POSITIVE DATA IN
SDWIS/FED
OVERALL DATA QUALITY
CWS
20.26% (+/-3.42%)
79.74% (+/-3.42%)
91.62% 8.38%
(+7-2.36) (+7-2.36)
82. 10% (+7-4.96%)
5.25% (+7-2.84%)
0.73% (+7-0.74%)
11.92% (+7-4. 42%)
18.38% (+7-3. 12)
NTNCWS
22.65% (+7-5.26%)
77.35% (+7-5.26%)
87.05% 12.95%
(+7-5.07%) (+7-5.07%)
81.39% (+7-7.93%)
4. 10% (+7-2. 7%)
1.45% (+7-2.15%)
13.05% (+7-7.88%)
20.51% (+7-4. 77%)
TNCWS
45.89% (+7-5.87%)
54. 11% (+7-5.87%)
96.87% 3.13%
(+7-2.31) (+/-2.31)
94.81% (+7-2.71%)
1.83% (+7-1.99%)
0.60% (+7- 0.59%)
2.75% (+7-1.77%)
44. 17% (+7-5.79%)
*CD=Compliance determination
*DF=DataFlow
                                  21

-------
EPA has public notification (PN) requirements to ensure that the public is notified of violations
in a timely manner.  The PN requirements define three tiers of notification that are based on the
public health significance of the violation, with tier 1 being the most significant (See Appendix
C for the definition of PN tiers).  The DQEs are also calculated by PN tier groups of violations in
Table 4-5. Two-thirds of PN tier 1 violations were reported to SDWIS/FED. There were no
significant differences in DQEs between PN tier 1 and PN tier 2. The DQEs for PN tier 3, which
mostly consisted of M/R violations, were significantly lower than those for PN tier 1 and PN tier
2. Less than two-thirds of PN tier 2 violations were reported to  SDWIS/FED and only 30% of
PN tier 3 violations were reported to SDWIS/FED. In all PN tier groups, the data in
SDWIS/FED were highly accurate. The overall data quality does not reflect false-positive
violations  in SDWIS/FED since they can not be categorized into a PN tier.
                        Table 4-5: Data Quality (DQ) by PN Tier

% COMPLETENESS OF SDWIS/FED
%NON-REPORTING in SDWIS/FED
%CD ERROR ON
NON-REPORTED
DATA
%DF ERROR
ON NON-
REPORTED
DATA
%ACCURACY OF DATA IN
SDWIS/FED
%CD ERROR WITH DATA IN
SDWIS/FED
%DF ERROR WITH DATA IN
SDWIS/FED
OVERALL DATA QUALITY
PN Tier 1
66.97%
(+/-22.37%)
33.03%
(+/-22.37%)
32.77%
(+/-23.93%)
67.23%
(+/-23.93%)
100%
0%
0%
66.97%
(+/-22.37%)
PN Tier 2
62.40% (+/-10.50%)
37.60% (+/-10.50%)
87.2 !%(+/-
9.04%)
12.79% (+/-
9.04%)
98.65% (+/-1. 52%)
1.35%(+/-1.52%)
0%
61.56% (+7-10.59%)
PN Tier 3
30.58%
(+7-4.15%)
69.42%
(+7-4.15%)
91.44%
(+/-
1.87%)
8.56%
(+7-1.87%)
95.46% (+7-1. 82)
3.46%(+/-1.68)
1.08%(+/-0.58)
29.
(+7-4
19%
13%)
        *CD=Compliance determination
        *DF=DataFlow
4.2.2   Results from 1999-2001 Data Verifications

This section presents DQEs from 1999 to 2001 data verification audits recalculated using the
current statistical methodology described in Section 3.3. The states subjected to the DV audits
during 1999-2001 are shown in Table 4-6. In the calculation, the DV results from Region 2 were
not included since the state DV reports were not finalized for those states during the period of
this analysis. The DQEs are included in Tables 4-7a and b. Because these estimates were
computed based on a different set of DV states in a different data quality assessment time frame
and with a different statistical sample design, it is not scientifically valid to make a national
inference by comparing the results between Table 4.3a-b. However, the DQEs from those states
that had repeated DV audits during both assessment periods are calculated and compared in the
following section.
                                           22

-------
             Table 4-6: States Subject to Data Verifications from 1999-2001
Region
1
2
O
4

5
States
MA, ME, NH
NY, PR
VA, PA, DE
FL, GA, KY, MS, NC,
SC, TN
IL, IN, OH, WI







Region
6
7
8
9

10
States
AR, LA, NM, TX
KS, MO, NE
MT, ND, UT
HI,NV

AK, ID, OR
Table 4-7b shows that 69% of the MCL and SWTR TT violations were reported to SDWIS/FED.
Seventy-nine percent of the non-reported violations were due to compliance determination errors
and 21% were due to data flow errors. The reported violations in SDWIS/FED were accurate at
91%. Overall, the DQE of the violation data was 64%. This means that 64% of the
noncompliance determinations on MCL/ SWTR TT standards were correctly reported in
SDWIS/FED.
      Table 4-7a: 1999-2001 Data Quality Estimates (DQE) for MCL and SWTR TT

% COMPLETENESS OF
SDWIS/FED
%NON-REPORTING ON
SDWIS/FED
%CD ERROR
ON NON-
REPORTED
DATA
%DF
ERROR ON
NON-
REPORTED
DATA
%ACCURACY OF DATA IN
SDWIS/FED
%CD ERROR WITH DATA
IN SDWIS/FED
%DF ERROR WITH DATA
IN SDWIS/FED
%FALSE POSITIVE DATA
IN SDWIS/FED
OVERALL DATA QUALITY
TCR MCL
76.71%
23.29%
70.69%
29.31%
91.71%
(+1-2.62%)
1.61%
(+/-2.96%)
0.64%
(+/-1.26)
6.05%
(+/-5.24%)
71.35%
OTHER MCL
63.33%
(+/-26.99%)
36.67%
(+/-26.99%)
68.55%
31.45%
63.99%
(+7-46.70%)
36.01%
(+7-46.70%)
0%
0%
40.52%
(+7-28.53%)
TOTAL MCL
74.81%
25.19%
70.25% 29.75%
88.54%
5.54%
(+7-7.82%)
0.56%
(+7-7.82%)
5.36%
67.14%
        *CD=Compliance determination
        *DF=DataFlow
                                         23

-------
    Table 4-7b: 1999-2001 Data Quality Estimates (DQE) for MCL/ SWTR TT and MR

% COMPLETENESS OF
SDWIS/FED

%NON-REPORTING ON
SDWIS/FED
%CD
ERROR ON
NON-
REPORTED
DATA

%DF
ERROR ON
NON-
REPORTED
DATA
%ACCURACY OF DATA
IN SDWIS/FED
%CD ERROR WITH DATA
IN SDWIS/FED
%DF ERROR WITH DATA
IN SDWIS/FED
%FALSE POSITIVE DATA
IN SDWIS/FED
OVERALL DATA
QUALITY

SWTR TT
54.54%
(+/-11.79)
45.46%
(+/-11.75)

92.43%
(+/-13.88)


7.57%
(+/-13.88)

100%


0%
0%
0%


54.54%
(+/-11
.75%)
MCL/SWTR TT
69.39%
(+1-9.
59%)
30.61%
(+1-9.

79.05%

59%)

20.95%

90.84%
(+1-1.
ISO/ \
Jo /o)
4.43%
0.45%
(+1-0.9%)
4.28%
(+/-3
63. S
(+1-9.
8%)
8%
12%)
MR
34.86%
(+1-4
59%)
65. 14%
(+1-4

92.26%
(+1-2.64%)

59%)

7.74%
(+1-2.64%)

91.85%
(+1-2
89%)
1.08%
(+1-0.93%)
0.20%
(+1-0.25%)
6.87%
(+1-2
58%)
33.51%
(+1-4
.5%)
        *CD=Compliance determination
        *DF=DataFlow
Thirty-five percent of the M/R violations were reported to SDWIS/FED and 65% of the
violations were not reported. Ninety-two percent of the non-reported violations were due to
compliance determination errors and 8% were due to data flow errors.  The reported M/R
violations in SDWIS/FED were accurate at 92%. Overall, the DQE of the M/R violation data
was 33%, i.e., 33% of the noncompliance determinations on M/R were correctly reported in
SDWIS/FED.

4.2.3   Data Quality Estimates from 1999-2001 and 2002-2004

In order to evaluate the progress of the data quality improvement, the DQEs from the states
where the DV audits were conducted during the data quality assessment period 1999-2001  and
again during 2002-2004 were calculated for the purpose of comparison. The states with repeated
DV audits for both assessment periods can be identified from Table 2-1 and Table 4-6 and  are
listed in Table 4-8. Since the LCR was not reviewed during the 1999-2001 DVs, the data from
the LCR were excluded from 2002-2004 DV results for this  evaluation

The DQEs from these 18 states are presented in Tables 4.9 a and b, which include point
estimates as well as the lower and upper bounds for 95% confidence intervals. In order to
determine any significant differences (increase or decrease) in the DQEs, the two confidence
intervals, defined by the lower and upper bounds as the end points of the interval, for the two
DQEs should not overlap.  Sixty-seven percent of MCL/SWTR TT violations with a 95%
confidence interval (55%,  79%) were reported to SDWIS/FED during 1999-2001. Similarly,
80% of MCL/SWTR TT violations with a 95% confidence interval (68%, 92%) were reported to
SDWIS/FED during 2002-2004. Since the confidence intervals overlap, there was no statistically
                                          24

-------
significant increase in the reporting of violations for these 18 states from 1999-2001 to 2002-
2003. The overall data quality of MCL/SWTR TT violations was 64% with a 95% confidence
interval (52%, 75%) during 1999-2001 and 75% with a 95% confidence interval (64%, 87%)
during 2002-2004. Based on the confidence intervals, there was no statistically significant
increase in the overall data quality of MCL/SWTR TT violations for these 18 states from 1999-
2001 to 2002-2003.

On the other hand, approximately, 60% of SWTR TT violations with a 95% confidence interval
(44%, 76%) were reported to SDWIS/FED during 1999-2001. During 2002-2004, 93% of SWTR
TT violations with a 95% confidence interval (81%, 100%) were reported to SDWIS/FED. Since
the confidence intervals do not overlap, there was a statistically significant increase in the
reporting of violations for these 18 states from 1999-2001 to 2002-2003. However, the accuracy
of SWTR TT has decreased significantly from 100% to 78%. The overall data quality of SWTR
TT violations was 60% with a 95% confidence interval (44%, 73%) during 1999-2001 and 74%
with a 95% confidence interval (54%, 94%) during 2002-2004. Therefore, there was no
statistically significant increase in the overall data quality of SWTR TT violations for these 18
states from  1999-2001 to 2002-2003.

In general, all the confidence intervals from the two periods overlap for all DQEs, except for
SWTR TT violations completeness DQE. Therefore, there were no statistically significant
increases or decreases in the DQEs for these states from 1999-2001 to 2002-2003 assessment.
      Table 4-8: States Subject to Data Verifications during 1999-2001 and 2002-2004
Region
1
2
3
4
5
States
MA

VA, PA
FL, KY, MS, NC, SC, TN
IL, OH






Region
6
7
8
9
10
States
AR, NM, TX
KS,NE
UT

AK, ID
                                          25

-------
              Table 4-9a: Data Quality Estimates (DQE) for MCL
from MA, VA, PA, FL, KY, MS, NC, SC, TN, IL, OH, AR, NM, TX, MO, UT, AK, ID
                      During 1999-2001 and 2002-2004

Year
% COMPLETENESS OF
SDWIS/FED
%NON-REPORTING ON
SDWIS/FED
%DF
%CD ERROR
ERROR ON ONNON.
NON- REPORTED
REPORTED DATA
DATA
%ACCURACY OF DATA
IN SDWIS/FED
%CD ERROR WITH DATA
IN SDWIS/FED
%DF ERROR WITH DATA
IN SDWIS/FED
POINT
ESTIMATE
LOWER
BOUND
UPPER
BOUND
POINT
ESTIMATE
LOWER
BOUND
UPPER
BOUND
POINT
ESTIMATE
LOWER
BOUND
UPPER
BOUND
POINT
ESTIMATE
LOWER
BOUND
UPPER
BOUND
POINT
ESTIMATE
LOWER
BOUND
UPPER
BOUND
POINT
ESTIMATE
LOWER
BOUND
UPPER
TCR MCL
1999-2001
70.13%
56.22%
84.03%
29.87%
15.97%
43.78%
74.90% 25. 10%
51.93% 2.13%
97.87% 48.07%
90.49%
81.28%
99.71%
3.04%
0%
8.99%
0%
0%
0%
2002-2004
82.17%
67.46%
96.88%
17.83%
3.12%
32.54%
83.54% 16.46%
61.95% 0%
100% 38.05%
96.01%
91.88%
100%
0%
0%
0%
0%
0%
0%
Other MCL
1999-2001
74.57%
48.25%
100%
25.43%
0%
51.75%
58.02% 41.98%
12.32% 0%
100% 87.68%
100%
100%
100%
0%
0%
0%
0%
0%
0%
2002-2004
60.74%
30.84%
90.64%
39.26%
9.36%
69.16%
21.14% 78.86%
0% 49.66%
50.34% 100%
82.84%
59.65%
100%
0%
0%
0%
0%
0%
0%
Total MCL
1999-2001
70.63%
57.87%
83.39%
29.37%
16.61%
42.13%
73.25% 26.75%
51.66% 5.16%
94.85% 48.34%
91.56%
83.40%
99.72%
2.70%
0%
7.98%
0%
0%
0%
2002-2004
79.17%
65.95%
92.39%
20.83%
7.61%
34.05%
67.09% 32.91%
38.77% 4.59%
95.41% 61.23%
94.40%
89.88%
98.920%
0%
0%
0%
0%
0%
0%
                                     26

-------
                        Table 4-9a: Data Quality Estimates (DQE) for MCL
         from MA, VA, PA, FL, KY, MS, NC, SC, TN, IL, OH, AR, NM, TX, MO, UT, AK, ID
                                 During 1999-2001 and 2002-2004

Year

%FALSE POSITIVE DATA
IN SDWIS/FED
OVERALL DATA
QUALITY
BOUND
POINT
ESTIMATE
LOWER
BOUND
UPPER
BOUND
POINT
ESTIMATE
LOWER
BOUND
UPPER
BOUND
TCR MCL
1999-2001

6.47%
0%
13.54%
64.71%
50.90%
78.52%
2002-2004

3.99%
0%
8.12%
79.46%
65.09%
93.82%
Other MCL
1999-2001

0%
0%
0%
74.57%
48.25%
100%
2002-2004

17.16%
0%
40.35%
53.95%
24.49%
83.42%
Total MCL
1999-2001

5.74%
0%
12.00%
65.78%
53.02%
78.54%
2002-2004

5.60%
1.08%
10.12%
75.62%
62.79%
88.46%
*CD=Compliance determination
*DF=DataFlow
                                                 27

-------
Table 4-9b: Data Quality Estimates (DQE) for SWTR TT, MCL/SWTR TT, and MR
from MA, VA, PA, FL, KY, MS, NC, SC, TN, IL, OH, AR, NM, TX, MO, UT, AK, ID
                     During 1999-2001 and 2002-2004

Year
% COMPLETENESS OF SDWIS/FED


%NON-REPORTING ON SDWIS/FED

%CD ERROR ON
NON-REPORTED
DATA

%DF ERROR ON
NON-REPORTED
DATA
%ACCURACY OF DATA IN SDWIS/FED


%CD ERROR WITH DATA IN
SDWIS/FED


%DF ERROR WITH DATA IN
SDWIS/FED


POINT
ESTIMATE
LOWER
BOUND
UPPER
BOUND
POINT
ESTIMATE
LOWER
BOUND
UPPER
BOUND
POINT
ESTIMATE
LOWER
BOUND
UPPER
BOUND
POINT
ESTIMATE
LOWER
BOUND
UPPER
BOUND
POINT
ESTIMATE
LOWER
BOUND
UPPER
BOUND
POINT
ESTIMATE
LOWER
BOUND
UPPER
BOUND
SWTR TT
1999-2001
59.95%
43.68%
76.21%
40.05%
43.68%
56.32%
93.23% 6.75%
76.38% 0%
100% 23.62%
100%
100%
100%
0%
0%
0%
0%
0%
0%
2002-2004
93.30%
80.83%
100%
6.70%
0%
19.16%
89.09% 10.91%
62.11% 0%
100% 37.89%
78.78%
60.28%
97.28%
17.01%
0%
39.27%
0%
0%
0%
MCL/SWTR TT
1999-2001
66.73%
54.74%
78.72%
33.27%
21.28%
45.26%
82.04% 17.96%
63.88% 0%
100% 36.12%
94.22%
88.26%
100%
1.85%
0%
5.53%
0%
0%
0%
2002-2004
80.36%
68. 10%
92.62%
19.64%
7.38%
31.90%
67.72% 32.28%
40.40% 4.95%
95.05% 59.60%
92.90%
88.24%
97.55%
1.63%
0%
4.11%
0%
0%
0%
MR
1999-2001
37.74%
31.69%
43.79%
62.26%
56.21%
68.31%
91.63% 8.37%
88.91% 5.66%
94.34% 11.09%
92.01%
88.90%
95.11%
1.44%
0.13%
2.75%
0.28%
0%
0.63%
2002-2004
28.06%
23.82%
32.31%
71.94%
67.69%
76.18%
93.59% 6.41%
91.60% 4.43%
95.57% 8.40%
92.2%
89.33%
95.08%
2.01%
0.37%
3.66%
0.64%
0.05%
1.23%
                                      28

-------
       Table 4-9b: Data Quality Estimates (DQE) for SWTR TT, MCL/SWTR TT, and MR
       from MA, VA, PA, FL, KY, MS, NC, SC, TN, IL, OH, AR, NM, TX, MO, UT, AK, ID
                              During 1999-2001 and 2002-2004

Year
%FALSE POSITIVE DATA IN
SDWIS/FED
OVERALL DATA QUALITY
POINT
ESTIMATE
LOWER
BOUND
UPPER
BOUND
POINT
ESTIMATE
LOWER
BOUND
UPPER
BOUND
SWTR TT
1999-2001
0%
0%
0%
59.95%
43.68%
76.21%
2002-2004
4.21%
0%
10.00%
73.71%
53.90%
93.52%
MCL/SWTR TT
1999-2001
3.93%
0%
8.44%
63.71%
52.70%
74.71%
2002-2004
5.57%
0.35%
9.58%
75.46%
63.60%
87.33%
MR
1999-2001
6.27%
3.57%
8.98%
36.14%
30.19%
42.08%
2002-2004
5.15%
2.89%
7.40%
26.87%
22.71%
31.03%
*CD=Compliance determination
*DF=DataFlow
                                                29

-------
4.3    Analysis of Enforcement Data

Federal regulations indicate the conditions under which enforcement actions will be taken with a
PWS to ensure public health protection if the system is in violation of the Federal-State drinking
water program. States must report a subset of these actions to EPA. EPA reports these data for
situations where EPA is the enforcement authority because the state has decided not to obtain
approval to implement the federal program (e.g. Wyoming, the District of Columbia and on
Indian  lands).

Enforcement data reported to SDWIS were compared to those found in the state files during the
DV. The proportion of enforcement data in the state files that were in agreement with those
reported to SDWIS/FED (la, Ic, and 5a from Table 4-2) were estimated as described in Section
3.3 and presented in Table 4-8. The overall DQE for enforcement data was 86%.
      Table 4-8: Proportion Estimates of Enforcement Data in State Files reported to
                           SDWIS/FED without discrepancy
PWS Type
Proportion Estimate
cws
73.14%
(+/-9.65%)
NTNCWS
76.25%
(+7-6.88%)
TNCWS
94.92%
(+7-2.72%)
Overall
85.97%
(+7-3.62)
5.     Analysis of Timeliness of Violation Reporting in SDWIS/FED

In this section, the results from the analysis of the data in SDWIS/FED are presented. This
analysis evaluates the timeliness of violations based on compliance period end date, which
provides a benchmark for comparison between fiscal years. Violations are due to be reported by
the end of the following quarter after awareness or the compliance period end date.

Timeliness is calculated as the ratio of the number of violations reported on time and the baseline
number of violations that should be reported, i.e.,

                             Number of ViolationsReported on Time
                Timeliness =  Numberof ViolationsReported Baseline

where on time is defined as by the end of the following quarter after the compliance period end
date and baseline is a point in time in the future (in this case, between 4 and 7 quarters after
violations are due to be reported).  Basically, the Timeliness is the proportion of violations that
were eventually reported to SDWIS/FED on time.

To compute the timeliness, the violation data were extracted from archived SDWIS/FED
databases for each of five fiscal years (2000-2004). The violations were then grouped by PWS

                                          30

-------
ID, fiscal year, quarter, violation code, contaminant code, and basic PWS attributes, and the on-
time and baseline violations were summed. Table 5-1 shows the database extracted for the
analysis. The database does not include LCR or other violations with open-ended compliance
period end dates. For these violations, the compliance period end date is open until the system
returns to compliance.

                Table 5-1: SDWIS/FED Database Analyzed for Timeliness
FY2000 on time:
OOQ1
OOQ2
OOQ3
OOQ4
FY2000 baseline:
Violations with end dates between:
10/1/99
1/1/00
4/1/00
7/1/00
and
and
and
and
12/31/99
3/3100
6/30/00
9/30/00
01Q4 tables, Archived 1/02
Archive Date
4/00
7/00
10/00
1/01


FY2001 on time:
01Q1
01Q2
01Q3
01Q4
FY2001 baseline:
Violations with end dates between:
10/1/00
1/1/01
4/1/01
7/1/01
and
and
and
and
12/31/00
3/31/01
6/30/01
9/30/01
02Q4 tables, Archived 1/03
Archive Date
4/01
7/01
10/01
1/02


FY2002 on time:
02Q1
02Q2
02Q3
02Q4
FY2002 baseline:
Violations with end dates between:
10/1/01
1/1/02
4/1/02
7/1/02
and
and
and
and
12/31/01
3/3102
6/30/02
9/30/02
03Q4 tables, Archived 1/04
Archive Date
4/02
7/02
10/02
1/03


FY2003 on time:
03Q1
03 Q2
03Q3
03 Q4
FY2003 baseline:
Violations with end dates between:
10/1/02
1/1/03
4/1/03
7/1/03
and
and
and
and
12/31/02
3/3103
6/30/03
9/30/03
04Q4 tables, Archived on 1/05
Archive Date
4/03
7/03
10/03
1/04


FY2004 on time:
04Q1
04Q2
04Q3
04Q4
FY2004 baseline:
Violations with end dates between:
10/1/03
1/1/04
4/1/04
7/1/04
and
and
and
and
12/31/03
3/31/04
6/30/04
9/30/04
05Q4 tables, Archived on 1/06
Archive Date
4/04
7/04
10/04
1/05

                                          31

-------
       Table 5-2: Violation Reporting Timeliness to SDWIS/FED by Violation Type
Fiscal Year
2000
2001
2002
2003
2004
Number of Violations Reported on Time
TCR MCL
Other MCL
SWTR TT
Health-Based
Violations13
M/R
7,738
727
932
9,397
49,782
8,114
652
918
9,684
50,868
7,977
771
1,045
9,793
55,425
7,902
1,106
774
9,831
61,967
7,421
1,273
540
9,308
32,742
Number of Violations Reported for Baseline
TCR MCL
Other MCL
SWTR TT
Health-Based
Violations13
M/R
11,445
1,344
1,574
14,636
93,231
10,963
1,315
1,627
13,905
111,397
10,795
1,844
1,585
14,369
121,819
10,821
2,573
1,252
14,996
106,664
10,510
3,716
932
15,513
104,427
Percent Timeliness
TCR MCL
Other MCL
SWTR TT
Health-Based
Violations13
M/R
68%
54%
59%
65%
53%
74%
50%
56%
70%
46%
74%
42%
66%
68%
45%
73%
43%
62%
66%
58%
71%
34%
58%
60%
31%
Table 5-2 shows the computed timeliness of the reported violations in SDWIS/FED.
Late reporting can have an impact on the reliability of SDWIS/FED in informing the public and
stakeholders about the quality of their drinking water. Further, it hinders our effort to assess the
public health risk and address the violations with enforcement actions in a timely manner. In
2004, 60% of the health-based violations were reported on time, while only 31% of the M/R
violations were reported on time. Note that there is a 27% decline in timeliness for the M/R
violations from 2003.
Additional information (in the form of pivot tables) is available from EPA upon request that
provides additional details on the timeliness in which violations are reported across several
additional attributes.  Additional findings based on this information are the following:
       Timeliness of reported health-based violations was similar across water system types.
       Monitoring violations for TNCWSs was highest at 58%, and lowest for NTNCWSs at
       33%.
13 These heath-based violations do not include Lead and Copper Treatment Technology (LCR TT) violations
because of they have open-ended compliance period end dates.
                                            32

-------
 !      Timeliness was similar across quarters.

 !      Timeliness generally decreased as system size decreased.

 !      It was difficult to evaluate the timeliness of reported violations for new rules, because
       many of the violations in these rules have open-ended compliance period end dates.
6.     Conclusion

For the 38 states evaluated from 2002 to 2004, most of the reported violations in SDWIS/FED
were accurate at 90%. Approximately 81% of the MCL and SWTR TT violations were reported
to SDWIS/FED. Sixty-two percent of the health-based violations (including LCR TT violations)
and 39% of the monitoring and reporting violations were reported. Non-reporting was mostly
attributable to the fact that states did not issue violations when violations had occurred. In other
words, the violations were not recognized, not recorded by the states as violations,  and
consequently, not reported to SDWIS/FED. Eighty-four percent of non-reported health-based
violations and 92% of non-reported M/R violations were  due to compliance determination errors.

EPA considers non-reported violations to be a serious problem that could have public health
implications at many levels. The information and the analyses based on such incomplete data in
SDWIS/FED compromises our ability to determine if and when we need to take action against
non-compliant systems, to oversee and evaluate the effectiveness of state and federal programs
and regulations, to alleviate burden on states, and to determine whether new regulations are
needed to further protect public health.  Further, our response to public inquiries and preparing
national reports on the quality of drinking water in a thorough and complete manner will be
severely limited.

Some of the discrepancies between the number of violations that should have appeared in
SDWIS/FED and those found by the DV auditors could have included differences in rule
interpretation in light of the flexibility provided to states in implementing rules under state
primacy agreements. The state implementation of rules must be at least as stringent as the
Federal regulations, but can differ in substantial respects within a reasonable scope of the
regulation.  It is critical that EPA and the states continue to work together toward reducing non-
reporting, reporting  errors, and late reporting of violations.

Additional findings included the DQEs of health- based violations were not significantly
different between CWSs and NTNCWSs. The DQEs on M/R violations for TNCWSs were
significantly higher than those for CWSs and NTNCWSs.

Further, the DQEs from 18 states where the DV audits were conducted during the data quality
assessment period of 1999-2001 and again during 2002-2004 were calculated for the purpose of
comparison. For those states, 67% of MCL/SWTR TT violations with a 95% confidence interval
(55%, 79%) were reported to SDWIS/FED during 1999-2001.  Similarly, 80% of MCL/SWTR
                                           33

-------
TT violations with a 95% confidence interval (68%, 92%) were reported to SDWIS/FED during
2002-2004.  Since the confidence intervals overlap, there was no statistically significant increase
in the reporting of violations for these 18 states from 1999-2001 to 2002-2003.  The overall data
quality of MCL/SWTR TT violations was 64% with a 95% confidence interval  (52%, 75%)
during 1999-2001 and 75% with a 95% confidence interval (64%, 87%) during  2002-2004.
Based on the confidence intervals, there was no statistically significant increase in the overall
data quality of MCL/SWTR TT violations for these 18 states from 1999-2001 to 2002-2003.

Finally, 60% of MCL/SWTR TT violations were reported on time and approximately 30% of the
MR violations were reported on time to SDWIS/FED in 2004.
7.     Data Reliability Improvement Action Plan

Based on this analysis and on the results of previous efforts, EPA, working with its state co-
regulators through the Association of State Drinking Water Administrators, has developed a
Data Reliability Improvement Action Plan ("the plan") designed to achieve a data quality goal of
90 percent complete and accurate data for health-based violation reporting. The plan covers the
years 2007 through 2009 and addresses improving data quality for monitoring and reporting
violations and inventory (water systems' facilities) data. Principally, the plan focuses on actions
that EPA and states can take to address compliance determination issues and thereby improve
violation data quality. Progress toward accomplishment of the data quality goal will be
measured annually and assessed in 2009.  The plan appears in Appendix A.
8.     Future Analysis of Data Reliability

Several factors will change both the process and the results of the data verifications and the data
quality calculation for drinking water data. In the near term, the selection of states for DVs will
be based on probability sampling from 2005. Specifically, the selection of states for the data
verifications from 2005 to 2007 will be based on a probability sampling method, with every state
being selected in a 4-year time frame.  This will allow the data quality to be  assessed nationally
for rolling multiple-years. In the longer term (2008 and beyond), EPA is evaluating the
feasibility of electronic data verification (ED V), which would collect and evaluate compliance
sample results of regulated contaminants electronically for all CWSs. EPA believes that the most
cost-effective and complete process of evaluating data quality in the long term may be via the
EDV process. In each state, we can evaluate the data once every one or two years through the
compliance determination processes recorded in the SDWIS/STATE software. SDWIS/STATE
is already designed and developed for states to manage their drinking water programs. The
advantages to this approach are that the software already exists and all compliance
determinations are available for evaluation. The current DV process relies on a sample of
systems, and due to the inherently small number of large CWSs, the large CWSs are not well
represented in the samples. The EDV will allow us to use all systems instead of relying on a
sample from a DV. Additionally, the drinking water administrators in decentralized states can

                                           34

-------
have hands-on data in one location instead of going to regional drinking water offices. All states
using SDWIS/STATE will have the capability to calculate data quality in near real time and take
action on issues as they arise. Furthermore, EDV will allow states and EPA to reduce and
reallocate time and resources spent on manual data reviews while providing a more complete
picture of program implementation and leading to the identification of opportunities for program
improvement.
                                           35

-------
      Appendix A: 2006 Data Reliability Improvement Action Plan

Introduction

The past two Data Reliability Improvement Action Plans have drawn attention to actions
that can be taken to improve data quality and the usability of SDWIS/FED data.  While
they significantly focused on information system improvements and general activities
that should improve data quality, this 2006 plan builds on current findings for more
recent data and capabilities not previously developed that concentrate on specific factors
that could result in real-time data quality improvements.

The philosophy of past data reliability improvement action plans largely was built on the
concept that we must improve the software of the information system, SDWIS/FED.
This has largely been done, with the last remaining step to be completed in 2007 with
SDWIS/STATE Web Release 2. This release fully web-enables SDWIS/STATE,
reducing resources needed to implement the software by states and reducing the
complexity of data entry with fewer data entry screens and more drop-down lists. This
2007-2009 Data Reliability Improvement Action Plan primarily focuses on the actions of
those responsible for determining which data will be entered and how that will occur.
The largest challenge is ensuring that all data reflecting determinations of violations are
entered into the SDWIS/FED, the federal  data base.

As indicated in this report, EPA found that 77 percent of all data on MCL/SWTR TT
violations in SDWIS/FED was complete and accurate.  This is not satisfactory, has been
the focus of media attention concerning the reliability of the data used to make decisions
about the most important public health program in the nation for safeguarding its water
supply to its citizens, and  needs to be improved. To make a larger step forward over the
next three years (2007-2009), EPA and ASDWA in October 2006 set a data quality goal
of 90 percent (completeness and accuracy) for future compliance reporting of health-
based violations in the federal database, SDWIS/FED.  This plan is principally focused
on achieving that goal. Based on past analyses of state-specific  results,  eleven states
have achieved this level of data quality for health-based violations, indicating that it is
achievable.  The  plan also addresses improving data quality of monitoring and reporting
violations and inventory data; that is, improving the quality of all data used and
supporting the state and national drinking water programs with the highest quality data.
The Plan is presented in a series  of issues and plan elements with assigned responsibility
and timeframes.
Issues

(1) Modify Data Verification Selection Processes: EPA continues to conduct triennial
    data quality analysis and to follow up on data verification by working with states to
    address identified differences and discrepancies from federal regulations. In the
    2005-2007 timeframe, EPA implemented probability-based selection (random
    selection) of states for data verification to enhance the representativeness of the data
                                       36

-------
    at the national level. The data quality results for these data will provide an indication
    of the extent of achievement of the 90 percent data quality goal set by EPA and
    ASDWA.  Consistent with the August 2005 recommendation of the special ASDWA-
    EPA Data Quality Subcommittee, the quality of results in the national database,
    SDWIS/FED will be displayed by rule and significance (i.e., public notification tier).

(2)  Consider All Compliance Determinations in State Data Quality: In evaluating
    SDWIS/FED data quality, EPA only considers data in the national database and not
    in the state databases reflecting all compliance determinations resulting from the
    states' position as the primary enforcement authority for the federal program. EPA
    will develop an "electronic  data verification" (eDV) tool to enable states to track any
    discrepancies of their compliance determinations relative to federal regulations and
    correct these discrepancies prior to data quality calculations and allow calculation of
    data quality relative to all compliance determinations. EPA should augment its
    SDWIS/STATE software to allow states to obtain management reports on any
    discrepancies in state compliance determination in near-real time to allow for the
    possibility of improving health-based response and data quality.

(3)  Use Electronic Scheduling and Lab Reporting: Using automated monitoring
    requirements/schedule generators and incorporating electronic reporting from
    laboratories to states would improve the quality of data that states receive from water
    systems. Anecdotal information suggests that when states issue automated
    monitoring schedules to water systems, on-time monitoring and reporting by  those
    systems improves.  This step increases the probability that all data will be used by the
    state in determining compliance with public health drinking water standards and that
    appropriate determinations are made. Additionally, when states receive monitoring
    data electronically, data entry errors are reduced.  This  second  step helps ensure that
    the correct data are used in the decision process for determining compliance.  Water
    system or laboratory submission of data to states must comply  with the Cross-Media
    Electronic Reporting Rule (CROMERR), compliance with which will need to be
    considered in any effort to facilitate electronic reporting from laboratories to  states.

(4)  Consider Data Management early in Rule Development: Data management concerns
    should be considered during every phase of the rule development process, beginning
    with the initial rule concept. If this does not occur, rules with complex reporting
    requirements may emerge, overwhelming the capability of states to implement them
    and shifting valuable resources from taking actions on real health needs to reporting.
    Data management using electronic reporting can simplify handling data but does not
    necessarily and always mean a simpler process for protecting health and should not
    be used as a "crutch" for creating complex rules instead of focusing on simpler,
    direct key health management objectives for drinking water supply protection.
    Streamlined approaches to data management in states' business processes must be
    considered in rule development.

(5)  Improve State Capability in Compliance Determination: Data  reliability, as reported
    in the Triennial Data Reliability Report, appears to have marginally improved, even
                                        37

-------
    though this is not statistically significant.  State compliance determinations play an
    integral role in determining the reliability of the data on violations reported to the
    national database, SDWIS/FED.  Incorrect compliance determinations, when they do
    occur, are due in part to the complexity and number of drinking water rules. The need
    for training to facilitate correct determinations is critical, especially with the
    changing nature of state staff available to implement the drinking water regulations.
    Incorrect compliance determinations are a serious matter as they may affect public
    health.

(6)  Complete SDWIS Modernization: EPA should continue implementation of the
    OGWDW Information Strategic Plan to modernize and web-enable the
    SDWIS/STATE to take advantage of newer technologies and system platforms. This
    action will save state resources by being able to enter data from anywhere in the state
    that is web-accessible and reduce data entry time with fewer screens and more drop
    down lists. State  deployment of SDWIS/STATE Web Releases 1 and 2 will take
    time because of different schedules and variation of available resources among states.
    For states using SDWIS/STATE, full use of all SDWIS/STATE modules and regular
    update of inventory data will facilitate improved data quality.

(7)  Evaluate Low Timeliness of Violation Reporting: Violation reporting timeliness is
    low and not improving.  Because the states have been taking steps to improve data
    quality and the calculation of data quality considers results which may be 3 to 5 years
    old in some cases, estimates of reporting timeliness may not be current.  EPA should
    use the reported results from the first year of using the modernized data flow to re-
    evaluate timeliness for each rule, as recommended by the Data Sharing Committee.

(8)  Update Out-of-date and missing Inventory Data: Key features of inventory data
    useful in examining compliance and for determining regulatory needs are not
    routinely updated and reported.  For example, consecutive systems or treatment
    objectives for recent rules are inventory data that are not reported for each system to
    which they apply. As a result, EPA cannot conduct analyses of national capability to
    treat certain contaminants.  Inventory data for grant eligibility are routinely reported
    for the purposes of ensuring adequate data for receiving grants.
                                       38

-------
2006 Drinking Water Data Reliability Improvement Action Plan
Element

(1) Modify Data
Verification
Selection
Processes

















(2) Evaluate All
Compliance
Determinations








Element
Description
EPA will calculate data
quality with data from
2005-2007 data
verification from the
random selection of
states and display by
rule and public
notification tier.













Develop a tool to allow
states to identify
compliance
determination
discrepancies from
federal regulations
more easily.




Activity

(a) EPA will calculate data quality
with data from 2005-2007 from the
random selection of states and
display by rule and public
notification tier.
















(a) EPA will develop an "electronic
data verification" (eDV) tool to
enable states to track any
discrepancies of their compliance
determinations relative to federal
regulations and correct these
discrepancies prior to data quality
calculations and allow calculation
of data quality relative to all
compliance determinations.
(b) States will agree to provide
Responsibility &
Actions
(1) EPA - Calculate
national estimate of data
quality for health-based
violations and separately for
monitoring and reporting
violations and inventory
data for 2005-2007

(2) EPA - Calculate state
estimates of data quality for
all health based compliance
determinations and
separately for all monitoring
and reporting compliance
determinations for 2005-
2007
(3) EPA - Report data
quality by rule and public
notification tier for 2005-
2007 using data verification
results
(1) EPA & States -
Complete pilot test of eDV
tool







(1) EPA & States - EPA
Completion

(1) December
2008






(2) December
2008






(3) December
2008



(1) December
2008








(1) July 2007
Status

































                            39

-------
Element




















(3) Use
Automated
Scheduling and
Electronic Lab
Reporting







Element
Description



















States and EPA will
take steps to more fully
utilize automated
technology to improve
reporting of water
system data to states.






Activity

contaminant occurrence and
monitoring schedules data to EPA
to allow the Agency to conduct
electronic data verification for all
rules across all water systems in a
state, retrospectively, on an annual
basis, but not less frequently than
every three years, to allow regular
assessment of data quality and to
identify opportunities for state
program improvement.








(a) States will utilize automated
scheduling of water system
monitoring to the extent possible
and report on progress in on-time
monitoring and reporting by water
systems at the ASDWA-EPA Data
Management Users Conference.
(b) EPA will develop an electronic
tool to allow laboratories testing
drinking water samples to report to
states ("lab-to-state") reporting
tool, rather than submitting paper
Responsibility &
Actions
request and states provide
contaminant occurrence
and schedules data for all
water systems from at least
nine states for testing eDV
tool

(2) EPA & States -
complete data sharing
agreements for contaminant
occurrence and monitoring
schedule data
(3) EPA & States - EPA will
receive state contaminant
occurrence and schedules
data for all water systems
from all states through
completion of a data sharing
agreement
(1) States - Report
progress on State
automated scheduling of
system monitoring



(1) EPA - develop "lab-to-
state" reporting tool



Completion








(2) December
2008



(3) Annually
beginning
2009




(1) Annually
May 2007
May 2009
May 2010



(1) March
2007



Status



























(3) (b) (1) Done




40

-------
Element

(4) Consider
Data
Management
early in Rule
Development
Element
Description

Implement a process to
address data
management in rule
development.
Activity
reports on monitoring results.
(c) The EPA Office of Ground
Water and Drinking Water will
work with the EPA Office of
Environmental Information (OEI) to
incorporate CROMERR
requirements in the "lab-to-state"
reporting tool, work toward OEI
approval of the tool.
(d) States not using the EPA
developed "Lab-to-State"
electronic reporting tool will identify
and use a similar tool
(a) EPA information systems staff
will participate in early rule
development through preparation
of issue papers on data
management for each future rule
and share these papers for
comment with states through the
ASDWA-EPA Data Management
Steering Committee.
(b) States will identify staff and
participate in discussions of future
rules to ensure that business
processes are considered.
(c) ASDWA and EPA will work
toward agreement on a mutual
generic timeline for considering
data management in rule
Responsibility &
Actions

(1) EPA- review and
approval of CROMERR
compliance of "lab-to-state"
reporting tool
(1) States will replace paper
lab reports for compliance
monitoring with automated
lab reporting
(1) EPA & States -
information systems staff
participate in rule
development
(1) State - staff identified
for participation in rules to
consider state business
processes
(1) ASDWA and EPA -
reach agreement on generic
timeline for including data
management in rule
Completion

(1) August
2007
(1) Ongoing
through
December
2009
(1) Ongoing
(1) Ongoing
(1) December
2007
Status

(3) (c) (1) Done

(4) (a) (1)
Ongoing;
Completed
issue paper on
TCR/Distribution
System
reporting for
DSMC input
(4) (b) (1)
Ongoing

41

-------
Element

(5) Improve
State Capability
in Compliance
Determination
(6) Complete
SDWIS
Modernization
Element
Description

EPA Regions, to
ensure that data
reliability improvement
(including
implementation of EPA
Order 5360. 1.A2) is
included in annual
agreements with
States, will work with
states to identify the
specific reasons for
discrepancies in
compliance
determinations and to
identify training needs
among states to
facilitate capability to
make correct
determinations.
Complete
modernization, web-
enablement and
deployment of
SDWIS/STATE Web
Activity
development.
(a) EPA Headquarters will
develop an electronic data
verification tool to allow EPA
Regions to compare the results of
all state compliance
determinations to the violation data
reported to EPA in SDWIS/FED.
(b) EPA Regions will ensure that
data reliability improvement steps
are included in all agreements and
work plans with States and identify
specific reasons for discrepancies,
including non-reporting, of state
determinations with federal
regulations.
(c) States will identify compliance
determination training needs to
EPA Regions.
(d) EPA Headquarters will
develop and provide capability for
training on compliance
determination for states
(a) Development of fully web-
enabled SDWIS/STATE and
facilitation of fuller use of software
for state program management
Responsibility &
Actions
development
(1) EPA HQ & States -
Complete testing eDVtool
(2) EPA Regions and
States - Use eDV tool to
check compliance
determinations and take
appropriate action
(1) EPA Regions & States -
Incorporate data reliability
improvement steps in state-
EPA agreements and state
work plans
(1) States - identify
compliance determination
training needs
(1) EPAHQ-
Completed/revision
underway for compliance
determination training
(1) EPA HQ - Develop
SDWIS/STATE Web
Release 2
(2) EPA Regions - promote
full state use of
Completion

(1) September
2008
(2) Ongoing
beginning in
2009; quarterly
check and take
action
(1) Annually
(1) Annually
(1) Ongoing
(1) October
2007
(2) Annually
Status







42

-------
Element

(7) Evaluate
Low Timeliness
of Violation
Reporting
(8) Update Out-
of-date and
missing
Inventory Data
Element
Description
Release 2, facilitate
fuller use of
SDWIS/STATE among
states choosing to use
it and regular update of
inventory data to
improve data quality
Evaluate timeliness by
rule with data reported
to the modernized
SDWIS/FED for 2006
Evaluate regulatory
requirements to
determine the
appropriate inventory
reporting relating the
applicability of rules to
systems, set a priority
on the data needed,
and work with states to
update the inventory
data routinely reported
to EPA
Activity

(b) Deployment of web-enabled
SDWIS/STATE with planned fuller
use of modules by states using
SDWIS/STATE and update of
inventory data
(a) Evaluate timeliness by rule
with data reported to the
modernized SDWIS/FED for 2006
(a) Evaluate regulatory
requirements to determine the
appropriate inventory reporting
relating the applicability of rules to
systems, set a priority on the data
needed, and work with states to
update the inventory data routinely
reported to EPA
Responsibility &
Actions
SDWIS/STATE software
through state agreements
(1) States - Deploy
SDWIS/STATE Web
Release 2
(2) EPA Regions and
States using SDWIS/STATE
- agree to steps toward
fuller use of SDWIS/STATE
in agreements and workplan
(3) EPA HQ and States -
Conduct workshop on
SDWIS/STATE Web
Release 2
(1) Data Sharing
Committee - perform
timeliness analysis in 2008
once all violation data are
reported and processed;
make recommendation to
DMSC
(1) Data Sharing
Committee - evaluate
inventory reporting and
propose a priority on data to
be updated
Completion

(1) Beginning
October 2007
(2) Annually
(or as
appropriate)
(3) Summer
2008
(1) 2007
(1) 2008
Status




43

-------
Appendix B: Violations Addressed by Data Verification (DV)
Violation
Code
1
2
3
4
5
6
7
8
9
10
11
12
13
21
22
23
24
25
26
27
28
29
31
36
Violation Name
MCL, Single Sample
MCL, Average
Monitoring, Regular
Monitoring, Check/Repeat/Confirmation
Notification, State
Notification, Public
Treatment Techniques
Variance/Exemption/Other Compliance
Record Keeping
Operations Report
Non-Acute MRDL
Treatment Technique No Certif. Operator
Acute MRDL
MCL, Acute (TCR)
MCL, Monthly (TCR)
Monitoring, Routine Major (TCR)
Monitoring, Routine Minor (TCR)
Monitoring, Repeat Major (TCR)
Monitoring, Repeat Minor (TCR)
Monitoring and Reporting Stage 1
Sanitary Survey (TCR)
M&R Filter Profile/CPE Failure
Monitoring, Routine/Repeat (SWTR-Unfilt)
Monitoring, Routine/Repeat (SWTR-Filter)
Violation
Type
MCL
MCL
MR
MR
Other
Other
Other
Other
Other
Other
MRDL
TT
MRDL
MCL
MCL
MR
MR
MR
MR
MR
Other
MR
MR
MR
Applicable rules and contaminant codes (CCodes)

DBF
1009,
1011,
2456,
2950

FBR 0500

TTHM pre-'02
2941/42/437
44, 2950

IESWTR 0300

DBP0999, 1006/08
DBP 0400
Other
VOC
2378/
80,
2955/
647 687
697 767
777 797
80/817
827 837
847 857
877 897
907 91 7
92/96

soc
2005/ 1 07
15/207
31/32/
337 347
357 367
377 397
407 41 7
427 43*7
44*/ 467
47*7 507
51/637
657 67,
21 057 1 0,
22747 98,
2306/ 267
837 887
907 927
947 967
98, 2400,
2931 / 467
59

DBP 1008
TCR 31 00
DBP 0400, 0999, 1006/ 087 097 1 1 , 2456, 2920, 2950
SS, TCR 31 00
IESWTR 0300
SWTR 0200
Other IOC
Nitrates 1005/10/157 Rads 40007 _ ...
1038, 20/24/251 06/10, £ ?™
1040, 35/36*7457 41007 Innn
1041 74/75/8S/ 01/02/74 300°
94

* codes required for monitoring only
                          44

-------
Violation
Code
37
38
39
40
41
42
43
44
46
47
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
71
72
75
76
Violation Name
Treatment Technique State Prior Approval
M&R Filter Turbidity Reporting
M&R (FBRR)
Treatment Technique (FBRR)
Treatment Technique (SWTR)
Failure to Filter (SWTR)
Treatment Technique Exceeds Turb 1 NTU
Treatment Technique Exceeds Turb 0.3 NTU
Treatment Technique Precursor Removal
Treatment Technique Uncovered Reservoir
Initial Tap Sampling for Pb and Cu
Follow-up and Routine Tap Sampling
Initial Water Quality Parameter WQP M&R
Follow-up & Routine E.P. WQP M&R
(deleted)
Follow-up & Routine Tap WQP M&R
(deleted)
Initial, Follow-up, or Routine SOWT M&R
OCCT Study Recommendation
OCCT Installation/Demonstration
WQP Entry Point Noncompliance
WQP Entry Point Noncompliance (deleted)
SOWT Recommendation (deleted)
SOWT Installation (deleted)
MPL Noncompliance
Lead Service Line Replacement (LSLR)
Public Education
CCR Complete Failure to Report
OCR INADEQUATE REPORTING
PN Violation for NPDWR Violation
Other Non-NPDWR Potential Health Risks
Violation
Type
TT
MR
MR
TT
TT
TT
TT
TT
TT
TT
MR
MR
MR
MR
MR
MR
TT
TT
TT
TT
TT
TT
TT
TT
TT
Other
Other
Other
Other
Applicable rules and contaminant codes (CCodes)
IESWTR 0300
FBR 0500
SWTR 0200
IESWTR 0300
DBP 2920
IESWTR 0300
LCR 5000
LCR 1022, 1030
LCR 5000
CCR 7000
PN 7500
DBP 0400

45

-------
         Appendix C: Definition of Public Notification (PN) Tiers

Tier 1: Violations and Other Situations Requiring Notice Within 24 Hours

    1.  Violation of the MCL for total coliform, when fecal coliform or E. coli are
      present in the water distribution system, or failure to test for fecal coliform or E.
      coli when any repeat sample tests positive for coliform

    2.  Violation of the MCL for nitrate, nitrite, or total nitrate and nitrite, or when a
      confirmation sample is not taken within 24 hours of the system's receipt of the
      first sample showing exceedance of the nitrate or nitrite MCL

    3.  Exceedance of the nitrate MCL (10 mg/1) by non-community water systems,
      where permitted to exceed the MCL (up to 20 mg/1) by the primacy agency

    4.  Violations of the MRDL for chlorine dioxide when one or more of the samples
      taken in the distribution system on the day after exceeding the MRDL at the
      entrance of the  distribution  system or when required samples are not taken in
      the distribution system

    5.  Violation of the turbidity MCL of 5 NTU, where the primacy agency determines
      after consultation that a Tier 1 notice is required or where consultation does not
      occur in 24 hours after the system learns of violation

    6.  Violation of the treatment technique requirement resulting from a single
      exceedance of the maximum allowable turbidity limit, where the  primacy agency
      determines after consultation that a Tier 1 notice is required or where
      consultation does not take place in 24 hours after the system learns of violation

    7.  Occurrence of a waterborne disease outbreak, as defined in 40 CFR 141.2, or
      other waterborne emergency

    8.  Other violations or situations with significant potential to have serious adverse
      effects on human health as a result of short term  exposure, as determined by the
      primacy agency either in its regulations or on a case-by case basis
Tier 2: Violations Requiring Notice Within 30 Days

    1.  All violations of the MCL, MRDL, and treatment technique requirements except
      where Tier 1 notice is required

    2.  Violations of the monitoring requirements where the primacy agency determines
      that a Tier 2 public notice is required, taking into account potential health impacts
      and persistence of the violation
                                       46

-------
   3.   Failure to comply with the terms and conditions of any variance or exemption in
       place
Tier 3:  Violations and Other Situations Requiring Notice Within 1 Year

    1.  Monitoring violations, except where Tier 1 notice is required or the primacy
       agency determines that the violation requires a Tier 2 notice

    2.  Failure to comply with an established testing procedure, except where Tier 1
       notice is required or the primacy agency determines that the violation requires a
       Tier 2 notice

    3.  Operation under variance granted under § 1415 or exemption granted under
       §1416 of the Safe Drinking Water Act

    4.  Availability of unregulated contaminant monitoring results

    5.  Exceedance of the secondary maximum contaminant level for fluoride
                                       47

-------