APPENDIX A:

EPA ENTERPRISE ARCHITECTURE CORE COMPONENTS
Office of Water
EPA 816-R-04-001A
January 2004
www.epa.gov/safewater

-------
Appendix A

EPA ENTERPRISE ARCHITECTURE CORE COMPONENTS

The Exchange Network

The Exchange Network is the Environmental Protection Agency's (EPA) proposed approach
for the exchange of environmental data among EPA, states, and other parties with whom EPA
and states exchange information. The Exchange Network "vision" is to promote access to
and exchange of quality environmental data while reducing reporting burden and increasing
the efficiency of data exchanges between Exchange Network partners - the parties that
officially participate  in the Exchange Network. During the  early Exchange Network
implementation phase, "Exchange Network Partners" will include EPA, states, tribes, and
territories. In the future, the term "Exchange Network Partners" will likely include other
governmental and possibly non-governmental parties. The  Exchange Network will gradually
replace the traditional approach to information exchange that requires states to feed data
directly into multiple EPA national program systems. These arrangements often vary from
state to state, region to region, and program to program. The Exchange Network will also
facilitate transparent and secure data exchanges that support specific analyses, such as the use
of indicators for measuring environmental results. While Exchange Network participation is
voluntary, EPA and states expect participation in the Exchange Network to become the
preferred method for routine intergovernmental transfers of environmental data.

The Exchange Network consists of both technical and organizational frameworks. The
organizational framework consists of the decision-making and operational structures for
building,  maintaining, using, and evolving the Exchange Network. The technical framework
encompasses the hardware,  software, and protocols, and related technical decisions needed
for Exchange Network implementation

The Exchange Network uses the Internet and Internet-based protocols to standardize and
streamline the information exchange process, and consists of nodes that support the exchange
of data among Exchange Network Partners.  The  data exchange on these nodes will be
formatted according  to agreed upon, standardized Data Exchange Templates (DETs) that rely
on common, Internet-based protocols. The DETs depend on data standards that represent
documented agreements on quality, consistency, formats, and  definitions of commonly shared
data.  The suite of DETs will be compiled and tracked in the Exchange Network
Registry/Repository.

The data exchanges among Exchange Network partners are governed by Trading Partner
Agreements (TPAs). TPAs specify the appropriate DETs and explicitly define the quality,
timeliness, and format of the data. Initially, "Exchange Network partners" will include EPA,
states, tribes, and territories. In the future, the term "Exchange Network Partners" is likely to
include other governmental and possibly non-governmental parties.

As of December 2002, two States, Nebraska and Mississippi, have entered into TPA's with
EPA.  TPAs and DETs are discussed further below.

-------
Exchange Network Rationale

The Exchange Network is both a strategic and collaborative approach between EPA and its
Exchange Network partners intended to address the following trends in environmental
information management:

•   Growing Complexity and Volume of Data - As the scale and complexity of environmental
    challenges (and their associated data) grow, environmental managers will collect, assess, and
    securely exchange more data.

•   Evolving State/EPA Roles - The devolution of environmental management from the federal
    to the state and local levels, and the attempts to use more "integrative" or "adaptive"
    management approaches, has dramatically broadened the universe of data and data exchange.

•   Increased Need for Integration - Integrated environmental management requires integrated
    environmental information and nearly always requires information integrated across media,
    program areas, and geographic, political, and organizational boundaries.

•   Growth of the Internet and e-Commerce - The Internet and its associated technologies are
    transforming information management approaches. They are also increasing public
    expectations for data access and presenting information security issues of a new magnitude.
On an individual basis, EPA's partners are responding to these trends by making major
investments in their internal, often integrated, systems. As part of these investments, many
states have been supplementing their use of EPA national systems.  EPA is currently in the
process of developing its first Agency-wide Enterprise architecture and SDWIS is being used
as pilot program data system to help establish options, costs and benefits to integrate with the
Enterprise Architecture. While individual Partner efforts  are important, there is also a need
for a clear vision or framework for how Partners' systems will inter-operate and collaborate.
Such collaboration and the data flows that support them are essential to meeting current and
future environmental challenges.

In the future, managing these interchanges on a system-by-system, program-by-program basis
will not be sufficient to meet the identified trends in information management and needs.
Without a common framework, it is likely that individual Partners will build better and faster,
but incompatible systems and a tremendous opportunity will have been lost.

The Exchange Network is intended to be such a framework to provide the vision of how these
systems will work together.

EPA's Node - Central Data Exchange (CDX)

The EPA has established a single portal on the Web for environmental data entering the Agency.
The Central Data Exchange (CDX) offers companies,  states, tribes and other entities a faster,
easier, more secure reporting option. CDX provides built-in data quality checks, web forms,
standard file formats, and a common, user-friendly approach to reporting data across vastly
different environmental programs.  A cornerstone of EPA's e-government initiative, CDX
currently accepts data for certain air, water, waste and toxics programs and will gradually expand
to support all Agency environmental reporting by 2004. Although its current focus is electronic,
CDX will eventually incorporate a facility that centralizes paper data collections as well. CDX is
part of a broader effort by states and EPA working together to build an Exchange Network to

-------
integrate state and federal environmental data, reduce the burden of reporting, and improve data
quality.

CDX benefits to reporting entities include:
•   Reducing their reporting burden and associated costs.
•   Enabling automated, machine to machine transactions eliminating tedious paper forms and
    redundant data entry.
•   Ensuring a secure electronic environment.
•   Improving data quality through built-in edit and data quality checks.
•   Offering faster, easier click-and-send reporting with one consistent point of entry for
    reporting, one streamlined set of procedures, and one password.
•   Confirming EPA's receipt of their data.
•   Translating and distributing incoming data to the appropriate data system.

CDX benefits to EPA include:
•   Centralizing receipt, security, user authentication, archiving, translation, distribution and
    related user support services for incoming data.
•   Eliminating redundant infrastructure and its associated cost.
•   Enabling the Agency to streamline and simplify compliance reporting for everyone.
•   Establishing EPA's presence on the Exchange Network.
•   Laying the groundwork for future data integration and quality improvement efforts with the
    States.
Existing Partner Information Systems

Exchange Network Nodes use two kinds of software to interact with back-end systems.
Middleware maps the location, type, and format of data in the back-end systems with the type
and format required for the XML schema. Database connectivity tools communicate between
the Middleware and the database that houses the partner's data.

Partners will map their existing data to the agreed-upon data exchange templates (DETs)
using their Middleware product. Mapping consists of identifying the location of the data in
the back-end database, defining the format of the stored data, and defining the format of the
output data (XML schema). Once the source data and the output data have been defined, the
Middleware translates from source to output and back.

Partners can participate in the network regardless of their existing system architecture using these
standard tools. Stand-alone databases, data warehouses, integrated databases, and enterprise
integrated systems all can be connected to nodes.  While it will be easier to connect a smaller
number of systems to the node, any stable system that serves as a source of quality data can be
used.

Exchange Network partners that have integrated their systems already will be especially well
positioned for these  connections. However, given the incremental nature of both integration
and flow development, it is likely that most partners will connect a number of (non-
integrated) systems to their nodes.

Existing technical architecture will determine the specific approach Exchange Network
partners will take when connecting their nodes to their existing systems. Processes for update

-------
schedules for databases and warehouses, back-up schedules, and quality control timing will
all influence how and when nodes can access data.  While logically straightforward, mapping
the Middleware to the existing systems is not trivial; it will require planning and staff time.

Data Standards

Data standards are the documented agreements on data formats and definitions of common
data. Data standards are especially important tools for data integration and exchange because
they allow data from many compliant sources to be integrated. The benefits of data standards
are even greater for Exchange Network partners because they reduce ambiguity of the
information contained in DETs at the most rigorous level possible. Standards are especially
important for large-scale integration and aggregation efforts such as those performed by EPA.

Data Exchange Templates

 DETs describe and enforce the format and specific restrictions, where applicable, of the data
being exchanged across the Exchange Network. Specifically, DETs are either XML Document
Type Definitions (DTDs) or XML schema. Exchange Network implementation requires not only
that these DETs be developed and used, but also that their development and coordination be
harmonized to  ensure compatibility across network flows. DETs will continue to be developed as
new data standards arise and existing standards are improved.  Used together, the data standards
and DETs will provide partners with powerful tools for data access and integration.

Trading Partner Agreements

Trading Partner Agreements (TPAs) are documents that Exchange Network partners agree
upon for each flow. They define what flow(s) are exchanged, define the stewardship and
security expectations, and specify additional technical details for the exchange of information
among two or more Exchange Network partners. A TPA is, or can be defined as, a stand-
alone document, an addendum or supplement to an existing agreement, or part of an existing
agreement.  If existing agreements and their amendments satisfy the minimum set of elements
that document the  content and process of a data flow, then a separate, stand-alone document
is not required. For the purposes of this Plan, all such agreements are called TPAs.

Exchange Network partners will need to develop at least a basic internal strategy for
managing multiple TPAs across programs and with various offices and agencies. The
strategy should address priorities for Exchange Network flows to be documented in  TPAs,
resource and staffing issues and implications for current business and management processes
associated with data exchange.

Stewardship

The flow of quality data is fundamental to the Exchange Network. The concept of
stewardship refers to the responsibility for this data quality on the Exchange Network.  Data
partners will take responsibility for the data they place on the Exchange Network and for their
interactions with the Exchange Network itself.  These responsibilities will be spelled out in
Trading Partner Agreements.  The concept of stewardship is involved in each of the
components of the Exchange Network.  Two of the most important of these are Data
Stewardship and Node Stewardship.

-------
Data Stewardship - By agreeing to host and exchange data and information, each trading
partner on the Exchange Network assumes and accepts certain data stewardship
responsibilities:
•   Assuring that responsibilities for data quality and integrity are clearly defined and
    understood inside the organization.
•   Assuring that data source, derivation and accuracy meet specifications.
•   Assuring that data formats and units of measure meet specifications.
•   Assuring that any other relevant data or metadata meet the specifications in the TPA.

Node Stewardship - Each partner, whether state, tribal or federal will be the steward of its
own node, making sure that it functions properly and that the data available complies with
agreed upon terms:
•   Assuring that the hardware and software that create, manage, store and provide access to
    the data work properly.
•   Assuring that the data transmitted and received is complete.
•   Assuring that the data transmitted and received comply with agreed-upon formats and
    time schedules.
•   Assuring that data has not been altered.
•   Assuring that confidential and sensitive data has not been intercepted.

System  of Registries

The System of Registries (SoR) is a centralized data registry that provides an authoritative
source of information critical to data integration and exchange between EPA and its partners.
The SoR supports the Agency's environmental  information network by uniquely identifying
objects of interest to EPA including information resources, facilities, chemicals, biological
organisms, and data elements. The SoR provides the means for coordinating the
management, access, and use of EPA's core registry systems.  Information is accessed
through several registry systems  including the Information Resource Registry System (IRRS),
the Environmental Data Registry (EDR), the Terminology Reference  System (TRS), the
Substance Registry System (SRS), the Chemical Registry System (CRS) and the Biology
Registry System  (BioRS).  The SoR also links to the Facility Registry System (FRS) and the
Environmental Information Management  System (EIMS)-two additional EPA core registry
systems. Some of the key registries include:

The Facility Registry System (FRS) - The FRS is a centrally managed database developed
by EPA's Office of Environmental Information (OEI). It provides Internet access to a single
source of comprehensive information about facilities subject to environmental regulations or
of environmental interest.  The FRS contains accurate and authoritative facility identification
records, which are subjected to rigorous verification and data management quality assurance
procedures. FRS records are continuously reviewed and enhanced by a Regional Data
Steward network and active state partners. The facility records are based on information
from EPA's national program systems and state master facility records and enhanced by other
Web information sources.  The Central Data Exchange (CDX) registration, when fully
implemented, will also be used to create and update facility identification records. As of July
2002, FRS  has over 1,133,484 unique facility records linking to over  1,497,987 program
interests.

The Facility Registry System also includes locational information that provides accurate
mapping of the facilities regulated by EPA.

-------
In terms of benefits, the FRS will:

•   Reduce the long-term reporting burden for facilities, states and programs.
•   Improve data quality by helping to reduce errors in state and Agency facility information.
•   Provide better tools for cross-media environmental analysis.
•   Provide better public access to the Agency's environmental information.
•   Give facilities the flexibility to review and update their identification information.

The Office of Ground Water and Drinking Water has listed all of its community water systems in
the  FRS.

The Substances Registry (SRS) -_Chemicals, Biological Organisms, and Miscellaneous
Substances - SRS serves as the nucleus for linking information about substances regulated by
the  EPA. The SRS search page includes queries for substances (such as chemicals,
organisms, and physical characteristics) in EPA regulations, data systems, and other
information resources.

The Chemical Registry (CRS) - Chemicals with Corresponding Information Resources -
CRS provides information on chemical substances and how they are represented in the EPA
regulations and information systems. The CRS search page includes queries for chemicals by
common identifiers.

The Biological Registry (BioRS) - Biological Organisms with Corresponding Information
Resources - BioRS provides information on biological entities and how they are represented
in the EPA regulations and information systems. The BioRS Search page includes queries for
biological organisms.

The Environmental Data Registry (EDR) - The EDR is a comprehensive, authoritative
reference for information about the definition, source, and use of environmental data. The
EDR supports the creation and implementation of data standards that are designed to promote
the  efficient sharing of environmental information among EPA, states, tribes, and other
information trading partners. The EDR also catalogs data elements in application systems.
The EDR does not contain environmental data - it provides descriptive information to make
the  data more meaningful.

Exchange Network Registry/Repository - The Exchange Network Registry/Repository is a
website that serves as the official record and location for the Exchange Network's DETs. The
Registry/Repository will also store other Exchange Network documents such as TPAs.
Trading partners will depend upon the Registry/Repository to access  the templates to validate
flows they receive and properly structure flows they are sending. The Registry/Repository
will be used both manually by users  to get copies of DETs for implementation,  and
automatically as nodes request DET information "on-the-fly" during the process of a data
exchange. In addition, the Registry/Repository will be used to indicate the status of DETs,
including their compliance with applicable standards, their acceptance by EPA and other
information. The Registry/Repository will also provide an ideal way for parties interested in
similar DETs to become aware of each other.

-------
APPENDIX B:

OGWDW INFORMATION ARCHITECTURE CORE
COMPONENTS
Office of Water
EPA 816-R-04-001B
January 2004
www.epa.gov/safewater

-------
Appendix B: OGWDW Information Architecture Core Components

BASELINE ARCHITECTURE

The Federal CIO Council defines Baseline Architecture as - "the set of products that portray
the existing enterprise, the current business practices, and technical infrastructure,
commonly referred to as the "As-Is " architecture. " This section describes in some detail the
currently existing information architecture.

Business Processes

The following narrative provides an overview of the major programs implemented by the
Office of Ground Water and Drinking Water (OGWDW) under the Safe Drinking Water Act
(SDWA), their goals, purposes and objectives and a high level description of the major
business processes engendered by these programs.  Future iterations of this Information
Strategy Plan will include more detailed business process flow diagrams of the major
business processes as appendices.

The Safe Drinking Water Act and Program Management

Congress enacted the Safe Drinking Water Act in 1974 and has enacted major amendments in
1986 and 1996. The purpose of SDWA is to establish national enforceable standards for drinking
water quality and to guarantee that water suppliers monitor water to ensure that it meets national
standards. The 1974 SDWA restructured drinking water programs in two significant ways.

First, it set up a higher level of responsibility for regulating drinking water systems than
established state programs by forming a federal program, called the Public Water System
Supervision Program (PWSS).

Second, it expanded the focus from water system planning and prevention of contamination, to
include developing standards, monitoring for contaminants, and taking enforcement action.

Federal law required the development of federal regulations. However the law realized that
protection of drinking water was still primarily a state responsibility. SDWA included a major
focus on delegating primary responsibility for program implementation (i.e., primacy) to the
states.

EPA's Director of OGWDW is the National Program Manager of the SDWA.  Accordingly,
OGWDW develops national policy and sets national goals and priorities for drinking water
programs. OGWDW consists of two divisions:  the Standards and Risk Management Division
and the Drinking Water Protection Division.

The Standards and Risk Management Division (SRMD) is responsible for setting drinking water
standards and monitoring requirements, establishing priorities for new standards, and researching
technologies that water systems can use to comply with new and existing standards.

SRMD includes the Technical Support Center. The Technical Support Center provides technical
and scientific support for the development of drinking water standards and their implementation.
In addition, it manages the implementation of the Information Collection Rule and the  drinking
water laboratory certification program, and supports the Partnership for Safe Water,  treatment
plant optimization and analytical methods development.

-------
The Drinking Water Protection Division oversees implementation of SDWA regulations through
various programs. They are: the public water system supervision, underground injection control
(UIC), source water assessment and protection, sole source aquifer and wellhead protection
programs. It is also responsible for maintaining drinking water information through
computer databases and the Internet, administering the State Drinking Water State Revolving
Fund, and promoting consumer awareness of drinking water issues.

Other EPA Offices also have responsibilities for implementing SDWA:

•   The Office of Enforcement and Compliance Assistance enforces the statute and regulations;

•   The Office of Research and Development is responsible for research related to health risk
    assessment, health effects, engineering and technology, monitoring, and quality assurance for
    drinking water issues; and

•   The ten EPA Regional Offices implement drinking water programs in non-primacy states and
    provide liaison, coordination and oversight of the primacy states as defined below.  In
    performing these activities, the regional offices perform inspections of water systems, provide
    implementation assistance to primacy agencies and water systems, take enforcement action
    where appropriate, administer the PWSS grants, and generally represent EPA interests with
    the state and local governments.

State Primacy

SDWA provides that EPA may delegate responsibility for implementation and enforcement of
SDWA drinking water regulations to the states that meet the minimum federal requirements for
the stringency of their regulations and the adequacy of their enforcement procedures. Primacy
state programs operate in lieu of the federal drinking water program.

States and tribes must meet these requirements in order to obtain primary enforcement authority
("primacy")  for the PWSS or UIC programs.

-------
         Concept  of Primac
      PWSS Primacy Revision Proi
                         up to 2 yea,
    EPA
    promulgates
    new regs
State
State
submits
draft
request
State
submits
complete
request
  State has interim primacy from
  effective date of State regs or
  submission of complete request,
  whichever is later

                         ordisappn
                                              up
                                              90 days
                          nd
                         comment
                          EPA review
                          and
                          determination
As EPA promulgates new regulations, primacy states must adopt the new requirements under
state law and apply for primacy for those requirements.  One important requirement is that the
primacy agency provides inventory, violation and enforcement data to EPA on a regular basis.
This data is stored centrally at the federal level in the Safe Drinking Water Information System
(SDWIS-FED). Where a primacy agency fails to enforce regulatory requirements in a specified
period of time, the SDWA requires EPA to initiate appropriate enforcement action.  This is one of
the major uses of data submitted by primacy agencies to EPA.

In states without primacy, EPA has primary enforcement authority. These states are called
"Direct Implementation" or DI states because EPA directly implements the UIC and PWSS
programs in those states.

Making changes to SDWIS-FED to accommodate regulatory changes and accommodating the
primacy business process for adoption of new EPA regulations by the primary agency as shown
in the above graphic is a major business process of the information management program.

The Public Water System Supervision Program

The public water system supervision program authorizes the regulation of the facilities that treat,
store and distribute drinking water to taps; the PWSS program implements the National Primary
Drinking Water Regulations developed and issued by EPA. The PWSS program also implements
programs to enhance water system operation.

-------
        Public  Water System
        Supervision  Program
     Over 167,000
     Public Water
     Systems
     Nationwide
54.064
                  93,210
                                               20,559
                    CWSs HNTNCWSs HTNCWSs
PWSs are divided into community water systems, transient non-community water systems
(TNCWSs), and non-transient, non-community water systems because the risks to the populations
these systems serve vary.

As shown above, the majority of PWSs are TNCWSs. While these systems are numerous, they
do not serve the majority of the population because each system only serves a small number of
people. However, almost everyone is served by transient non-community water systems at some
point. (TNCWSs include roadside stops, commercial campgrounds, hotels, restaurants, and other
facilities that have their own water supplies and serve a transient population at least 60 days per
year). Community water systems serve the vast majority of the population. A community water
system can be vast, serving millions of people (like New York City or Boston) or small, serving a
trailer park with 25 residents.

There are currently over 160,000 water systems regulated by the Federal government in the U.S.
National Primary Drinking Water Standards established either the maximum concentration of
pollutants allowed in or the minimum treatment required for water that is delivered to customers.
A Maximum Contaminant Level Goal (MCLG) is the maximum level of a contaminant in
drinking water at which no known or anticipated adverse health effects would occur. A
Maximum Contaminant Level  (MCL) is enforceable. It is the maximum permissible level of a
contaminant in water that can be delivered to any user of a public water system. An MCL is set
as close to an MCLG as possible, taking into account the costs and benefits and feasible
technologies.

For some contaminants, there is not a reliable method that is economically and technologically
feasible to measure the contaminant, particularly at low concentrations. In these cases, EPA
establishes a treatment technique. A treatment technique is an enforceable procedure or level of
technological performance that public water systems follow to ensure control of a contaminant.

-------
An example of a treatment technique involves protection of consumers from certain pathogens.
Reliably measuring the concentration of pathogens can be cost prohibitive.  EPA has found that
operation of filters at a certain level of performance would reliably remove the pathogens from
the water.  EPA implemented regulations requiring filtration at a specified level of performance.

In the regulatory scheme provided by the SDWA, EPA conducts and/or analyzes public health
research and other data regarding the public health impacts of a contaminant, evaluates treatment
and control technologies and associated costs, conducts risk assessments on public health impacts
of various levels of a contaminant, and establishes a MCL or treatment technique it determines is
economically achievable. The EPA also establishes monitoring requirements for these
contaminants which specify the number and types of samples to be collected, the frequency of
sampling,  sampling locations in the water system, the analytical methods to be used and related
technical requirements.  Public Water Supply systems have the responsibility for providing the
necessary treatment or controls, conducting the necessary monitoring, and submitting monitoring
results to the primacy agency. The primacy agency (usually states), have primary responsibility
for determining compliance and taking necessary enforcement actions.

EPA Regional Offices oversee and track primacy agency state enforcement efforts and directly
enforce the regulations in DI states.  Oversight and enforcement focus on actions against
significant non-compliers (SNCs).  Significant noncompliance presents a potentially serious
public health concern (as opposed to a single monitoring violation, for example).

The primacy agency submits certain information and data on PWS's and violations of regulatory
requirements to EPA on a routine basis. EPA compiles this data, does quality control checks,
analyses the data, calculates SNC's, where necessary provides compliance assistance to the
primacy agency or public water supply or in some instances takes federal enforcement action to
compel compliance. EPA also makes data available to the public, develops national trends and
statistics, prepares formal reports to Congress and uses the data to assist in further policy or
regulatory development.  SDWIS-FED is the EPA information management system that supports
this high priority business process.

EPA has developed SDWIS-STATE, an information management system designed to assist
smaller states with limited or no automated information management systems  of their own.
SDWIS-STATE is of much broader scope and much larger than SDWIS-FED because it is
designed to help the states manage the entire PWSS program including additional state program
requirements. Information contained in SDWIS-FED  is a small subset of information contained
in SDWIS-STATE.  Currently, 25 states, 6 EPA Regions and 2 territories are using SDWIS-
STATE.
The Drinking Water State Revolving Fund Program

The Nation's 54,000 community water systems make significant investments to install, upgrade,
or replace infrastructure to continue to ensure the provision of safe water to their 254 million
customers. Installation of new treatment facilities can improve the quality of drinking water to
comply with national primary drinking water standards and protect public health. Improvements
are also needed to help those water systems experiencing a threat of contamination due to
inadequate distribution and transmission pipes.

Many public water systems find it difficult to obtain affordable financing for infrastructure
improvements that would enable systems to comply with national primary drinking water

-------
standards and protect public health. Recognizing this fact, Congress established the Drinking
Water State Revolving Fund (DWSRF) as part of the 1996 SDWA Amendments. The goal of the
program is to provide states with a financing mechanism for ensuring safe drinking water to the
public.  States can use federal capitalization grant money awarded to them to set up an
infrastructure funding account from which assistance is made available to public water systems.

Loans made under the program can have interest rates between 0 percent and market rate and
repayment terms of up to 20 years. Loan repayments to the state will provide a continuing source
of infrastructure financing into the next century. The program also places an emphasis on small
and disadvantaged communities and on programs that emphasize prevention as a tool for ensuring
safe drinking water.

Congress provided $1.275 billion for the DWSRF program in fiscal year 1997. The amount of
funding each state was eligible to receive in 1997 was based on a formula used to award state
program grants under the Public Water System Supervision program.  Congress has provided an
additional $3.145 billion for the DWSRF program for fiscal years 1998 through 2001, including
$825 million for fiscal year 2001.  The amount of funding each state is eligible to receive for
fiscal years 1998 through 2001  is based on the total eligible need determined for each state by the
Drinking Water Infrastructure Needs Survey which the EPA released in January 1997.

Both publicly and privately owned community water systems and non-profit non-community
water systems are eligible for funding under the DWSRF program. Eligible projects include
installation and replacement of failing treatment facilities, eligible storage facilities and
transmission and distribution systems. Projects to consolidate water supplies may also be
eligible.

States develop a priority system for funding projects based on three criteria from the  Act. States
rank the projects and then offer loans to systems based on their ranking order. Priority is given to
those  eligible projects that:

•   address the most serious risk to human health;
•   are necessary to ensure compliance with the requirements of the SDWA
•   assist public water systems most in need according to  state-determined affordability criteria.

The Drinking Water SRF National Information Management System collects information that
provides a record of progress and accountability for the program. The system is managed by
OGWDW and data is made available to the public on the World Wide Web.
The Underground Injection Control (UIC) Program

Underground injection is the technology of placing fluids underground, in porous formations of
rocks, through wells or other similar conveyance systems. This technology is used for many
purposes including disposal of wastes and oil recovery. While rocks such as sandstone, shale,
and limestone appear to be solid, they can contain significant voids or pores that allow water and
other fluids to fill and move through them. Man-made or produced fluids (liquids, gases or
slurries) can move into the pores of rocks by the use of pumps or by gravity.  The fluids may be
water, wastewater or water mixed with chemicals. Injection well technology can predict the
capacity of rocks to contain fluids and the technical details to do so safely. Facilities across the
United States discharge a variety of hazardous and non-hazardous fluids into more than 400,000
injection wells.

-------
The Safe Drinking Water Act established the UIC Program to provide safeguards so that injection
wells do not endanger current and future underground sources of drinking water (USDW). The
most accessible fresh water is stored in shallow geological formations called aquifers and is the
most vulnerable to contamination. These aquifers feed lakes; provide recharge to streams and
rivers, particularly during dry periods; and serve as resources for 92 percent of public water
systems in the United States.

The UIC Program defines an injection well for a wide variety of injection practices that range
from more than 100,000 technically sophisticated and highly monitored wells which pump fluids
into isolated formations up to two miles below the Earth's surface, to the far more numerous on-
site drainage systems, such as septic systems, cesspools, and storm water wells that discharge
fluids a few feet underground.

The EPA groups underground injection into five classes for regulatory control purposes. Each
class includes wells with similar functions, and construction and operating features so that
technical requirements can be applied consistently to the class.

Benefits of the  UIC Program

 Injection wells have the potential to inject contaminants that may cause underground sources of
drinking water to become contaminated. When wells are properly sited, constructed, and
operated, underground injection is an effective and environmentally safe method to dispose of
wastes.  The goals of the EPA's UIC Program are: to prevent contamination by keeping injected
fluids within the well and the intended injection zone, or, in the case of injection of fluids,
directly or indirectly into a USDW; and, to require that injected fluids not cause a public water
system to violate drinking water standards or otherwise adversely affect public health.  These
minimum requirements affect the siting of an injection well, and the construction, operation,
maintenance, monitoring, testing, and finally, the closure of the well. Injection wells require
authorization under general rules or specific permits.  Finally, states may apply to have primary
enforcement responsibility (primacy) for the UIC Program. To date, 33 states, Guam, the
Commonwealth of the Mariana Islands and Puerto Ricohave obtained primacy for all classes of
injection wells.  Seven states share primacy with the EPA.  The EPA administers UIC programs
for the remaining states, the Virgin Islands, American Samoa and Indian Country.

At the present time, information management systems for the UIC program are scattered among
the states, EPA regions and headquarters.  Presently, a national schema or unified set of data
management requirements does not exist.
The Source Water Protection Program

Source water is untreated water from streams, rivers, lakes, or underground aquifers that is used
to supply private wells and public drinking water. Most public and some private well drinking
water is treated before it enters homes. While some treatment is usually necessary, ensuring that
source water is protected from contamination can reduce the cost of treatment and the risk to
public health.

Most source water is defined as surface or ground water. The majority of drinking water in large
metropolitan areas originates from a surface source such as a lake, stream,  river or reservoir. The

-------
land area that can have an impact on these water bodies is called a watershed, and can be
delineated on a map.

Most water in smaller communities originates from under ground and is pumped to the surface
through a well.  Ground water comes from natural underground layers, often of sand or gravel,
which contain water. These formations are called aquifers.  The land area that can have an impact
on the quality of this underground water is called the aquifer recharge area.

There are many contaminants that may be present in source water before it is treated.
These include:

    •   Microbial contaminants, such as viruses and bacteria,
    •   Inorganic contaminants, such as salts and metals,
    •   Pesticides and herbicides,
    •   Organic chemical contaminants, including synthetic and volatile organic chemicals,
    •   Radioactive contaminants.

Assessing the Risks

While many states, water systems, and localities have watershed and wellhead programs, the
1996 Safe Drinking Water Act Amendments placed a new focus on source water quality. States
have been given access to funding and required to develop Source Water Assessment Programs
(SWAP) to assess the areas serving as public sources of drinking water in order to identify
potential threats and initiate protection efforts.
                                                   Public distribution of findings
                          Contamination source inventory
The source water assessment programs created by states differ since they are tailored to each
state's water resources and drinking water priorities. Each assessment includes the four major
elements shown above:

    •   delineating (or mapping) the source water assessment area
    •   conducting an inventory of potential sources of contamination in the delineated area
    •   determining the susceptibility of the water supply to those contamination sources
    •   releasing the results of the determinations to the public.

-------
Benefits of the Source Water Protection Program

Protection of drinking water at the source can be successful in providing public health protection
and reducing the treatment costs/challenge for public water suppliers.  Source water quality can
be threatened by many everyday activities and land uses, ranging from industrial wastes to the
chemicals applied to suburban lawns.  Private well owners are urged to test regularly for common
contaminants such as microbes and nitrate-nitrogen because there is no federal oversight of their
water source. Water systems are heavily regulated through the Public Water System Supervision
Program and respond to this threat to public health with regular water quality monitoring and
actions ranging from well closure to expensive treatment. In some  cases, source water protection
can eliminate or forestall the need to change or modify treatment processes.  Treatment is
expensive and  source water protection can save consumers significant money.

Whether a public water system relies on surface water, ground water, or a combination of the
two, protection of a water system's source is important. Prevention of contamination is one of the
most cost-effective methods of ensuring safe drinking water supplies.  If source water becomes
contaminated,  expensive treatment or replacement of the water source may be required before
safe drinking water can be delivered to users.  Treatment costs are passed on to every user served
by the public water system.

Once completed, source water assessment results can be used to focus prevention resources on
drinking water protection.  EPA strongly encourages linking the source water assessments to
implementation of source water protection programs.  The Source Water Protection (SWP)
Program is a non-regulatory program at the federal level.

At the present  time, information management systems for the SWP program are scattered among
the states, EPA regions and headquarters. Surface water sources of contaminants are contained in
the Permit Compliance System (PCS) managed by the Office  of Enforcement and Compliance.
PCS is one of the largest public data systems in the nation.  Presently, a national schema or
unified set of data management requirements for all source waters does not exist.

-------
The three major programs within the Office of Ground Water and Drinking Water are interrelated
in many ways. The common goal of PWSS, UIC, and SWP at the federal, state, and local levels
is to protect public health. The graphic above shows just a few ways that the programs relate to
each other. Integration of source water data at this time is very limited.

The Unregulated Contaminant Monitoring Program

The 1986 SDWA Amendments required EPA to establish a list of substances that were not
regulated at that time but had the potential for adverse public health impacts and to conduct a
national monitoring program at PWSs to determine their presence and concentrations in drinking
water supplies. The  Amendments required periodic revision of the list and re-sampling to be
conducted at five-year intervals. Two rounds of monitoring occurred under this provision.

The Round 1 dataset contains public water system monitoring sample results for 62 (then)
unregulated contaminants, generally collected between 1988 and 1992, from 40 states and
primacy entities. Round 1 data were stored in a database called the Unregulated Contaminant
Monitoring Information System (URCIS).

The Round 2 dataset (the second round of unregulated contaminant monitoring) contains public
water system monitoring sample data for 48 (then) unregulated contaminants, generally collected
between 1993 and 1997, from 35 states and primacy entities.  Round 2 data were incorporated in
the EPA Safe Drinking Water Information System, SDWIS/FED, that was modified to receive
parametric data.

The monitoring for unregulated contaminants was conducted by the PWSs and sent to the state
primacy agencies that forwarded the data to EPA for evaluation.

-------
The 1996 SDWA Amendments modified but continued the unregulated contaminant monitoring
program established by the 1986 Amendment.  Under these Amendments, EPA issued the 1999
Unregulated Contaminant Monitoring Rule (UCMR 1999) which established a list of 12
contaminants to be monitored nationally to determine their presence in public water supplies.

Under EPA's Information Integration Initiative of FY 2000, OGWDW and EPA's Office of
Environmental Information established a new database for UCMR. This database is able to
receive data directly from large laboratories with sophisticated automated data entry, using an
XML data format by way of the World Wide Web.  This effort also developed Web forms for use
by smaller PWSs for their entry of data and transmission over the Internet to EPA.  EPA holds the
data for a period of 60 days during which period states and PWSs are able to access the  data over
the Internet and submit comments to EPA on the results of their review. After this period, EPA is
free to use this data for rule making and to provide public access to the data.

The National Contaminant Occurrence Database (NCOD)

National Contaminant Occurrence Database (NCOD) was developed in response to the  1996
SDWA Amendments. The data collected and stored in this database, like the unregulated
monitoring data, is used to support EPA's decisions related to identifying contaminants  for
regulation and subsequent regulation development.  The NCOD contains contaminant occurrence
data for both regulated and unregulated contaminants in public water systems from PWSs and
other sources such as the U.S. Geological Survey National Water Information System on
physical, chemical, microbial and radiological contaminants. Regulated occurrence data are
sample data from monitoring in public water systems for contaminants with health-based
standards established by EPA under the SDWA.

EPA uses NCOD data and the data generated under the UCMR (1999) to evaluate and
prioritize contaminants on the EPA Contaminant Candidate List (CCL). The CCL is a list of
contaminants EPA is considering for possible new or revised drinking water standards.

Data ModeKs)

SDWIS-FED
SDWIS-FED is designed to support OGWDW in monitoring compliance with the Safe Drinking
Water Act. SDWIS-FED processes the following major categories of information:

    •   Characteristics of Public Water Supply Systems (includes administrative contact
       information, activity status, PWS type, population served, primary  source type,  and
       owner type).
    •   Water system facility and treatment data (includes flow data between sources of water
       through the treatment plant).
    •   Locational and geographic data to support geospatial applications and source water
       assessments.
    •   Violations of the national Primary Drinking Water Standards and other implementing
       regulations of the Safe Drinking Water Act.
    •   Enforcement and compliance assistance actions (formal and informal) and linkage data to
       violations.
    •   Sample data for unregulated contaminants.

-------
The SDWIS/FED database is a 3rd normal form relational database, comprised of over 50 tables.
Many of the tables are look-up tables as well as association tables.  About 11 of the tables contain
actual data (other than the look-up and association).  There are several tables, and attributes
within existing tables that are not populated, due to an initial intent of having a single data model
for SDWIS/FED and SDWIS/STATE applications. This was determined to not be a feasible
alternative, and the resources to remove the unused structure from the FED database were not
expended.

A data warehousing model and associated on-line analytical processing (OLAP) capabilities now
exist. Data is extracted and reformatted into the warehouse from the SDWIS/FED system
quarterly update. A test data warehousing environment will be operational shortly. It will likely
provide Intranet access in the short term. Production Internet access is still to be defined.

Inventory data is reported on an annual basis, while other data is reported quarterly with a one
quarter reporting lag. The data is frozen each quarter after all processing and validation is
completed.

SDWIS-STATE

SDWIS-STATE is a relatively new system developed to assist states that did not have automated
information management systems or the capability of developing one of their own. SDWIS-
STATE, as contrasted with SDWIS-FED, was designed to assist states in managing their entire
drinking water program on a day-to-day operational level. SDWIS-FED's focus is on
consolidating selected data as previously described on a national basis and making that limited set
of data available to the states, regions and the general public.

SDWIS-STATE operates in a client-server platform using a UNIX operating system or one  of
several versions of Windows operating systems. It uses an Oracle database for the backend and
the front end is written in C++. It contains 147 tables and 1,886 data elements. It addresses 726
analytes, 30 monitoring rules and 62 violation types. Currently, 25 states are using SDWIS-
STATE. Data that is required to be reported to EPA is periodically extracted from SDWIS-
STATE tables and converted to DTP format and submitted to SDWIS-FED where the data
receives quality control checks and then entered into SDWIS-FED tables.

OGWDW Data Warehouse

Periodically, the data warehouse extracts SDWIS-FED data into staging tables modeled after
SDWIS-FED tables.  Additional QA is performed on the data and transforms it, adding attributes
and de-normalizing the data, and organizing it by subject.  Several data marts are also
periodically updated which contain subsets of the data (in the form of multi-dimensional star-
schema cubes), which facilitate making analysis tools in the form of OLAP cubes and pivot
tables, as well as an array of standard reports.

The OGWDW data warehouse includes SDWIS-FED inventory (water system, water system
facility, treatments, contacts, and locational data) and compliance data;  samples datasets listed
above; and results of PWS audits performed by states.

Fact tables include current violations, violations organized to facilitate trend analysis, analytical
results from static and active sample datasets and data verifications findings.  Conformed
dimensions (which are basically the same as EPA Registries) facilitate information integration as
they can be used with different fact tables.

-------
NCOD/UCMR

NCOD consists of static datasets for Rounds 1 and 2 (then) unregulated contaminants, for the 6-
year review of 16 regulated contaminants from a 15-state sample, and for the current flow of
unregulated samples data via the Safe Drinking Water Accession and Review System (SDWARS)
(UCMR 1999).  The static datasets have gone through extensive quality assurance and been
evaluated for national representativeness, documented in EPA analyses and they are available for
download from the web.

SDWARS, the UCMR (1999) transaction database, is housed at RTF; data is sent to it directly
from laboratories in a number of formats, including XML.  UCMR data from SDWARS is
extracted into a data warehouse. Pivot tables are created to facilitate access.

APPLICATIONS

SDWIS-FED
Reporting Toolkits
The reporting of public drinking water inventory and noncompliance information to SDWIS-FED
is supported by a variety of individually-developed state data systems as well as a Personal
Computer (PC)-based, EPA-developed data entry tool (DTFWriter).  A full-featured local
database application (SDWIS-STATE) was developed by EPA for use by Primacy Agencies is
also available. Due to complexity and pending obsolescence  of DTFWriter, EPA decided to
develop Actions DTP, a short-term, stand-alone, single-purpose, PC-based application that
supports violations and enforcements data entry.

DTFWriter was developed using Clipper™. The system can  run on any computer supporting
PC/MS DOS version 3.0 or higher.

Actions DTP was developed to assist state and regional PC users in the creation of a data file
containing Violation or Enforcement actions information that can be input to the SDWIS-FED
System. The software creates records in DTP that is required for entry of data into SDWIS-FED.
DTP files are input to the SDWIS-FED national database on a quarterly basis from the Primacy
Agencies (states and EPA regions) that have been delegated PWSS oversight responsibility by
EPA.

Actions DTP is a Microsoft™ (MS) Access® Windows application installed on a PC at the user
site

SDWIS-FED contains a number of other applications.

Data Entry subsystem - This batch software (CLIST, JCL, CoolGen, COBOL,  SAS, Assembler)
performs input data editing and validation, constructs "total replace" transactions, posts data to
the SDWIS-FED database, identifies, aggregates and creates error reports, and provides detailed
and high level summaries of update status.  Users are required to post the data to the EPA
mainframe, and communicate data processing instructions to  SDWIS-FED production control
staff.

Data Retrieval subsystem - This is the software that creates the user interface for canned
retrievals of data from the SDWIS-FED database. There are over 15 standard reports designed

-------
for interactive batch access; storage of reports online or printed on high speed printers; or,
provides access to the Platinum Report Facility, an ad hoc data retrieval tool.

SNC/Exception Tracking System - This software provides support to EPA's enforcement and
enforcement oversight programs via generated SNC and exception records, three standard reports,
and an on-line system for evaluation of noncompliance and enforcement data to allow regional
modifications of the standard reports.

On-line Data Dictionary - This MS-ACCESS application provides the data dictionary for the
database.

Error Code Database - This MS-ACCESS application provides a look-up for users debugging
error reports to assist in understanding the nature of data entry errors and actions that need to be
taken to correct those errors.

Data Warehouse - EPA staff  operate, update and maintain a local data warehouse for data
distribution and reorganization to enable easier access to SDWIS-FED data. Extract-transform-
load (ETL) tools and procedures are utilized to extract data, transform it,  and post it to the
warehouse.

OGWDW Data Warehouse

There are two ways of accessing drinking water violations and inventory  data:
•   Through the mainframe—standard reports or ad hoc queries using PRF, as well as through
    use of the Oracle Transparent Gateway (OTG), and
•   Through the OGWDW data warehouse

MS-ACCESS custom queries provide access to the warehouse tables and many of the data marts,
as well as through several pivot tables, which can be downloaded off the web.

Numerous analysis tools in the form of pivot tables and OLAP cubes have also been built and are
continually refined. These  include:

•   GPRA, violations, and  inventory analysis tools for trends analysis.
•   Current violations and inventory (including contacts, locational data,  treatments, etc.).
•   An array of data quality analysis tools based on both data verifications and SDWIS-FED data
    that assess data quality, completeness and accuracy of violations data, % of correct
    compliance determinations, rule implementation, timeliness of violations reporting,
    completeness of various required inventory elements including the Source Water Treatment
    Rule (SWTR) reporting and locational data.
•   Several samples analysis tools  for UCMR, 6-year review, and Rounds 1 and 2 datasets.
TECHNOLOGY

SDWIS-FED

The SDWIS-FED Reporting System is designed to operate on the IBM mainframe computer
system; the data are held in an IBM Database2 (DB2) database.

-------
The SDWIS-FED operating environment incorporates use of the following software:

•  IBM's Interactive System Productivity Facility (SIPF).
•  IBM's DATABASE 2™ (DB2) Relational Database Management System (RDBMS).
•  Platinum Technology's Platinum Report Facility™ (PRF).
•  User dialogue screens implemented using IBM's Dialog Management Services (DMS).
•  Control processing via IBM's Time Sharing Option (TSO) Command Lists (CLISTs).
•  Report production performed through a combination of original COBOL programs and
   COBOL programs modified to utilize SQL (Structured Query Language)  formulated from
   user-supplied selection criteria.

EPA headquarters staff accesses the IBM mainframe via TCP/IP (Internet Protocol) -based
communications between desktop devices and servers. EPA's 10 regional offices and state
primacy sgency staff access the mainframe system through the Internet using IBM WebSphere
Host On-Demand™.

SDWIS-STATE

SDWIS-STATE uses client-server architecture and supports Oracle, MS SQL Server and IBM's
DB2 database system as well as  several operating systems including UNIX, WindowsNT,
Windows 98 and Novell. The servers are housed at the state primacy agency offices and EPA
provides SDWIS-STATE software.

OGWDW Data Warehouse

The data warehouse is in a SQL  Server database.  The ETL tool is Microsoft Data Transformation
Services (DTS). Several multi-dimensional OLAP cubes using MS Analysis Services software
are available.

Data access for ad hoc queries is accomplished through MS-ACCESS databases, which have
links to both SQL Server data warehouse tables, data marts, and SDWIS-FED mainframe tables.
The Oracle Transparent Gateway is the means to access those tables that have not yet been pulled
into the warehouse.
TARGET ARCHITECTURE

Business Processes

The EPA Target Business Reference Model as presented in the document entitled "EPA Target
Environmental and Health Protection Architecture " (EHPA) developed by the Office of
Environmental Information presents EPA's model for information integration. The Model for
Information Integration (M4I)1 was developed by the Information Integration Program and
accepted by the Agency in July 2002.  The M4I is a technical, strategic framework that proposes
an integration of data, applications and technology across the Agency. It consists of the
following high-level functions:

•   Connect and Exchange — Electronically connecting to transmit or access data
1 Model for Information Integration, A Preview of the Core Components of the EPA's Target
Environmental Information Architecture (EIA), July 24, 2002.

-------
•   Process and Stage — Data collection, cleansing, validation and approval for use
•   Store for Use — Data storage, linkage and/or referencing for access and use
•   Use — Data manipulation (potentially from multiple sources) to aid in learning,
    discovery and problem solving

Classifying major functions into these broad categories is intended to enable program and
system managers across the Agency to think of information integration in general terms and
to use common terminology to discuss and plan for their programs' functional needs.
Classifying systems by common functions helps identify areas where improvements to
services as well as reductions in costs can be made by eliminating redundancies through the
sharing of services.  OGWDW intends to employ this high level classification of functions as
it further refines its planning in support of system modernization that not only meets its
immediate programmatic business needs, but fully supports the enterprise business needs
through conformance with the Agency's enunciated Enterprise Architecture.

The EHPA further lists and  defines the following EPA business categories and subcategories:

•   Environmental Protection Services
           o   Pollution Prevention
               This area includes the Agency's non-command and control approaches to
               reduce or avoid pollution are centered, as well as its international voluntary
               efforts in such areas as ozone protection and climate change.  Pollution
               prevention incorporates the current pollution prevention program, including
               such activities as the Design for Environment Program, the Energy Star
               program, waste minimization and a variety of best practice efforts. Also
               included are the pollution prevention aspects of regional programs such as
               Great Lakes and Chesapeake Bay.
           o   Pollution Control & Public Health Protection
               This area includes the Agency's national standard-setting programs, such as
               those for ambient air quality and drinking water quality. It also includes
               source and facility permitting activities and other authorizations, along with
               supporting enforcement and compliance responsibilities.  This business area
               relies heavily on state involvement. Under the Criminal Enforcement
               activities of OECA, this area supports homeland security through its
               environmental investigations and forensics functions.
           o   Emergency Response and Remediation
               All cleanup operations, including Superfund sites, facility  spills,
               transportation accidents,  industrial accidents, oil spills, and other accidental
               releases of contaminants fall under this area. It is field-engineering oriented,
               under headquarters or regional office supervision.  This area also supports
               homeland security responses, such as the anthrax decontamination of the
               U.S. Capitol and the World Trade Center response in New York.
           o   Environmental and Human Health Assessment
               This area includes the Agency's responsibilities to monitor and evaluate
               current and future environmental conditions and human health risks.
               Encompassed here are activities that document, map and project many kinds
               of environmental trends.  It is at this level that activities such as the proposed
               EPA Situation Room are found. These involve the integration of the
               Agency's knowledge of all dimensions of human health and environmental
               quality, and many are driven by the use of environmental indicators.

-------
•   Shared Business Support Functions
           o   Research and Development
               EPA's Research and Development program supports the full range of top-
               level business areas. It conducts both basic scientific research as well as
               targeted research to support specific program needs.  Areas include
               environmental studies (non-human biota and ecosystems), human health
               studies, development of monitoring and modeling methods, and creation of
               methods, standards and procedures to ensure the quality of scientific
               technical results.  Under this general heading are the  activities of the
               Agency's formal scientific and technical panels.
           o   Assessments
               Grouped under this heading are analytical activities, such as risk
               assessment/risk management studies, economic impact analyses, social
               impact evaluations and legal reviews. Also within this area are the
               generation of environmental indicators: specification of methods to evaluate
               the state of the environment and to quantify relationships among Agency
               activities and their environmental results. These indicators directly support
               the business driver of supporting performance-based  environmental
               protection.
           o   Regulatory Process Management
               The Agency's regulatory processes include rulemaking activities, but also
               include the development of guidance documents and  other activities in which
               public comment is invited or required. Activities within the process include
               the development of the rules themselves, the process  of external review  and
               comment, formal promulgation and the development of formal policy and
               guidance documents to facilitate implementation.
           o   Information Management
               Management of information in its various forms includes: business-related
               information exchange from inside and outside the Agency; the processing of
               that information to conform to Agency systems; management of metadata
               standards governing Agency program data; data quality management
               operations to ensure the proper applications of standards to EPA data;
               integration of data to some form of enterprise repository; activities to ensure
               data security (data integrity, confidentiality and access); and, activities to
               deliver data in appropriate form to Agency personnel, public EPA partners,
               stakeholders and regulated parties.  This area supports expectations for E-
               government and the need for better services over the  Internet for stakeholders
               and the public.
           o   Communications and Training

•   Program Management
       This level of the target business architecture hierarchy covers activities that guide and
       direct activities at the program level. It includes program planning and design, formal
       delegations of authority under regulatory programs, partnership development, program
       implementation, and program analysis to determine effectiveness in relationship to goals
       and objectives.

EPA's target business architecture foresees the Agency's future as one in which  quick access to
authoritative and unambiguous information is essential.  It is also one in which relationships
among data—particularly the ability to draw clearer connections between program outputs and

-------
environmental and public health protection—should be documented and made active in new or
revised applications. EPA's target business model is characterized by highly interrelated
functions that will ultimately rely upon highly integrated multimedia information to operate
efficiently.

The model emphasizes the  new focus on pollution prevention.  This area will receive
increased attention in the future as the Agency works to emphasize and implement the
increased efficiency and cost effectiveness of preventing a problem versus fixing it after it
happens.

Review of OGWDW's baseline business architecture above shows that OGWDW's business
processes cover the full range of categories specified above.  The following brief examples
illustrate this fact:

        Pollution Prevention:
        OGWDW's Source Water Protection program is specifically designed through its
        Sole Source Aquifer and Wellhead Protection Programs to prevent contamination of
        groundwater sources of drinking water.

        Pollution Control & Public Health Protection:
        Development of Drinking Water Standards (MCLs) for protection of public health.

        Emergency Response and Remediation:
        Public health advisories are issued when public water supplies are known to be
        contaminated.

        Development of Vulnerability Assessments of public water supply systems  and
        appropriate responses to those vulnerabilities are receiving high priority in support of
        the newly emerging Homeland Security program.

        Environmental and Human Health Assessment:
        Both the ongoing National Contaminant Occurrence Database and Unregulated
        Contaminant Monitoring Program are aimed at assessing the potential for adverse
        human health impact of regulated and unregulated contaminants, and at providing
        data necessary to determine the need to develop or revise standards or otherwise
        regulate the release of these substances into the  environment.

Similarly the shared business support functions and program management support functions
can be mapped to the OGWDW baseline business architecture as presented previously in this
document.  The Agency's Enterprise Architecture Team is in the process of selecting tools
that will enable the consistent mapping of program information into the Enterprise
Architecture. A system called METIS may be the prescribed tool and will be adopted when it
is fully supported by the Agency and available to the programs. It is OGWDW's intent to
conform its target business architecture to the Agency's framework as outlined above.

While it is expected that many of OGWDW's business processes, as described in the high
level presentation in the baseline business process discussion, will remain the same over the
next 3 to 5 years, EPA will revise several business processes to meet the expanded
information needs of OGWDW, which generally  fall in the following categories:

•   State oversight and assistance, including enforcement oversight

-------
•   National program oversight—key measures of program success, and program
    assessments
Information to the public
•   EPA research, including developing and evaluating regulations
•   Other needs including Homeland Protection, capacity development efforts
•   Conformance to Agency's evolving Enterprise Architecture requirements including:
           o   Participation in the Exchange Network through adoption of XML as the data
               transfer language between states and EPA
           o   Use of the System of Registries particularly the Facilities Registry, Chemical
               Registry, Biological Registry, proposed XML and Metadata Registries
           o   Use of the Central Data Exchange as OGWDW's data portal
           o   Continued development of the OGWDW data warehouse and use of the
               Agency's data repository
           o   Application of EPA data standards
           o   Development of Trading Partner Agreements

There are many unmet needs in the existing information and business processes, and many
opportunities to meet these needs more directly and effectively.  EPA requires:

    •   Sample data on regulated contaminants, over time, in order to better evaluate existing
        regulations and develop new ones, to perform research on effects of multimedia
        pollutants, and to evaluate the success of the drinking water program;

    •   More effective and direct processes for conducting enforcement oversight;

    •   More meaningful and accurate measures of program  success, to evaluate regions,
        various program initiatives including capacity development programs, infrastructure
        loans and drinking water resource security;

    •   More meaningful and accurate information to provide to the public.

    •   Optimization of data verification audits for:
           o   State oversight—currently used to determine how well states  are determining
               compliance with the regulations and reporting violations and inventory data
               to SDWIS-FED.  Evaluate ways to include simple evaluations, and
               subjective  assessments of state enforcement programs and capacity. Preserve
               statistically representative sample at the state-level.
           o   National measures—need to optimize data verifications to also provide
               statistically-representative samples at the  national-level, stratify samples
               across system size categories and ground  water/surface water, in addition to
               water system type to most effectively evaluate:
                  •   SDWIS-FED data quality
                  •   Impact of data quality on the current GPRA measure
                  •   Investigate the possibility of replacing the current GPRA measure,
                      which is based on reported violations, with statistically
                      representative samples from audits.
                  •   Rule implementation

    •   Evaluate the implications of expanding the number and breadth of data verification
        audits.

-------
    •  Evaluate ways to provide more timely and complete enforcement oversight
           o   Calculate SNCs on PCs
           o   Evaluate queries that can gather violations data from state data systems,
               calculate SNCs on-the-spot to enable QA and follow-up during visits to
               states.
           o   Determine if there are ways to simplify and streamline the software used to
               calculate whether water systems are significant non-compliers

    •  Investigate the methods and program implications of obtaining data from sanitary
       surveys

    •  Investigate the methods and program implications of obtaining parametric data on
       regulated contaminants over time
           o   Data flows—from states, from labs, through SDWARS?
           o   Program benefits
           o   Minimizing the misinterpretation and misuse of the data

    •  Investigate ways to improve the public's access to drinking water information via
       CCRs, enabling us to rely less on incomplete violations data

    •  Integrate drinking water data from several sources

    •  Integrate information
           o   UIC and SRF
           o   Integrate drinking water information across OGWDW and EPA

    •  Ways of integrating:
           o   Geospatial tools
           o   Data warehousing techniques
           o   Agency Enterprise Architecture initiatives
               •  Conformed dimensions/registries
               •  Repositories

OGWDW Data Warehouse Processing and Access:

Pull all data fields into the warehouse from SDWIS-FED, which will be included in the future
XML flows, warehouse them, and build an array of access tools from them.  Information
access shouldn't change when the new data flow through CDX begins.

Internet access: work with Envirofacts to modernize and replicate the mainframe standard
reports, built from warehouse tables OGWDW provides them.  Integrate the warehouse tables
and data marts into the central repository and registries, etc. using ETL tools when they're
ready for us.

Intranet access: post warehouse tables and some data marts  that supply pivot tables and
standard reports on an OW NT intranet server on the EPA Tree. This will be used as an
access server.

-------
And, of course, move the processing and warehouse storage from the mainframe to the NT
server on the VLAN.

Next steps

Once new staging tables are built and modeled after the XML objects they will be populated
with SDWIS-FED, replacing the current staging tables.

The SETS mainframe tools will be replicated and streamlined to run off the server or a PC

The possibility of loading DTP directly into the warehouse, bypassing SDWIS-FED, will be
explored.

With these steps taken, the mainframe can be phased out of the SDWIS data flow.

Data Model(s)

SDWIS-FED

Transitioning from to an architectural environment employing use of staging tables and other
data warehousing technology will have a major impact on the current data model/structure of
SDWIS-FED including the ultimate elimination of the system as it now exists. In the near
term, the transference of data edits/verification to states and EPA Regional Offices could
entail some structural changes as a consequence. Also, the replacement of DTP with XML
will potentially involve structural changes to SDWIS-FED. There will likely be a period of
operational overlap between SDWIS-FED and these new data structures until the new
systems are fully functional in the operating environment.

SDWIS-STATE

Transitioning from DTP to XML as the data exchange language between states and EPA and
the data Edit/Validation responsibilities as described above will likely result in some
structural modifications to the current SDWIS-STATE system to accommodate these
changes. Full Web enabling of SDWIS-STATE (beyond use of XML as the data exchange
language) will entail other structural adaptations.

UIC, SWP, SRF and GPRA

Information management systems for these programs will be reviewed to determine the
potential for (and benefit of) development of national standards, expansion of these systems
to satisfy unmet or evolving needs (including Homeland Security, Web and geographically
enabling these systems), and conformance with Office of Water and EPA Enterprise
Architecture policies and requirements (including consolidation and integration, where
practicable).

Tracking GPRA requirements is clearly an Enterprise level activity and OGWDW will adopt
any software that the Office of Water or the Agency develops and provides.  In the interim,
the OGWDW data warehouse will continue to  support this need. The warehouse has been
specifically designed to accommodate any course of action the Office of Water or the Agency
takes in this regard.

-------
Applications

SDWIS_FED
By 1/2004, a new application will be ready for distribution that is designed to run on local
desktops and/or servers and allow state and regional data providers to validate the data at
their convenience and frequency, without the burden of moving the data to an EPA platform.
It will be designed to operate in environments where states have implemented SDWIS-
STATE, or their own data management systems. It will take the EPA XML schema as input,
and thus will take advantage of commercial off-the shelf (COTS) XML parser software for
field and cross-field validations which precludes the need to develop custom software for that
purpose.

Technology

SDWIS-FED
Data will flow from states to EPA in XML format, the current industry standard. The drinking
water draft schema will be published in February 2003.  The draft schema will be tested with
several volunteer states' data. The staging tables on the NT-Server will accept data from any
state ready to exchange XML formatted data through CDX as soon as the schema is judged ready.

-------
ATTACHMENTS

-------
Attachment 1: List of OGWDW Data Systems
List of current/planned information systems in OGWDW
#
1
2
3
4
5
6
7
8
9
Name
Safe Drinking Water Information System
National Contaminant Occurrence
Database/Safe Drinking Water Accession
and Review System
Drinking Water Mapping Application
Drinking Water National Information
Management System
Long-Term 2 Data Base
National Environmental Method Index
Drinking Water Research Database(?)
Contaminant Information Tool
Contaminant Candidate List (Planned)
Acronym
SDWIS
NCOD-SDWARS
DWMA
DWNIMS
LT-2
NEMI
DRINK
CIT
CCL

-------