AND THE
WATER QUALITY ENTERPRISE:
AN INITIAL ASSESSMENT
i
FOR STORET1995
EPft.
220 /
1989
COM
004
OCTOBER 5,1989
-------
l» :I *••''
fc.e
^•:Jpfl*:"
*'x;.- i '';.. -'" '. • -
-'j:
•I;?-* ', ' V-1
lp&';, j!.. ' j-7
'• li'Si^ ; ' ;• , '"' ''•--
;!;ife"> *••;-
,-: i'£;- * ••' . ?••-.
I !f'- ' " •"" "— '•
w
V
•••sj-'*..'. •.' "'!£i-'" -i/r
•*&*•>:• W..'*&
('<••;<- - v-. *•-. .. •:. ;.-•••• •••-• -;sr<-* r--- •
Ife r^' ^' •'•••".' t r"'^-,^;? -^s%-\': -. •$ • J •..
II?:* ••- '-s:* '•:•-: i/;'9^^,-f- '•'•.•'{..••;
.11:.:•:,- --^/-. ..- ••>:-t:'-xt;^"'->--'--..:i*r--.-i
|:;t:$8- •••^mmm^'y^:
filMfL»::?:i:'^^^H»^*^;^
:• if . \\ '•••:, ;-" . j '... j^jt/fev*"- .."'..- vf:
>s ;;:ft*'-;r,;:;:;,H,-^ifr^:t^^
a^iA:-.vf •'>.;4v.^t^v^.-K;^
'i ••£. •. \ *•-. • '• '•"*... - •;-"---v- ?...•;"••' •*
£•'V. -•-;.- ' •; *'• •' {:-'.*-:..::V* • '•'"". ,'v"'.
"'A
'•-•- V^;Jt!1ip«--.is./iJ
I
I
I
I
I
I
-------
STORET AND THE WATER QUALITY ENTERPRISE:
AN INITIAL ASSESSMENT FOR STORET 1995
PREPARED FOR:
SYSTEMS DEVELOPMENT CENTER
OFFICE OF INFORMATION RESOURCES MANAGEMENT
ENVIRONMENTAL PROTECTION AGENCY
401 M STREET, S.W.
WASHINGTON, B.C. 20460
SUBMITTED BY:
COMPEX CORPORATION
5500 CHEROKEE AVENUE
SUITE 500
ALEXANDRIA, VIRGINIA 22312
CONTRACT # 68-01-7444
DELIVERY ORDER # 006
October 5, 1989
HEADQUARTERS LIBRARY
ENVIRONMENTAL PROTECTION AGENCY
WASHINGTON, O.C. 20460
-------
I
I
I
I
I
I
-------
PART I
STORET: A PROFILE
-------
1
-------
PART I
STORET: A PROFILE
TABLE OF CONTENTS
SECTION
PAGE
1.0 INTRODUCTION 1-1
2. 0 BACKGROUND 1-3
2.1 HISTORY OF WATER QUALITY LEGISLATION 1-3
2.2 HISTORY OF STORET 1-3
3. 0 STORET TODAY 1-7
3. 1 DEFINITION AND DESCRIPTION 1-7
3.1.1 THE WATER QUALITY SUBSYSTEM 1-8
3.1.2 BIOLOGICAL DATA SUBSYSTEM 1-9
3.1.3 DAILY FLOW SUBSYSTEM 1-9
3.1.4 THE STORET SYSTEM 1-11
3 . 2 RELATED SYSTEMS 1-12
3.2.1 WATER QUALITY ANALYSIS SYSTEM 1-12
3.2.1.1 THE REACH FILE 1-12
3.2.1.2 MUNICIPAL/INDUSTRIAL DISCHARGE FACILITIES FILE. 1-13
3.2.1.3 THE STREAM GAGE/FLOW FILE 1-13
3.2.1.4 THE DRINKING WATER SUPPLIES FILE 1-14
3.2.2 PERMIT COMPLIANCE SYSTEM 1-14
4.0 STORET USER PROFILE 1-15
4 .1 WHO USES STORET 1-15
4.1.1 CATEGORIZATION OF USERS 1-15
4.1.2 FREQUENCY OF USAGE 1-16
4 . 2 USER INTERVIEWS 1-21
4.2.1 INTENTION OF THE INTERVIEWS 1-21
4.2.2 THE QUESTIONNAIRE 1-21
4.2.3 THE INTERVIEW PROCESS 1-23
4 . 3 ACTIVITIES ACCOMPLISHED BY STORET USERS 1-23
4 . 4 INTERVIEW FINDINGS 1-25
5 . 0 SUMMARY REMARKS 1-29
APPENDIX A INTERVIEW QUESTIONS
APPENDIX B SUMMARY OF INTERVIEWS
APPENDIX C GLOSSARY OF WATER QUALITY ACRONYMS
-------
-------
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
STORET: A PROFILE
SDC
SECTION 1
1.0 INTRODUCTION
The COMPEX project staff is pleased to submit this report in
response to delivery order # 68-01-7444-006, which required a study
be made of the STORET Water Quality Analysis System.
This study constitutes one of the initial pilot projects being done
in conjunction with the EPA Systems Modernization Initiative (SMI)
for the Office of Information Resources Management (OIRM). The
SMI takes an agency-wide view of information systems and directs
the EPA to build and renew systems for more accessibility and
usability by a growing population of information users. The result
has been to study the feasibility of establishing an EPA Systems
Development Center (SDC). COMPEX is providing a central group of
people knowledgeable in systems analysis and programming who are
simulating the contractor support at the proposed SDC. This group
has begun initially with three ADP pilot projects, which are in
different stages of development.
STORET is an acronym for EPA's water quality STOrage and RETrieval
information management system. It was chosen as one of the ADP
pilot projects for the SDC feasibility study because of (1) the
importance of the system to EPA, (2) its high profile, and (3) its
potential for becoming a viable repository for the Contract
Laboratory Program Analytical Results Database (CARD) data. This
potential for a CARD data repository was reported by CARD users in
a mission needs statement completed for the Superfund Chemical
Analysis Data System (SCADS), of which CARD is a part.
STORET is one of EPA's oldest and largest automated systems, and
is maintained and supported by OIRM for the Office of Water. The
objective of the study for the STORET pilot project was to take a
"fresh look" at the system to better understand it and update some
of the overview-level documentation. This objective has been
accomplished by collecting and reviewing existing STORET
documentation, interviewing a cross-section of the system users,
and interacting with the STORET User Assistance Group.
October 5, 1989
Page 1-1
-------
SDC
STORET: A PROFILE
The outcome of this initial look at STORET is threefold. First,
STORET is defined and explained including its history, subsystems,
databases and relationship to other water quality systems.
Secondly, the project team is providing EPA with a profile of who
STORET users are, why they are using the system, and the extent to
which STORET is satisfying their informational needs. Lastly, this
initial look at STORET will give EPA a basis on which to judge
whether STORET will be a possible future repository for CARD data.
Page 1-2
October 5, 1989
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
-------
STORET: A PROFILE
SDC
SECTION 2
2.0 BACKGROUND
2.1 History of Water Quality Legislation
Our nation's legislative efforts to restore and maintain the
quality of its waters date back to the Refuse Act of 1899. By
passing this law, Congress required states to obtain permits
administered by the U.S. Army Corps of Engineers for discharges
into the nation's waters. Since this time, several additional laws
relating to water quality have been enacted and administered by
other government organizations.
The evolution of these legislative endeavors culminated in the
creation of the Environmental Protection Agency (EPA) in 1970.
EPA's mandate was to mount an integrated, coordinated attack on
environmental pollution in cooperation with state and local
governments, private and public groups, individuals, and
educational institutions.
Soon after the creation of the EPA, one of the most comprehensive
pieces of legislation ever enacted to contend with water pollution
within the United States was passed: the Federal Water Pollution
Control Act and Amendments of 1972, also known as the Clean Water
Act (PL 92-500). In enacting this law, the objective of Congress
was to restore and maintain the chemical, physical and biological
integrity of the nation's waters.
To achieve the goals of the Clean Water Act, the law supported the
collection and dissemination of basic water quality data by EPA,
in cooperation with other federal departments and agencies, and
with public or private organizations concerned with water pollution
control and abatement. STORET, EPA's computerized water quality
database system, has been instrumental in helping organizations
fulfill their obligations under this law.
2.2 History of STORET
Prior to the 1960s, water quality data collected by local, state
and federal agencies was seldom presented in a consistent format.
In addition, an organization that secured data for one particular
study or requirement gave little thought to the possible reuse of
the information by others. Consequently, most data reports were
of limited use, and required costly and time-consuming extraction
and analysis to be applicable to other requirements or tasks.
October 5, 1989
Page 1-3
-------
SDC STORET: A PROFILE
I
I
I
As water pollution control personnel became increasingly aware of
the importance of sound data handling and processing systems, the •
need was recognized for a better system of addressing information |
requirements. The basic STORET concept — the storage and
retrieval of water quality data — evolved from ideas generated at _
an informal conference in August 1961 at an office of the U.S. •
Public Health Service called the Basic Data Branch, Division of •
Water Supply and Pollution Control. This concept established a
single coding structure for water quality data, allowing data to •
be stored in a computer and ultimately made available for others |
to use. When initially implemented in 1964 on a Public Health
Service Honeywell computer in Cincinnati, STORET contained data on
approximately 140 sampling locations.
I
As the number of sampling stations grew, the need to efficiently
manage the rapidly expanding volume of data and number of users •
became evident. In 1966, an Executive Order transferred the •
jurisdiction of water pollution control from the U.S. Public Health
Service to the Federal Water Pollution Control Administration •
within the Department of the Interior. STORET was moved from the |
tape-oriented Honeywell system to a disk-oriented IBM system
operated and maintained by the Department of the Interior. In _
1968, users were able, for the first time, to use modern medium- •
speed card reading terminals located in each of the federal "
regional offices to communicate with the central STORET system1.
Additional improvements were made to the STORET system over time, |
and soon it began to exceed the capabilities of the Department of
Interior data center. As a result, in 1970, EPA contracted with M
us Time sharing to place STORET on its commercial time-sharing •
system. This move greatly expanded the accessibility of the STORET *
database to state and local agencies nationwide. Agencies could
now use a low-cost teletype terminal and have dial-up access to the •
data in STORET. •
Throughout the 1970s, the location of the STORET system changed •
several times, as EPA contracted with various commercial time- Jj
sharing systems. In 1971, EPA moved STORET to Boeing Computer
Services' time-sharing system. The system was initially accessed —
using BCRE, Boeing's own telecommunications language. Later, •
Boeing changed from BCRE to Time-Sharing Option (TSO). In 1974, "
EPA contracted with Optimum Systems Inc. to house STORET, where it
was accessed by WYLBUR. STORET was moved to ComNet time-sharing •
system in 1977, where it was first accessed by the Alpha language |
and later by TSO. Each time the system was moved or the
I
Page 1-4 October 5, 1989
I
-------
STORET: A PROFILE
SDC
telecommunications language changed, parts of the software required
updating to be compatible with the new system. This meant much
time spent in retraining and reorienting the user community.
In 1972, the Daily Flow Subsystem was acquired from the U.S.
Geological Survey and incorporated into STORET, providing users
with access to a large amount of flow data as well as water quality
data (see Section 3.1.3).
A major change occurred to the STORET system in 1980, when EPA
purchased its own IBM 3081 mainframe computer, located at the
National Computer Center in Research Triangle Park, North Carolina.
Five years later, EPA upgraded its computer and purchased an IBM
3090. The transfer of STORET from the 3081 to the 3090 was a very
smooth transition, and the 1980s brought a period of stability for
STORET. Because the system was not moved and reconfigured every
few years, EPA could concentrate on improvements and enhancements
to the system, while focusing on system compatibility.
To accommodate users' increasing needs to store and retrieve
biological data (see Section 3.1.2), the Biological Data Subsystem
(BIOS) was designed and the first part implemented in 1987.
One of the most recent achievements for STORET occurred in 1988,
when the first stage of an interactive retrieval menu system was
designed and implemented to facilitate data retrieval from the
Water Quality Subsystem. These menus help users create the
language statements necessary for data retrieval. This new user
interface has the potential of greatly reducing the training time
necessary for new users.
Table 2.1 depicts the milestones of the STORET system.
October 5, 1989
Page 1-5
-------
SDC
Page 1-6
STORET: A PROFILE
HISTORICAL MILESTONES OF STORET
1961 Formation of original ideas for STORET
1964 Initial implementation of STORET by Public Health
Service
1968 STORET moved to Department of the Interior
1970 EPA created
1970 STORET moved to US Time Sharing
1971 STORET moved to Boeing Computer Services
1972 Daily Flow Subsystem acquired from USGS
1974 STORET moved to Optimum Systems, Inc.
1977 STORET moved to ComNet
1980 STORET moved to EPA-owned IBM 3081 computer
1985 STORET transferred to EPA-owned IBM 3090 computer
1987 First component of BIOS implemented
1988 Interactive retrieval menu system developed
Table 2.1
October 5, 1989
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
-------
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
•
I
I
STORET: A PROFILE
SDC
SECTION 3
3.0 STORET TODAY
3.1 Definition and Description
STORET is a computerized information management system maintained
by the EPA for storage and retrieval of surface and ground water
quality data within and contiguous to the United States. It
contains data on both fresh and salt water and provides for the
sharing of that data among governmental and non-governmental
organizations. EPA owns STORET, but user-organizations own the
data they supply to the system, and are responsible for maintaining
that data.
STORET's original water quality database has developed into a set
of databases and subsystems capable of performing a broad range of
functions, including:
- data availability summaries
tabular data reports
- statistical data analyses
- graphics and maps
- data preparation for export to other systems
Three primary subsystems comprise STORET: the Water Quality
Subsystem (WQS), the Biological Data Subsystem (BIOS) and the Daily
Flow Subsystem (DFS). Each contains a database (or file) and the
necessary software to retrieve the data. Data retrieval is
accomplished by using a different custom retrieval language for
each subsystem. Because these retrieval languages are complex and
users require extensive training to achieve proficiency, the STORET
User Assistance Group conducts an initial three-day training
seminar for new WQS users. Advanced training seminars are
available as well. However, the new interactive menu system
implemented in 1988 for the WQS has alleviated some of these
difficulties for the user. The BIOS subsystem also requires some
training, which takes two days to complete. The retrieval language
for the DFS is the most difficult to understand, as its syntax is
exacting and inflexible. Because of a lower demand for the
subsystem, Flow File training is conducted on an as-needed basis.
Although it has existed for over two decades, the term "STORET" is
used for different applications and subsystems, depending on the
user. Generally, when field users use the name "STORET," they are
referring to the WQS only. BIOS is considered a sister system to
October 5, 1989
Page 1-7
-------
SDC
STORET: A PROFILE
the WQS. However, when members of the STORET User Assistance Group
refer to STORET, it represents all subsystems for which they are
responsible and for which they answer questions. Documentation
also uses the STORET acronym differently at different times. The
documentation produced for the Regional Forums on Water
Information, for example, contains a chart showing the WQS as
STORET's only subsystem, with BIOS representing a different system.
In the same document, another diagram depicts STORET as consisting
of the three subsystems discussed in this document.
3.1.1 The Water Quality Subsystem (WQS)
The WQS is the original and largest subsystem of STORET, written
primarily in PL/1 and IBM Assembler language. Many users consider
the WQS synonymous with STORET. The WQS contains station data that
describes the location of every collection site, sample data that
identifies the event of water collection from the station, and
observation data that records results of chemical and physical
analyses performed on the sample. Many observations can come from
one sample, and many samples can come from one station.
Those agencies that collect and supply data to the WQS are
responsible for updating and deleting that data. Creating,
updating and deleting data is limited to authorized users. Data
is owned by the supplying organization, but all users have read
capabilities. EPA provides the ADP facility, hardware and software
for the central repository, and assumes responsibility for system
software and hardware maintenance, including programming
enhancements, problem solving and modifications mandated by
Congress.
The STORET User Assistance Group estimates that the WQS contains
information from over 700,000 sampling sites throughout the United
States. It maintains data for approximately 25 million samples
containing 140 million observations. In addition, there are over
9,000 parameter codes that can be used to classify the chemical or
physical components of a water sample. These parameter codes are
located in a parameter file, which serves as a reference table for
the WQS, and is maintained by EPA's Water Quality Analysis Branch
(WQAB) within the Office of Water.
Page 1-8
October 5, 1989
I
I
1
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
-------
STORET: A PROFILE
SDC
3.1.2 Biological Data Subsystem (BIOS)
BIOS, a national biological information management tool initiated
by an extensive requirements survey from field biologists, is the
newest subsystem to join STORET. It contains data on the
distribution, abundance and physical condition of aquatic
organisms, as well as descriptions of their habitats. BIOS
contains three data components: (1) field survey, (2) toxicity,
and (3) tissue residue. Only the field survey component has been
implemented. The other two components are still under development.
BIOS is implemented using PL/1 and assembly languages, and
interacts extensively with the WQS, allowing association of
biological and water chemistry data. In addition, BIOS accesses
two reference tables: (1) the parameter file for water quality
parameters and (2) the taxonomic file. A taxonomic numbering
convention is used to identify the organisms collected in a
biological field survey. Names and associated numbers are located
in the taxonomic file, which contains, by STORET User Assistance
Group estimates, approximately 70,000 taxa. Organisms can be
identified to any level of taxonomy: (1) phylum, (2) class, (3)
order, (4) family, (5) genus, and (6) species. This file is
administered by the STORET User Assistance Group, with new aquatic
species sent to both the National Oceanographic and Atmospheric
Administration and the Smithsonian. Institution for validation.
According to the STORET User Assistance Group, BIOS contains
information on approximately 3,000 stations identified as
biomonitoring sites, 5,000 sampling events and 52,000 taxonomic
observations. As is the case with the WQS, the supplying agencies
own their data and are responsible for maintaining that data.
3.1.3 Daily Flow Subsystem (DFS)
The DFS contains daily observations of stream flow and
miscellaneous water quality parameters gathered at U.S. Geological
Survey's (USGS) national network gaging stations. The USGS sends
a current file to EPA semiannually, which the EPA uses to replaces
the old Daily Flow File.
Flow data is owned by the USGS, who is responsible for supplying
and maintaining it. Users access this data for stream flow
studies, or when flow data is needed for performing calculations
with WQS data.
October 5, 1989
Page 1-9
-------
SDC STORET: A PROFILE
or 365 observations.
Page 1-10 October 5, 1989
I
I
The DPS is written primarily in the programming language PL/1, |
although a few programs are written in FORTRAN. The STORET User
Assistance Group estimates that the database is comprised of 85 _
percent daily flow data and 15 percent miscellaneous field •
observations. The DPS contains data from 28,000 USGS gaging '
stations, with some dating back to the 1800s. There are 675,000
records in the system, and each record contains one year of data •
I
I
I
I
I
I
I
I
I
I
I
I
I
-------
STORET: A PROFILE
SDC
3.1.4 The STORET System
Figure 3.1 illustrates the STORET system. Although there are
related systems that interact with the STORET subsystems, this
figure represents what is directly supported by the STORET User
Assistance Group.
THE STORET SYSTEM
Water Quality
Subsystem
Biological Subsystem
(BIOS)
Figure 3.1
October 5, 1989
Page l-ii
-------
SDC STORET: A PROFILE
3.2 Related Systems
Page 1-12 October 5, 1989
I
I
I
Many EPA systems and databases contain information on some aspect _
of water quality. Many of these systems are stand-alone systems, •
not directly linked with STORET or any other system. However, ™
because STORET contains basic water quality data used for many
different analyses and reports, some of these systems do retrieve •
data from STORET. Following is an explanation of two of these |
systems .
3.2.1 Water Quality Analysis System (WQAS) I
The WQAS is a group of procedures that create reports and graphics
from environmental databases maintained by EPA on its central •
mainframe computer. These procedures were developed by the Water •
Quality Analysis Branch (WQAB) of the Assessment and Watershed
Protection Division. The objectives of this development were to •
help users obtain data more easily from STORET and other water |
quality files, and to provide users with more options in the
manipulation of data and production of output. ~
The WQAS was designed not only to take advantage of EPA's •
communications network and IBM 3090 central computer at Research
Triangle Park, but also to be both device and data- independent. •
The central concept provided access to data through any reasonable |
access method, and allowed data, graphics, images and documents: to
be transferred to any number of output devices, software packages M
or computer systems. This design made workstation, graphics and •
printing technology available to STORET system users.
The WQAB determined the best method for performing desired I
functions was to develop a series of databases that had a common m
link. Under the umbrella of the WQAS, several software procedures
and supporting databases were developed, including the Reach File, •
Municipal/Industrial Facility Discharge File, Stream Gage/Flow £
File, and Drinking Water Supplies File.
I
3.2.1.1 The Reach File
The Reach File is an extensive database that identifies and
subdivides U.S. streams, lakes, reservoirs and shorelines to •
provide a framework for organizing water resource data. |
A "reach" is a length of stream or shoreline having relatively »
uniform hydrological attributes. Reach boundaries may occur at •
stream junctions, where streams enter and leave bodies of open *
I
I
I
-------
STORET: A PROFILE
SDC
water, where deep narrow rivers flow into wide shallow rivers,
where stream slopes change significantly, where elevations change
suddenly, or at political boundaries. These reaches are linked to
form a skeletal structure representing the branching patterns of
surface water drainage systems. In the Reach File, each reach is
identified by an eleven-digit code.
There are several versions of the Reach File, all containing
different levels of detail. Some versions are still in
development. The data for this database is abstracted from USGS
topographic maps. Currently, over 650,000 miles of streams and
shorelines in the 48 contiguous states are identified. Eventually,
twice as many streams will be identified and represent the
information contained in 54,000 USGS topographic 1:100,000 scale
maps.
The Reach File is linked to STORET through the inclusion of a reach
number in the STORET station data, which is used to identify a
Reach File location. Several other water information systems are
linked similarly, which results in the use of the Reach File to
integrate these systems. Reach numbers reference each other in
such a manner that it is possible to traverse upstream or
downstream through the nation's rivers and open waters while
scanning other databases for any reach-indexed data along the
traversal path. This is the foundation of EPA's ability to
integrate data from other databases in hydrological order and.by
river mile relationships.
3.2.1.2 The Municipal/Industrial Facilities Discharge File (IFD)
The IFD contains general information about each National Pollutant
Discharge Elimination System (NPDES) facility. This information
includes indirect discharges to publicly-owned treatment works,
standard industrial classification codes, latitude/longitude,
stream reach location, and categorization of process and discharge
type.
3.2.1.3 The Stream Gage/Flow File (GAGE)
The GAGE File contains information on 36,000 stream gaging
locations, including location of gaging stations, types of data
collected, frequency of data collected, media in which data are
stored, identification of the collecting agency, mean annual flow,
and the lowest average seven-day flow over a ten-year period.
October 5, 1989
Page 1-13
-------
SDC
STORET: A PROFILE
3.2.1.4 The Drinking Water Supplies File (DWS)
The DWS File contains data on surface water supplies, including
locations of utilities, intakes and sources, and the hydrologic
cataloging unit numbers and reach numbers of their receiving
waters. This database contains data on 824 utilities serving
communities with populations greater than 25,000, and 6,840
utilities serving communities with populations less than 25,000.
3.2.2 Permit Compliance system (PCS)
The Office of Water Enforcement and Permits is responsible for the
PCS, a database management system written in ADABAS that contains
data on wastewater effluent composition, discharge monitoring, and
facility limits. It is logically linked with STORET so that it can
retrieve water quality data from its database. This allows the
user to analyze, summarize and report combined data from facilities
in PCS and water quality stations in STORET. PCS is also the only
autonomous system that can be accessed through STORET.
Page 1-14
October 5, 1989
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
-------
STORET: A PROFILE
SDC
SECTION 4
4.0 STORET USER PROFILE
One objective of the pilot project was to provide EPA with a clear
view of who the STORET users are, why they are using the system,
and the informational needs that STORET addresses. To accomplish
this objective, STORET users across the country were interviewed
over the telephone. These users represented a variety of
government agencies and organizations.
4.1 Who Uses STORET
As a part of its function to provide service to the users of
STORET, the STORET User Assistance Group maintains (1) the STORET
users mailing list and (2) the STORET System User Activity Report.
Throughout this discussion the term "user" is defined as those who
are on the mailing list.
The STORET users mailing list contains over 1,600 names and
addresses of individuals and firms. The STORET User Assistance
Group maintains this list and updates it annually for the purpose
of sending out updated documentation. Each mailing list member
must return a postcard to be kept on the list. The list consists
of active users, occasional users or former users of STORET. Some
of those on the mailing list have never used STORET, but want to
keep abreast of the system.
The STORET User Activity Report is a computer-generated list of
all retrievals from STORET, by user name, over the past twelve
months. A retrieval occurs when a user enters the system and
extracts data. The data can either be sent to a file for further
use on the mainframe, downloaded to a local system, or used to
generate one of the many reports available from STORET. Retrievals
do not, however, reflect usage of STORET for data entry or browsing
on-line only. Furthermore, the Activity Report does not indicate
the size of the retrieval, or the amount of computer time required
to make the retrieval.
4.1.1 Categorization of Users
Based upon the mailing list and activity report, the users were
categorized in the following way:
o EPA Headquarters
October 5, 1989
Page 1-15
-------
SDC STORET: A PROFILE
4.1.2 Frequency of Usage
I
I
I
o EPA Regional Offices
o Other Federal agencies
o State environmental agencies •
o Regional/county/municipal government |
o Universities
o Private industry g
o STORET User Assistance Group (SUAG) I
The STORET User Assistance Group is included as a separate category
because its usage of the system is different than that of other •
EPA Headquarters users. SUAG is responsible for maintaining the •
system and providing user assistance.
Figures 4.1 and 4.2 illustrate the distribution of STORET users,
by category, as found in the users mailing list and the User
Activity Report, respectively. Figure 4.2 is more useful for —
analysis purposes, as it represents related activity. I
An analysis of Figure 4.2 indicates that the category containing
the majority of STORET users is state government agencies (51 •
percent). The other categories in which there are a relatively |
large number of users include: (1) private industry at 15 percent,
(2) EPA regional offices at 13 percent, and (3) other federal M
agencies at 12 percent. It should be noted that private industry I
includes companies who use STORET in conjunction with their *
contractual work for EPA.
I
The STORET User Activity Report was used to analyze the frequency •
of usage of the STORET system. Figure 4.3 illustrates the •
percentage of retrievals, by user categories, over the past year.
This analysis indicates that most retrievals are made by users at I
state environmental agencies. This category results in 47 percent •
of all retrievals. EPA regional users account for 15 percent,
private industry 12 percent, and federal government agencies 8 •
percent. fj
A second analysis of usage frequency is shown in Figure 4.4, which «
illustrates the frequency of STORET retrievals by individual user. I
I
I
Page 1-16 October 5, 1989
I
-------
STORET: A PROFILE
SDC
Number of STORET Users By Category
From User's Mailing List
EPA R*giwul Offices
12%
HO EPA
5*
Figures rapmwt percentage o( th« toW 1 flOO ntntn
on Bw STORETnuWng fat
Souro*: STORET Mailing Ust « o( Jonuvy 18, 1868
Figure 4.1
October 5, 1989
Page 1-17
-------
SDC
STORETI A PROFILE
Number of STORET Users By Category
From User Activity Report
EPA FtegfaMl OfflCM
FlgurMi
A00 flf ttw tOM 4v5 FMfim
on ttw STORET ActMly LM. Numbcra In |
actual number of r
SOI»M: STORET UMTAdMy Report. •• of ApiH 22 IS
" STORET LJMf* AraManc* Group
Figure 4.2
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
Page 1-18
October 5, 1989
-------
I
I
I
I
I
I
I
STORET: A PROFILE
SDC
I
I
I
I
I
I
Number of STORET Retrievals By Category
From User Activity Report
EPARvgbfMJOMoM
15% (£044)
47% (17.901)
Figure* rapr***nt pwnnfegc of ttM Mil 37.9411
ov»r ttt* frwlv* months covered by VM Activity Report. Mutton in
pmfttMM* r*prn«nt tattal numbw of r*«rwv*l* w«wi radi e^«gory.
Souro*: STORET Ua«r AdMty Report
a»of Aprt22.1988
* STORET Uson Aisatoie* Oroup
Figure 4.3
I
I
October 5, 1989
Page 1-19
-------
SDC
STORET :
A
1
PROFILE
1
1
1
Average Annual Retrievals Per User
By Category Type
300
290
•3 20°
• j
|| 1»
4i
| 100
1
so
0
-
-
<*
^.4
••eh group «nd lh»
annul nkiowto «•
number of UMT*.
Mt*M
1
^*
$
iw«a
1
2-rf
i
R
^,-rf
|0
17JOUBI
1
"O
£
«atr
1
«* »\d
» bv nvr*M« lh« number of raMMb by
nunb«a(u«OT«wittiinttul group. Ttwwina*
compulid by dividing Bn mi»b«c «< uttonrti by B»
Figure
4.4
•*
mm
if
"1..
v*^
Souro*:6T
'STOHET
^
on
Uw
-------
STORET: A PROFILE
SDC
4.2 User Interviews
4.2.1 Intention of the Interviews
A broad range of STORET system users were interviewed to gain
information about how STORET is used by those involved in water
quality analysis. Due to the time constraints of this project, it
was estimated that between 50 and 75 individuals could be
interviewed. The interviews were conducted over the telephone in
order to expedite the process.
By using both the users mailing list and STORET User Activity
Report, individuals were selected to provide a cross-section of
users, which included:
o active and inactive users
o state coverage (users from as many states as possible)
o inclusion of other, non-EPA federal agencies (users from
each of the more heavily represented federal agencies)
o reasonable coverage of EPA users, both at headquarters
and the regions
o a representative sample of the smaller categories
Figure 4.5 illustrates the categorization of the users interviewed.
4.2.2 The Questionnaire
The questionnaire used for user interviews was prepared with the
assistance of OIRM and the STORET User Assistance Group. The
complete questionnaire is provided in Appendix A of this document.
October 5, 1989
Page 1-21
-------
II
1
SDC STORET: A PROFILE I
II
1
1
Number of STORET Users Interviewed
By User Category
^__-— __^ EMH»gtaHOme«»
^K^^^^^^^^^fl^^ HQEPA
fliiiiiHHr* "
^H^Qpzr
"^ ^^^^P^;-1
A tot* of 00 STORET UMT» w«r* int«ntM*«d.
,
1
1
1
I
1
1
1
1
Figure 4.5
1
1
Page 1-22 October 5, 1989 _
1
-------
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
STORET: A PROFILE
SDC
4.2.3 The Interview Process
Before the interview process began, the STORET User Assistance
Group wrote letters to all names on the STORET users mailing list,
notifying them that they might be contacted in conjunction with
this project. Users were informed that assisting with the
interviews was voluntary.
The duration of each interview was from 2 minutes to 30 minutes,
averaging 10 minutes. During the interview process, the
interviewer asked additional questions not always on the
questionnaire of the interviewee, if such questions would lead to
a better understanding of STORET applicability. Similarly, certain
questions, when not applicable or answered indirectly through other
responses, may not have been asked.
The complete interview process was accomplished over a period of
two months, beginning in mid-June 1989, and completed in mid-August
1989. A total of 77 individuals or firms were contacted, with 66
interviews successfully completed. The eleven non-respondents were
either no longer with the agency or department contacted, or had
changed positions and were no longer using STORET.
4.3 Activities Performed by STORET Users
Users interviewed represented a wide range of experience in water
quality analysis and STORET expertise. Some users were managers
and others were involved in the day-to-day technical use of STORET.
Some users were quite expert in the system, others were novices.
Some users were involved in only one aspect of the system (e.g.,
retrieval) and others were involved in all aspects (e.g., input,
retrieval and analysis).
One of the most important questions asked of all users involved the
nature of their work and how it led them to use STORET. Table 4.1
lists the users' activities, grouped under general headings, as
they were reported to the interviewers. Monitoring is generally
defined as periodic surveillance or testing to determine compliance
with requirements of pollutant levels in water, sediment or fish.
Compliance involves the establishment and enforcement of standards
and issuance of permits. Research can include many different
activities, from academic research done at a university to special
surveys or trend analysis done at a state environmental agency.
Classification refers to the categorization of waterbodies or
aquatic animals as to their physical, chemical or biological
composition. EPA administrative activities include activities
October 5, 1989
Page 1-23
-------
SDC
STORET: A PROFILE
performed by EPA Headquarters personnel involving Headquarters'
needs that do not fit into the other general activities.
ACTIVITIES ACCOMPLISHED USING STORET
Monitoring
Monitoring surface water
Monitoring ground water
Monitoring ambient water quality
Monitoring offish quality
Monitoring water for fish and wildlife
Monitoring water in proximity to water-
body projects
Effluent monitoring
Non-point source monitoring
Monitoring of sediments
Monitoring of drinking water sources
Monitoring product safety by private
industry
Self-monitoring by private industry
Compliance
Compliance with standards
Modification of standards
Enforcement of standards
Compliance with legislation
Establishment of effluent limits
Issuing of permits
Regulation of chemicals
Research
Applied research
Trend analysis
Water quality research from non-point
source projects
Modeling of water quality
Special surveys
intensive surveys
Class
Classification of waterbodies
fMriBiBililrtrJintf* «*J MM!«
widssncaDon ornsn
Classification of shellflsh
EPA Admlnttrathra ActMtle*
Use of STORET data in pestidoe reports
Monitoring of surface water to determine
chemical exposures
Retrieval of data for requests within EPA
Promotion and marketing of STORET tor use
with ground water data
Assistance with difficult retrievals
Scheduling of STORET training :
Acting as regional liaison between users and
EPA
Miscellaneous
Verification of STORET requirements to make
sure water testing equipment conforms to
those requirements
Management of public land
Table 4.1
Page 1-24
October 5, 1989
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
-------
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
STORET: A PROFILE
SDC
4.4 Interview Findings
General observations about how .the system is used and the
interviewees1 reactions to STORET are provided here:
STORET Data
o State agencies are responsible for the input and
retrieval of the data they use.
o Thirty respondents indicated they primarily use only the
data they input. Occasionally, these respondents access
other files such as the Reach and Daily Flow File.
o Three individuals indicated they use STORET basically as
a data repository. One of these respondents indicated
that, while he stores and retrieves data from STORET,
the analysis and reporting is done on the PC, after
having downloaded the data. Another respondent indicated
all data is stored on his own computer and he simply
uploads his data to STORET on a quarterly basis. The
third indicated he uses his own system more often because
of the faster access time, since he does not have to wait
for the STORET database to be updated before he can use
his data.
(
o State agencies are interested in data for their own state
as well as other states, when waterways cross state
boundaries.
o Twenty-two respondents said they input the same data to
STORET and at least one other system.
o Three respondents expressed concern for data accuracy
within STORET. One of these users said that he often
contacts the data source to verify data accuracy.
o Two interviewees indicated they have a backlog of data
to be entered into STORET. One of these users said she
has been trying to get her data into STORET for two
years. The process is taking a long time because only
one person enters data, and that person has other
responsibilities in addition to STORET data entry.
o There is a demand for both the current and historical
data from the system.
October 5, 1989
Page 1-25
-------
SDC
STORET: A PROFILE
STORET System
New users of STORET have difficulty using its procedures,
especially if they do not have a computer background.
Some users download to PCs to perform analysis and others
use only the mainframe. Thirty respondents said they
download data to their local systems for either some or
all of their analysis. The principal reason for
downloading is to do further analysis using PC software
(i.e., spreadsheets, databases, SAS and other statistical
packages, graphics software, etc.). One user indicated
he downloads data because it is too expensive to do
analysis online.
Three respondents specifically mentioned they did not
know how to download data, although they would like to
learn.
Twenty-five respondents expressed dissatisfaction with
the documentation provided with STORET. They cited the
documents as difficult to use, infrequently updated and
too voluminous.
There were mixed responses regarding the newly-introduced
STORET menu system. Two interviewees did not know about
it; four were interested in seeing the menus or starting
to use them; only one respondent said he had begun to
use them.
Those respondents interested in non-point source data
criticized the way this data is handled and felt STORET
does not effectively deal with this data.
Two respondents wanted to know more about SAS, both
within STORET and offline on their PCs.
Three people expressed a need to have the various water
systems better integrated so that data from different
databases could be compiled.
Fourteen respondents indicated problems they have
experienced were specific to their needs (i.e., being
able to generate reports in a different format than
currently available).
Page 1-26
October 5, 1989
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
-------
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
STORET: A PROFILE
SDC
o Seven respondents conceded that their problems with
STORET are probably due to their lack of knowledge of
STORET, not STORET's lack of capability.
User Functions
o Regional EPA users interviewed use STORET for a variety
of purposes, including: performing ambient water,
environmental, and drinking water monitoring; program
management; and acting as regional representative or
contact to users. All but one of the regional EPA users
interviewed said they primarily rely on data input by
states and other federal agencies.
o Three respondents from the regional EPA offices indicated
they make retrievals for other EPA staff, or for users
within their regions. These retrievals are performed due
to the difficulty of the retrieval, or because the
requesting individual does not know how to access STORET.
o Forty people interviewed said they are beginning to use,
or are interested in using, BIOS. More than any other,
BIOS was the system users foresaw accessing in the
future.
o Fifteen respondents in state agencies indicated that,
because of their limited involvement with STORET, they
do not necessarily have a broad-based knowledge of EPA
water quality databases.
STORET User Assistance Group
o When asked what they like about STORET, thirteen
respondents said the STORET User Assistance Group is very
helpful and do an excellent job.
o Thirty-four people said they wanted to see some type of
user meeting reinstated. Those who did not feel these
meetings are necessary were primarily more experienced
users who felt the meetings would not benefit them
personally. Those interested in having some type of
contact with other users felt it would be valuable to be
able to exchange ideas with other users to further their
own knowledge and skills with STORET.
October 5, 1989
Page 1-27
-------
SDC
STORET: A PROFILE
Inactive Users of STORET
o Of the 66 people interviewed, 13 are classified as
inactive system users. Their reasons for not using
STORET, and the number of interviewees supplying each
response, are listed here:
The expense of using STORET (1)
Not knowing enough about what STORET could do for
them (2)
Not knowing about STORET, but interested in finding
out more (3)
Having their own system, which meets their needs
(2)
No need for STORET at this time (1)
The weekend updating process (which limits the
accessibility of recently entered data) is
unacceptable (1)
Lack of training (1)
Current job responsibilities do not require STORET
(1)
Using STORET data indirectly through
downloaded by a state university (1)
files
Appendix B provides a detailed overview of the responses from all
user interviews.
Page 1-28
October 5, 1989
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
-------
STORET: A PROFILE
SDC
SECTION 5
5.0 Summary Remarks
STORET is one of the oldest and largest of EPA's automated
systems. It has grown over the past 25 years into a complex and
widely-used system. Its users span the country geographically, and
represent a diversity of government and non-government
organizations, although state agencies accounts for about half of
all STORET activity. A cross-section of these users were
interviewed to gain a better understanding of who the users are,
for what purposes they are using STORET, and whether STORET is
meeting their informational needs.
STORET users employ the system in a variety of ways spanning the
spectrum of water quality needs. The activities STORET supports
includes all types of water monitoring and aquatic life monitoring,
establishment and enforcement of standards, research projects and
special surveys, classification of waterbodies and aquatic life,
and a range of administrative activities. STORET represents a
vital ingredient in all of the water quality activities in which
these users are involved.
The size of the sample taken for the user interviews was too small
to draw definitive conclusions of the overall usage of the STORET
system. However, several interesting facts were discovered about
the group. Almost half of the users interviewed download their
data to do analysis on their own PCs or minicomputers. One third
of the interviewees input their STORET data into another system as
well as STORET. More than half of the respondents said they
foresaw the need to use BIOS in the future. More than a third
expressed dissatisfaction for the STORET documentation, and half
expressed interest in the reinstatement of the STORET user
meetings. In general, those interviewed had very few complaints
about STORET and the system appears to be successfully meeting
their needs. They also spoke highly of the STORET User Assistance
Group and commended the group for its work.
Because STORET is a viable and flexible system, many agencies and
organizations have a high interest in continuing to use the system.
Just as STORET has been updated in the past to keep up with the
changing water quality needs, so it must continue to keep pace in
the future to accommodate the ever-increasing activities of the
water quality enterprise.
October 5, 1989
Page 1-29
-------
-------
APPENDIX A
INTERVIEW QUESTIONS
-------
-------
Interview Questions
Organization andUser characteristics
1. The STORET User list shows that you work for . Is
this information correct? If not, what is the name of the
organization that you work for?
3.
4.
5.
EPA
What is the organizational unit to which you are
assigned?
Non EPA
Is this organization federal government_
state government
non-profit
private
other
What is the mission of your department? (e.g., monitoring,
enforcement, research, etc.)
What is the nature of your work?
Do you analyze water quality data? How would you describe
your position?
What requirements of your job in particular lead you to use
STORET?
[After these initial questions, it should be clear
whether the user is a supplier of data, a user of data,
or both.]
SUPPLIER
6.
I would like for you to describe to me the process of
supplying data to STORET. What data do you supply to
STORET?
How is the data collected? Is there a protocol that you
follow?
Do you supply this same data to any other automated system
or repository in addition to STORET? If yes, what systems
and why?
-------
I
I
9. I would like for you to describe to me the process of |
acquiring data from STORET. What data do you get/access
from STORET? _
10. Do you use the STORET reports as they are produced or do you w
do further work to them?
11. Does the system give you all the information you need? What I
else would you like?
12. Do you use any other automated system in addition to STORET I
to acquire the data you need? If yes, what systems and why?
13. Are there others who rely upon you for data or reports I
supplied by STORET? How many? Who? •
14. What is the geographical scope of your interest? (e.g., •
local/regional/entire country) |
15. For what period of time are you interested in retrieving »
data? (e.g., months, years) •
16. What sort of turn-around do you require on queries that you
make of STORET? Do you get it? •
GENERAL
I
17. What do you like about STORET?
18. What do you not like about STORET?
19. Over the next 5 to 10 years, what types of information will ^
your agency need? (chemical, biological, locational,
analytical, etc.) [For what purposes will this information •
be needed?] |
20. What other databases do you foresee needing to access? M
21. What should the STORET User Assistance Group be supporting
that they don't?
22. How can training be improved? I
23. Should the STORET Users Meetings be reinstituted? •
24. Is there anyone else you know who uses STORET quite a bit
that I could call?
I
I
-------
APPENDIX B
SUMMARY OF INTERVIEWS
-------
I
I
1
I
I
I
I
1
I
1
I
I
I
I
I
I
I
I
-------
o
a
I
O
m
I
CD
1.
&
co
t
n
•c
a.
II
M U
(0 C
v 3
(0
•g
1
•o
1
1
$
a>
UJ
!
DC
I
C
O
0) .N
a. c
>• as
^ C3)
.52
s
55 Q
I
CO
It
|o
(D o
CO Co
-------
<0
3
en
u. o
O
d>
|
«
DC
S
GO
U
•OB *^
(0 U
0} C
^ 3
5 u_
§
«
1
to>
I!
o .1
Is
3S
-------
•y
co
-
u. O
sill
I
en
01
a:
o ,
(D O
Q. CO
£*3
s
CD
o
m
m
II
_ 4-*
CO O
I
CO
IE
li
™
(0
CT
O)
O)
I
s
a
8
5 UJ 5
0-5 I
a> <- M
3&S
CO O O
oa
11 g
ill
§o a>
U cc
tS^ «
£S Q) co
CO O Z
-------
n
3
CO
*
co
5
u_ o
|
O O)
Q. (0
JK CO
S
m
i
CD
li
W Q
i
o
0)
IB
O
O
I
•5
5!
t
«
so
|i
-------
ID
>*«
c S
"
«
I
JS
§•
QC
O Q)
o 01
Q. (0
o
o
m
S
OQ
8
1
fl>
E
«
o
2
0)
.2
(0 QJ
"D Q)
ii*
C 0) -S
o to 5
5 2 §•
o
-------
t
OS
CO
i
* «
I!
u. o
O)
II
01 O
C
a
2»
O
§
S* ®
§1
2
1
o
i
O
•s
c
g
b
5 UJ
CD'S
4)
"
O
11
£ i
El
1-8
ii
co o
igl
III
en
E 8
3*
.2 I
S
9
§1
8*
m
c
a
u»
•8 <-5
•§
I
I
I
I
I
I
I
I
1
I
I
I
I
I
I
I
I
I
-------
!
§ i?
s.«
if=
ul o
I
9
cc
cn
CO
5
OQ
CD
*
CO O
CO C
CT
o
§'eT
« °-
Itr
IS (Q
> CO
l3 D
o
•si
I
cn
o
8
s-1
co
£=>
o °"
il
52
,?
>Uj
B
§
||
O3 0)
II
II
«
J
J
^'Q.
n
g
%
2
-------
«
12 IslSl SI
CC
2
IS
•a1"
?1
Si
f||f |
|||| I
o ^ S"s £
CO
£
(O
.
UJ O. O
<«
&:
£i
i
QC
CD
X
O.
UJ
I
I
I
I
-------
« o
I-
S-D
2Z
u. o
I
!g
s
I
I
o
^PBB T
O (_
Q> O
JJt {P
?«
.1
§
GC
|
I
U
« *-
0» U
0) C
sl
O
O
CD
Q.
t
CO
t
cr
I
CO
5?
•5
§o
CO
II
3 W
||
a^i5
*^"g
1|!
£ o J
o
g
2
UJ
I
0)
I
•o
$
UJ
.22
£
1
cr
a3
c
CO
2
•§
-------
3
CO
r«
is
2- o)
U- O
o
II
"w o
0) C
If
o
O
(0
o»
o
»
i
I
lift
Self
LU O UJ
-------
I
CO
o> o
£8
II
CO O
CO C
1
i
C
o
• (0
— OS
ST
O
1
1
a>
o
o
o
S
cr
11
It
O CT
CO CD
•o
.£
tt °-
2 .fe
2
3
1 O C
O
oEii
5 CD •£ '«
to
0>
«?
DC
f ^
CO «S
(D *x
<0 w
QJ ^*
°£
18?
I
Q.
•5
^
5
r- CM
-------
-------
APPENDIX C
Glossary of Water Quality Acronyms
-------
-------
Glossary of Water Quality Acronyms
BIOS
DWS
FRDS
GAGE
GIGS
GIS
IFD
NEEDS
NPDES
ODES
PCS
REACH
STORET
USGS
WATSTORE
WBS
WQAB
WQAS
Biological Data System
Drinking Water system
Federal Reporting Data System
Stream Gage/Flow File
Grants Information Control System
Geographic Information System
Industrial Facilities Discharge File
Needs Survey File
National Pollutant Discharge Elimination
System
Ocean Data Evaluation System
Permit Compliance System
Reach File
Storage and Retrieval System
U.S. Geological Survey
National Water Data Storage and
Retrieval System
Waterbody System
Water Quality Analysis Branch
Water Quality Analysis System
-------
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
-------
PART II
AN INFORMATION ENGINEERING PERSPECTIVE
OF THE WATER QUALITY ENTERPRISE
-------
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
-------
PART II
AN INFORMATION ENGINEERING PERSPECTIVE
OF THE WATER QUALITY ENTERPRISE
TABLE OF CONTENTS
SECTION
PAGE
1.0 INTRODUCTION II-l
2.0 THE INFORMATION ENGINEERING MODEL II-3
2.1 ENTITY ANALYSIS II-3
2.1.1 GLOBAL ENTITY RELATIONSHIP DIAGRAM II-4
2.1.2 ENTITY DESCRIPTIONS II-6
2. 2 FUNCTION ANALYSIS 11-11
2.2.1 FUNCTION HIERARCHY DIAGRAM 11-11
2.2.2 FUNCTION DESCRIPTIONS 11-13
2 . 3 ORGANIZATIONAL STRUCTURE 11-15
2. 4 INFORMATION NEEDS 11-16
2.5 MATRIX ANALYSIS 11-18
2.5.1 ENTITY TYPE/BUSINESS FUNCTION MATRIX 11-19
2.5.2 ENTITY TYPE/INFORMATION NEEDS MATRIX 11-21
2.5.3 BUSINESS FUNCTION/ORGANIZATION MATRIX 11-23
2.5.4 CLUSTERED NATURAL DATA STORES MATRIX 11-25
2.5.5 CLUSTERED NATURAL BUSINESS SYSTEMS MATRIX 11-25
3 . 0 SUMMARY REMARKS 11-27
APPENDIX A
SELECTED REFERENCES
-------
-------
AN INFORMATION ENGINEERING PERSPECTIVE
SDC
SECTION 1
1.0 INTRODUCTION
EPA's Systems Modernization Initiative (SMI) directs the EPA to
build and renew systems for more accessibility and usability by a
growing population of information users. Because of this
directive, EPA has chosen to implement the development of new
systems using a structured methodology. A structured methodology
defines procedures to be used throughout the system development
life cycle to improve communication with the user during system
definition, design and development. These structured methods also
ease the transition from one phase of the life cycle to the next,
and ensure that all work is consistent and accurate. After several
methodology and software comparisons were completed by Viar and
Company, Information Engineering Methodology (IBM) was the
methodology chosen for use in the pilot projects because of its
orientation toward defining the business structure, its full
implementation of the system life cycle, and its adherence to
standards.
Computer-Aided Systems Engineering (CASE) software was also studied
and compared to aid in the productivity of the pilot projects.
This software tool represents the automation of the system
development cycle, including systems analysis, design and
implementation. The software chosen as a result of the Viar study
was Information Engineering Facility (IEF) from Texas Instruments,
which conforms to the IEM.
By adopting a structured methodology and CASE tool, EPA's new and
revised environmental systems will be more thoroughly planned
through all stages of development; there will be more communication
between EPA program offices and the development team especially in
the analysis phase; the resulting systems and documentation will
more satisfactorily meet EPA's needs; and in the future, other
systems will benefit from reusable software and data sharing.
October 5, 1989
Page II-l
-------
-------
AN INFORMATION ENGINEERING PERSPECTIVE
SDC
SECTION 2
2.0 The Information Engineering Model (IBM)
One of the purposes of this STORET pilot project was to take a
"fresh look" at the entire enterprise of water quality from an
Information Engineering perspective, and create a logical model of
the enterprise. The first stage of this methodology, Information
Strategy Planning (ISP), provides a system to plan and manage
information use. During ISP, planners gain a broad view of the
information needs of the entire business enterprise, from which
they create a blueprint for future activities in support of these
needs.
A complete ISP could not be accomplished as a part of this
undertaking, as it requires intensive interviewing of upper-level
and middle management. Given the allotted time frame and the
vastness of the water quality enterprise, these interviews could
not be conducted.
This document presents an initial assessment of the water quality
enterprise and its information needs, using text and diagrams. As
mentioned, this document does not comprise the results of a
complete ISP. Diagrams completed with the input of the STORET user
interviews and various forms of documentation constitute a "straw
man," which may help individuals acquaint themselves with the water
quality business. It can also be used by interviewers conducting
an ISP to facilitate discussion and thought about the water quality
enterprise.
2.1 Entity Analysis
One of the first tasks the methodology dictates is to document the
items of interest, or entity types, to the water quality
enterprise. To understand what is important to this enterprise,
it is necessary to first understand what constitutes the
enterprise. Its overall mission, as stated by Congress in The
Clean Water Act (PL 92-500), is to maintain, and where necessary
restore, the chemical, physical, and biological integrity of the
nation's waters.
Items of fundamental relevance to water quality, about which data
must be maintained, are the data resources (entity types) of the
enterprise. Entity analysis addresses the definition of data
resources of interest to water quality and identifies items about
which it is necessary to keep information to accomplish the overall
October 5, 1989
Page II-3
-------
SDC AN INFORMATION ENGINEERING PERSPECTIVE
I
I
I
mission of the enterprise. As part of the analysis, entity types
are identified and defined and relationships between entity types M
are illustrated. Entity types are the resources about which some g
group in the water quality enterprise needs to keep information.
2.1.1 Global Entity Relationship Diagram (ERD) •
An ERD of the water quality enterprise is pictured in Figure 2.1.
Each entity type is illustrated in a box, and each box position is •
arbitrary, having only to do with the aesthetics of the diagram. |
The fact that all the boxes are the same size does not signify that
all the entity types are of equal importance. Relative importance m
of one entity type compared to another is not indicated on the •
diagram. Also, lines connecting entity types are not flow lines;
they indicate only that there is a relationship between the two
entity types. I
A deficiency of the tool is- apparent in this diagram.
Relationships, although defined in both directions, are presented
only for entity types reading, for the most part, from left to
right and top to bottom. Typically, the missing relationship is
merely the inverse of the one shown on the diagram. For example, _
water "is polluted by" non-point source and non-point source I
"pollutes" water. ™
I
I
I
I
I
I
I
Page II-4 October 5, 1989
I
-------
I
I
I
I
I
I
I
AN INFORMATION ENGINEERING PERSPECTIVE
SDC
I
I
I
I
k.
I
I
I
I
I
I
CVSKT
Piqure 2.1
B
!
October 5, 1989
Page II-5
-------
SDC AN INFORMATION ENGINEERING PERSPECTIVE
2.1.2 Entity Descriptions
Entity type; EFFLUENT
I
I
I
The ERD illustrated in Figure 2. l depicts the entity types and 8
their relationships in pictorial form. The following pages expand
upon the Global ERD by listing each entity type, its description «
and relationships to other entity types in text form. This listing I
is compiled from the entity report generated from the Information *
Engineering Facility (IEF).
Entity type: DISCHARGE_SITE 8
Description: A discharge site is the location where an effluent m
flows out from a facility. One facility may have 8
many discharge sites.
Relationships: •
Sometimes IS SITE OF manv SAMPLE SITE V
I
Sometimes IS_SITE_OF many SAMPLE_SITE
Sometimes IS_OWNED_BY one TREATMENT_FACILITY
Always DISCHARGES many EFFLUENT
Always IS_OWNED_BY one FACILITY I
I
Description: Effluent is liquid waste that flows out of a
facility into surface waters. •
Relationships:
Always DISCHARGES_INTO many WATER
Always IS_DISCHARGED_BY one DISCHARGE_SITE •
Entity tvr>e; FACILITY I
Description: A facility is a commercial or government plant that «
discharges an effluent into the water. A facility •
may or may not do its own water or effluent *
monitoring.
Relationships: m
Sometimes IS_OPERATED_BY one GOVERNMENT_AUTHORITY
: : : i
Page II-6 October 5, 1989 —
I
-------
AN INFORMATION ENGINEERING PERSPECTIVE
SDC
Sometimes MAINTAINS many SAMPLE_SITE
Always IS_REGULATED_BY many LEGAL_CONTROL
Always OWNS many DISCHARGE_SITE
Sometimes MAY_BE_PROSECUTED_BY many GOVERNMENT_AUTHORITY
Entity
Description:
GOVERNMENT_AUTHORITY
A government authority is a federal, state,
regional, or local agency which prepares,
promulgates, or enforces legal controls that address
the nation's waters; monitors water quality; and may
or may not operate facilities, water supplies, and
treatment facilities.
Relationships:
Sometimes OPERATES many FACILITY
Sometimes MAKES many WATER_QUALITY_ASSESSMENT
Sometimes MAINTAINS many SAMPLE_SITE
Sometimes OPERATES many WATER_SUPPLY
Sometimes OPERATES many TREATMENT_FACILITY
Always PROMULGATES many LEGAL_CONTROL
Sometimes MAY_PROSECUTE many FACILITY
Entity type;
Description:
LEGAL_CONTROL
A legal control is any law, regulation, standard,
or permit pertaining to the production or
disposition of water pollutants that a government
authority promulgates and enforces for a facility.
Relationships:
Sometimes IS_PROMULGATED_BY one GOVERNMENT_AUTHORITY
Sometimes REGULATES many FACILITY
October 5, 1989
Page II-7
-------
SDC
AN INFORMATION ENGINEERING PERSPECTIVE
Entity type;
Description:
NON-POINT_SOURCE
A non-point source is a pollution source which is
diffuse and does not have a single point of origin
or is not introduced into a stream from a specific
discharge site. The pollutants are generally
carried off the land by storm water runoff.
Relationships:
Always POLLUTES many WATER
Entity type;
Description:
OBSERVATION
An observation is a parametric value obtained from
a laboratory or instrument during a test done on a
water sample. Many observations can come from one
sample.
Relationships:
Sometimes IS_AN_INDICATOR_FOR many WATER_QUALITY_ASSESSMENT
Sometimes IS REPORTED BY one TEST
Entity type;
Description:
POLLUTION_EVENT
A pollution event is an occurrence at a specific
date and time which results in pollution of ground
water or surface water. An example is an oil spill.
Relationships:
Always POLLUTES many WATER
Entity type:
Description:
SAMPLE
A sample is a water or effluent specimen collected
at a certain time, place, and depth and used in the
analysis process.
Relationships:
Always IS_ANALYZED_BY many TEST
Always IS_GATHERED_AT one SAMPLE_SITE
Page II-8
October 5, 1989
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
-------
AN INFORMATION ENGINEERING PERSPECTIVE
SDC
Entity type: SAMPLEJSITE
Description: A sample site is the location where a sample (ground
water, surface water, sediment, or effluent) is
gathered.
Relationships:
Sometimes IS_LOCATED_AT one DISCHARGE_SITE
Sometimes IS_SITE_OF many SAMPLE
Always IS_LOCATED_ON one WATER
Sometimes IS_MAINTAINED_BY one FACILITY
Sometimes IS MAINTAINED BY one GOVERNMENT AUTHORITY
Entity type:
Description:
TEST
A test is a process by which samples are analyzed
and examined based upon predetermined criteria. A
test may occur in a laboratory or at the sample site
by the use of an instrument.
Relationships:
Sometimes REPORTS many OBSERVATION
Always ANALYZES many SAMPLE
Entity type:
Description:
TREATMENT_FACILITY
A treatment facility is a sewage treatment plant
that is either publicly or privately owned and is
designed to treat domestic wastewaters.
Relationships:
Always OWNS many DISCHARGE_SITE
Always ARE_OPERATED_BY one GOVERNMENT_AUTHORITY
October 5, 1989
Page II-9
-------
SDC AN INFORMATION ENGINEERING PERSPECTIVE
Entity type; WATER
Relationships:
Always IS_MADE_BY many GOVERNMENT_AUTHORITY
I
I
I
Description: Water is a body of water where sampling may be done.
Water includes wells, aquifers, lakes, rivers,
streams, oceans, estuaries, bays, etc. —
Relati onships: ™
Sometimes IS_POLLUTED_BY many NON-POINT_SOURCE
Sometimes FILLS many WATER_SUPPLY |
Sometimes IS_POLLUTED_BY many POLLUTION_EVENT «
Sometimes IS_OBSERVED_AT many SAMPLE_SITE
Sometimes IS_POLLUTED_BY many EFFLUENT •
Entity type: WATER_QUALITY_ASSESSMENT m
Description: Water quality assessment is the determination made
regarding the suitability of a given body of water _
for drinking, swimming, farming, fishing, fish •
production, or industrial processes. ™
I
Always IS_MADE_USING many OBSERVATION _
Entity type: WATER_SUPPLY
Description: A water supply is a reservoir of potable water which •
is distributed generally after treatment to the
consumer. •
Relationships:
Always IS_FILLED_BY many WATER _
Always IS_OPERATED_BY one GOVERNMENT_AUTHORITY ™
I
I
Page 11-10 October 5, 1989
I
-------
AN INFORMATION ENGINEERING PERSPECTIVE
SDC
2.2 Function Analysis
After the entity analysis stage, which involves the data resources
of the enterprise, the next stage is function analysis, involving
the activities of the enterprise. A function is an ongoing, broad
business activity. It consists of other functions or processes,
which together completely support one aspect of furthering the
mission of an enterprise. A function can be defined as what a
business does, it is the highest level of activity defined in IE.
In function analysis, the methodology specifies that the enterprise
activities are documented independently from its organization,
current practices and existing information systems.
2.2.1 Function Hierarchy Diagram
Figure 2.2 illustrates a Function Hierarchy Diagram (FHD) of the
water quality enterprise. This is clearly not an exhaustive list
of all water quality functions, only of functions encountered in
the study of STORET and its related literature.
The order in which functions appear is not relevant, and the fact
that all are shown on the same level does not indicate all
functions are of equal importance. Relative importance of one
function compared to another is not indicated on the diagram. In
an ISP, these functions would be broken down into one more level
of detail.
October 5, 1989
Page 11-11
-------
SDC
AN INFORMATION ENGINEERING PERSPECTIVE
Figure 2.2
Page 11-12
October 5, 1989
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
-------
AN INFORMATION ENGINEERING PERSPECTIVE
SDC
2.2.2 Function Descriptions
The FHD illustrated in Figure 2.2 depicts the functions in
pictorial form. The following pages expand upon the FHD by listing
each function and its description. This listing is compiled from
the function report generated from the IEF.
Function;
Description:
Function;
Description:
Function:
Description:
Function;
Description:
Function;
Description:
Function;
Description:
AMBIENT_WATER_MONITORING
Ambient water monitoring is all forms of surface
water monitoring conducted beyond the immediate
influence of a discharge pipe.
CLASSIFICATION_OF_WATER
Classification of water has to do with the
categorization of waterbodies as to their physical,
chemical, and biological composition.
EFFLUENT_MONITORING
Effluent monitoring is the monitoring of the
wastewater that flows out of a discharge pipe before
the wastewater mixes with the receiving stream.
FISH_MONITORING
Fish monitoring is the monitoring of fish
populations and types, fish tissue, and fish kills
in surface waters.
GROUND_WATER_MONITORING
Ground water monitoring is the monitoring of the
fresh water found beneath the earth's surface,
usually in aquifers, which supplies wells and
springs.
ISSUANCE_OF_PERMITS
Issuance of permits, which regulate the requirements
of environmental standards, includes those
activities involved in issuing and monitoring
permits granted to industrial facilities, water
treatment plants, or water supply plants.
October 5, 1989
Page 11-13
-------
SDC
AN INFORMATION ENGINEERING PERSPECTIVE
Function:
Description:
Function:
Description:
Function;
Description:
Function;
Description:
Function;
Description:
Function;
Description:
Page 11-14
POLLUTION_CONTROL
Pollution control includes the activities, plans,
projects, and technological development involved in
the prevention of pollution.
RESEARCH
Research includes those functions involved in
applied research (for example, at a university),
trend analysis, water modeling, and special or
intensive surveys.
SEDIMENT_MONITORING
Sediment monitoring is the monitoring of soil, sand,
and minerals that are washed from land into surface
waters.
STANDARDS_DEVELOPMENT
Standards development includes the activities
involved in setting the prescriptive norms which
govern actions and actual limits on the amount', of
pollutants produced and disposed of as they relate
to water quality.
STANDARDS_ENFORCEMENT
Standards enforcement is legal action taken to
obtain compliance with environmental laws, rules,
and regulations. It also includes legal action to
obtain penalties or criminal sanctions for
violations.
WATER_QUALITY_AS S ESSMENT
Water quality assessment is the function of
determining whether a given body of water meets all
standards for drinking, swimming, farming, fishing,
fish production, or industrial processes.
October 5, 1989
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
-------
AN INFORMATION ENGINEERING PERSPECTIVE
SDC
2.3 Organizational Structure
The next task specified by the methodology is the definition of
the enterprise's organizational structure. The following is an
organizational structure formulated from documentation about water
quality obtained from the Office of Water and from interviews with
the STORET users.
The Water Quality Enterprise
EPA Office of Water
Office of Drinking Water
Office of Municipal Pollution Control
Office of Water Regulations and Standards
Office of Water Enforcement and Permits
Office of Marine and Estuarine Protection
Office of Ground Water Protection
Office of Wetlands Protection
EPA Regional Offices
Other Federal Agencies
U.S. Forest Service
U.S. Army Corps of Engineers
Bureau of Land Management
Bureau of Reclamation
National Park Service
U.S. Geological Survey
State EPA
Bureau of Natural Resources
Fish and Game Division
Other State Agencies
City/County Agencies
Regional Authorities
Interstate Commission
Tennessee Valley Authority
Academia
Private industry
Corporation
Contractor
Laboratory
Research Institute
October 5, 1989
Page 11-15
-------
I
SDC AN INFORMATION ENGINEERING PERSPECTIVE
I
2.4 Information Needs
The next step in the methodology requires that information needs
are identified. An information need is a type of information
required by an enterprise to enable it to meet its objectives and _
support its functions. The following list of information needs was •
developed from STORET user interviews and Office of Water ™
documentation. The shorthand reference of each information need
is underlined and followed by one or more examples of the types of •
data that might be required to meet this need. •
1. Physical characteristics of a water sample •
Depth of sample •
Temperature
Date and time
2. Chemical composition of a water sample •
Dissolved oxygen content
Cadmium level •
3. Biological composition of a water sample
Bacteria counts _
Chlorophyll measurements I
Description of aquatic organisms *
I
4. Daily flow in a stream or river
Low flows
High flows
7Q10 flow
5. Information on Waterbodv characteristics
Waterbody type
Fishable/swimmable status •
6. Identification of the Reach of the water sample
Geographic coordinates
Reach number
Waterbody name
7. Information about a Sampling site •
State •
County
Lat/long •
Reach number •
Ecoregion
I
Page 11-16 October 5, 1989 —
I
-------
AN INFORMATION ENGINEERING PERSPECTIVE
SDC
8. Characteristics of a City/county in the proximity of a
water sample
Demographic data
Congressional district
9. Geographic description of the vicinity of a water sample
Topographical data
10. Fish population of a stream, river, or lake
Type of fish
Fish census data
11. Analysis of the Fish tissue
Toxic levels
12. Information on a Fish kill
Location
Number of fish killed
13. Information on Publicly owned treatment works (POTW)
facility
Wasteload allocation studies
Grant application files
14. Information on Industrial facility
Standard industrial classification (SIC) codes
Categorization of discharge type
15. Information on Construction grants
Municipality
Grant amount
16. Information on Drinking water supplies
Location of utility
Location of intake
Location of source
17. Information on existing Standards pertaining to water
quality
Pollution limits
Mixing zone policy
18. Results of an Intensive survey
Chemical parameters
Biological parameters
October 5, 1989
Page 11-17
-------
I
SDC AN INFORMATION ENGINEERING PERSPECTIVE |
2.5 Matrix Analysis •
Matrix analysis is a method of diagramming different object types •
within the enterprise and recording interactions between them. A |
matrix is similar to a chart or a graph, where all components of
one object type are listed along the vertical axis and all m
components of another object type are listed along the horizontal •
axis. The intersection of the components within the body of the
matrix are called cells of the matrix. Each cell contains one
character that describes the extent of involvement of one vertical •
component with a corresponding horizontal component. By using a •
matrix to illustrate these interactions, an overview of the
enterprise can more easily be understood. •
After a matrix has been developed, with both the vertical and
horizontal axes defined and interactions described in the cells, M
the IEF Computer-Aided Systems Engineering (CASE) tool performs a •
function called "clustering." Clustering is a mathematical
procedure that groups the components to show how they fit naturally
together and displays their affinity for each other. The tool fl
groups these components into clusters of high affinity. After I
clustering, it is easier to see what components have a high degree
of affinity, or involvement. This is especially useful when •
designing new systems, as designers can more easily understand how |
information and functions naturally fit together and develop new
systems based on those natural fits. * —
In the following pages, five matrices have been constructed. The '
first three matrices are the Entity Type/Business Function Matrix,
Entity Type/Information Needs Matrix, and Business •
Function/Organization Matrix. " Matrix components are abstracted •
from diagrams or lists referenced in Sections 2.1 through 2.4. The
last two matrices shown are the Clustered Natural Data Stores •
Matrix and Clustered Natural Business Systems Matrix. These two •
are mathematically derived from the Entity Type/Business Function
Matrix during the clustering process.
Page 11-18 October 5, 1989
I
I
I
I
I
-------
AN INFORMATION ENGINEERING PERSPECTIVE
SDC
2.5.1 Entity Type/Business Function Matrix
The Entity Type/Business Function Matrix maps entity types versus
business functions. The matrix gives an overview of how entities
are used in conjunction with each function. The only values
permitted in each cell are "C" (create), "R" (read), "UM (update)
or "D" (delete). The strength of involvement signified by each
letter is based on its value. Values of "C" or "D" indicate a
greater strength of involvement than those with values of "U",
which in turn indicate a greater strength of involvement than those
with values of "R." A cell that is left blank indicates there is
no involvement at all.
Figure 2.3 illustrates the matrix as it appeared immediately after
the data was input. The tool specifies the value in each cell must
be assigned by determining whether the function creates, deletes,
updates, reads only, or has no involvement with information about
the corresponding entity. For example, is the function ground
water monitoring involved with the information about the entity
sample? The answer is that the function creates information about
the entity and therefore, a "C" was entered into the matching cell.
If the answer was "reads only," an "R" would have been entered.
If there was no involvement between the function and the entity,
or there was not enough known about the involvement, the cell was
left blank.
It should be noted that in this initial assessment, cells were
populated using only a "C", "R" or were left blank. As the matrix
was analyzed, it was assumed the function would either create,
update and delete the entity, or that the function would read it
only. There were no cases where the function would update or
delete the entity without being able to create it in the beginning.
Therefore, a value of "C" was used for the former situation, as
that would indicate the greatest strength of involvement.
Figure 2.4 illustrates this same matrix after clustering has been
performed. The upper left part of the matrix contains most of the
"C's" grouped together, and the entities with the least amount of
involvement are listed at the bottom. Notice the various types of
monitoring have much involvement with observation, test, sample,
sample site and water, which represent STORET entities.
October 5, 1989
Page 11-19
-------
SDC
AN INFORMATION ENGINEERING PERSPECTIVE
1 1 1
' >EV 'ENTER MIGKECT ~\ \ 'o
: • REIITE 1~. C. c- £ B. S «: d
M-3S5 is.eIS6s«
! R r READ QWIV J * c * * 1 ^ ;
r* llislis:
| EJfflTV TVPCE A fe •> f £ £ 1
1 OBSERVATION CiC :C iCit:* i
TEST C-C:C:;t:;C:«:
1 SAMME CiiCifc-CitiR:
1 TRERTMMT FACILITY • i • -RiR!
j OUTER 9UPW.V : ; • ;«««'
1 FftCIlSYV IM! : : :MtR
1 •TH.tUTtON EVENT ; '• • :«:"
1 UMTEX OURLITY R£«E£9C R:R:R:R:C
1 O'l^LUCNT C i : i ^Riff
£RM»ic frre c!c;c;c-c;ii
iiic«5MB snc c! '• '• ;R;«
1 ' f
1 ^£V 'EMTER HICICfT jt, g
1 f LftSSlFlf.RTIO* ONLVi ? g t ? *
i M e £. B r
] '. : CREATE ^ & S C- w
j o •• CCIETE S Q j " e
j >.' : UPDATE S O H W ?.
j « = •»«> w«-v 3 J f fe j
1 V g ». *l £ £
! 5 S & S C S
i *!t!iSsi
I E3VTITV TVPES JB £ g. J[ r
I OBlkRVRTICri C-C -CiC-t
__ t!:C!t:£-.iC
j EAHFLE c;t:c:c:c
1 SAMPLE SITE C:C;t. R-<
! WATER • C' : C • C •
| LE1AL CCWTROL «:»>:f«:»:
WATER ClLMLITV ftSrETQ* R;R]R:
FAOILITV R-R- :f>l
1 TNCATMCT FAf-ILlTV ;R; ;R:
WATER niFWV R: :«:
i POLLUTIOTJ EVENT K; .R
sH
''!* ?
> y IL ^
u c . j J
£z S a>
O hi C k.
.: «! t c.
'•r6K
Iliii
S 2 m ki S
f f 5 C 1
m n o £ M
iRjumiCiRt _;
; ; .a;C:;B; :
1 ! -RiC'iR:! :
RlRiR! -RIO1 :
R:R(R- -RtCS: : . _ „
R>R!R;R;N=C; ;
:*t||:||l||<)|i||i ••
iSi'iiiSifll"! ;
!Rl*;R;l»:R-Ri •
iN'RiiiR-iimt •
111 ill ill mi* in:- :
t- t- w
•" „, B B S
2 t * " ^ S|
«. * | S ^ £ C;
»• *^ k! *•* *•* 1.* C
"gO.1 ' t
01 b ( ' ' all*
>C:R:RtR:fi; :R- '
•C:A'A' '• -R<
:c :R;R; :R-
: :R-M;R'R;R:R:
*.m:m:*-.r-:,:*:*: Figure 2.4
R :R:C ;R;R;K;R:R; •
: :R:C:RvR!R-R: •
: R-O-R;R:R'r :
• »:r-»-.» :«:
. R:R-R:R:R:R ;
I EFFLUENT viR: -fi- R -R'-H-R-K -R • •
1 DISCHARGE SITE |C.R- •»>• R:R-R:R!R:R
I NCN >oilMT TCURtC :«:•:•
! GOlCRMCtfT ALTTtORlTV S
H^:*:*:
; t •
Page 11-20
October 5, 1989
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
-------
AN INFORMATION ENGINEERING PERSPECTIVE
SDC
2.5.2 Entity Type/Information Needs Matrix
The Entity Type/Information Needs Matrix maps entity types against
information needs. The matrix gives an overview of how entities
interact with each information need. The matrix was completed
using a simple involvement indicator of "X." Figure 2.5
illustrates the matrix as it appeared immediately after the data
was input.
The tool specifies that the value in each cell must be assigned by
determining whether the entity contributes or does not contribute
to the corresponding information need. For example, does the
entity test contribute to the chemical composition information
need? The answer to this question was "yes," and therefore an "X"
was entered into the matching cell. If the answer to the question
was "no" or there was not enough information known about the
relationship between the entity and the information need, the cell
was left blank.
Figure 2.6 illustrates the matrix after clustering. Most of the
"X's" appear at the top left, representing the most involved
entities and information needs, and most blanks appear at the
bottom-right, representing least-involved entities and information
needs. No clear conclusion was drawn from these matrices.
October 5, 1989
Page 11-21
-------
SDC
AN INFORMATION ENGINEERING PERSPECTIVE
...
1 T
w •» ^
- V « S c
I~ ° j?l *
is s
i J s ;.'
• \i \ i
w o s t S
* » & & S
;. I \ j{
•ScI'Fx"k\S?
^Ix'.'x'S?*!
ifclgESlfl?
i
!X:XiX> • • • • • -XT • : ! • • iX i
:?"= ! : 1 Sxi ; !x! ixl"? i i i iiii
!:X(XiX; • • > i iXiKlKJ : •_ ; JH;M}
• • t ! • ;XIX1 i : : i :XI iMi i :
•.::•>•. !X» ! • i ' ! !X«X! = _;
;;:;::::;:• :x!X!X:X;M^ . Figure 2.5
PftfrtLlTV • i i = :
UATCR : : iK:Kt>
fFFLUEMT X iH i i : :
SUHPtC tlTt : • t ! *
01'CHAHCC' 21TE : : ; : :
1
.A
C* t i"
B*- it
Z %) a -*
sit^.t
f S 2 § 1 S
2. s a 2 S E
t £,a«;s
S I- X 2 2 K
— ._,-- ^ C C* ^* ^
UATEK (ILMLITV H££E£9C X -X -X = :
NON POINT WUBCC X!X: :
OBSEKVMT 1OJ X ;
TEST X: :
POLLUTION EVENT X:X:- : -.
;KM*LE x-xi
hmTEH :::x; :
FSCILITV xix-xix;
DIECKONGE IITE X -X -X ':X •
EFFLUENT K -V '• '•*-•
IEGBL CCf/rnou . ::;x:
•Xt : t ! 1 :X1 : :X: ':X i
:v- i : ty;x; i ™.t ^
•Xt -X-Xi : : i : : : :A:
;xt • • • • ;xixi ; -x-xi
:Xi : : ! : :XiX: :X 1 'X:
1 { 1 1
T « £ B
« t 1- fc>
fc £5 t
8 K *• C| * *.|'"'
a • « j • i o sk
« c S si c * Si £ *
1 1 PS 1 1 : J 1
^ i »- 2 i £ S £!l<
*> L> k^ k, ^ £JO k. Is
-J
a
a
i
a
, tf
B> t« > J
£StS
£> £. O Ci
:X x-xix; IK! ;x:Xi ; • • •
^!x=x=x= : ; ; ;v;x; ; _ _
•X'X: -.A:X:Xi :••:••
• K-x;x:.X: : : • ; ; ;
; ' •• .*'V. • ••*: :
:'''!'! V'x'x'x'xl ' " Figure 2.6
K; • •. • ; • : • • • . •
K'X'W V: •
.-c .y.: . .-; ;
THEBTMENT FACILITY :H,X: A; ::.;..:.... .^. .
UHTEK £W>PLV : X; : :;•::•::.. -x ;>< i
QQUEI^MENT S"JTV«3«1Tv v •- y:
:RM*>LE IITE :::•,•.:;•:
Page 11-22
October 5, 1989
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
-------
AN INFORMATION ENGINEERING PERSPECTIVE
SDC
2.5.3 Business Function/Organization Matrix
The Business Function/Organization Matrix maps business functions
against organizational structure. The matrix gives an overview of
how each organizational unit is involved with each business
function. The matrix was completed using a simple involvement
indicator of "X." Figure 2.7 illustrates the matrix as it appeared
immediately after the data was input.
The tool specifies that the value in each cell must be assigned by
determining whether the organizational unit is or is not involved
in each business function. For example, is the organizational unit
State EPA involved in the function Issuance of Permits? The answer
to this question was "yes," and therefore an "X" was entered into
the matching cell. If the answer to the question was "no" or there
was not enough information known about the involvement between the
organization unit and the function, the cell was left blank.
Figure 2.8 illustrates the matrix after clustering. Most of the
"X's" appear at the top-left, representing the most involved
functions and organization units, and most blanks appear at the
bottom-right, representing least-involved functions and
organization units. Notice that state, regional, and local
organizations have the highest involvement with a broad range of
water quality functions.
October 5, 1989
Page 11-23
-------
SDC
AN INFORMATION ENGINEERING PERSPECTIVE
—
;*-
~ L * j
s £ "' ••
« n c- »
« J«£
L' £ 8 £
" t U, 1
*•",**;
?. t c J
EUSINESS FLMCTIGNE ». £> e
1 AMIENT WHICH MONITOR I •
WATCH CUM.ITV ftSSESSC :
POLLUTION CfflNTHOL :X:>
£TAMiBM40C OKwuJnv&rt' :M TH
mmnmos omwMMT i •
£LHSCt^1t?HTtGM OF unic \ ;
SEDMHT tCNTTOfttro : :
FISH MONITOR INB • •
CPTtUMT MONITOHtNO i
GROUND WHICH MONITOR IN
RESEARCH IK :
ISSUANCC OF PCRMITS :
i i i
;• j i f, m u|n
c « ~ •*(= J.'
» •» u, 11 5
r> >» u f i* *
HtTslP
?f f SSI
i i i * i i
ft ?. fc % & fe
i i i i «
Ik k.1* k, k. k.
k. klk k. h. k.
c- c.J& o c. c
'•_'
'O r?
WJ tM.
r- k)
P u. c
!;c;j
ia
fe,6SS
w e> k. w
?5ij
a a K- K
i 1
, ." ! j g!'~'s;1. C
^ " Si S 2 "' ^» " "i
: r|f « ^$5 . a ^
MS »?535J
t^s 5I3MI*
c S 1 H ^ k k S k.
*?_ic*i..'c>o^5e
•VOC^W^" 1 l&^i
u>MOk.'c£uj;ek
hthlo *»*»$$«•!!
IX! •X-X-X-X'X-XiXiXiX-X'XiXtX
: ; ; : : I : -X>X! 'M:M* :X!XW!XiXI -M
;x: : 'x:xiXiX2x:xix:x:xiXt ixixix xix MIX
•XtK: iXiXiXi iX 1 iX 1 :XiXi : : : ! X
'-, iX! : : i : : :X? -X t :•!:::! IX
: - =
KB:- P
213 t
JSS^i
ili
iliiil!
ix;x;x;x:x;xtx
?X«X!X! • : i 1
•XiXiX:XtXfX K
IXIXiXt : •
iXi • • • ! 1
; ; ; ; : ! : 'MSX! • ;M? -X« > !X;X!X!X!K1X!X! • • • ;
: : ' ; i ;X: ;M-XiXlX:X!XlX!XiXiX:X> :X;X-XiX:XiXJXix"l
• • • : - 1X1 :XiXiXIX :X>X*X!
< ;x :x ;x ;x -x ;x ; ;x ix >x ;x ;x .x -x sx :x ix ;x ;x »• -y
IjXiXlX1 :<
Figure 2.7
1
n
£ ;t £ £ e
iS^
P :> p
•«!& =
:*•««" ftl
iMis
^'-jl
sJoggt
££; s P s
h •* 3 « »-
& a 10 Z c
*8 > .5
W k.| ^it
t k. * > k t *
r o . a 3 f '
t ' J * 1 B, S
{. *•"(<> J E
el> IMsSjI
Mis&'^iii!
i«!i«l>i^:
o u» t L; •<> 5 01 c- •) a k
ott^i diftfti>3kje
; : i : : i ; :
J ? «' c- « 1 S 5 ;
;' « w a w *, a •
j^t*.C«|fc
e<*£!?itwZj
a j ci S fc, * 1 ^
5 5 I «• sj S S S
iisr,|ff£
Sjfefc^lffe^l,
k.k.k.k.fk.k.k
&^&t&&&*
X-XIX-X-XiXiXIXiX-XiXiXiXiXtXiXiX: i IXiXiX :X :X:XiX:X:X:Xi •
y -.-!•!:(::•:;! :icx:x:yx -X:5<:X:X ;x ix :x;x;y; :::;:::
X':X •'IS IX IX
X:n;X:X,Xi«1KiM:h:X=X:XIXi :
«.x M.K.X.X.X.K.X.K.X.X.X.
.y.-.rf.H.v:K:w.x=w:v:vtM.y:v:y.ytv.x.X! ••:;:: i. :.:• ^
i • • i. . i i.'iX! i i i' i i "•' i i i i i • . , i i • • . i i
v. K x ;x:x ix :x ;x -x ' : ' ' ; • :Xi • : : : •
rTAMnHROf CCVEt-OPVOT 1 - X •/ -:j:w:v: . . V -X •« :V • V iW =X
rTMAMRDS OFORCChBfT j :.-:'..: . : . . : . : I : . X
Figure 2.8
Page 11-24
October 5, 1989
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
-------
AN INFORMATION ENGINEERING PERSPECTIVE
SDC
2.5.4 Clustered Natural Data Stores Matrix
The Clustered Natural Data Stores Matrix is mathematically derived
from the Entity Type/Business Function Matrix, and is illustrated
in Figure 2.9. It describes natural data stores, which are
hypothetical repositories or groupings of data. The matrix shows
entity types on both the vertical and horizontal axes, and cell
values are automatically completed by the computer. A value of "9"
in a cell indicates that a high degree of similarity exists between
the sets of business functions that reference occurrences of the
entity types. A blank cell indicates that the two entity types are
referenced by few or no common business functions. All other
values represent a degree of affinity between none (blank) and
total (9) . In other words, the matrix shows what entities relate
closely to each other because of the similar ways that they relate
to the functions. Notice that observation, test, sample and sample
site are naturally very closely related. Since these are the
primary entity types in STORET, this clustering would suggest that
STORET is a natural data store.
2.5.5 Clustered Natural Business Systems Matrix
The Clustered Natural Business Systems Matrix (illustrated in
Figure 2.10) is also mathematically derived from the Entity
Type/Business Function Matrix, as is the Clustered Natural Data
Stores Matrix. It used to describe natural business systems, which
are hypothetical groupings of business functions. The matrix shows
business functions on both the vertical and horizontal axes, and
cell values are automatically completed by the computer. A value
of "9" in a cell indicates that a high degree of similarity exists
between the sets of entity types that reference occurrences of the
business functions. A blank cell indicates that the two business
functions are referenced by few or no common entity types. All
other values represent a degree of affinity between none (blank)
and total (9). in other words, the matrix shows what business
functions relate closely to each other because of the similar ways
that they relate to the entities. Notice that all types of
monitoring functions are naturally very closely related, as are
standards functions and pollution control.
October 5, 1989
Page 11-25
-------
SDC
AN INFORMATION ENGINEERING PERSPECTIVE
EOTITV TVPCE
OBSERVRTION
SAHPlC
] I
-i o
k.
*!«:t!t:f>;t:t:i ; i U i i !» i » j i «
*•*
WNICK
euniiTY A;*E;OC i:
OSSCtWMK SITE
NON M>1MT fCUMCC
HUTHDIUTV
1.1
titjliliti
Sis :S:»?|r:r;:
!*!»>»•»)»
1 It •-•-!»l»!li»*!«i!*?*!ti
t it:*!>i»ilH*!•»»!»}<>»t»U t
• M JI !?'?«»?»!»HH»H>J»!I !
BUEINEE:
iw
3»
ki
It.
ETFLLCKT MONlTORltO
e
i-"i ;r
i ;i
:1 :l
AMMIEMT UATFR MTKHTOCI l< <• •
GROUND WATER HCNITOAirJ r
j;*-5>ii:: :\ • i : t :: :
fiESEARv-M
•: r : s-: t : s
MQNlTOniNC
CUHL1TV
uF
• ;* = >;*;
ENFODCCXMT
i : €• : b : * ; «• •
POtLUTJON C«/TRC>L
Clri££IF
-------
AN INFORMATION ENGINEERING PERSPECTIVE
SDC
SECTION 3
3.0 Summary Remarks
Information Engineering seeks to elucidate the information needs
of a business or enterprise apart from existing systems or
structures. During Information Strategy Planning (ISP), which is
the initial stage of Information Engineering, planners gain a broad
view of the information needs of the enterprise. The water quality
enterprise is an abstraction. To consider it an enterprise for the
purpose of performing an ISP is difficult because it does not fit
the typical business mold. Thus, concepts like organizational
structure and top management cannot be well defined for the
enterprise as a whole. There are many organizations and
individuals who have a strong interest in water quality but have
no relation or responsibility to each other.
However, it is valuable to look at water quality apart from the
current systems, structures and organizations. New or revised
systems need to be developed based on functions and information
needs that logically fit together, but may not be considered
related because of the change of the organization over time.
Therefore, it is worthwhile to determine data resources, functions,
information needs and organizational structure of the enterprise
and analyze their relationship and involvement with each other.
Products created from this initial assessment give this kind of
overview of the water quality enterprise and will prove useful in
the future as a foundation in the ISP interview process.
October 5, 1989
Page 11-27
-------
-------
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
APPENDIX A
SELECTED REFERENCES
-------
-------
Selected References
- superfund ChemicalAnalysis Data System - Mission Needs
Statement (2/89)
- Ground-Water Data Requirements Analysis (5/87)
Prepared as a result of a joint effort between the Environmental
Protection Agency's Office of Ground-Water Protection and the
Office of Information Resources Management. Describes the
information needs of EPA and State decision makers, identifies
existing data management policies and systems, and recommends
specific projects to improve ground-water data management.
- Appendices: Ground-Water Data Requirements Analysis (5/87)
Provides the basis of the analysis and foundation for the
findings, recommendations and conclusions of the Ground-Water
Requirements study. It is the result of over 300 structured
interviews with EPA Headquarters, Regions, state governments,
local governments and other federal organizations as well as a
thorough document review.
- Manager's Guide To STORET
<
This guide is intended to help the water quality manager simplify
preparation of reports and graphics from raw data stored in
STORET. The five topics addressed are :
. Monitoring Programs
. Existing Water Quality and Historical Trends
. Pollution Sources and Control Programs
. Biological Monitoring
. Lake Water Monitoring.
The guide describes analysis techniques which are applicable to
programs initiated under various federal regulations and to
functions of the Office of Drinking Water and the Office of Solid
Waste.
- Surface Water Monitoring; A Framework For Change (9/87)
This report presents the findings and recommendations of a major
study initiated in December, 1985, by EPA's Assistant
Administrator for Water, addressing the Agency's surface water
monitoring activities. The projects principal objectives were
to: 1) Determine where EPA's surface water monitoring program
should be heading in the late 1980's to ensure that it can meet
the information needs of water quality managers in the 1990's;
-------
I
and 2) Identify where specific adjustments to the current program
are needed, and how they should be made.
- System Requirements For Tissue Residue Components For The
Biological Data System (BIOS) (8/89) •
Describes the proposed data elements, draft prototype and
suggested format retrievals for a Tissue Residue File. ^
- System Requirements For Toxicitv Testing Components For The
Biological Data System (BIOS) (8/89) I
Describes the system design and proposed data elements required
for implementing the BIOS Toxicity Testing File. Data will m
address toxicity levels found during effluent and ambient testing •
of identified sites.
- Requirements Statement For A Field Survey File fBIOS^ (6/86)
I
A requirements study to determine if the proposed Field Survey •
File of STORET would meet the needs of the user community. |
The intent of the system being to provide a biology-oriented data
management system to service the needs of those conducting •
biological sampling in the nation's waterbodies. This study is •
based on an extensive telephone and personal survey with key. ™
individuals from federal and state agencies as well as other
groups such as academia. I
I •
- Regional Forum On Water Information Handbook (1989)
During 1989, the Office of Water Steering Committee for Water I
Quality Data Systems conducted Regional Forums to introduce,
demonstrate and assess interest in recent developments and _
applications for the following data systems: •
. BIOS
. Menu-driven STORET retrieval user interface •
. Ground-water and surface-water data management capabilities of |
STORET
. Waterbody System (WBS) •
. Water Quality Analysis System (WQAS) •
. Reach file *
. NEEDS file General Query
. Ocean Data Evaluation System (ODES) I
This handbook provides background and descriptions of these
systems as well as guidance for accessing them. •
I
I
-------
•1- ••:>..'•" •-
• .' • ''?*j J. \ ' ' ' . '•- ' -•• . „..:
;. l\^^^&^^^^'^-^i&0A
.. .1 .Vv. ,,.'.-. : ... •••••£* v* .-;,„• • :,. ,f:r :-».*^*^
. •-•,:'.-;A'-.:-i'vV,.?i>?';^.'- •' - ^'^v«Wi:*M::2
•^.x^'-r'" ^.:-••'• V-''' '"-;; ^'•••y^r-.^:-"
.•-..'- :•>*»;; i/r**;' v-.v.- v..-' >' ".'-.>. '•--..' .v^* •*
>- V'.;.'•••,•'''::..- •:•"•-> • •','" ^' ••• -^ --•• "•- • ^^^--.• •,
.'•••' -V '.."• ~^ - • *^ -^r«- - -' ..•-• " ^••\:>*--^±
r. : .v.
•'•> '*
^:^ -;-^;
'i-^w -v''v-':-x;:-c ::'.' '-"'
-V
''•£*: . '
'."'V'X.'. •? j£. '.'*'.•• -''^-' .:„ '" '•'.'*''--"'•'-•.
^V:^
•-.•
-------
l^ijl&^-fr
..: . .:<•&.••.. • *. f ..- •
."- •• !ji 9v.; •:•**! ,- .:' •:•• • •:
•",' - • ' •*"' Y~* }', • *'J-'t^,4i'' tUii
^;.-i.;-'-.'i!VM^i?|;Hi
••••' .'•.>y.i-i:-^§;it:f|ii|^
.tt-.'-t'---.'],--t.'!«i«Jii*i:; l!
">n-"..:= , • ••:, t, -..'iArt- W;-^ iy,!
r|,y
•?_• •.':'•'': '&£•:$&%5&%iti
•3$ti&f&Mm
'm.* %&$$$£$.
;!|?MV ^b-:iii;
«h^m»fi
^^™j*ii
v^:te?feiii»
as",:.'•':«:•
------- |