EPA-600/5-78-007
May  1978
Socioeconomic Environmental Studies Series
                                                  Office of Air        and Water Use
                                                 Office of Research and Developm
                                                U.S. Environmental Protection Agency
                                                            Washington,  0 C. 20460

-------
                RESEARCH REPORTING SERIES

Research reports of the Office of Research and Development. U.S. Environmental
Protection Agency, have been grouped into nine series. These nine broad cate-
gories were established to facilitate further development and application of en-
vironmental technology.  Elimination of traditional grouping was  consciously
planned to foster technology transfer and a maximum interface in related fiel.ds.
The nine series are:

      1.   Environmental  Health Effects Research
      2   Environmental  Protection Technology
      3.   Ecological Research
      4.   Environmental  Monitoring
      5.   Socioeconomic Environmental Studies
      6.   Scientific and Technical  Assessment Reports (STAR)
      7.   Interagency Energy-Environment Research and  Development
      8.   "Special" Reports
      9.   Miscellaneous Reports

This  report has been  assigned  to the SOCIOECONOMIC ENVIRONMENTAL
STUDIES series. This series includes research on environmental management.
economic  analysis,  ecological impacts, comprehensive planning  and fore-
casting, and analysis methodologies. Included are tools for determining varying
impacts of alternative policies; analyses of environmental planning techniques
at the regional, state, and local levels; and approaches to measuring environ-
mental quality  perceptions, as well as analysis of ecological and economic im-
pacts of environmental protection measures. Such topics as urban form, industrial
mix, growth policies, control, and organizational structure are discussed in terms
of optimal  environmental performance. These interdisciplinary studies and sys-
tems analyses are presented in forms varying from quantitative relational analyses
to management and policy-oriented reports.
This document is available to the public through the National Technical Informa-
tion Service, Springfield, Virginia  22161.

-------
                                          EPA-600/5-78-007
                                          May 1978
              DATA BASE SYSTEM FOR

STATE WATER QUALITY MANAGEMENT INFORMATION SYSTEM
                Grant No. S-801000
                 Project Officer
                 Harry C. Torno
        Office of Air, Land and Water Use
       Office of Research and Development
      U.S. Environmental Protection Agency
              Washington, D.C. 20460
                  Prepared for
       OFFICE OF RESEARCH AND DEVELOPMENT
      U.S. ENVIRONMENTAL PROTECTION AGENCY
              WASHINGTON, D.C. 20460

-------
                                      DISCLAIMER

This report has been reviewed by the Office of Research and Development, U.  S. Environmental Protection
Agency, and approved for publication. Approval does not signify that the contents necessarily reflect the
views and policies of the U. S. Environmental Protection Agency, nor does mention of trade names or
commercial products constitute endorsement or recommendations for use.
                                             ii

-------
                                       ABSTRACT

The Pennsylvania State Water Quality Management Information System demonstration project (S-801000)
is a jointly funded effort between the U.  S. Environmental Protection Agency (EPA) and the Department of
Environmental Resources, Pennsylvania Bureau of Water Quality Management (BWQM) The project was
inaugurated to provide systems which would enhance the speed and precision with which decisions may be
made in the Water Quality Management field and thereby increase the effectiveness of the program as a
whole.

The objectives of the first grant period (starting January  1, 1969) were to enhance and  demonstrate a
State-wide Water Quality Management  Information System which could be made available for use by
Federal, other State, and inter-State water pollution control agencies, also, to provide a base for a Water
Quality Management Data Systems techniques training program for Federal, State, and inter-State Water
Quality Management personnel.

A  portion of the proposed system was implemented utilizing standard keypunch card data entry for
processing on a UNIVAC (RCA) SPECTRA 70/45 and provided predetermined periodic reporting capability.
This portion of the project was documented in EPA Publication 600/5-74-022 entitled, Demonstration of a
State Water Quality Management Information System. This portion of the system is referred to as WAMIS
(Water Management Information System) Release I.

Systems concepts, systems design techniques, software, computer hardware and telecommunications
capability have all experienced marked changes since the beginning of the project in 1969.

The Pennsylvania Bureau of Water Quality Management was encouraged to revamp its original systems
design concepts based upon initial reports of the capabilities inherent in the EPA developed Data Base
Management System (DBMS) known as the General Point Source File System (GPSFS). The modified
portions of WAMIS Release I, together with the modules which were to be included in the system as part of
the third year demonstration project, are collectively referred to as WAMIS Release II. This report
addresses the current design concept, design technique,  systems input/output capabilities, together with
illustrative documentation of the Pennsylvania system.
                                             111

-------
                       TABLE OF CONTENTS


                                                      PAGE
 ABSTRACT
 INTRODUCTION
                            SECTION  I

 RELEASE I  (CURRENT) WAMIS                         3

                            SECTION  II

 WAMIS RELEASE II

     General                                           5
     Systems Concepts                                  6
     Data Entry                                        7
     Information Retrieval                                8
     Data Management                                  9
     Operations                                        1 o
     Conversion                                        10
     Resources                                         12

                            SECTION III

 SYSTEM DESCRIPTION

     Development                                       13
     Analysis                                           13
     Design                                            15
     Implementation Plan                                 15
     Implementation                                     16
     System  Specifications                                16

                            SECTION  IV

THE EXTERNAL SPECIFICATIONS

     Description                                         1 Q
     Constructs                                         1 g
     Relationships                                       19
     Natural Hierarchy                                    20
     Data Elements                                     20
     Identities                                          21
     Forms                                             21
     Reports                                            22
     Training  and User Guides                            23
     General  Documentation Standards                      23
     Development Concerns                               24
                               iv

-------
BIBLIOGRAPHY                                              26

    APPENDICES

          A.    External Specifications
          B.    Internal Specifications
          C.    Project Control  Book
          D.    DMS 1100  Schema
Please Note:     Copies of Appendices A through D are available on a loan basis from either the

               Department of Environmental Resources, Bureau of Water Quality Management, or the

               U. S. Environmental Protection Agency, Offices of Research and development

-------
                                      INTRODUCTION

In the early 1960's, it became apparent that the accelerating accumulation of information relative to water
quality management required automated data management techniques. High speed electronic computers
already had demonstrated the ability to handle masses of information for other purposes. Therefore, it
appeared that water quality management data systems using computers would become essential in guiding
many of the policy decisions of the coming decades.

The 1963 conference of Pennsylvania State Sanitary Engineers formed a Joint Committee on Water Quality
Management Data. Organizations represented on the committee were the Army Corps of Engineers, Soil
Conservation Service, State and  Interstate Water  Pollution Control Administrators,  U. S. Geological
Survey, U. S. Public Health Service, and later, the Federal Water Pollution Control Administration. In May,
1967, the Committee issued a report entitled Water Quality Management Data Systems Guide. The primary
purpose of this manual was to provide a guide to the development of water  quality management data
systems for agencies with little or no experience in this field.

The Commonwealth of Pennsylvania has a massive investment in physical facilities to aid in the manage-
 ment of its water resources. In administering the wise use of this investment and in assuring compliance
 with  federal and state statutes designed to protect the water resources, the  Commonwealth Bureau of
 Water Quality Management is faced with managing a large amount of information.

 Because of the magnitude of information to be collected, stored, retrieved and analyzed, the concept of a
 state-wide water quality management information systems (WAMIS) was developed. The WAMIS system
 reflects the objectives of the Joint Committee manual and additionally incorporates objectives related to the
 specific problem areas particular to the State of Pennsylvania.

  Some of the areas of water quality management for which WAMIS provides information support are.

      1.    Facility inspections

      2.    Progress in terms of enforcement, construction and water quality upgrading

      3.    Permit processing activities

      4.    Identify problem areas and priorities

      5.    Determine specific treatment, research and budgetary needs

-------
    6.   Treatment plant operation reports

    7.   Treatment plant operator certification

    8.   Word processing activities

    9.   Ground-and surface water quality

    10.  Planning

The subsequent  sections of  this report define the system objectives as perceived at the time of project
initiation and describe how these objectives were modified through experience with the project during later
years.

The Bureau of Water Quality  Management's (BWQM) current version of WAMIS is described in Section 1
(Release I WAMIS). The system design objectives of the new version is described in Section 2 (Release II
WAMIS). The development cycle, the external specifications, training, and user guides are introduced in
Section 3. Section 3 also describes the general documentation standards applicable to all WAMIS Release
II documents that were developed. The purpose of the standards is to insure that people working on WAMIS
were able to develop the system and control changes. All communications regarding Release II WAMIS are
published in  formally identified documents prepared in accordance with  this standard.  These formal
documents are presented in  the appendices. The appendices are available from EPA or the Bureau of
Water Quality of the Commonwealth of Pennsylvania.

-------
                                         SECTION  I
Release I (Current) WAMIS
The Release I system as designed and implemented by Price Waterhouse in 1970-1971  established a
workable water quality management information system. The systems were implemented and integrated
into the BWQM on-going program efforts. As Bureau personnel gained familiarity with the capabilities and
limitations of the system, it became apparent that increased capabilities were needed to meet the data
processing requirements of the Bureau.

On the Release I  system, the user is locked into utilizing keypunched cards as input with periodic and fixed
reporting capabilities. The user has to wait an unacceptable length of time for software modifications for the
production of new reports. Therefore, the Release I system cannot respond to changing  information needs.

Shorter turnaround time between the input of information and output on a WAMIS report was required. The
Bureau functions of grant processing, enforcement and permit processing require turnaround time as short
as 24 hours.

An analysis  of the  current system showed  that 29 official forms  are  used  to  record  the values of
approximately 390 data elements. The information is gathered principally by Bureau personnel at the seven
regional offices throughout the state and recorded on card/column-oriented forms. Significant time is spent
transferring data  from working documents to the data processing forms.  These EDP forms are mailed to
Harrisburg for review and keypunching.

Thirteen reports are produced by the Release I system. These reports are distributed by  the central WAMIS
office to the regions at varying frequencies  and are used as a source of information for  decision-making
responsibilities. Manual data files are maintained by each region and are used to obtain more current facility
status and history than the Release I system can provide. There is also no provision for  remote job entry to
allow regions to specify types and contents of reports on a timely basis.

The existing WAMIS systems have the ability to produce reports containing information from very large files.
It  is  very  difficult to produce  reports containing only desired data from selected records in the files.
Therefore, the use of the existing printouts provides specific information about many data  records in a very
large quanity.

It is very difficult to relate information in one  system to information  in another  system. This capability is
needed by end users and management to support the decision-making processes required for responsible
water quality  management. The ability  to  relate data in  different systems  would  also reduce data
redundancy.

-------
The activities and, therefore, the information requirements of the BWQM, are controlled by the Bureau's
management environment. This environment is established by the state and federal laws, budget and public
opinion. The dynamics of this  environment require that the EDP system be flexible and can  respond to
changes in data management needs.

These observed requirements, fast turnaround,  general reporting, related information, and flexibility were
the basis for the development of the specifications for the new (Release II) WAMIS system.

-------
                                        SECTION II
                                   WAMIS RELEASE
General
The Commonwealth of Pennsylvania's water quality management system is described in detail in the EPA
Publication No. 600/5-74-022 titled, Demonstration of a State Water Quality Management Information
System. The objectives of the existing system were to demonstrate a state-wide water quality management
information system, including case status reports, project status reports, water quality control and plant
operation control systems.

Development and implementation of the system was performed in three phases. A comprehensive system
design was developed and the facility status,  water quality and contact modules (a name and address
system) were made operational. The third phase anticipated that the plant operation control (POC), project
status, grants and history modules would become operational. Time and funds did not permit  the imple-
mentation of the third phase.

The system described and demonstrated in this report is the demonstration of the third phase. The report
includes an overall coordinated systems design to interrelate the existing different modules, to demonstrate
a plant operational control system (POC), and a  PREP system which meets the needs of project status,
grants and historical applications.

The third phase of this project includes an overall system design because experience with the existing water
quality system has shown that data relationships among modules is very interdependent; therefore, the
design integrates the various modules into a more cohesive system. This design recognizes the fact that
data (i.e., data elements) have multiple users and should not be restricted by defining a computer file or
records.

It was decided that the overall WAMIS system  should be designed to operate on a generalized data  base
management system. The Bureau of Water Quality Management was provided with information by the U. S.
Environmental  Protection  Agency  (EPA) which  indicated that the General Point Sources File  (GPSF)
developed by EPA and available  as public domain software, would be able to handle the WAMIS design
requirements. Support for GPSF was later discontinued by EPA and other data  base systems  were
investigated. A Management Information System (MIS) is designed to retain information in a "data base." (A
data base being defined as a central  repository for interrelated network type data.) The input consists of
transactions (add, delete, or modify), and the output consists of reports. The third phase demonstration
Release II involves three interrelated information modules - Facilities, POC and PREP. These modules form
the central data base for the WAMIS system.

-------
The facility (FAC) module is included in the data base because all other modules are related to information
defined by this module. The facility module defines data about any entity over which the Bureau of Water
Quality Management has regulatory control. Examples are wastewater treatment facilities, water supplies,
bathing places, industrial waste discharges, dams and encroachments. Data is maintained about the facility
locations, constructions, inspections and populations served for each entity contained in the module.

The POC module was developed to support the EPA NPDES self-monitoring program. The objectives of
POC are to monitor the reporting requirements for the plant operations for all wastewater dischargers. The
POC module will compare daily plant discharge parameters and values with the permitted standards in the
NPDES permit. It will then generate exception reports showing violations of the permitted standards.

The PREP module is designed to monitor and report on any generalized time-constrained activity. These
activities include the processing of permit applications, processing of operator certification, and  the
monitoring of construction grants processing. The PREP module will also provide historical information
about completed activities.

Since the scope of  the total Release  II  is rather large, a smaller portion  known as Release I la will be
implemented with batch input capability and on-line retrieval capability. The FAC, PREP and POC modules
are now operational.  The other modules that comprise the total Release II design, such as Water Quality,
Contact,  Planning, etc., will be implemented as resources allow. These modules  will continue to be
supported by Release I until they can be included in Release II. The information in Release Ha is detailed in
Section III.

Release lla will replace the facility status system of Release I. The data from Release I will be transferred to
Release lla as discussed  in the section on conversion. The Release lla system meets the requirements of
the third phase of the WAMIS project.

System Concepts
The objectives of the Release II system are to provide more flexibility of  data input and retrieval using
current data base management system (DBMS) technology. The system will use a UNI VAC 1110 computer
and the  DMS 1100  DBMS as the basic data management system. Five major advantages  that this
technology offers over conventional processing techniques are:

    Program Development - Data base  management  software will simplify application program development by  performing
    automatically many routine functions under software control.

    File Structures - Release II file structures offer powerful cross-referencing capabilities that eliminate redundant data and make
    possible fewer files.

-------
    System Design - Fewer 'overhead' programs for use with specific applications are needed. This cuts down on sorts, merges, file
    extracts, duplicating updating and scanning of files.

    Operations - Fewer files means less set-up time and less external manual management of programs and data.

    Flexibility - Changes required by users can be implemented with ease.

Data Entry
The input to  a data management system is extremely flexible. The commonly used method of input, coding
a form, keypunching and inputting data to the system in a batch mode is somewhat obsolete using data
base management system software. (BWQM does intend to use batch input for large volumes of routine
data to our system, particularly for input to the POC module.)

The most user-oriented approach to inputting data in a data management system is to have users directly
key input to  the data base by using a cathode ray tube terminal (CRT) in either an on-line (interactive) or
demand (batch) mode of operation. The Release II system is applicable to terminal input and it is expected
that as the Bureau of Water Quality fine tunes the Release II system, regional personnel will utilize the direct
input mode  for PREP  and facility data. This will allow the user to immediately store his new information
required in the data base without the time delay involved in keypunching and batch update.

Regional telecommunication capability will be utilized for data entry to the system and information retrieval
from the system. To the maximum extent,possible, Bureau working documents will be the source used to
update the data base.  Input errors will be displayed during the input process to enable immediate correction
of errors by  the terminal operator. (This will reduce time spent on encoding data onto keypunched forms.)
Computer printed forms containing previously stored information will be used as turnaround  documents.
This will make separate reports unnecessary for activities that develop large amounts of update data for the
data base. Examples of these activities are facility inspections and enforcement actions. We anticipate that
the initial updates of the data base through a cathode ray terminal (CRT) will be done in a demand (batch)
mode. The Bureau of Water Quality expects to utilize real-time capabilities  (interactive) in the future. Real-
time will be used where the user requires less than 24-hour turnaround between data input and output. Real-
time capabilities with CRT input will be developed as implementation off the project progresses. Develop-
ment of the real-time capability is expected in 1978.

In order for  a user to  develop a CRT input, a  form will be projected on the CRT screen and  the terminal
operator will fill  in the spaces and submit the data to the data base for  storage. The design for this is
included in appendix B. An editing  program will inform  the terminal operator when errors are  found in the
input. The editing validates codes, length of field and logic problems, where necessary.

-------
Information Retrieval
Just as inputs in the data base environment are not constrained to keypunching, outputs from a data base
environment are extremely flexible. The Bureau of Water Quality has designed standard reports to manage
the routine large-volume type reporting requirements, particularly for the POC modules. These reports are
found in Appendix A in the section titled Reports.

For much of the Release II reporting capability, the Bureau of Water Quality will be utilizing the DMS 1100
Report Writer. The Report Writer is a UNIVAC software product that can be used with the DMS 1100 data
base system. It allows the user to select information from the data base and format the information into a
useful report. Production of reports  using the Report Writer is more  expensive than producing a standard
report to get the same information. The advantage that the Report Writer has over standard reports is that
reports can quickly be developed to meet the users needs. This is necessary in an agency like the Bureau of
Water Quality where information requirements are dynamic.

As the Bureau of Water Quality gains experience with reporting needs, a decision will be made as to which
reports are to be standard reports and which reports will continue to  be produced by the Report Writer. As
some reports are shown to  be produced on a routine basis, a program will be written to produce the reports.

Initially, the basic output from the data base is the retrieval of information resulting from the input of ad hoc
queries at a telecommunication terminal. Using a few instructions in conjunction with  selection criteria, the
requested information is displayed on the CRT. For example, information on a project's priority, status and
history will be retrieved using the ad hoc Report/Writer capability.

Requests for standard reports are also entered via terminals. If error-free, these result in searches of the
data base for the requested information and formatting of this information is required.  Sorting, totals, tallies
and code conversions are performed and the report can be read from the CRT or printed at a high speed
terminal.

On standard reports that will be produced by a program, the data that are coded for input to  permit cost-
effective storage will be displayed in expanded notation (e.g., codes will be translated into English before
print) on the printouts. In this way users will be able to read the reports without  referring to code lists.
Reports produced by the Report Writer will not  contain expanded codes because the  Report Writer cannot
translate codes before printing a report.  The Report Writer reports are short,  only  one or two pages of
printout; therefore, the users do not find coded output to be a problem on these reports. Standard reports
would be much longer and,  therefore, translation of codes on standard reports relieves the user of referring
to a code list in order to understand the report.

-------
Data Management
The data that is input to the Release lla data base originates in the BWQM Regional Offices. The source
documents are permits, enforcement reports, grant applications and other documents that are used for
information and communication in the Bureau's ongoing program activities. Since the source documents
are records of the Bureau's activities, the information on the source documents is considered accurate.
Therefore, the Release II system only performs edits to check the data for acceptance to the data base.

The  system performs  edits on all coded input to check that the code is valid and  of proper length. The
identities of the records are checked to be sure that all appropriate records required to establish a proper
hierarchy are in the data base. For example, a municipality record cannot be added to the data base unless
the county record for that municipality is already in the system. Data fields are checked for length and type
of characters allowed. Numerical fields are validated to make sure there are no letters included in the data.

Maintaining accurate and logical data is the responsibility of the regions that submit the data for input. The
regions receive an error list for all data submitted. The listing describes each record submitted, its status
(either accepted or rejected), and provides a message to tell the user why the system rejected any data.
After an update is run, the regions receive a report containing the updated information which they  use to
assure that the data is correct and that it provides the information that was on the source document.

By using the same forms for both the source document and data input, coding errors are reduced. The data
only has to be transferred one time, either during keypunching or during terminal input. This avoids coding
errors that tend to occur when information on a source document is transferred to a special coding form.

The management  of  the data base operation including updates,  deletes,  report production, etc. is a
responsibility  of the  EDP manager who  is part  of the  WAMIS  Section.  His responsibilities  include
management  of contracts,  defining system work,  and overseeing  the data base operation.  Data base
administration (DBA) which is a team effort shared by the analysts, programmers, and the EDP manager is
responsible for the data base activities. These activities include scheduling of updates, storage allocation,
internal data structure, recovery, security, and any other activities required to maintain a cost-effective data
base operation.

DBA activities also control the deletion of data from the data base. All requests to delete records from the
data base must be approved and executed under DBA control. This is important  because deletion of a
record from the data base system also deletes any records that are related to the deleted records.

-------
Operations
In the simplest case, "operations" are the following processes of Release II WAMIS:

    Receipt of data from BWQM staff and rejection of unacceptable data.

    Incorporation of received data into the data base.

    Receipt of requests from BWQM staff or information (reports) and preparation of the same from the data base, using the Report
    Writer or a standard report

For the non-systems staff member, these are all the operations there are, and the user manuals provide the
information required to interlace properly with the system. The Release II operations are diagrammed on
chart 1.

Certain "system" staff are required to handle the other aspects of the operations:

         Recovery in case of error

         Monitoring of operating costs

         Reorganization of data files.

As mentioned earlier,  the process of using the system will make apparent a continual list of  modification
desires.  These will be collected and, on a  periodic basis,  form the rationale for beginning a new
development cycle which will consist of additions,  modifications, or deletions to existing system compo-
nents.

Conversion
"Conversion" is the process whereby BWQM people stop using  Release I WAMIS and begin using Release
II WAMIS. This involves distribution of new forms and procedures, training, incorporation of existing data
into the new data base, and the destruction of old forms and procedures. Old programs (Release I) and data
files will be removed to archives for reference.

This would clearly be a major activity even if it began with a completely operational version of Release II
WAMIS capable of performing all of the processing required by the external specifications.

The users will input data into the Release I la system as portions of the system become available to the user.
The data will be input in small amounts at a time. This is being done as opposed-to inputting or converting all
of the Release I  data at one time. Controlled input to the data base will give users time to become familiar
with the data base environment. BWQM feels that users cannot deal with a data base as a black box, but
must be familiar with the logical relationships between data and must understand the data base  structure.
                                                10

-------
                                                                      REQUEST  FOR
                                                                      INFORMATION
ADD
MODIFY
DELETE
RENAM
r               CONSTRUCTS
                data elements
               -identify
               -other
REPORTS
ADD	-)	
DELETEj
            -(  RELATIONSHIPS
  FIGURE  1
RELEASE II WAMIS OPERATION

-------
Resources
The Release II system was developed in a time share environment using terminals in a demand mode for
the writing and debugging of programs. The larger programs were coded and keypunched and input to the
UNIVAC 1110 computer by remote card reader. A total of 210 programs were developed for the Release lla
subsystem. The total Release II system will require approximately 450 separate programs. Using the top-
down modular techniques, only 10 of the total programs were large enough to require keypunching and
batch input.

The Bureau of Water Quality has four programmers and one computer systems analyst working on the
design and development of the Release II system. Combined with the other obligations, the Bureau has
been able to commit approximately 2 112 systems persons to Release II work per year.

To develop a major system using a computer in  a time share mode and with the limited number of staff
available to the Bureau required that the Bureau's management accepted development times in excess of
two years. In order to maintain direction and develop a system that would meet the needs of the Bureau
over such a long period of time, a very rigid method of design and documentation was used. The design and
documentation methodology is discussed in Section III and detailed in the appendices.

In order to maintain the Release II system, system staff will be required to operate, enhance and modify the
system. Staff will also be allocated to provide data base administrative support. Data base administrative
support is  required to  make  decisions concerning  update frequencies,  operating  cost  analysis and
scheduling. Systems staff will be required to perform the programming and other systems changes that the
data base administrators identify. Total systems staff required to maintain Release lla after implementation
is completed will be no less than one analyst and two computer programmers. Contracted services may be
used for additional systems support. Although contracted services will only be used for activities involving
vendor supplied software, such as operating system and DMS1100 charges. BWQM will support all of the
210 programs written in-house.
                                              12

-------
                                         SECTION  III
                                  SYSTEM DESCRIPTION

Development
The WAMIS Release II Development Cycle, diagrammed on Chart 2, consists of four basic activities:

    Analysis       - The production of external specifications based on communication with BWQM staff.

    Design        - The production of an Implementation Plan, given certain constraints such as the computers and languages
                 to be used.

    Implementation  - The production of the system (machine instructions, user manuals, etc.)

    Operations     - The day-to-day operation of the system

Although it is not readily apparent from the diagram, the development cycle is often continuous. Once
operations begin, it  is envisioned that user requirements will change. In this case, a new cycle begins at
analysis and ends with upgraded operations, resulting in a modification to the Release II system.

Following is a discussion of the details of the development cycle.

Analysis
Analysis in the case of Release II WAMIS has been based on a very high  confidence in the capabilities of
the BWQM technical staff and in the belief that the system will be more useful if it is a step in an orderly
evolutionary process  of BWQM  information requirements. As a result,  a large portion of the analysis
function  has been delegated to BWQM technical staff by the formation of  committees known as Task
Forces. The Task Forces served to assure that BWQM staff had a major part in the Release II design.

Task forces were set up for each major information module. These Task Forces included technical and
systems personnel from the central and regional offices. The major Task Forces are Facility, PREP, POC,
Water  Quality and  FEUDS (planning  and modelling). An Executive Task Force  consisting  of upper
management and system staff coordinated and directed the activities of the other Task Forces. The major
goal of the Task Forces was to define the external specifications of the system. This involved defining data
requirements, reports, forms and aiding in fine tuning the data base. The definitions that were developed are
listed in Appendix A.

The basic methodology of defining the external specifications can be summarized in a simple procedure:

    Stepl.    BWQM needs as  expressed by  individuals and/or the Task  Forces were defined in a rigorous form in the
             specifications.
                                                13

-------
                PENNSYl VANIA
       BUREAU OF WATER QUALITY MANAGEMENT
                  PROGRAMMING —
•H
FIGURE 2    WAMIS RELEASE II DEVELOPMENT CYCLE

-------
    StepS.     The Specifications were published and the people involved determined whether their needs would be met by
              implementation of the specifications.

This procedure was repeated until each person was satisfied that the system as defined in the specifica-
tions would fill his particular needs. The final product is a specification of a system that will assist the Bureau
in performing its functions but that will not have traumatic impact as it becomes operational because the
user has been involved in the development of the specifications.

Design
The design of Release II WAMIS proceeded in a top-down manner (McGowan). This simply means that the
design was produced in levels. The first level, level 1, being the overall systems design. Level 2 was a more
detailed design of the overall system documented in level  1. Design was performed in levels until sufficient
detail was provided for programming to be started. This process assures that the programs will work in the
total system.

The top down process allowed continual monitoring by DER systems and Water Quality personnel to insure
that Bureau's priorities and interests took precedence in all design decisions. The process consisted of:

    Step 1.     The designers were authorized to proceed with a given part of project. Initially, the project was the entire system
              design phase.

    Step 2.     The design was developed which resulted in the definition of sub-projects (levels).

    Step 3.     The design was evaluated and the design was approved or Steps 1 and 2 were redone.

    Step 4.     After approval, the new defined sub-projects were classified as follows:

              Class 1     Projects for which design was authorized. These  then could proceed, independent of each other,
                        through Steps 1-4 for each project.

              Class 2.    Projects for which no  further  design was  required. They  were  ready for implementation when
                        authorized.

Implementation Plan
The implementation plan consists of two kinds of documents:

     Internal Specifications (IS) See Appendix B.

     Project Control Books (PCB) See Appendix C.

-------
The internal specifications contain the description of how the system was designed. They were prepared in
accordance with the WAMIS Doc. IS-STD Standard (Appendix B).

The control project  books contain the instructions as to how the system was to be implemented. PERT
diagrams and event definitions were the  basic tools for directing  the implementation effort.  These
documents were prepared in accordance with the standard: WAMIS Doc. PCB-STD (see Appendix C).

The implementation plan was divided so that the information required to maintain the  system (IS) wquld not
be confused with the information required to control the original  development (PCB). After the system is
operational, the internal specifications become the so-called "documentation" of the system - i.e. a guide
to the people responsible for maintenance and operation of the system in the future.

Implementation
As all design tasks were finished  and authorized to be implemented the implementation proceeded as
follows:

    Step 1.     The project control books for all levels directly superior to the authorized project were obtained by the programmers
              and used to find the description of events, due dates, etc.

              The project control books were used to identify the complete definition of the authorized project.

    Step 2.     References to the internal  specification documents were also obtained. These contain definitions of routines
              (programs, macros, subroutines, etc.) and records (files, tables, work areas).

There are two kinds of implementation activity depending on the class of the project:

    Class 1.    Projects which have an approved design are implemented in strict accordance with the design, preferably by the
              designer or under his leadership. In the case of Release  II, most of the programming and other implementation
              activities were done by the designer of the project

    Class 2.    Projects requiring no design approval can  be implemented in any  manner consistent with good professional
              practices and general standards.

System Specifications
WAMIS Release II consists of the following components which are independent of the computer installation
that will be used to operate the system:

    1.    A general introduction document (WAMIS Doc. Gen, Appendix A). This document is designed to be updated as required SO
         that it always presents a current overview.
                                                  16

-------
    2.   The External Specifications as described in Appendix A.

    3.   The Internal Specifications Appendix B.

    4.   The project control book Appendix C.

WAMIS Release II, implemented by BWQM is operating on the following configuration:

    Univac DMS-1100 software

    Univac 1110 hardware configuration.

WAMIS Release II consists  of  the following components which are developed to  fit the configuration
defined above:

    Parameters, instructions (i.e. programs), data, etc. required to cause the configuration to perform in accordance with the external
    specification (in the form of cards, listings, etc.).

    Internal specifications describing the above in a context suitable for maintenance.

    Users Manuals. These, combined with (1) the external specifications and (2) the user manuals for the QMS-1100 and the UNIVAC
    1110 configurations contain the information required by DER to effectively use the system.
                                                      17

-------
                                         SECTION  IV
                           THE  EXTERNAL SPECIFICATIONS
Description
The external specifications describe the system in terms of its inputs, outputs and storage content (data
base). They do not contain any design decisions such as what computers (if any) or what kind of storage
media (disk, tape) will be used.

The purpose in preparing a system definition which is independent of method of implementation is twofold:

    1.   Potential users of the system have a description of what the systems will do for them - which they can evaluate - as
        opposed to a description of how the system will operate -which they cannot evaluate without experience on the systems.

    2.   Designers can concentrate on a well-defined problem, producing the best solution within the budgeting and physical
        constraints imposed.

The specifications will be kept up-to-date after the system is operational. If events should force the redesign
of the system for an alternate configuration, the external specifications insure compatibility. The external
specifications can be used to develop a similar system for other water quality management organizations
irregardless of the hardware configuration they may wish to use.

A detailed description of the WAMIS Release  II External Specifications is provided in Appendix A. The
external specifications of WAMIS Release II are contained in various formal documents beginning "WAMIS
Doc ES-H" as  identified in  the following discussions. WAMIS Doc Gen. section, introduces and describes
the external specifications; therefore, it should be considered a part of the external specifications and
prerequisite reading for anyone using the external specifications. This section introduces the technical
terminology required by the reader and introduces the individual parts of the external specifications.

Constructs
The data base contains information about places, people, projects, measurements, etc. as specified in the
External Specifications (Appendix A). The totality of information retained about any single person, place,
measurement  etc.  is hereafter called a "construct", consistent with  the use of the term  in physics and
philosophy. A construct  is  the  memory or conceptualization  of a real or imagined entity  (Infotech
International Limited).

The term "construct" is preferred over more traditional terms such as record, repeating group, or entry in an
array  because the latter tend to describe physical storage areas. The purpose of the external specifications
is to specify what data the system is to retain, not the detailed technical methodology to be employed in the
retention process.
                                               18

-------
As  shown in Chart  1, the  input  forms are edited  and result  in the  DBMS performing  the following
transactions regarding constructs:

    ADD       A new construct is to be placed in the data base.

    MODIFY   An existing construct is to be modified.

    DELETE   An existing construct is to be removed from the data base

    RENAME:  The identity of an existing construct is to be changed.

When it was decided that information about facility inspections was to be retained, a new construct type, Fl,
was defined. When  the information regarding the occurrence of a given inspection is  entered,  a new
construct (the given inspection) is thereby added.

Relationships
The basic information retained in the data base is:

    1.    There exists certain constructs (facilities, enforcement actions, municipalities, etc.).

    These constructs have certain characteristics (data elements) (i.e., facilities have a location, a daily flow, etc.).

    There are relationships between certain constructs

Some examples of relationships are:

    "A plan is associated with a demand center."

    An establishment is responsible for a facility.''

     An operator works at a facility."

The general practice is to confuse relationships with data elements. Thus, one might define for facilities a
data element called "responsible  establishment". This works to a certain extent, but is basically wrong. It
obscures the fact that a significant complexity of the  system exists. The external specifications of WAMIS,
therefore, contain  definitions of relationships in lieu of defining data elements for one construct as being
links or pointers to another.  In this example, a relationship "establishment is responsible for a facility" was
defined.

The relationships in WAMIS Release II are specified under both constructs entering into the relationship. A
description of the characteristics of each relationship  is found in Appendix A.
                                                   19

-------
Natural Hierarchy
A special relationship exists between certain constructs; this is an identity subordination. For example, a
sample (SM) is related to a facility (FC) by the fact that in order to identify a sample, the corresponding
facility must be identified. When a construct must be logically related to another construct to be meaningful,
it is described as being in Natural Hierarchy. If a construct is in Natural Hierarchy to another, the identities of
both constructs must be provided to retrieve  the subordinate construct. A relationship between the two
constructs could have been defined.  However, it is desirable to minimize the number of relationships in the
system. Insofar as possible, constructs are  placed in a natural hierarchy. In the case of a water quality
sample, the hierarchy is:

    FC-facility

         SM-sample

             CP - custodial period

             SC - substrate component

             VL-value

Therefore, to identify the pH of a sample at a facility, the identity of FC, SM and VL would have to be given.

The External Specifications show the natural  hierarchy of all constructs.

In the case of the most simple report, a user selects constructs of a given type (facilities, for example),
orders and displays the desired information. It is also possible to reference, in the same report, constructs of
other types (samples, unit processes, etc.) provided they are in natural hierarchy.

Data Elements
For each construct type there is a list of its attributes called "data elements" contained in WAMIS Doc. ES-
II-DED (Data Element Dictionary. Appendix A). Each data element is assigned an arbitrary identity number
such as FC92477 (facility name) and is described according to type, length, etc.

The types of data elements are:

    A/N - (alpha/numeric) - This means a sequence of characters as they were input without processing of any kind except to sort for
    reports. This is data such as names, addresses, descriptions, and other strings of letters.

    MUM - (numeric) - This means a number. The value is assumed to have units and a decimal location of some kind.
                                                20

-------
   CODE - This means the value must be one of a specific list. It is planned that for each code value there will be a corresponding
   expansion for reports defined in WAMIS DOC. ES-II-CODES (Appendix A).

   DATE - Define with format DDMMYY, DDDYY, etc.

   ID - Identity data element.

The data element identity number is used in all retrieval requests.

Identities
The construct VL (value) contains only three data elements called PARAMETER, FINDING and REMARK.
At least  one of these data elements must be an identity in order for a transaction (ADD, MODIFY, DELETE
and RENAME) to take place. In the construct VL, the identity element is parameter. The non-identity data
element (FINDING or REMARK) can be changed by the MODIFY transaction. If an identity element is to be
changed, a special transaction type, RENAME, must be performed. Depending on the system methodology,
a RENAME can be simple or complex. It could, for example, result in records being moved from one part of a
file to another part of the file.

All identity elements have a description, for example, "(05VLSM)". This example says "this is an identity
data element for the construct type VL. It is element 05 in a sequence of identities. Element 04 is an element
of construct type "SM".

The External Specifications provides the complete list of construct types and identities.

Forms
Input to the system is on working forms whenever possible and is interpreted by the edit phase of the
system. The basic content of all forms is be given in WAMIS Doc. ES-II-FORMS. The detailed layout of the
form will not be shown because the system  must be independent of layout up to final implementation. When
input forms must be designed,  they are designed to be the form that BWQM personnel use for information
record keeping during their working routine (working forms). This reduces the activity of coding data from a
source document or to a special input form.

Input for Release II will be collected on working forms and entered via terminals.  The data is edited and
errors (invalid fields, etc.) are communicated to the data collector for correction. These corrections can  be
mixed with ordinary subsequent input as desired. The Data Base Administrator decides when  the edited
data is to be entered in the data base.
                                              21

-------
All forms will be read and edited resulting in the basic update transactions: ADD, MODIFY, DELETE or
RENAME a construct and ADD or DELETE a relationship.

Transactions will be rejected if they call for creation of a construct or relationship that already exists, the
modification or deletion of  one that does not exist, or have invalid identity elements. Individual  data
elements will be rejected if they do not conform to the standards for their type (for example VALUE = 12A
instead of 128 where VALUE must be a number.)

No attempt will be made to screen out illogical data, although data will be checked for length, type and valid
codes. Instead, reports will be produced showing the information that was input. Errors will be corrected by
updates. The reason for this is that the data entry clerk should not receive rejects which are not a result of
typing errors. The latter will usually result in invalid data which unclear will be screened out. If it is found to be
desirable to have logic checks, the external specifications will be modified to show, for each data element,
the logical requirements it must meet. BWQM has found through past experience that logic checks are not
very useful except in such simple cases as where a date field is checked to be numeric.

Reports
The principal outputs of Release II WAMIS are user-specified reports, most of which will be the result of ad
hoc queries. A large number of people are to be trained in the method for obtaining information from the
system, which, for the simplest case, is:

    SELECTION - For a given construct type, name the criteria that each individual construct must meet to be selected.

    Example: Select Facilities, Type = Sewage Treatment Plant.

    ORDER - Name the data elements which control the order in which the selected constructs are to be listed.

    Example: Design Average Flow - descending.

    DISPLAY - Name the data elements to be shown on the report and describe report constants (headings, pagination) and the
    positioning of the output data.

It is possible to provide machine readable output for input to other systems (such as STORET).

There are systems outputs other than reports - these are edit messages, file maintenance reports, and lists
regarding the user-supplied report specifications (errors, predicted results, etc.).

WAMIS Release lla  will produce the  reports listed in WAMIS Doc. ES-II-REPORTS under the heading  POC
Module, PREP Module, and Enforcement Module. The constructs that are being implemented in support of
this are:
                                               22

-------
   CY—County
        MM—Municipality
             AF---Admmistrative  File
   ES—Establishment
        AP—Act 339 Payment
        NP—Need Pro|ect
   PC—Facility and/or Resource Sampling Station
        FP—Facility Permitted Value
        OR- -Operations Report
             OD—Operations Report Day
                 OV--Operations Report Value
             OS—Operations Summary
   OM—Operations Report Parameter
   PP—PREP Proiect
        GR—Grant
        PE—PREP Event/Task
             SE—Sub-event/Task

A loss of capability due to conversion of facilities from Release I to Release II will be avoided by including
such reports as are provided at present if they are not made obsolete by Release II reporting capability.


Training and User Guides

User guides will be necessary to link the WAMIS information contained in the external specifications to the
hardware and software. These will be provided as a set of documents identified as WAMIS Doc. USER. The
exact list will be provided as part of this section when the design is complete.


The user guides are considered part of the system. Their preparation, their distribution and the initial training
in their use is considered  part of implementation. User guides are part of the  design of the appropriate
subordinate design module. See the edits and retrieval designs for examples. (Appendix B, WAMIS IS).


General Documentation Standards

In order for system documentation to have continuing utility, there must be a method for communicating
changes. This section describes the method used for all WAMIS Release II  documents. The documents are
updated as they become obsolete so that at all times the formal documents contain the current information
required to use, evaluate, or control the system.


Each page published will begin with a heading containing the following elements:
                                               23

-------
    Document identity (WAMIS Doc. xxx)

    Publication control number (level xxx)

    Pagination (Page xxx)

Each document will have a distribution list and a document status page. The purpose of the latter is to show
the publication history and the concurrence status.

The resulting library of documentation should contain all communication regarding Release II WAMIS. The
complete detailed documentation is found in the appendices.

Development Concerns

The major problems encountered in the development of the Release II system involved the length of the
project, the complexity of the data base, and the training of the system personnel in data base techniques.

The development time required from the finalization of the External Specifications to the implementation of
the Release i.a sub-system was approximately two and one-half years. A project of this length is difficult to
manage because of staff turnover, budget ramifications, and changes in priorities.

In order to reduce the effect of staff turnover on the project, extensive documentation and  a modular
programming approach was utilized for systems development. This facilitated the training of new staff to a
productive level of systems work without incurring an extremely long training period (Maynard, J).

All of the critical systems development work  was performed with the Bureau systems  staff.  Therefore,
budget limitations had a limited effect on the project. If the development work on a project of  this size is
performed by contractors, changes in budget priorities and contract difficulties could adversely affect the
development of the system.

The Release Ma data base, which is now in operation, contains data required by all Bureau programs. This
avoids the situation of having a data base design for a special Bureau function, which  may fluctuate in
priority and, therefore, result in a system that is supporting a low priority activity.

The complexity of the Release Ha system was a problem because the DMS1100 data base system was
released by UNIVAC during our development activities. Therefore, the DMS1100 system  had not been
implemented in many user sites, nor in support of a data base as complex as our design. This meant that
there were no DMS1100 users that Bureau systems staff could contact for advice. Being a ground-breaking
effort, Release II discovered numerous quirks and inadequate UNIVAC documentation for the DMS1100
system forcing us to become involved in trial and error activities.
                                              24

-------
A less significant problem, but still time consuming, was the retraining of our systems staff for data base
maangement system development. Most of our personnel had experience on large batch systems oriented
toward tape storage and sequential files. This meant that we lost significant time in retraining our people in
the area of random storage data base management system concepts and the UIMIVAC DMS1100 operation.

As stated before, most of the above problems arose because of the length of the project. If an agency with
limited systems staff considers attempting a data base project of this magnitude, similar problems can be
expected. The impact of these problems could be reduced by breaking the project into discrete steps Each
step would be implemented and placed into  operation before development on the next step was started
For example, our Release lla system, which contains three major modules, could be broken down into three
steps. In this way, each development step would be less than a year's worth of effort  which would avoid
some of the problems with priority changes, budget, and personnel turnover.

An attempt to segment a  large project into smaller steps would  require good design and extensive
documentation to assure that all the pieces of the system would fit together when all of the steps are
completed.
                                              25

-------
                                    BIBLIOGRAPHY

Data Base Systems Infotech International Limited, Nicholson House, Berkshire, U.K., 1975

Demonstration of a State Water Quality Management Inforamtion System. EPA Publication 600/5-74-022

Modular Programming, Maynard J., Petrocelle Books, New York, 1972

Top-Down  Structured  Programming   Techniques,  McGowan,  Clement  L.  and  Kelly,  John R.,
Petrocelle/Charter, New York 1975
                                           26

-------
                                   TECHNICAL REPORT DATA
                            (Please read Instructions on the reverse bejorc
1. REPORT NO.
  EPA-600/5-78-007
                              2.
                                                            3. RECIPIENT'S ACCESSION-NO.
4. TITLE AND SUBTITLE
         Data Base  System For State Water Quality
         Management Information  System
                                                   5. REPORT DATE

                                                       11/77
Date of
Prpna rat
                                                                          on
                                                   6. PERFORMING ORGANIZATION CODE
7. AUTHOR(S)
                                                            8. PERFORMING ORGANIZATION REPORT NO.
         John  Kitch,  Associated  Staff
9. PERFORMING ORGANIZATION NAME AND ADDRESS
                                                            10. PROGRAM ELEMENT NO.
Commonwealth  of Pennsylvania
Department  of Environmental  Resources
Bureau of Water Quality  Management
Harrisburg,  PA.  17120
                                                            11. CONTRACT/GRANT NO.

                                                                 S-801000
12. SPONSORING AGENCY NAME AND ADDRESS
         Office  of Air, Land  and  Water Use
         Office  of Research and Development
         JU.S.  Environmental Protection Agency
         Washington D.C.   20^*60
                                                   13. TYPE OF REPORT AND PERIOD COVERED
                                                       Final                   	
                                                   14. SPONSORING AGENCY CODE

                                                       EPA/600/16
15. SUPPLEMENTARY NOTES
         Appendixes will be  updated preiodically.
         Source  Programs are available from  performing organization.
16. ABSTRACT
         This  report describes the WAMIS  Release II Date  Base  Management  System
         as  developed by  the above performing organization.   It includes  System
         Design, Development procedures,  Development Procedures, Overview of  Data
         and a discussion  of problems and recommendations.   The appendixes, which
         are available  from the performing organization,  contain the System detail.
17.
                                KEY WORDS AND DOCUMENT ANALYSIS
                  DESCRIPTORS
                                               b. IDENTIFIERS/OPEN ENDED TERMS
                                                                 c. COSATl Field/Croup
          Data Base  Management System
          Top Down  Design
          Water  Quality Information
           Management
                                        Information Sciences
                                        Automatic Indexing
                                        Documentation
 05/B
 Behavioral  and
 social  sciences
13. DISTRIBUTION STATEMENT

          Release Unlimi ted
                                      19. SECURITY CLASS (This Report/
                                       Unclassified
 21. NO. OF PAGES
       27
                                      20. SECURITY CLASS (This page)
                                       Unclassified
                                                                 22. PRICE
EPA Form 2220-1 (9-73)
                                              27
                                                 ft U.S. GOVERNMENT PRINTING OFTICfc 1978— 260-880/104

-------