FEASIBILITY STUDY: USE OF USER PROMPTED GRAPHICS
DATA EVALUATION (UPGRADE) SYSTEM BY EPA
Prepared for:
Environmental Protection Agency
and
President's Council on Environmental Quality
under
Contract EQ7AC021
Prepared by:
Automation Industries, Inc.
Vitro Laboratories Division
14000 Georgia Avenue
Silver Spring, Maryland 20910
November 1978
-------
FOREWORD
The User Prompted GRAphic Data Evaluation (UPGRADE) System provides
an on-line, graphic, statistical analysis tool for a wide spectrum of
multi-media environmental data. This feasibility study of its potential
use in EPA defines a user community, its experience in utilizing UPGRADE
to assist in data analysis, summarizes the findings and conclusions, and
recommends a specific course of action.
The Management Information and Data Systems Division (MIDSD) has
reviewed the study in draft form, and has concurred with the recommen-
dation that UPGRADE be co-sponsored by EPA. Furthermore, MIDSD
recommends that UPGRADE be used by the present set of interested users
on the NIH-DCRT computer complex, until EPA's requirements either exceed
the usage threshold at NIH or the graphics analysis support capabilities
and requirements evolve in other directions. A review and update of this
feasibility study by August 1, 1979, to determine alternatives for use
of UPGRADE in FY80, was recommended.
-------
ACKNOWLEDGEMENTS
Key contributors to this feasibility study include: from EPA,
Dr. Lance Wallace, Dr. Wayne Ott, Dr. James Reisa, Mr. Elijah Poole,
and Mr. Charlie Poole; from CEQ, Dr. Douglas Buffington; from Vitro
Laboratories, Messrs. Mark Dorlester, Joseph Higgins, Hartley Holte,
Alan Rundquist, Gary Sitek, Char-1-es Wellander, and John Terebey; and,
Mr. Larry Milask and staff from Sigma Data Computing Corporation.
-------
TABLE OF CONTENTS
I. MANAGEMENT SUMMARY 1-1
A. Background and Objective 1-1
B. UPGRADE 1-1
C. Comparison to EPA Systems 1-1
D. User Needs 1-2
E. Usage Summary 1-5
F. Alternative Solutions 1-6
G. Evaluation of Alternative Solutions 1-6
H. Recommendation 1-7
I. Project Plan Outline 1-7
II. NEEDS ANALYSIS AND EVALUATION CRITERIA II-l
A. Mandates II-l
B. UPGRADE II-3
C. Program and Computer Environments II-7
D. User Requirements II-8
E. Ranked Major Outputs Required from UPGRADE 11-17
F. UPGRADE Acceptance Criteria 11-17
G. Summary of Savings and Benefits 11-20
III. FEASIBLE ALTERNATIVE SOLUTIONS III-l
A. OPTION I - EPA Direction of UPGRADE III-l
B. OPTION II - EPA Co-Sponsorship of UPGRADE III-3
C. OPTION.Ill - EPA Limited Use of UPGRADE III-3
D. OPTION IV - No EPA Involvement/Use of UPGRADE III-3
E. UPGRADE Changes III-3
F. Major Benefits III-5
G. Functional Advantages and Disadvantages III-5
IV. EVALUATION OF ALTERNATIVE SOLUTIONS IV-1
A. Cost Assumptions IV-1
B. One-Time Costs IV-1
C. Recurring Costs IV-3
D. Development Lead-Times IV-5
E. Cost/Benefit and Cash Flow Analysis IV-6
F. System Acceptance Ratings IV-6
V. RECOMMENDATIONS V-l
A. Initial Recommendation V-l
B. Subsequent Recommendation V-l
C. Time Phasing of Enhancements V-l
D. Project Plan V-l
-------
LIST OF APPENDICES
APPENDIX PAGE
A HISTORY OF UPGRADE A-l
B COMPARISON OF UPGRADE TO OTHER
EPA SYSTEMS B-l
C INTERAGENCY AGREEMENT BETWEEN
THE COUNCIL ON ENVIRONMENTAL
QUALITY CCEQ) AND THE ENVIRONMENTAL
PROTECTION AGENCY CEPA) C-l
D UPGRADE EVALUATION REPORTS D-l
E CONVERTING UPGRADE TO OTHER SYSTEMS E-l
F POSSIBLE CORE SAVINGS IN UPGRADE F-L
G UPGRADE REPORTS G-l
ii
-------
LIST OF FIGURES
NO. TITLE PAGE
1-1 SUMMARY 1-4
II-l COMPARISON OF EXISTING EPA SYSTEMS II-6
II-2 USER SUMMARY II-9
II-3 USER SUMMARY 11-10
II-4 USER SUMMARY 11-11
II-5 INTERFACING STORET DATA TO UPGRADE 11-12
II-6 INTERFACING STORET DATA TO UPGRADE 11-13
II-7 DATA BASE COMMENTS 11-15
II-8 DATA BASE COMMENTS 11-16
II-9 UPGRADE COMMENTS 11-18
11-10 UPGRADE COMMENTS 11-19
III-l UPGRADE CURRENT USES III-7
III-2 UPGRADE FY 79 FUNDED USES III-8
III-3 UPGRADE PROJECTED USES 111-10
IV-1 LIFE CYCLE ONE-TIME COSTS (IN $1,000) IV-4
IV-2 DEVELOPMENT LEAD-TIMES IV-7
IV-3 DEVELOPMENT LEAD-TIMES IV-8
IV-4 DEVELOPMENT LEAD-TIMES IV-9
IV-5 OPTION I ($000) ALTERNATIVE 1 (NIH) IV-11
IV-6 OPTION I ($000) ALTERNATIVE 2 (COMNET) IV-12
IV-7 OPTION I ($000) ALTERNATIVE 3 (RTF) IV-13
IV-8 OPTION I ($000) ALTERNATIVE 4 (GSA) IV-14
IV-9 OPTION II ($000) IV-15
IV-10 UPGRADE ACCEPTANCE CRITERIA IV-20
V-l PROJECT PLAN ALL USER NEEDS PROGRAMMING V-3
ill
-------
LIST OF FIGURES (Continued)
NO. TITLE PAGE
B-l TOTAL PHOSPHATES AT HURON RIVER STATION B-10
B-2 GRAPH BACKGROUND ELEMENTS B-ll
B-3 GRAPH DATA ELEMENTS B-12
B-4 ORIGINAL GRAPHS WITH TEXTUAL ANNOTATION B-13
B-5 EXAMPLES OF GRAPHICAL ANNOTATION B-14
B-6 POSSIBLE COMPOSITION LAYOUTS B-15
F-l OVERLAY STRUCTURE F-2
F-2 ROOT SEGMENT F-3
F-3 OVERLAY A F-4
F-4 OVERLAY C F-5
F-5 OVERLAY D F-6
iv
-------
SECTION I
-------
I. MANAGEMENT SUMMARY
A. Background and Objective
Since its establishment in 1970, EPA has placed priority on the develop-
ment and acquisition of environmental data bases to provide a basis for
analysis of environmental status and trends. Much data has thus been compiled
to date, and the current emphasis is to develop tools to support analysis of
these data. In particular, tools are needed to support technical and policy-
level analyses and involving multi-media (e.g., air vs. specific health effects)
Such tools must meet a wide range of criteria - from management access and
confidence to rapid response and cost-effectiveness to proven analytical
accuracy - if their application to EPA needs is to be both successful and
practical.
The objective of this study is to determine the feasibility of adapting
CEQ's User Prompted GRAphic Data Evaluation (UPGRADE) system for EPA use to
satisfy these needs. UPGRADE has been successfully used by CEQ since 1975,
and has been adapted for use in support of environmental analysis needs of
other federal and state agencies. An Interagency Agreement between EPA and
CEQ provides specific guidelines for this study (see Appendix C).
B. UPGRADE
UPGRADE (User Prompted GRAphic Data Evaluation) is a versatile system for
analyzing computer stored data on the environment, natural resources, public
health, and related topics. Employing ordinary English language instructions,
step-by-step analysis, and graphic display, UPGRADE is designed for effective,
immediate use by managers and scientists with little or no computer training.
UPGRADE attempts to interactively guide and prompt the user by English lan-
guage instructions and responses through all phases of data selection, proces-
sing, and display. Most UPGRADE users have been able to prepare their own
"presentation-ready" copy of analysis results after less than an hour's demon-
stration of the system. UPGRADE is presently being used for environmental and
health studies at CEQ and has been planned for future use by other Federal,
State, academic, and private organizations.
C. Comparison to EPA Systems
The major EPA data gathering and data base systems consist of SAROAD and
STORET. EPA is also planning to install an interactive graphic system called
ADROIT to work in conjunction with the STORET data base. A detailed compari-
son of these systems with UPGRADE revealed the following:
SAROAD - A batch-data entry, storage, and retrieval system for air data
STORET - A batch system with some graphics and analysis capabilities, for
water data
ADROIT - An interactive graphic system geared for water data that could
add other media data for cross-correlation. It has no mapping
capability and is more oriented to a computer trained user. This
system is presently being rewritten so that it can run on the
COMNET computer.
1-1
-------
UPGRADE - An interactive graphic system designed for cross-correlation of
various media data. It has mapping capabilities and is geared to
the non-computer trained EPA researcher.
UPGRADE was found to provide the most capability and required the least
amount of training for the non-computer trained EPA researcher. However,
UPGRADE would not replace these existing systems; it would supplement EPA's
analysis capabilities by adding the ability to correlate environmental data to
health data (i.e., multi-media capability). Cost-effectiveness is highly
favorable with respect to alternatives.
D. User Needs
EPA's evaluation of UPGRADE was actively conducted by six individual
users. Two other potential users have identified requirements and another is
using UPGRADE on a trial basis. One of the six evaluators is now actively
using UPGRADE under a separate IAG with CEQ. Each evaluation report has been
included in Appendix D. Other EPA evaluators have either expressed no inter-
est or their evaluations were not sufficiently detailed to permit quantitative
analysis. These comments are included in Appendix G. The evaluations brought
a variety of results, notably that UPGRADE increases the user's analysis
capabilities at equal or significantly reduced cost. However, each user has
indicated a need or desire for some form of improvement or enhancement to the
UPGRADE system as a whole. These requirements were categorized as:
Essential - User cannot effectively use UPGRADE without the capability
Necessary - Would make the use of UPGRADE easier, should be incorporated
as money and priority allows
Desirable - Nice to have, would enhance the system
1-2
-------
Those requirements termed "Essential" are listed below, along with re-
questing EPA office, and a CEQ User Support Group time and cost estimate:
CEQ Estimate
Requirements
Data compression or
restructured Data Base
Refine Terse Mode
Internal Data
Management System
Limited Batch Mode
Moving Means
Multiplot
All Essential requirements
other sponsors of UPGRADE
months .
Requester
OAQPS/RTP and Region X
ORD/OMTS/Las Vegas
ORD/OMTS/Las Vegas and
Region X
ORD/HQ
Region X
Region III and ORD/HQ
TOTAL
Cost Time
$ 7,500 4-6 weeks
10,000 5-8 weeks
5,000 4-5 weeks
Available in FY 79
10,000 8 weeks
10,000 8 weeks
$42,500
except "Moving Means" have already been funded by
and are scheduled for implementation in the next few
Those requirements termed "Necessary" are:
Requirements
Improved STORET
Interface
Improved SAROAD
Interface
Data Extraction from
Tape Storage
Super Terse Mode
(Full Batch Capabilities)
Add User Analysis
Requester
ORD/OMTS/Las Vegas and
Region X
OAQPS/RTP and Region X
and III
ORD/HQ
ORD/OMTS/Las Vegas and
OAQPS/RTP, and ORD/HQ
ORD/OMTS/Las Vegas and
CEQ Estimate Yr.
Cost Time Planned
$15,000 12 weeks 1
15,000 12 weeks 1
3,000 2 weeks 1
35,000 24 weeks 2
9,000 4 weeks 2
Routines
3D Plots
OAQPS/RTP
Region III
Evaluate Available
Packages
1-3
-------
SUMMARY
wr
System Acceptance Rating
(0-600)
Total Present Value Net Savings
($000) (LIFE CYCLE)
Total Present Value Costs
($000) (LIFE CYCLE)
RECOMMENDATION
OPTIONS
I
EPA DIRECTION OF UPGRADE
ALTERNATIVES
1
NIH
417
210.8
2381.3
2
COMNET
373
(2040.5)
4632.6
3
RTP
315
(1455.7)
4047.8
4
GSA
429
(1439.7)
4031.8
II
EPA CO-
SPONSORSHIE
OF UPGRADE
505
1099.7
1492.5
X
III
EPA LIMITED
USE OF
UPGRADE
385
Not Applicable
User Dependent
(30.8/User/Yr)
IV
NO EPA IN-
VOLVEMENT/USE
OF UPGRADE
80
Not Applicable
Not Applicable
Figure 1-1
-------
Requirement
Increase Plot Size
Add User Defined
Models
Contour Plots
Data Save
Capabilities
Requester
OAQPS/RTP
ORD/OMTS/Las Vegas
Region 111
ORD/HQ and
ORD/OMTS/Las Vegas and
Region III
CEQ Estimate
Cost Time
7,500
20,000
6 weeks
16 weeks
Yr.
Planned
1
2
Available in FY79
30,000 24 weeks
TOTALS Yr
Yr
1
2
$ 70,500
64,000
TOTAL $134,500
Those requirements termed "Desirable" are:
Requirement
Overlying Points on
Graph
Improved Alphanumeric
Axis Description
Interactive SPSS
E. Usage Summary
Requester
ORD/HQ
ORD/HQ
ORD/HQ
CEQ
Cost
$10,000
5,000
15,000
Estimate
Time
8 weeks
4 weeks
12 weeks
Yr.
Planned
3
3
3
TOTAL $30,000
UPGRADE, being an operational system, is being used now by the CEQ, DOE,
and EPA to correlate aspects of water and air quality to mortality rates, to
produce demographic maps, and to accomplish other related studies. During FY
79, the planned and funded UPGRADE uses include regional Health and Energy
facility siting studies, CEQ air quality emission data analysis, USGS NASQAN
biological data analysis, and EPA/OTS mortality studies.
The addition of more and varied data for prospective users would invite
many more studies and outputs. Typical data additions would include: Ambient
Air, Multiple causes of death, Morbidity, Drinking water, Altitude, Phytoplank-
ton, Hanes II, and Wisconsin's Industrial Effluent data. These data added to
other data accessible by UPGRADE could produce: maps on air quality for dif-
ferent pollutants and aggregations, variations in contributing causes of death
with water and air quality, and morbidity; studies on drinking water versus
cancer, altitude versus heart disease, and phytoplankton distributions and
indices of trophic state; analysis of CO exposure in the blood and CO levels
in breath samples; and correlations of source and ambient water data in the
1-5
-------
Wisconsin area. Cost estimates have been made for the first three of the
above data additions. The cost averages $55,000 per data type.
F. Alternative Solutions
There are four options for EPA related to the use of the UPGRADE system.
They are:
1. OPTION I - EPA Direction of UPGRADE
EPA would be the main UPGRADE sponsor or would take a version of
UPGRADE for its own use. For this option, EPA requires a minimal User Support
Staff of six. There are hardward alternatives associated with this option.
They are: to remain on the NIH-DCRT computer; move to the EPA-COMNET computer
(requiring 20% reprogramming of UPGRADE); move to the EPA-RTP computer
(requiring 80% reprogramming of UPGRADE); or move to a commercial computer
(requiring no reprogramming). This option would best serve more than 15
dedicated EPA users.
2. 'OPTION II'- EPA Co-Sponsorship of UPGRADE
EPA would co-sponsor UPGRADE at a level that would entitle EPA to
heavily influence the development to meet EPA requirements. For this option,
EPA requires a dedicated UPGRADE coordinator and CEQ liaison attuned to EPA
requirements. The system would continue to run at NIH-DCRT. This option
would best serve a dedicated EPA user base of 5 to 15.
3. OPTION III - EPA Limited Use of UPGRADE
EPA would be a user of UPGRADE, only able to buy certain "goods"
from available stock, but not able to influence the development of UPGRADE.
This option would best serve a user base of 1 to 4.
4. OPTION IV - No EPA Involvement/Use of UPGRADE
EPA should exercise this option if no users are found for UPGRADE.
G. Evaluation of Alternative Solutions
Each option for satisfying EPA's requirements was evaluated on a cost
basis over a 5-year life cycle and against a set of 10 system acceptance
criteria. To perform the cost evaluation, assumptions were made from discus-
sions with EPA evaluators and the tests performed by the CEQ User Support
Group. The assumptions concerned percent of system use, average computer/an-
alysis time, average data requirements and outputs. Estimates were provided
by the CEQ User Support Group for system enhancements and conversions. The
costs are summarized by non-Recurring, Recurring and Net Cost Savings. The
quantification of benefits was difficult because a number of studies that were
performed as part of the individual EPA evaluations would not have been attemp-
ted without UPGRADE. This is due to the separate physical locations of the
data and the laborious process to read listings, extract values, perform
statistical analysis and plot the results. This intangible benefit has been
given a nominal cost value for comparison purposes in the Cash Flow Analysis.
The anticipated dedicated EPA user base was set at "9" based on the Evaluation
1-6
-------
reports. More less-dedicated users or less more-dedicated users will not
affect the cost analysis. The resulting total life cycle costs and savings
are summarized on Figure 1-1. As can be seen, Option 11 provides the greatest
Net Cost Savings for the given user base.
To generate a System Acceptance Rating, each criteria was defined and
weighted according to relative importance to EPA as a whole using the UPGRADE
system. Each option was then rated as to overall satisfaction of the criteria,
taking into account functional advantages and disadvantages, and the incor-
poration of user defined requirements. Figure 1-1 shows that Option II has
the highest System Acceptance Rating. The sensitivity of the rating is great-
est for the number of estimated EPA users, followed by Response Time Benefits
and Analysis Benefits.
H. Recommendation
Based upon the foregoing analysis, the most cost effective solution for
implementation of the UPGRADE system is Option II. It has the highest System
Acceptance Rating and the greatest Net Cost Savings for the presently anti-
cipated volume of users. Its cost savings exceed other options by at least
$350,000 over the five-year life cycle. Other options may be selected at EPA
descretion should the number of users significantly change. EPA, as a co-
sponsor of UPGRADE, may wish to foster a maintenance program as defined in
Appendix F, software configuration control, and exert influence to accomplish
EPA user requirements termed "Necessary" and "Desirable" in accordance with
the schedule in Section IV. In the event that UPGRADE must be moved off the
NIH-DCRT computer, EPA may wish to consider the alternate site analysis of
Option I and locate UPGRADE on a commercial service bureau such as the GSA-
Boeing computer site. A version may be retained on the NIH-DCRT computer as a
development system because of its relatively low cost.
I. Project Plan Outline
• Designate EPA Coordinator
• Put Initial Users on the System
• CEQ User Support Group accomplish Program Modifications
• EPA/CEQ prepare detail schedule for programming "Necessary"
and "Desirable" System Enhancements
• Write IAG for execution
1-7
-------
SECTION II
-------
II. NEEDS ANALYSIS AND EVALUATION CRITERIA
A. MANDATES
The logical follow-up to the National Environmental Policy Act (NEPA)
was the creation of a single agency to oversee the mandates of the Act.
Thus, the Environmental Protection Agency was established on December 2,
1970, through an executive reorganization plan, to consolidate certain
federal environmental activities set forth by NEPA. Since its inception,
EPA's mission has been defined, and re-defined, through enactment of the
following laws:
• The Clean Air Act Amendments of August 1977
• The Federal Water Pollution Control Act Amendments of 1972
• The Federal Insecticide, Fungicide, and Rodenticide Act
Amendments of 1972
• The Marine Protection, Research and Sanctuaries Act of 1972
("Ocean Dumping")
• The Noise Control Act of 1972
• The Toxic Substances Control Act of 1977
• The Resource Conservation and Recovery Act of 1977
• The Safe Drinking Water Act Amendments of November 1977, and
• The Clean Water Act Amendments of 1977
EPA's priority mandate for the first years of operation has been the
development and acquisition of environmental data bases to provide a base for
analysis of the environmental characteristics and trends. Tools to develop
analyses of these data, especially in areas of cross-correlation, such as
ambient environmental quality data with health effects data, have received
secondary attention. The need to examine many environmental variables to
identify potential cause-effect relationships is increasing.
The National Academy of Sciences has reported that most EPA environ-
mental data are not analyzed adequately and, when published, contain no
interpretation of the data. It further states that although EPA maintains
large environmental data banks, "Little effort has been made to relate these
data to data on health and ecological effects although opportunities exist
to do so, for example, with the following: data on morbidity and mortality
from the National Center for Health Statistics (NCHS); data on cancer from
the National Cancer Institute (NCI); ...."
Several additional reports have been written by the General Accounting
Office criticizing EPA's performance in and response to specific program
areas. Typical critiques include:
• AIR
"Federal Programs for Research on the Effects of Air Pollutants"
"Pollution from Cars on the Road-Problems in Monitoring Emission
Controls"
II-l
-------
• WATER
"Implementing the National Water Pollution Control Permit Program:
Progress and Problems"
"Better Data Collection and Planning Is Needed to Justify Advanced
Waste Treatment Construction"
"Problems and Progress in Regulating Ocean Dumping of Sewage
Sludge and Industrial Wastes"
"Continuing Need for Improved Operation and Maintenance of Municipal
Waste Treatment Plants"
"National Water Quality Goals Cannot Be Attained Without More
Attention to Pollution from Diffused or 'Nonpoint' Sources"
• RADIATION
"The Environmental Protection Agency Needs Congressional Guidance
and Support to Guard the Public in a Period of Radiation Prolifer-
ation"
"Efforts by the Environmental Protection Agency to Protect the
Public from Environmental Nonionizing Radiation Exposures."
• PESTICIDES
"Federal Pesticide Registration Program: Is It Protecting the
Public and the Environment Adequately from Pesticide Hazards?"
"Special Pesticide Registration by the Environmental Protection
Agency Should Be Improved"
• NOISE
"Noise Pollution - Federal Program to Control It Has Been Slow and
Ineffective"
• GENERAL
"Environmental Protection Issues Facing the Nation"
"GAO Reviews of Federal Environmental Research and Development"
The addition of appropriate data from these program areas to UPGRADE
could Increase the quantity and meaningfulness of data analysis that could
be performed by EPA's research staff by reducing the effort presently re-
quired to locate, peruse, and identify significant interactions, and lucidly
display the results.
The National Academy of Sciences, in the previously mentioned report,
has cited CEQ's use of UPGRADE as an effective method of responding to these
cross media analysis requirements.
II-2
-------
A need for rapid Investigation of available monitoring data on any
given pollutant has also persisted. High interest in a given pollutant by
Congress and the public and other requests for fast turn-around analyses
and maps, charts, graphics and other information aids will predictably
continue in the future.
An on-line interactive system with user prompting and rapid graphics
output would support these requirements.
B. UPGRADE
1. Definition
The CEQ undertook the responsibility of designing a new, versatile
computer system which would analyze computerized data relating to the environ-
ment, natural resources, public health, and related topics. Their end-product
is UPGRADE - User Prompted GRAphic Data Evaluation. One of the prime criteria
in the design of UPGRADE was flexibility. The system can be used with data
from any field of scientific endeavor that requires graphic display and/or
statistical analysis. The system was developed to:
• provide easier access to computerized environmental data
• make these data available to a larger portion of the nation's
scientists and managers
• facilitate more efficient and convenient environmental assessments
• foster increased uses for available environmental data
• provide better capabilities for identifying correlations between
factors represented in different computerized data (e.g., air
quality and health), and
• improve environmental research and data collection programs through
the insights and feedback provided by users of the UPGRADE system.
Although UPGRADE is designed to run on an IBM 360/370 computer equipped
with TSO CIBM's acronym for "Time Sharing Option"), other computer manufacturers
also have systems which perform the same functions as TSO, and UPGRADE can be
adapted to run under these systems.
The data presently available on the UPGRADE system have been selected from
several major Federal data banks which contain information relating to the
environment, natural resources, public health, and related topics.
The analytical and graphic features presently available on UPGRADE include
basic statistical summaries, data sorting and ranking, data transformations,
data partitioning, scatter plots, least squares regression with Nth-order
polynominal curve-fitting, polygon (point-to-point) plots, shaded bar graphs,
and shaded maps.
II-3
-------
UPGRADE also provides extensive plot-modification capabilities. The
user can interactively tailor any graphic display by altering the statistics
plotted, the axis scales or annotations, the plotted symbols, the ranges
and levels of shading, or a variety of other specifications. These features
were developed because it is not always possible to anticipate at the outset
what format or results will be most informative, and varying analytical
demands required flexible capabilities for visual presentation.
The graphic display and statistical results of UPGRADE analysis can be
produced immediately on an appropriate remote terminal video screen (with
optional hard copy capability) or a standard typewriter terminal. Automatic
sequencing options are also available, permitting the user to specify that
similar graphs be made for a selected series of variables or data sets.
UPGRADE data bases are carefully screened for adequacy of the variables
measured, measurement frequency, period of record, geographic location,
quality assurance, and other criteria, depending on the type and intended
use of each data. Thus far, most of UPGRADE'S data have been selected from
data banks maintained by the Environmental Protection Agency (primarily
SAROAD and STORET), the National Institutes of Health, the U.S. Geological
Survey, and various state agencies.
The designers of UPGRADE envision a community of users from a variety
of disciplines representing Federal Government, state and local agencies,
academic institutions, and private interests. As more data sets are inte-
grated into the system, they can be made available to all the users.
2. Use
CEQ and collaborating Federal agencies have been screening and selecting
data for use in UPGRADE since 1975. Since that time, UPGRADE has been used
to support:
• CEQ Annual Reports
• Other CEQ analysis for reports to the White House and/or OMB
• State of New Jersey environmental analysis
• CEQ reports supplemented to the Annual Report
• Department of Energy Atlas
• Mapping of NCHS cancer death rate data
Specific examples of its use are contained in Appendix A.
3. EPA Systems vs. UPGRADE Capabilities
In terms of EPA requirements for increased analytical capabilities,
existing EPA systems and a proposed EPA system were compared to UPGRADE.
II-4
-------
Figure II-l lists the requirements and the relative ability of each system to
satisfy those requirements. A detailed comparison of each system's capabil-
ities is contained in Appendix B.
The salient points of the comparison are that SAROAD is basically a
batch-data entry, storage, and retrieval system for air data. It has some
batch statistical capabilities. STORE! is a batch data entry, storage and
retrieval system for water data. It has some graphics and some batch analysis
capabilities. ADROIT is an interactive graphic system currently being re-
programmed by an EPA contractor to operate on the EPA-COMNET computer
primarily with water data. It has the ability to correlate water quality
data with other data. However, there are no computer run time statistics
available at this time for comparison, and it is geared for a computer-
trained user. UPGRADE has the capability to correlate data from various
media (including STORET and SAROAD), is fully interactive, has mapping capa-
bilities and is geared for the non-computer trained user. UPGRADE is
operational now, accessible to EPA users and contains a significant EPA-
applicable data base. UPGRADE does not supplant existing systems; it adds
more analysis capability to EPA researchers. STORET and SAROAD are EPA water
and air data bases. ADROIT adds water analysis capabilities and UPGRADE adds
the capability to correlate environmental data with health information.
There are two other interactive graphics systems, one by CALCOMP and the
other by Tektronix. These are EPA systems and are being compared to UPGRADE
by Region III personnel. Judged on the graphic features alone, the CALCOMP
system was rated higher, with UPGRADE ranked second. However, the results
indicate that UPGRADE is easier to use and costs about one-half of the other
systems. The evaluation report generated by Region III is included in
Appendix D.
An Interagency Agreement was signed by CEQ and EPA to analyze UPGRADE
capabilities and its ability to satisfy EPA needs. The remainder of this
study, therefore, is a cost benefit analysis on the use of UPGRADE.
4. Interagency Agreement
Executed in September 1977, the Interagency Agreement between the
Council on Environmental Quality and the Environmental Protection Agency
(See Appendix C) provides that EPA will join CEQ and DOE as a co-sponsor in
the development of the UPGRADE system, sharing all rights to the use and
sponsorship of the system.
The LAG prescribes five tasks:
a. System Installation Survey
Presently, UPGRADE is supported by the NIH Computer Center.
Identify the possible EPA installations which might support
UPGRADE and determine the effects. These include:
• Government IBM TSO environment
• Government UNIVAC 100 (RTF) environment
II-5
-------
Requirements
Data
Access
For:
Air Quail LV
U.icor Quality
Demographic
Health
Other Envir.
User's Daca
UVIA Ll.sLlin-
D.it.i Manipulation
Basic Statistics
Polygon Plots
Bur Charts
Regress! oa
I'crcoiul hjs
Moping
Co isolation
Interactive vs. Batch
User background rcq.
Date of information
STORET
NO
Main EPA UB
NO
NO
NO
ONLY IF IN STORET
YFS. BATCH JOB
FILTERING
BATCH JOB
STRAIGHT I INC
NO
REG PROGRAM
l>rGKCE=l
STAND program
LOG program
RUG program
Batch
SOME TEXT EDITOR &
KOLI.OW MANUAL
June '77
ADROIT
NO. bur could
SET UP FOR IMMEDIATE
COULD BE ADDED
COULD BE ADDED
COULD BE ADDED
YES, THROUGH MTS
TO TERMINAL OR OTHER
I/O DFVKL"
FULL TRANSFORM
CAPABILITY
YES; PLUS AGGREGATION
AND OTHER STATISTICS
STRAIGHT OR HASH
SHADED WIDTH & NUMBER
CONTROL
DrfiREE=l to 9
FULL USER PROGRAMMED
NO
NO as standard
CAN BL USER PROGKAMMED
INTERACTIVE
FORTRAN or BASIC helpful
not req'd; learn manual
June '75*
SAROAD
Main EPA nfl
NO
NO
NO
NO
ONLY IF IN SAROAD
YES, BATCH JOB
FILTERING
BATCH JOB
NO
NO
NO
NO
Bn f rh
•Some text editor &
follow manual
'71, '73 '74
UPGRADE
liY TAPE KltOri SAMOA!)
BY TAPE FROM STORET
VERY LI1TLL
BY TAPE FROM NCIIS
SOME: oil spill, etc.
YES, CFNERAL INTERFACE
TO TER1I1NAL OR DISK
FILTERING : TRANSFORM PLANNED
YES; PLUS SAS
STRAIGHT OR DASH
SHADED
NUMDFR CONTROL
DFRRFF-1 r-n fi
PARTITIONING SUItROHTTNES
NASQAN, COUNTY,
AS PART OF REGRESSION
TNTFRAPTTVr
No computer's data analysis
-------
• COMNET/ALPHA environment
• Commercial IBM TSO environment
b. User Needs Survey
This task has two elements. The first is the identification
of offices which have a high and immediate interest in UPGRADE,
and identification of their requirements not directly met by
UPGRADE. The second element is the survey of potential users
to determine overall EPA requirements for the system. This
effort was redirected towards working with UPGRADE evaluators
in key offices. The EPA Project Officer determined that EPA
Staff Coordinators in each major office would be used to
survey the technical staffs, on an informal basis, to identify
potential users. As the basis for evaluation, they would also
support these users in the practical use of UPGRADE in solving
a current problem. This provides a more effective base than
the approach originally envisioned.
c. System Design Analysis
The user requirements will be analysed in terms of the imple-
mentation times and costs to permit evaluation of their cost
effectiveness by EPA management. A program plan for the added
software requirements will be prepared so as to show the time
frames in which they could be expected to be added to the
production system, adding that factor for management consider-
ation in setting priorities.
d. System Management Requirements
The CEQ User Support Staff is developing a production soft-
ware configuration control system to assure the orderly
meeting of the many user community requirements for software
extensions to the system without degrading current production
system performance by the errors commonly introduced with
software modification/extension. The specification for
software changes to meet (1) new user requirements and (2)
transport UPGRADE to a new computer environment if necessary
will be designed to keep the control and maintenance of
UPGRADE software at a production level.
e. Documentation
The CEQ User Support Group will prepare a user manual specifi-
cally addressing the requirements of the EPA user community.
C. PROGRAM AND COMPUTER ENVIRONMENTS
The potential users of UPGRADE in the EPA are non-computer trained
scientists and analysts doing environmental research, problem solving, air
and water quality reports, trend analysis and correlation studies. Since
II-7
-------
all potential users of this system cannot conduct an individual evaluation,
there are designated coordinators and potential users doing an active evalu-
ation of UPGRADE while others have indicated their intent to use the system
if acquired. Six active evaluations have been conducted by: the Office of
Air and Waste Management (OAWM) (1); the Office of Research and Development
(ORD) (2); the Office of Toxic Substances (OTS) (1); Region III (1), and
Region X (1). Evaluations are also under way in the Office of Planning and
Management (0PM) (1) and the ORD (1). ORD also has another potential user
with defined requirements. Each evaluation report is included in Appendix D.
Other EPA evaluations have either expressed no interest or their evaluations
were not detailed enough to be included in this analysis. These comments
are included in Appendix G. Additional statistics were gathered via per-
sonal communications, but some evaluators/users were unable to quantify
their evaluation. A summary of each user's evaluation (use) objective,
current capabilities, data volumes and costs (if available) is contained in
Figures II - 2, 3 and 4. There are a variety of results, notably that UPGRADE
increases the user's analysis capabilities at equal or significantly reduced
cost. However, each user has indicated a need for some form of improvement
or enhancement to the UPGRADE system as a whole. These are addressed in the
remaining paragraphs of this section and are addressed as Data Base require-
ments and UPGRADE requirements.
D. USER REQUIREMENTS
1. Current Data Base Capabilities
Data available to UPGRADE users resides in the Integrated Data Base (IDB)
or UPGRADE Data Base (DB).
This currently consists of:
- County Level Health Data
- County Level Drinking Water Data
- County Level Demographic Data
- General Water Quality Data
- Aquatic Pesticide Data
- National Stream Quality Accounting Network
- General Air Quality Data (limited user-defined subset)
Programs currently exist to extract user designated data from EPA's Water
Quality Data Base (STORET) and Air Quality Data Base (SAROAD) residing in
the computers at COMNET and Research Triangle Park (RTP). Once extracted,
these data sets can be loaded on the computer for access by UPGRADE. There
is also the capability of adding user-supplied data sets that can be kept
current or modified as desired. However, the additions to the IDB require
the intervention of the CEQ UPGRADE User Support Group and sometimes can re-
quire as long as 1-2 weeks for data extraction, format, and load. The inter-
face processes are shown in chart form on Figures II-5 and 6.
2. Additional Data Base Requirements
The evaluation reports have indicated several areas of concern by
potential users in regard to data for UPGRADE. The primary area of concern
for users of air data is the data base structure which severely limits the
II-8
-------
USER SUMMARY
CATEGORIES
Program objectives and prob-
lem definition
Current support systems and
data bases
Interfacing activities re-
quired to reach UPGRADE
Data volumes and special
handling requirements
Current coats vs. UPGRADE
coses
Deficiencies/strengths In
current systems
Projection of future needs
expannlon
OAWM/OAQPS
RTF
Reduction of Air Quality
Data for Trend Analysis
(CEQ Annual Report)
SARD AD
Data extraction and Listing
Extract SAROAD data and
load on UPGRADE. FTS
call. MODEM and terminal
1 yr hourly data - 8,760 pti
required data extraction
and loading on UPGRADE by
CEQ User Support Croup
Mould be comparable for
this application but did
not make use of Inter-
active capability
Current system has no In-
teractive graphic
capabilities or on-line
analysis capabilities
UPGRADE Data Base Improve-
ments and expanded plot
capabilities
REGION
X
Graphs for Regional Profiles
Trend analysis and summary
reporting of Air data
Current Regional SAROAD Data
Ixical mini-computer programs
No Interactive Graphics
Extract SAROAD data and load
on UPGRADE. FTS call,
MODEM and terminal
Not available
No current data on SAROAD
Not available
Current system does not have
programs comparable to
UPGRADE, but can be added.
Region Data Base Is more
current than SAROAD
Add and maintain current
Region data In custom data
set
ORD/OMTS
Las Vegas Lab
Correlate Phytoplankton data
with Water Quality data
Waiting for UPGRADE
Load data on UPGRADE.
Require access to terminal
and MODEM
2,400 lines of STORET data
70.000 lines of Phytoplan-
ton data. Require CEQ
User Support Group to load
data
Potential User
Potential Vast
ORD/OMTS
Las Vegas Lab
Correlate mortality rates
with drinking water con-
stituents
Unknown
FTS call, MODEM and terminal
250 counties, 40 Water
Quality and 20 Disease
variables per county, about
SDK points
3000 hrs versus 120 hours
(o Graphics
JPCRADE Data Base Improve-
ments and additional
analysis routines
Figure II-2
-------
USER SUMMARY
CATEGORIES
Program objectives and prob-
lem definitions
Current support systems und
data bases
Interfacing activities re-
quired to reach UPGRADE
Data volumes and special
hand] inn requirements
Current costs vs. UPGRADE
coats
Deficiencies/strengths in
current systems
Projection of future needs
expansion
ORD/OHEE/HERL
CINN
Correlate cardiovascular
mortality to water hard-
ness
STORET
FTS call, MODEM, and term-
inal
5,000 records, requires CEQ
User Support Croup to
load data
Evaluation In process.
Current system would be
10 times expected coats
No Graphic analysis and
plotting capabilities
Unknown
REGION
III
versus time
(monthly readings)
CALCOMP and Interactive
Graphing Package (IGF)
(Tektronix) STORET
FTS call
12 points on plot
1 parameter. Required
CEQ User Support Croup to
load data S(10n rornrHa
CALCOHP-$17.28 machine +
3.75 Programmer
IGP -$11.47
UPGRADE-} 6.52
UPGRADE - Easier to use
- Output looks good
Environmental Profiles
Public Info. Hedlth Related
Effects 208 Planning
OTS
mortality. Trend analysis.
Quick Response analysis
Access to STORET and SAROAD
files
FTS call, MODEM and terminal
Depends on study if not in
the IDE, extract from
STORET
Some studies could not be
accomplished, others would
cost 5-10 times more
without UPGRADE
No Graphic analysis and
plotting capabilities. No
correlation capabilities,
More data for more
studies
ORD/OHTS
I1Q
Correlate causes of death
tilth environmental varlablei
also intercorrelations of
health, demographic A en.va,
Access to SAROAD and STORET
FTS call. No 1200 Baud
MODEMS as yet, so operating
at 300 baud
Often work with 3082 counties
Cannot do health analysis
without UPGRADE
No access to health variables.
Plotting not available to
More data for more studies
Figure II-3
-------
USER SUMMARY
CATEGORIES
0PM
OE
Program objectives and problem
definitions
Analysis of ambient
levels for lead
Use ADP for storing and
analyzing pollution source
data
Current support systems and
data bases
Manual Investigation
access to SAROAD
PCS, ROEDS, LEDS and CAS -
little access to parametlc
data
Interfacing activities
required to reach UPGRADE
ITS call. MODEM and
terminal
UPGRADE tested while the
user was on detail at CEQ
Data volumes and special
handling requirements
S-10K records require
CEQ User Support Group
to lead data
Not supplied
Current costs vs. UPGRADE
coats
Evaluation In process
Not applicable - no
alternative exists
Deficiencies/strengths In
current systems
Availability of data on
UPGRADE
Projection of future needs
expansion
Additional analysis
No centralized parametric
data for air and water
Centralized data on
emissions, effluent
Figure II-4
-------
INTERFACING STORET DATA
TO UPGRADE
FLOW CHART
NOTES:
LOGON TO EPA/COMNET
COMPUTER FACILITY
ALPHA TEXT EDITOR TO SET
UP JOB
RETRIEVE DESIRED DATA
USING STORET RET PGM
W/MORE=4 OPTION
COPY RETRIEVED DATASET
TO COMNET TAPE
TRANSFER TAPE FROM
COMNET TO NIH/DCRT
INSTALLATION
COPY TO NIH
LIBRARY TAPE
PROCESS DATA THROUGH
UPGRADE PRE-PROCESSOR
RUN DATA TO DISK
• CATALOG DATA FILE
IN UPGRADE DYNAMIC
ALLOCATION CATALOG
• CATALOG FILE ON ISO
EXEC UPGRADE
AND ENTER WATER
QUALITY INTERFACE
Figure II-5
REFER TO STORET MANUAL
RET- RAW DATA
MOREA=COPYABLE TO TAPE
IBM UTILITY IEBGENER
COMNET SCRATCH TAPE
BY COURIER SERVICE USUALLY
WYLBUR TEXT EDITOR
IEBGENER
REFORMATS STORET DATA
AND COMPUTES OVERVIEW
STATISTICS
IEBGENER
TSO CAT COMMAND
11-12
-------
Interfacing SAROAD Data to UPGRADE
Flow Chart
Notes
SAROAD run at RTF
(NADB & INTRFAC2)
SARTRAN
UPGRADE
Preprocessor
UPGRADE
\/
Vitro Mapping
Processor
Using:
get:
AEROS Manuals
Inventory by Site
Inventory by Pollutant
Special listings
Run Totals
Tape for next Step (CEQ.SAROAD,
PSI Format)
add Site Information
get Tempfile in STORET More=4 format
produces:
basic statistics
UPGRADE Data Set (CEQ.UPGRADE.
SAROAD)
produces:
Graphic and Statistical Analyses
Map tape
produces;
CALCOMP Plotter Maps
Figure II-6
11-13
-------
amount of useful data per data set. For example, only one year of hourly
data can be stored on a data set (8,760 points in time X 50 variables per
point) because of the fixed space reserved for 50 variables per point when
there may be only one variable required. Reduction of this fixed space to
one dependent upon the number of variables specified is essential to the
storing of usable air data on the IDS. Current data in the IDB is a concern
of users who are doing work with current year data. There seems to be a time
delay in the loop wherein data comes from the regions into STORET or SAROAD
and is extracted by User request for UPGRADE. This could be overcome by the
regions directly storing their current year data as a "custom" data set and
keeping it current. However, they are dealing with hourly data and required
more than one year's worth of data per data set. This is a current restriction
as previously discussed. The STORET and SAROAD extraction process is another
area of concern. Users feel that the automation of this process or much
quicker access to these Data Bases is required. Automation of this process
would allow the user to directly initiate the extraction process and eliminate
the time required by the CEQ UPGRADE User Support Group to accomplish this
task. If EPA uses UPGRADE heavily, ORD/OMTS/HQ recommends that the ability
to extract data from tape storage be added to reduce the amount of on-line
storage required. A ranked summary of the user's data base comments and
additional requirements are included in Figures II-7 and 8. A time and cost
estimate is also included where applicable. It should be noted that the only
essential requirement is the data base restructuring or data compression
requirement. This requirement is already funded by other UPGRADE users and
is scheduled for completion early in FY 79. The ranking is:
E = Essential - Cannot use UPGRADE without the capability.
N = Necessary - Would make the use of UPGRADE easier, should be
incorporated as money and priority allow.
D = Desirable - Nice to have; would enhance the system.
3. Additional UPGRADE Requirements
Current capabilities of UPGRADE are described in Paragraph II.B.I.
These evaluations have brought out some pertinent points; (1) UPGRADE is an
easy system for those without computer training because of its English language
and conversational prompting; (2) It has the capability to allow the user to
correlate data from almost any accessible data base with other data, such as
air and/or water with mortality rates; (3) The correlations are readily
evident through the use of graphical analysis on a Video terminal with instant
copy available; (4) It invites initiation of studies previously shunned be-
cause of the laborious data extraction and manual correlation required.
However, the evaluations have also indicated the need for additional or
improved capabilities before effective use can be made of UPGRADE. Those
needs that are most pressing seem to be: batch production mode; an improved
Terse operational mode for the more familiar and constant user; the ability
to compute moving means; the ability to use more than one value per plot axis
or multiplot; and an internal data management for storing and revising data.
These requirements and less pressing needs are ranked and summarized in
Figures II-9 and 11-10. Those requirements ranked as Essential, except
Moving Means, are already funded by other UPGRADE users for incorporation in
early FY 79. All UPGRADE users share these benefits.
11-14
-------
DATA BASE COMMENTS
DATABASE
II)B
STOKKT Interface
SAROAD Interface
OMENTS
Excellent concept
Data must be kept current
File Structure/Storage
Needs More Data
User Created/Maintained data
Current Retrieval Capability Needs to be
Automated (Tape to N1H)
Current Retrieval Capability needs to be
Automated (Tape to NI11)
OTS
X
X/N
USERS/Kank
LAS VEGAS
X/N
OAQPS
/RTP
'
X/E
X/N
HERL
/CINN
Region
X
X/E
X/E
X/E
X/N
Region
III
X/D
X/N
X/N
ORD7
OMTS/
11Q
X
X/N
X/N
515K
Coat
Estimate
Another Retrieval
4-6 weeks for
compressions
$7.5K
Depends on User
needs
Available Docunien
tatlon not ready
3 months
515K
3 months
S15K
X •= Comment/Requirement
E = Essential - Cannot use UPGRADE without the Capability
N - Necessary - Would make the use of UPGRADE easier, should be
Incorporated as money and priority allow
D • Desirable - Nice to have; would enhance the system
Figure I1-7
-------
DATA BASE COMMENTS
DATA BASE CAPABILITIES
Cacegoiy
Data Extraction
Fluid Size for
varltibleti
liner defined
variables
Meaningful size
data sets
Available
Integrated Data Base —
predefined parameters
of all Data elements In
DB
SAROAD Programs -
User defined parameters
(Contractor extraction)
STOHET Programs -
User defined parameters
(Contractor extraction)
User Cieated Data -
User defined parameters
Fixed size field for
SO variables
Predefined variables
3.9M bytes per data act
(Because of Fixed size
field for SO variables -
data sets with 1, 2, or
3 variables are limited.
I.e., 1 site year of
continuous data)
Required Additions
User defined parameters and
more direct control
User manageable extraction
User manageable extraction
Ability to extract data
from tape storage
Variable size field depending
on the number of variables
used (compression)
User Creation of new
variables
Adjustment of variable field
size (would allow multlyear
continuous data for
multlsltes)
USERS/Rank
OTS
Las Vegas
X/E
X/E
X/E
OAQPS
Kit
X/E
X/E
X/N
X/E
I1ERL
CINN
Region
X/E
X/E
X/E
Region
III
X/N
X/N
HQ
X/D
X/N
Coat
Estimate
Local Program
can do It at
term site
See DB
See DB
2 weeks
S3K
Figure 11-8
-------
E. RANKED MAJOR OUTPUTS REQUIRED FROM UPGRADE
The additional user output requirements are summarized in Figures II-9
and 11-10 under Plot, Graphic Analysis, and Mapping Categories. Of the addi-
tions, Multiplot is ranked as Essential and has already been funded by
another UPGRADE user, and will be available early in FY 79. Contour plots
will be available in FY 79 and the remainder, not defined as Essential, will
be candidates for early implementation.
F. UPGRADE ACCEPTANCE CRITERIA
1. The acceptance criteria for EPA use of UPGRADE consists of 10 factors,
each rated on a 0-10 scale with "0" being the least significant. In addition,
each factor has a 0-10 weight assigned according to its relative importance
to EPA as a whole using the UPGRADE system. The 10 factors are as follows:
a. Number of EPA Users - The number of current and identified
users within EPA. This is the major determining factor of
whether or not there will be a minimum amount of use to
justify using or supporting UPGRADE. It also is the deter-
mining factor for level of EPA involvement with UPGRADE,
if it is to be used. Consequently, it carries a weight of
10. The rating will be proportional to the number of
identified users.
b. Satisfy User Needs - The ability of UPGRADE to fulfill the
ranked user requirements. The more needs that are satisfied,
the higher the rating. The relative weight of this factor
is 6.
c. Expansion Capability - The ability of the individual options
to accept growth, both in numbers of users and in numbers/
sizes of data sets. The easier it is to expand, the higher
the rating. The relative weight of this factor is 5.
d. Implementation Costs - In terms of the applications identified
in the evaluation process, this is the total one-time cost to
the user to initiate analysis. This includes data acquisition,
data loading, and special program requirements. The lower the
cost, the higher the rating. The relative weight of this
factor is 4.
e. Operating Costs - For each application, this is the cost to
perform UPGRADE analysis. The cost includes data storage,
computer time, support staff and terminal cost over a 5-year
life cycle. The lower the cost, the higher the rating. The
relative weight of this factor is 4.
f. Cost Savings - This is the overall life cycle cost savings
determined by comparing the cost of current analysis to
UPGRADE operating costs and implementation costs. Where
savings are intangible, a nominal figure will be estimated.
The higher the cost savings, the higher the rating. The
relative weight of this factor is 6.
11-17
-------
UPGRADE COMMENTS
UPGRADE CAPABILITIES
Category
Faster System
Analysis
Routines
Plot
Plot
Graphic Analysis
Display features
Available
Verbose and Terse Modes
SAS package
Sort and Rank
CEQ Air Quality Rollback
Model
Basic Statistics
Plot maximum = 400 data
points
Polygon Plot
Bar Chart
Scatter Plot
Multlslte Plotting
Linear Regression
Required Additions
Ruperterse Mode
Batch Production Mode
Improve Terse
Ability to add User Developed
analysis routines
Moving means
Additional averaging
transforms
Add Interactive SPSS
capability
Increase maximum
Overlying points on graph
3D Capability
Multlplot
Alpha Numeric Axis
description
Contour
Ability to add User defined
models
Quick EXIT from System
USERS/Rank
OTS
.
La Veeas
X/N
X/N
X/E
X/N
OAQPS
RTF
X/N
X/D
X/N
HERL
CINN
Region
X
X/E
X/E
Region
III
X/N
X/N
X/E
X/N
X/D
ORD
HQ
X/D
X/E
X/N
X/D
X/D
X/D
X/D
Cost
Estimate
6 months
S35K
Full-wlth S-terse
Llmlted-ln FY79
5-8 weeks
$10K
2-6 weeks
$2.5K - 15K
B weeks
$10K
Add SAS
Procedure
3 months
$15K
1. NC Increase
cores
2. Create Temp.
data set
6 weeks
$7.5K
2 months
$10K
Evaluate avail .
Pkg. & acquire
2 months - $10K
1 month - $5K
Available In FY7<
3-5 months
S15-25K
Available In FY7S
H
M
Figure 11-9
-------
UPGRADE COMMENTS
UPGRADE CAPABILITY
Standardized
Terminology
Standardized
Units
M tipping
Keutart
Capabilities
Founal access
to Ll.e UPGRADE
User Support
Group
Data Manipulation
Confusing terminology
llnlta used In Data Base
County Map
MASCjAH Map
Start Over
Per Contract
Data Filtering
Data 1'jitltlonlng
Can only handle numeric
ES USERS/Bank
Required Additions
Standardize
User controlled variable
transformations
More Maps on Screen
Data save capabilities
No cost access, possible an
In-llouse Croup
Add flags for data not
meeting report criteria
I/O for storing and
revising data
Alpha/Numeric Data
OTS
X/D
Las Vegaa
X/N
X/E
X/E
OAQPS
RTF
X/N
HERR
CINN
Region
X
X/N
X/N
X/E
Region
III
ORD
uq
X/N
X/D
Coat
Estimate
on going
maintenance
Need to be STD at
DB entry level
Part of DB
maintenance
Basic Demo. In
?Y79
4-6 months
$30K
Not feasible In
UPGRADE - Should
be done at DB
extraction
1 month
$5K and
logistics
Available om
FY79
Figure 11-10
-------
g. Time Benefits - The user benefits derived from being able to
respond quicker in terms of calendar time. The response
would be to ad hoc queries and analysis problems. The
shorter the response time, the higher the rating. This is
exclusive of data transfer time. The relative weight of this
factor is 8.
h. Analysis Benefits - For each application, the improvements in
analysis capabilities. The ease with which the system can be
used and the studies that would not be attempted without
UPGRADE. The greater the improvement, the higher the rating.
The relative weight of this factor is 7.
i. EPA Controllability - The degree of influence EPA has over
the development of the UPGRADE system to meet EPA needs.
The greater the control, the higher the rating. The relative
weight of this factor is 2.
j. Risk - The degree of identifiable risk associated with each
solution. This includes both economic and technical risks.
The smaller the degree of risk, the higher the rating. The
relative weight of this factor is 5, primarily because UPGRADE
is and will be an ongoing system supported by other govern-
ment agencies and not a new system awaiting development.
2. In the evaluation process, each factor will be rated within each
proposed solution. The rating will be multiplied by the weight factor to
determine a relative score for each factor and solution. The summation of the
weight scores for each proposed solution will be the System Acceptance Rating.
It will be correlated to recurring costs, non-recurring costs, and net cost
savings to determine a final recommendation to EPA in the use of UPGRADE.
3. These factors are given values in Section IV, Evaluation.
G. SUMMARY OF SAVINGS AND BENEFITS
From the EPA evaluations that have been conducted and are included in
Appendix D, most applications show an average 5-fold cost savings (when
qualified) and intangible benefits. The intangible benefits range from the
fact the UPGRADE'S rapid pictorial response allows EPA to either support or
disprove an allegation with substantiating figures to the fact that many
study applications would not have been attempted without UPGRADE.
11-20
-------
SECTION III
-------
III. Feasible Alternative Solutions
There are four options for EPA in the use of the UPGRADE system. They
are:
OPTION I - EPA Direction of UPGRADE
OPTION II - EPA Co-Sponsorship of UPGRADE
OPTION III - EPA Limited Use of UPGRADE
OPTION IV - No EPA Involvement/Use of UPGRADE
Option I carries a number of hardware alternatives for possibly moving UPGRADE
to another computer site. Each option and alternative is discussed in succeed-
ing paragraphs followed by a discussion of UPGRADE changes. Costs are discussed
in Section IV.
A. OPTION I - EPA Direction of UPGRADE
1. Under this option, EPA would become the main UPGRADE sponsor or would
take a version of UPGRADE for its own use and future development. This is the
most costly action that EPA could take and should only be considered if and
when the dedicated user base is greater than 15. EPA also incurs the cost of
maintaining an UPGRADE User Support staff to perform the following functions:
Technical Staff Function
2 Assist new/old users in realizing the
total analysis potential with UPGRADE,
including training, data base format-
ting and data acquisition.
4 Program maintenance, enhancements.
The User Support staff would be involved in planning future development,
improvements, and expansion of the system according to EPA needs on a long-
range basis. It would then become necessary for EPA to define its role rela-
tive to UPGRADE. EPA could either operate independently from the UPGRADE user
community (currently DOE and State governments), or it could, by agreement
with CEQ, become the lead agency in the development of UPGRADE. The cost of
maintaining an in-house staff would run $150,000 per year. However, it is
unlikely that EPA would be granted the additional billets for this purpose,
thus a comparable contractor staff cost of $300,000 has been used for analy-
sis.
With the heavy use envisioned under this option, UPGRADE may have to be
moved from the NIH-DCRT computer center in Bethesda, Md. A discussion of 4
alternatives for computer support follows:
2. Alternative 1 is to remain on the NIH-DCRT computer. It has the
cheapest operating costs and least user implementation costs; consequently, it
has the most cost savings. These are shown in Section IV. However, although
NIH sets no formal restriction on the amount of computer resources available
to a single user, users who build up charges in the range of $10-15,000 per
month or who use a great deal of on-line storage (approaching max of 880 data
sets) are encouraged to commence seeking alternative computer sources. These
numbers vary with the total loading on the NIH computer complex at any given
III-l
-------
point in time. CEQ's experience indicates that a dozen active user units will
reach the present NIH usage threshold. If and when this limit is reached,
UPGRADE will have to be moved to another computer installation such as those
described in Alternatives 2, 3, and 4.
3. Alternative 2 is use of EPA's computer installation at COMNET. This
computer is the same type IBM machine as that at NIH-DCRT, however the opera-
ting systems and terminal interface systems are very different. NIH-DCRT has
a virtual memory system (MVS) with a Time Sharing Option (TSO) for terminal
interface. COMNET has a fixed partition (MVT) operating system and an ALPHA
terminal interface system. Moving UPGRADE onto COMNET would require 20%
reprogramming at a cost of $82,000 as estimated by the CEQ UPGRADE User Support
Group. The nature of COMNET1s operating system is such that each user would
require UPGRADE'S maximum core requirements (500,000 bytes) available during
the entire analysis session. For example, 6 users on the system at one time
ties up 3,000,000 bytes of COMNET's core virtually stopping its computer from
being used by anyone else. This would not be allowed and UPGRADE users would
have to wait in line for computer time. A severe reduction of system use and
user satisfaction would result. See Appendix E for a detailed discussion on
UPGRADE conversions.
On a cost basis, the UPGRADE conversion costs would be added to the user
implementation costs, as shown in Section IV. Also at COMNET, interactive
sessions carry a cost multiplier of "6" which increases the annual computer
operating costs to 6 times that of NIH-DCRT. (Batch costs at the two installa-
tions are similar.)
4. Alternative 3 is the use of EPA's computer installation at Research
Triangle Park (RTP), Durham, North Carolina. This computer is non-IBM and, as
such, employs a different scheme for encoding data and programming instructions.
It also employs different methods to inputting and outputting data between
storage media and the computer core storage. Other significant differences
are that the statistical analysis subsystem (SAS) would have to be changed
along with the Sort/Merge Program, Assembly Language Code and FORTRAN exten-
sions. In essence, this would cause an 80% rewrite of UPGRADE at an estimated
cost of $164,000. See Appendix E for a detailed discussion of UPGRADE conver-
sions. Another point to consider is the terminal interface system. At present,
the computer installation at RTP has a finite limit of approximately 40 on-line
users at one time. This would present an indeterminate wait time for UPGRADE
users and a reduction of system use and user satisfaction. Operating costs at
this installation are comparable to COMNET charges, about 6 times that of
NIH-DCRT.
5. Alternative 4 is the use of a commercial computer service. For
purposes of this study, the GSA installation at Boeing Computer Services is
used. This installation is comparable to NIH-DCRT and the CEQ UPGRADE User
Support Group has successfully transferred UPGRADE to this system on a test
basis with no change in user capabilities or reprogramming. The use of this
computer installation would increase operating costs by a factor of 4 over
NIH-DCRT costs based upon the CEQ UPGRADE User Support Group test. At this
installation there are no known expansion limitations.
III-2
-------
B. OPTION II - EPA Co-Sponsorship of UPGRADE
1. Under this option, EPA would support UPGRADE at a level that would
entitle EPA to direct a portion of UPGRADE'S future development in conjunction
with the other supporting agencies. This option should be considered when the
dedicated EPA user base numbers between 5 and 15. With this option EPA should
maintain an UPGRADE User Support Staff of 1 to assist new/old users in realiz-
ing the total analysis potential with UPGRADE. This person will assist the
user with data base formatting and data acquisition, and be an EPA liaison to
the CEQ UPGRADE User Support Group. The person would also be EPA's representa-
tive in planning future UPGRADE development, improvement, and expansion to
ensure that EPA needs are satisfied to the extent possible. The contractor
cost of this one-person staff would be $50,000 per year. The system would
continue to operate on the N1H-DCRT, but when the expansion limitation, as
defined in Option I, is reached, the same hardware alternatives apply. As
such, the EPA liaison should use the hardware alternative analysis of Option I
to influence any relocation of UPGRADE.
C. OPTION III - EPA Limited Use of UPGRADE
1. Under this option EPA would be a user of UPGRADE. They would be a
customer, able to buy certain "goods" from available stock, but not able to
influence the long-range development. An EPA liaison would not be required as
individual users would interface directly with the CEQ UPGRADE User Support
Group. This option should be considered when the dedicated EPA user base
numbers less than 5. The costs of this option would depend upon the individual
user involvement with UPGRADE.
D. OPTION IV - No EPA Involvement/Use of UPGRADE
1. This option is presented to include the possibility that there are no
interested EPA users for UPGRADE. The User survey shows that this is not the
case.
E. UPGRADE Changes
1. Those requirements ranked as "Essential" are listed below, along with
requesting EPA office, and a CEQ User Support Group time and cost estimate:
CEQ Estimate
Requirements Requester Cost Time
Data compression or OAQPS/RTP and Region X $ 7,500 4-6 weeks
Restructured Data Base
Refine Terse Mode ORD/OMTS/Las Vegas 10,000 5-8 weeks
Internal Data ORD/OMTS/Las Vegas and 5,000 4-5 weeks
Management System Region X
Limited Batch Mode ORD/HQ Available in FY 79
III-3
-------
Requirements
Moving Means
Multiplot
Requester
Region X
Region III and ORD/HQ
CEQ Estimate
Cost Time
10,000
10,000
8 weeks
8 weeks
TOTAL $42,500
All Essential requirements except "Moving Means" have already been funded by
other sponsors of UPGRADE and are scheduled for implementation in the next few
months.
2. Those requirements termed "Necessary" are:
Requirements
Improved STORET
Interface
Improved SAROAD
Interface
Data Extraction from
Tape Storage
Super Terse Mode
Requester
ORD/OMTS/Las Vegas and
Region X
OAQPS/RTP and Region X
and III
ORD/HQ
ORD/OMTS/Las Vegas and
(Full Batch Capabilities) OAQPS/RTP, and ORD/HQ
Add User Analysis
Routines
3D Plots
Increase Plot Size
Add User Defined
Models
Contour Plots
Data Save
Capabilities
ORD/OMTS/Las Vegas and
OAQPS/RTP
Region III
OAQPS/RTP
ORD/OMTS/Las Vegas
Region III
ORD/HQ and
ORD/OMTS/Las Vegas and
Region III
CEQ
Cost
$15,000
15,000
3,000
35,000
9,000
Estimate
Time
12 weeks
12 weeks
2 weeks
24 weeks
4 weeks
Yr.
Planned
1
1
1
2
2
Evaluate Available
Packages
7,000
20,000
6 weeks
16 weeks
1
2
Available in FY 79
30,000 24 weeks
TOTALS Yr
Yr
TOTAL
1
2
$ 70,500
64,000
$134,500
III-4
-------
3. Those requirements termed "Desirable" are:
CEQ Estimate Yr.
Requirement Requestor Cost Time Planned
Overlying Points on ORD/OMTS/HQ $10,000 8 weeks 3
Graph
Improved Alphanumeric ORD/OMTS/HQ 5,000 4 weeks 3
Axis Description
Add Interactive SPSS ORD/OMTS/HQ 5,000 12 weeks 3
Quick Exit from System 0PM and Available in FY 79
Region III
TOTAL $20,000
F. Major Benefits
1. A significant number of benefits come to EPA with the use of UPGRADE.
The most significant is that it allows EPA to utilize its large environmental
data bases and the health data bases for analysis of the factors affecting the
length and quality of life of every area in the United States. It allows
studies to be conducted that would not previously be attempted because of the
time and cost required. For those studies that would be attempted, UPGRADE,
with proper use, reaps a 5-10 fold saving on cost and time. Evidence of these
benefits is also demonstrable by other agencies using UPGRADE. The total
accumulation of benefits to EPA is indeterminate due to the intangible nature
of the benefits derived from studies that would not have been undertaken
without UPGRADE. As can be seen from the evaluation report from OAWM at RTP,
not.all uses of UPGRADE produce a significant cost savings but most do and
bring large benefits.
2. Crosswalks between differing classes of source data provide a major
analytical tool. Initial EPA evaluation activities included correctional
analysis of air and water pollutants and NCHS mortality data. Possible analy-
sis products would be Atlases of mortality/morbidity air and water quality
after the model of NCI's Cancer Atlas. In-depth analysis in this area would
be responsive to EPA mandates and will alleviate criticisms of EPA's prior
lack of activities in this area.
G. Functional Advantages and Disadvantages
1. The functional advantage is that UPGRADE is a user-oriented system
geared to the non-computer analyst. It requires little training and allows
the cross-correlation of different media data. UPGRADE provides rapid visual
display of analysis results that are retainable in instant hard copy form.
The current UPGRADE data available to the user and typical outputs are shown
in Figure III-l. However, the system is a tool for analysis and, as with all
tools, should and can be enhanced to provide even greater capability. Those
additional capabilities and uses that are planned for the next fiscal year and
funded by other users are shown in Figure III-2.
III-5
-------
2. Additional advantages can be gained just by adding more data to the
UPGRADE system. Typical outputs/uses that can be achieved by adding data are
shown in Figure III-3. These outputs are significant steps in responding to
the criticisms EPA has received concerning analysis of collected data.
3. The functional disadvantage is that UPGRADE does require some modifi-
cation before some EPA offices can make effective use of the system.
III-6
-------
UPGRADE
CURRENT USES
DATA AVAILABLE
County level mortality data
County level drinking water data
County level demographic data
General water quality data
Aquatic pesticide data
National stream quality accounting (NASQAN)
data
General air quality data (limited user-defined
TYPICAL OUTPUTS (Con't.)
Ad hoc materials with quick turnaround time
Mortality rates vs. drinking water quality variables
Water quality variables vs. "hardness"
Intercorrelations of mortality variables
Cross-checking EPA & USGS water quality measure-
ments
Correlations with demographic variables
Multiple regression using both environmental and
demographic variables
Air Quality and health data
TYPICAL OUTPUTS
Maps showing geographic patterns of county-level
mortality/water quality
Relationship of cardiovascular disease mortality
rales and constituent levels in drinking water
Time trends and mean violation rates of drinking
water constituents in selected surface supplied
public drinking water systems
Water turbidity vs time (months)
Air quality data for trend analysis and summary
reporting
Cancer mortality in California
Trends analysis of CO and oxidant
Figure III-l
-------
FY
UPGRADE
79 FUNDED USES
CO
DATA/CAPABILITY ADDED
1. NE Region Drinking Water DB and Energy
Environmental DB on an industrial and
county basis.
2. Midwest Region Drinking Water DB and
Energy Environmental DB on an
industrial and county basis.
3. Statistical Tables from the Compendium
on Environmental Statistics (250 small
data sets)
4. Integrate National Meteorological Data
with existing IDB Air Data
(NO A A National Weather Service DB)
5. NEDS Air Quality Emissions data
6. International water quality data from
Canada (GEMS)
7. Biological Data (BENCHMARK) from
NASQAN stations
8. Complete GLIDE Interface with Age
Adjusted Mortality Rates,
100-200 demographic variables
TYPICAL OUTPUTS
1. DOE NE Region Health and Energy Facility siting
studies
2. DOE Midwest Region Health and Energy Facility
siting studies
3. CEQ studies and improved report generation
4. DOE and CEQ studies
5. CEQ studies
6. CEQ Annual Report and USGS studies
7. CEQ and USGS NASQAN studies
8. o More sophisticated transforms
o Apply NCHS comparable ratios for
1959-61 and 1968-71 mortality rates
Figure III-2
-------
DATA/CAPABILITY ADDED
9. Dat.a base management system for specific
mortality
Industrial concentrations and
Occupation variables
10. Mapping enhancements
11. Enhanced report generation
TYPICAL OUTPUTS
9. EPA/OTS studies
10. • State and regional levels
• Spot mapping of rare causes of death
• Batch Specification Maps
11. • Tabular displays
• %Distribution and statistical
significance testing
v£>
Figure III-2 (cont'd)
-------
UPGRADE
PROJECTED USES
DATA ADDED
1. Ambient Air Data
M
O
2. Multiple Causes of Death
3. Morbidity Statistics
4. Increase Drinking Water Data to Nationwide
5. Altitude
6. Source Data
7. Phytoplankton Data and National
Eutrophication Survey (NES)
Program Data
TYPICAL OUTPUTS
1. o Air Quality maps for different pollutants and
different aggregations
o Report on Air Quality correlations with mortality,
water quality, and other variables
2. o Maps of geographical variations in contributing
causes of death with drinking water, water
quality, and air quality
3. o Maps of morbidity
o Report on morbidity versus air and water
quality
4. o Drinking water versus cancer
5. o Altitude versus heart disease
6. o Report relation of point source trends to
ambient quality trends
7. o Geographical distributions and representations
(phytoplankton and environmental factors)
o Statistical evaluations of the environmental
requirements of well-known problem and
special interest algae
o Development of phytoplankton indices of trophic
state
(testing index level modification, and develop-
ment of species level indices to water quality)
Figure III-3
-------
DATA ADDED
8. Hanes II Data
9. Wisconsin's Industrial Effluent Data
9.
TYPICAL OUTPUTS
o Retrieval of baseline phytoplankton data within
geographically restricted areas
o Retrieval of data from like areas for making
preconstruction predictions relative to 208
planning
o Selection of lake subsets of special interest,
(e.g., high or low productivity), for com-
parison with community structure
o Provision of ambient water quality and/or
sensitive biological components for inclusion
in multiparameter models for land-use and
watershed management
o Opportunity to interface the water quality and
phytoplankton data with other information of
specific user interest
o CO exposure in the blood by geographic area
and/or community
o CO levels in breath samples
o Correlation of source and ambient water data
in the Wisconsin area
o Use as pilot study for extending efforts to other
areas
Figure III-3 (cont'd)
-------
SECTION IV
-------
IV.
A. Cost Assumptions
EVALUATION OF ALTERNATIVE SOLUTIONS
1. There are a number of assumptions that must be made to analyze costs.
They are as follows:
Category
Life Cycle
Amortization
Remote Terminal
Station
User Unit
Assumption
5 years
Data Extraction
per User Unit
Computer Usage
Fixed Costs will be spread on
straight line basis over 5 years
TEKTRONIX 4014 CRT Graphics Unit
TEKTRONIX 4631 Hard Copy device
1200 Baud Modem
15% of Total Analyst Time
@ $30K/yr
15% of Total Intern Time
@ $10K/yr
SAROAD = 1 per quarter
STORET = 1 per quarter
A user unit will average 2 runs/day
@ 35 min/session with 4 data sets.
It produces 40 plots/day and 10
maps/day.
B. One-Time Costs
1. Hardware Implementation
Based upon the evaluation reports, this analysis will assume 9 poten-
tial EPA users with only 4 requiring a full remote terminal station. The
other 5 either have terminals or access to one. The cost of a remote terminal
station is:
Terminal Hard Copy Printer Purchase:
a. TEKTRONIX 4014 Terminal with full graphics
b. TEKTRONIX 4631 Hard Copy Printer
c. 1200 bps Modem
11,600
4,400
750
16,750
The total cost of 4 stations is $67,000. This cost applies to OPTIONS I,
II, and III. For Option IV there is no cost.
IV-1
-------
2. Software Implementation
UPGRADE modifications ranked as essential by EPA users are: restruc-
turing the data base or data compression at a cost of $7,500; refinement of
UPGRADE'S terse mode at a cost of $10,000; the addition of multiplot at a cost
of $10,000; and the beginning of an internal data management system at a cost
of $5,000. These modifications have already been funded by other users. Only
the addition of moving means remains to be funded by EPA for software implemen-
tation. The total cost of software implementation is $10,000 - this cost
applies to Options I and II. There is no cost for Option III and IV.
3. Non-Recurring Operations - Fixed
This is the cost of program and data base enhancements ranked as
necessary and desirable by the EPA users.
Type
N Data Base
N Data Base
N Data Base
N UPGRADE
N
N
N
N
D
D
D "
Year
Enhancement Planned Cost
STORET Interface
SAROAD Interface
Data Extraction from Tape Storage
Superterse Mode
Add User Analysis Routines
Plot Size
Add User Defined Models
Data Save Capabilities
Overlying Point on Graph
Alphanumeric Axis Description
Add Interactive SPSS
1
1
1
2
2
1
2
1
3
3
3
$ 15,000
15,000
3,000
35,000
9,000
7,500
20,000
30,000
10,000
5,000
15,000
$164,500
Time
(Weeks)
12
12
2
24
4
6
16
24
8
4
12
124
The data base enhancements will be funded to completion in the first year
and the UPGRADE enhancements will be prioritized and spread out over the first
three years of the 5-year life cycle. This cost applies, in full, for Option
II and would be 20% for Option III because of the limited EPA involvement.
Option IV has no cost. For Option I, this cost will be replaced by the recur-
ring cost of maintaining a User Support Staff.
4. Non-Recurring Operations - Variable
This cost item is only applicable in Option I, if UPGRADE is moved
from the computer installation at NIH-DCRT to EPA-COMNET or EPA-RTP. Based
IV-2
-------
upon the CEQ User Support Group, the cost to develop UPGRADE amounts to
$411,000 and the reprogramming cost involved in moving it to EPA-COMNET is
$82,000. The cost to move UPGRADE to EPA-RTP is $164,000, as estimated by the
CEQ User Support Group.
5. Summary
Figure IV-1 summarizes the one-time costs that would be incurred for
each option in the use of UPGRADE.
C. Recurring Costs
1. Fixed
Recurring fixed costs are the same for Options I, II, and III and are
derived from a User Units' use of the system as defined in paragraph IV A.
The costs are based on an estimate of 9 potential units. The average annual
costs are:
Annual Cost Total Annual
Cost Item Per User Cost
Analyst Time $ 4,500 $40,500
Intern Time 1,500 13,500
Data Extraction and Load 1,000 9,000
Terminal Hardware Maintenance
Terminal 1,340 12,060
Printer 660 5,940
Materials 1,000 9,000
TOTAL $10,000 $90,000
2. Variable
The average annual costs that vary between and within options are
detailed herein using the use assumptions of paragraph IV A for 9 User Units.
a. Option I
Under Option I, EPA has its own version of UPGRADE and must incur
the cost of an EPA User Support Group consisting of 6 persons: two for Data
Base, two for the system, and two for user interfaces. This group would
perform all UPGRADE enhancement work after program conversion for COMNET and
RTP hardware alternatives. The average contractor cost per person for this
group is estimated at $50,000/year. Computer/Storage/Plotter charges were
acquired from the EPA evaluations, CEQ UPGRADE User Support Group test runs,
phone contacts with installation personnel, and charge algorithm analysis.
The costs under this option are:
IV-3
-------
Life Cycle One-Time Costs (in $1,000)
Option I
COST ITEM
Hardware Implementation
Software Implementation
Non-Recurring Operations-Fixed
Non-Recurring Operations-Variable
Years
1
NIH
67
10
0
0
COMNET
67
10
0
82
RTF
67
10
0
164
GSA
67
10
0
0
2
0
0
0
0
3
0
0
0
0
4
0
0
0
0
5
0
0
0
0
TOTAL
67
10
0
82/164
TOTALS
77 159
241
77
OPTION II
COST ITEM
Hardware Implementation
Software Implementation
Non-Recurring Operations-Fixed
Non -Recurring Operations-Variable
Years
1
67
10
70.5
0
2
0
0
64
0
3
0
0
30
0
0
0
0
0
5
0
0
0
0
TOTAL
67
10
164.5
0
TOTALS
147.5 64
30
241.5
OPTION III
COST ITEM
Hardware Implementation
Software Implementation
Non-Recurring Operations-Fixed
Non-Recurring Operations-Variable
Years
1
0
0
33
0
2
0
0
10
0
3
0
0
10
0
4
0
0
10
0
5
0
0
0
0
TOTAL
0
0
63
0
TOTALS
33
10
10
10
Figure IV-1
IV-4
63
-------
ITEM
Computer Usage
Data Storage
Plotter
User Support Group
TOTALS
(* in $,000)
Annual Cost Per User
NIH
12.5
4.32
1.25
-
18.07
COMNET
75
.864
0
-
75.864
RTP
52.5
5.6
0
-
58.1
GSA
50
10.8
1.25
-
62.05
Total Annual Cost
NIH
112.5
38.88
11.25
150
462.63
COMNET
675
7.77
0
150
982.77
RTP
472.5
50.4
0
150
822.9
GSA
450
97.2
11.25
150
858.45
b. Option II
Under Option II EPA, as a co-sponsor of UPGRADE, requires a
central UPGRADE co-ordinator and CEQ liaison. This would be one person at a
contractor cost of $50,000/year. The costs (in thousands) for this option
are:
Item
Computer Usage
Data Storage
Plotter
Coordinator
TOTAL
c. Option III
Annual Cost
Per User
12.5
4.32
1.25
18.07
Total Annual
Cost
112.5
38.88
11.25
25.
187.63
The costs under this option would vary with user involvement and
a central UPGRADE coordinator would not be necessary. The costs involved
would be the same as Option II's Annual User Cost of $18,070. For 4 users,
the average annual cost would be $72,280.
d. Option IV
No cost is involved.
D. Development Lead-Times
Approval of this Feasibility Study by MIDSD in the normal time frame
requires one month. Should approval be granted for EPA participation in the
development and use of UPGRADE, preparation of an appropriately worded IAG
between EPA and CEQ may be required. This IAG would take about one month to
be prepared and approved. The IAG may include provision for programming the
User requirements and necessary system conversions, depending on the option
IV-5
-------
approved. After IAG approval, under Option I, contract negotiations may be
required for acquisition of an EPA UPGRADE User Support Group. Under Option
II, an EPA UPGRADE Coordinator would be designated. Also at this time, poten-
tial users would request terminal equipment, if necessary. Those users with
equipment would begin using the system to the extent possible until the Essen-
tial User Requirements are operational - 3 months. Even in Option I where
conversion is required, limited use of the system could be made, until such
time as the alternate hardware installation is operational. This would be
approximately 6 months for COMNET and 12 months for RTP. Charts of these time
relationships are presented in Figures IV-2, 3, and 4. There are no charts
for Options III and IV. Option III requires only MIDSD approval, preparation
of individual user lAGs with CEQ, and use of the system. Option IV requires
no further action.
E. Cost/Benefit and Cash Flow Analysis
1. Cash Flow Analysis
Cash flow analyses were made for Option I (all alternatives) and for
Option II in order to provide a measure of the value of each option and the
alternatives within the options. Cash flow analyses of Options III and IV are
meaningless because of the number of potential UPGRADE users. All costs were
taken from paragraph IV-B, One Time Costs, and from paragraph IV-C, Recurring
Costs. Deviation of quantifiable and non-quantifiable benefits are described,
in paragraph 2 of this section.
Figures IV-5 thru 9 show the costs incurred for Development and Operation
of each of the Options and the results (total present values) in order of
precedence are as follows:
Total Present Value ($000)
Option II 1099.7
Option I, Alternative 1 210.8
11 4 (1439.7)
3 (1455.7)
2 (2040.5)
It is obvious that the computer usage charge is the dominant factor in
the analyses, and with this small volume of potential users (and thus limited
benefits), only use of the NIH computer facility allows a positive total
present value over the costed life cycle (5 years). An increase in the number
of users, or an increased judgment in non-quantifiable benefits would improve
the present value picture for other alternatives, but as long as NIH computer
charges are significantly less than other facilities, the comparative analyses
will remain the same.
IV-6
-------
Action Steps
Secure MIDSD Approval
Prepare IAG for Use
of UPGRADE
Program Essential
User Requirements
Contract for User
Support Services
1 mo. I
3 mos.
Development Lead-Times
Option I Alternatives 1 and 4
Continuous
3 mos.
/ ]\ Secure MIDSD Approval
/2\ Approve IAG
'3
Order additional User terminal
stations as requested
6 mos. 12 mos. 24 mos. 36 mos. 48 mos. 60 mos.
Issue User Support Group
contracts.
Install terminal equipment
upon arrival
Figure IV-2
-------
CO
Action Steps
Secure MIDSD Approval
Prepare IAG for Use
of UPGRADE
Convert UPGRADE for
New Installation
Program Essential
User Requirements
Contract for User
Support Services
1 mo.
A
Development Lead-Times
Option I Alternatives 2 and 3
6 itios •
.Alt. .3...
3 mos.
Continuous
-4-
0 3 mos.
/1\ Secure MIDSD Approval
^2\ Approval IAG
Order additional User terminal
stations as requested
6 mos. 9 mos. 12 mos. 2k mos. 36 mos. 48 mos. 60 mos.
Issue User Support Group
contracts
Install terminal equipment
upon arrival
Figure IV-3
-------
Action Slepa
Secure Ml USD Approval
Prepare IAU for use
of Ul
Program Essential
User Kc(|iiliuuiunLu
Program Necessary
DU Requirements
Program Nucetiaary
llacr ut'CUADU
Uiiijulruiuentti
fru^rum Den 1 rattle
Ituer III'CKADU
Kct|ul ruumnca
3 moa.
3 noa.
3.5 mos.
Development Lead-Timea
for Option II
I B.6
n>OB.
-l-
6 moa.
12 mos.
24 moa. 36 moa. 48 mos. 60
moa.
Secure MIDSU approval
Deulgnace EPA UPGRADE Coordinator
Order additional User terminal
stations aa requested
rA\ Install terminal equipment
upon arrival
Train Users aa requested
Figure IV-4
-------
2. Benefit Derivations
Benefits include quicker response, better accuracy, a wider range of
problem solving capabilities, a decrease in time consuming repetitious manual
calculating, an increased ability to analyze crossmedia data, etc. With all
these improvements comes better visibility and greater respect from other
agencies. However, many of these benefits are not directly quantifiable so
most of the benefits calculations shown in Figures IV-5-9 will be in the form of
cost avoidance or cost savings.
Users have estimated that during their testing of UPGRADE, analyses and
plots could be produced 5-10 times faster using UPGRADE as could be produced
manually, if they could even be produced manually. At least 40% of the tests
were considered too complex to have been done manually at all.
Approximately 10% of the users doing testing already had access to
some automated system with which they could minimally perform the test func-
tions. These users estimated that use of these existing systems takes approxi-
mately twice the cost of UPGRADE to produce the same results.
Therefore, benefits were calculated from the cost avoidances in not
having to perform the analyses as at present i.e., 10% by inferior automated
means, 50% by manual means, and 40% too complex to quantify completely. Thus,
the quantifiable and non-quantifiable benefits have been calculated as follows:
a. Quantifiable Benefits.
(1) Replace present automated procedures (10% of users)
Assume costs are the same as UPGRADE system.
Operating Costs (exclude all Development costs. Use factor
of 2, and 10% of users. 277.6 X 2 X .10 = 55.5
(2) Replace present manual procedures (50% of users)
Assume User Personnel, Data Extraction and Plotting cost 5-10
times more than UPGRADE when done manually. Data Storage is
the same and other costs are not involved. Also assume that
the UPGRADE enhancements improve efficiency so that during
the first year, UPGRADE can be used 5 times more efficiently
than manual means, the second year 7 times, the third year 9
times, and 10 times more efficiently thereafter.
1st year (54+0+11.25)( 5)(.50)+(38.9)(.50) = 205.1
2nd year (54+9+11.25)( 7)(.50)+(38.9)(.50) = 279.3
3rd year (54+9+11.25)( 9)(.50)+(38.9)(.50) = 353.6
4th year (54+9+11.25)(10)(.50)+(38.9)(.50) = 390.7
b. Non-Quantifiable Benefits
(1) Worth equivalent to manual procedures (40% of users)
Assume the complexity of manual procedures described above
for those 40% of analyses which would probably not now be
attempted.
1st year (74.25)( 5)(.4)+(38.9)(.4) = 164.1
2nd year (74.25)( 7)(.4)+(38.9)(.4) = 223.4
3rd year (74.25)( 9)(.4)+(38.9)(.4) = 282.9
4th year (74.25)(10)(.4)+(38.9)(.4) = 312.6
IV-10
-------
OPTION I ($000)
Alternative 1 (NIH)
Development Costs
Remote Terminals Purchase
Software Implementation
Software Enhancement
Computer Site Change
User Support Personnel
Operating Costs
Remote Terminal Maint. & Materials
User Personnel (Analyst & Intern)
Data Extraction
Data Storage
Computer & Plotter Usage
User Liaison Personnel
Total Costs
Quantifiable Benefits
Non-Quantifiable Benefits
Net Savings
Present Value Factors
(10% Discount Rate)
Year
N
67.0
10.0
0
0
300.0
27.0
54. 0
9.0
38.9
123.7
0
629.6
260.6
164.1
(204.9)
1.00
N + 1
300.0
27.0
54.0
9.0
38.9
123.7
0
552.6
334.8
223.4
5.6
0.91
N + 2
300.0
27.0
54.0
9.0
38.9
123.7
0
552.6
409.1
282.9
139.4
0.83
N + 3
300.0
27.0
54.0
9.0
38.9
123.7
0
552.6
446.2
312.6
206.2
0.75
N + 4
300.0
27.0
54.0
9.0
38.9
123.7
0
552.6
446.2
312.6
206.2
0.61
Discounted Savings
(204.9) 5.1 115.7 154.7 140.2
Total Present Value
210.8
Figure IV-5
IV-11
-------
OPTION I ($000)
Alternative 2 (COMNET)
Year
Development Costs
Remote Terminals Purchase
Software Implementation
Software Enhancement
Computer Site Change
User Support Personnel
Operating Costs
Remote Terminal Maint. & Materials
User Personnel (Analyst & Intern)
Data Extraction
Data Storage
Computer & Plotter Usage
User Liaison Personnel
Total Costs
Quantifiable Benefits
Non-Quantifiable Benefits
N
67.0
10.0
0
82.0
300.0
27.0
54.0
9.0
7.8
675.0
0
1231.8
260.6
164.1
N + 1
300.0
27.0
54.0
9.0
7.8
675.0
0
1072.8
334.8
223.4
N + 2
300.0
27.0
54.0
9.0
7.8
675.0
0
1072.8
409.1
282.9
N + 3
300.0
27.0
54.0
9.0
7.8
675.0
0
1072.8
446.2
312.6
N + 4
300.0
27.0
54.0
9.0
7.8
675.0
0
1072.8
446.2
312.6
Net Savings
Present Value Factors
(10% Discount Rate)
C807.1) (514.6) (380.8) (314.0) (314.0)
1.00 0.91 0.83 0.75 0.68
Discounted Savings
(807.1) (468.3) (316.1) (235.5) (213.5)
Total Present Value
(2040.5)
Figure IV-6
IV-12
-------
OPTION I ($000)
Alternative 3 (RTF)
Year
Development Costs
Remote Terminals Purchase
Software Implementation
Software Enhancement
Computer Site Change
User Support Personnel
N
N+l N+2 N + 3 N + 4
67.0
10.0
0
164.0
300.0 300.0 300.0 300.0 300.0
Operating Costs
Remote Terminal Ma int. & Materials
User Personnel (Analyst & Intern)
Data Extraction
Data Storage
Computer & Plotter Usage
User Liaison Personnel
27.0
54.0
9.0
50.4
472.5
0
27.0
54.0
9.0
50.4
472.5
0
27.0
54.0
9.0
50.4
472.5
0
27.0
54.0
9.0
50.4
472.5
0
27.0
54.0
9.0
50.4
472.5
0
Total Costs
Quantifiable Benefits
Non-Quantifiable Benefits
1153.9 912.9 912.9 912.9 912.9
260.6 334.8 409.1 446.2 446.2
164.1 223.4 282.9 312.6 312.6
Net Savings
Present Value Factors
(10% Discount Rate)
(729.2) (354.7) (220.9) (154.1) (154.1)
1.00 0.91 0.83 0.75 0.68
Discounted Savings
(729.2) (322.8) (183.3) (115.6) (104.8)
Total Present Value
C1455.7)
Figure IV-7
IV-13
-------
OPTION I ($000)
Alternative 4 (GSA)
Year
Development Costs
Remote Terminals Purchase
Software Implementation
Software Enhancement
Computer Site Change
User Support Personnel
N N+l N + 2 N+3 N + 4
67.0
10.0
0
0
300.0 300.0 300.0 300.0 300.0
Operating Costs
Remote Terminal Maint. & Materials
User Personnel (Analyst & Intern)
Data Extraction
Data Storage
Computer & Plotter Usage
User Liaison Personnel
Total Costs
Quantifiable Benefits
Non-Quantifiable Benefits
27.0
54.0
9.0
97.2
461.2
0
1025.4
260.6
164.1
27.0
54.0
9.0
97.2
461.2
0
948.4
334.8
223.4
27.0
54.0
9.0
97.2
461.2
0
948.4
409.1
282.9
27.0
54.0
9.0
97.2
461.2
0
948.4
446.2
312.6
27.0
54.0
9.0
97.2
461.2
0
948.4
446.2
312.6
Net Savings
Present Value Factors
(10% Discount Rate)
(600.7) (390.2) (256.4) (189.6) (189.6)
1.00 0.91 0.83 0.75 0.68
Discounted Savings
(600.7) (355.1) (212.8)(142.2) (128.9)
Total Present Value
(1439.7)
Figure IV-8
IV-14
-------
OPTION II ($000)
Year
Development Costs
Remote Terminals (4) Purchase
Software Implementation
Software Enhancement*
Computer Site Change (Reprogramming)
User Support Personnel
N N+l N + 2 N + 3 N + 4
67.0
10.0
70.5 64.0 30.0
0
0
Operating Costs
Remote Terminals Maint. & Materials
User Personnel (Analyst & Intern)
Data Extraction
Data Storage
Computer & Plotter Usage
User Liaison Personnel (Coordinator)
Total Costs
Quantifiable Benefits
Non-Quantifiable Benefits
Net Savings
Present Value Factors
(10% Discount Rate)
27.0
54.0
9.0
38. 9
123.7
50.0
450.1
260.6
164.1
(25.4)
1.00
27.0
54.0
9.0
38.9
123.7
50.0
366.6
334.8
223.4
191.6
0.91
27.0
54.0
9.0
38.9
123.7
50.0
332.6
409.1
282.9
359.4
0.83
27.0
54.0
9.0
38.9
123.7
50.0
302.6
446.2
312.6
456.2
0.75
27.0
54.0
9.0
38.9
123.7
50.0
302.6
446.2
312.6
456.2
0.6i
Discounted Savings
(25.4) 174.4 298.3 342.2 310.2
Total Present Value
1099.7
*As a cosponsor of UPGRADE this cost is a worse case analysis. Some of
these costs will probably not be incurred by EPA but by other users.
Figure IV-9
IV-15
-------
F. System Acceptance Ratings
Determination of System Acceptance Ratings for EPA using UPGRADE has been
measured against the Acceptance Criteria and presented in Figure IV-10. The
analysis for each rating is discussed in the following paragraphs.
1. Number of EPA Users - The number of identified users with defined
requirements is 9. The user level for each option is as follows:
OPTION User Level
I >15
II 5-15
III <5
IV 0
The rating assigned to Option I is "2" because there are less than those
required. This applies to all alternatives of Option I. The number of users
fits within the limits of Option II so it is assigned a rating of "10". It
also exceeds the user limits of Option III and is also assigned a rating of
"10". Under Option IV, a rating of "0" is assigned because there are identi-
fied users and Option IV would not satisfy any of their needs.
2. Satisfy User Needs - Option I will satisfy the user needs in varying
degrees. On NIH's computer, user needs will be completely satisfied, this
system has good response and little downtime. It is assigned a rating of
"10". At EPA's COMNET installation, UPGRADE would have to operate in a non-
virtual memory mode. Some users will have to wait until others are finished
because of the number of UPGRADE size memory partitions available. It is
assigned a rating of "9", because the wait will depend on the number of users
on the system at one time. Users of EPA's computer at RTP, N.C. already have
delays getting on the machine. Installation of UPGRADE on this computer would
foster further delays. It is assigned a rating of "8". A commercial computer
facility such as GSA's Boeing Computer Services will provide good user service
to stay in business. It has the same computer equipment as NIH, thus it is
assigned a rating of "10".
Option II implies that EPA, although it will have great influence, will
not be able to completely control UPGRADE'S development and satisfaction of
EPA User needs. Thus, it is assigned a rating of "9".
Option III gives EPA little influence or satisfaction of EPA user needs.
For the most part, EPA must use what is available. It is, therefore, assigned
a rating of "4".
Option IV satisfies none of the user's needs and is assigned a rating of
"0".
3. Expansion Capacity - Under Option I, the NIH installation has data
base expansion constraints due to the limitations imposed for on-line disk
storage. It is assigned a rating of "2". EPA's COMNET computer has greater
data expansion capability but has severe user expansion contraints by the
IV-16
-------
current operating system. Effectively, each user requires the maximum amount
of UPGRADE core for the entire analysis session. Four users require 2 million
bytes of core dedicated. It is, therefore, assigned a rating of "7". If the
operating system changes in the future, this rating would change. The compu-
ter at RTF has a finite capacity for on-line users; the limit is 40. There
are other users of this computer. Therefore, a rating of "6" is assigned.
The GSA installation will add capability to support users with money. It is,
therefore, assigned a rating of "10".
Option II has limited expansion capability, again constrained by NIH's
data storage limitations. However, under this option it is assigned a rating
of "5" because other agencies will also be underwriting system expansion.
Option III is assigned a rating of "1" because EPA will have little
influence over the nature of future system expansion.
Option IV is assigned a rating of "0" because where there is no involve-
ment, there is no expansion.
4. Implementation Cost - For Option I at NIH, the costs would be nominal
because the system is running there. At COMNET, the cost would increase
greatly because major program modifications are required. It is, therefore,
assigned a rating of "4". Before the computer is used at RTP, the system must
be rewritten, thus a rating of "1" is assigned. At GSA no reprogramming is
required. The computer is compatible with NIH, a rating of "9" is assigned.
Option II is assigned a rating of "9" because only nominal costs are
involved.
Option III is assigned a rating of "10" because the cost of co-sponsor-
ship is removed.
Option IV is rated "10" because no costs are involved for this option.
5. Operating Cost - The recurring operating costs over the 5-year life
cycle are low at NIH, but under Option I the total cost of a User Support
Group must be added. The NIH alternative, under Option I, is thus assigned a
rating of "8". At COMNET, computer charges are significantly higher, thus a
rating of "6". At RTP, the computer charges are comparable to COMNET, thus
RTP is also assigned a rating of "6". At GSA the computer charges are higher,
thus a rating of "5" is assigned.
Option II is assigned a rating of "9" due to the cost of co-sponsorship
for the User Support Group.
Option III is assigned a rating of "10" because it has the lowest opera-
ting costs.
Option IV has no operating costs; it also is assigned a rating of "10".
6. Cost Savings - Identified cost savings are significant and should
increase with use of the system. However, they must be offset by Implementa-
tion costs amortized over the 5-year life cycle along with annual operating
costs. Relative cost savings are high in Option I. On the NIH computer
IV-17
-------
alternative, the cost savings are reduced by the total cost of the User Support
Group, thus a rating of "9" is assigned. At COMNET operating costs are higher
and some reprogramming is involved to offset savings, thus a rating of "7".
At RTF, considerably more reprogramming is involved, reducing cost savings
again, thus a rating of "5". At GSA, the cost savings are only reduced by the
higher operating costs. A rating of "8" is assigned.
Option II produces the most cost savings, thus a rating of "10".
Option III has reduced capabilities, thus reduced analysis and reduced
cost savings. A rating of "5" is assigned.
Option IV is assigned a rating of "0". "No use" implies no savings per
application.
7. Time Benefits - Increases in response time range from 5-10 times in
some applications to immeasurable in other. The ability to respond to ad hoc
queries is one of UPGRADE'S most useful features. In Option I, Alternative I,
NIH, the response time benefits are rated "10". Alternative 2 is rated "9"
because of the possibility of delay for a user getting on the system. Alter-
native 3 is rated "8" for the same reason. Alternative 4, GSA, will have'the
same response time benefits as NIH, thus a rating of "10".
Option II response time will be slightly lower, a rating of "9", because
EPA does not have complete control over satisfying user needs.
Option III control is even less, thus a rating of "8".
Option IV has no time benefits, thus a rating of "0".
8. Analysis Benefits - The analysis improvements are high with UPGRADE.
On a relative basis, all alternatives of Option I rate a "10" because EPA can
fully satisfy all user needs. Under Option II, EPA is a co-sponsor and has
reduced control and a rating of "9". Under Option III, EPA has very little
control so the rating is reduced to "4". And under Option IV, EPA has no
control, thus a rating of "0".
9. EPA Controlability - Under Option I, EPA controls the system, thus a
rating of "10" for all alternatives. Under Option II that control is reduced
and a rating of "7" is assigned. Option III entails very little control, thus
a rating of "2". Option IV has a rating of "0".
10. Risks - There are both economic and technical risks involved in using
any system. Fortunately, UPGRADE is a system that has been in operation for
approximately 3 years, so the overall risks are significantly less than with a
new system.
Under Option I, the NIH alternative is assigned a rating of "7" because
of EPA divorcing itself from CEQ and supporting its own version of UPGRADE.
The risk is even greater with the COMNET alternative, a rating of "4", due to
the program modification required. At RTP, the risk increase is even greater
because of complete reprogramraing. A rating of "1" is assigned here. At GSA,
the risk is comparable to NIH, thus a rating of "7".
IV-18
-------
Option II has a rating of "9" strictly because of the co-responsibilities
of sponsoring a User Support Group.
Option IV has considerable risk by not responding to outside mandates and
criticism. A rating of "0" is assigned.
11. Totals - The weighted rating is derived by multiplying the factor
weights by the raw ratings. The complete picture is contained in Figure
IV-10, p. IV-20 . The System Acceptance Ratings are:
OPTION Weighted Rating Rank
I - 1 417 3
- 2 373 5
- 3 315 6
-4 429 2
II - 505 1
III - 385 4
IV - 80 7
IV-19
-------
UPGRADE ACCEPTANCE CRITERIA
(Rating is from 0-10)
(Weight is from 0-10)
Max.
Score Criteria
100 1. Number of EPA Users
60 2. Satisfy User Needs
60 3. Expansion Capability
40 4. Implementation Cost
40 5. Operating Cost
60 6. Cost Savings
80 7. Time Benefits
80 8. Analysis Benefits
20 9. EPA Controlability
60 10. Risk
Weight
10
6
5
4
4
6
8
7
2
5
.System Acceptance
Racing
Rank
OPTION
I
ALTERNATIVES
1
NIH
R
2
10
2
9
8
9
10
10
10
7
W
20
60
10
36
32
54
80
70
20
35
77
417
3
2
COMNET
R
2
9
7
4
6
7
9
10
10
4
W
20
54
35
16
24
42
72
70
20
20
68
373
5
3 1
Rr
R
2
8
6
1
6
5
8
10
10
1
pp
W
20
48
30
4
24
36
64
70
20
5
57
315
6
GSA/Boeing
R
2
10
8
9
5
8
10
10
10
7
W
20
60
40
36
20
48
80
70
20
35
79
429
2
OPTION
II
Cosponsor
R
10
9
5
9
9
10
9
9
7
9
W
100
54
25
36
36
60
72
63
14
45
86
505
1
OPTION
III
Limited
Use
R
10
4
1
10
10
5
8
4
2
10
W
100
24
5
40
40
30
64
28
4
50
64
385
4
OPTION
III
NO
Invol\
R
0
0
0
10
10
0
0
0
0
0
ement
W
0
0
0
40
40
0
0
0
0
0
20
80
7
<1
o
Figure IV-10
-------
SECTION V
-------
V. RECOMMENDATIONS
A. Initial Recommendation
Based upon the evaluation of Section IV, the most cost effective solution
for implementation of the UPGRADE system is Option II. It has the highest
System Acceptance Rating and the greatest net cost savings. Its cost savings
are at least $350,000 more than the next option. It presents an additional
analysis capability for EPA that has been successfully utilized by CEQ, DOE,
and some of EPA's evaluators. Other options may be selected at EPA discretion
should the number of users significantly change. UPGRADE enhancements termed
"Essential" by the user evaluations and not currently funded by other UPGRADE
users have been reduced to one. It is the addition of "Moving Means". EPA,
being a co-sponsor of UPGRADE, may wish to foster an increase in user training
for UPGRADE analysis to obtain maximum capabilities and benefits.
B. Subsequent Recommendation
During this study, a very detailed program analysis was conducted of
UPGRADE. From this analysis, a number of ways were found to improve program
operation and reduce core requirements (see Appendix F). EPA, as a co-sponsor
of UPGRADE, may wish to foster maintenance and software configuration control
programs that fulfill the users defined requests for standardization of termi-
nology and follow the recommendations in Appendix F.
In the event the combined user base of UPGRADE should ever exceed the
capabilities of the NIH-DCRT installation, EPA may wish to consider the alter-
nate site analysis within Option I of this study. That is, UPGRADE can be
moved, when necessary, to a commercial installation similar to the GSA-Boeing
computer center and the NIH-DCRT installation could be retained as a develop-
ment installation for cost purposes.
C. Time Phasing of Enhancements
Accomplishment of EPA user requested enhancements must be on an UPGRADE
user-wide priority basis. However, it is recommended that EPA provide funds
and exert influence to ensure the enhancements defined in Section II and
scheduled for accomplishment in Section IV be programmed in the time frame
recommended.
D. Project Plan
The next step upon MIDSD's approval of this study, is for EPA to prepare
the appropriate IAG with CEQ containing program requirements and schedules, as
provided in Figures IV-2, 3, and 4 and Figure V-l.
The project plan for programming enhancements is based upon the practical
limitation that, only a programmer experienced in UPGRADE coding can make the
modifications and enhancements required. The programmer must also be capable
of operating within the CEQ Support Staff conventions for handling UPGRADE
Software configuration control. Thus, a minimum of parallel efforts are
scheduled.
V-l
-------
The sequence of tasks is determined by relative priority of the feature.
Each task can be broken down into the standard subtasks of maintenance/
enhancement programming:
review design vs. existing code and files
verify interfacing of new routines into system
technical review of approach
prepare detailed program (routine) specification
prepare test specifications
code
check code in test version of UPGRADE
compile program documentation
update system documentation to reflect new features
review
system regression test
install code in production version of UPGRADE
Management tracking by these subtasks (or a subset thereof for small enhance-
ments) will provide a management visibility to the the project as detailed as
management chooses to make it.
V-2
-------
ACTION STEPS
f
CO
MIDSD APPROVAL
JAG
ESSENTIAL RQMTS
MOVING MEAN
NECESSARY D.B. RQMTS
Improved STORET Interface
Improved SAROAD Interface
Data Extraction from
tape storage
NECESSARY UPGRADE RQMTS
SUPERTERSE Mode
Add Users Analysis
Routines
Plot size
Add User-defined Models
DESIRABLE UPGRADE RQMTS
Data Save Capabilities
Overlying Points on Graph
Improved Alphanumeric
Axis Description
Add Interactive SPSS
PROJECT PLAN
ALL USER NEEDS
PROGRAMMING
(PERSON TIMES IN MONTHS)
\
12
I
24
36
CALENDAR MONTHS
Figure V-l
-------
APPENDIX A
-------
APPENDIX A
HISTORY OF UPGRADE
-------
Section I - History
Under NEPA, the CEQ was established to (among other requirements):
• gather timely and authoritative information concerning the conditions
and trends in the quality of the environment..., to analyze and inter-
pret such information...
• review and appraise various (Federal) programs and activities in light
of NEPA... make recommendations to the President...
• develop and recommend... national policies to foster and promote the
improvement of environmental quality...
• conduct investigations, studies, surveys, research, and analyses
relating to ecological systems and environmental quality...
• document and define changes in the natural environment... accumulate
necessary data and other information for a continuing analysis of
these changes or trends and an interpretation of their underlying
causes...
Early on, CEQ recognized that to be fully responsive to the requirements
of its charter, it would need some type of cross-disciplinary analysis tool
which would assist in correlating environmental, natural resources, public
health and related data. Because environmental policies, programs, and deci-
sions at all levels had to be guided by the best analytical information avail-
able, the required analysis tool - and its resultant data - would have to be
readily accessible to both policy makers, or managers, and scientists.
With a clear concept of what is needed, CEQ developed the UPGRADE system
to:
• provide easier access to computerized environmental data
• facilitate more efficient and convenient environmental assessments
• foster increased uses for available environmental data
• provide better capabilities for identifying correlations between
factors represented in different computerized data banks, and
• improve environmental research and data collection programs through
the insight and feedback provided by the users of the UPGRADE system.
Sine 1975, CEQ has been actively collaborating with other federal agencies
in screening and selecting data for use in the UPGRADE system. From the time
it was first implemented, UPGRADE has been used to support various federal and
state government projects including:
CEQ Annual Reports
Other CEQ analyses for reports to the White House and/or OMB
State of New Jersey environmental analyses
The Environmental Protection Agency
The Department of Energy, and
'Mapping of NCHS cancer death rate data.
(Several case histories of water and air quality studies, using UPGRADE,
are presented in Section II of this appendix).
-------
Support for the above projects has come in the form of: straight data
analyses; maps (for NASQAN, NCHS, and county and population-weighted air,
water and health data); and graphs, charts, and bar-graphs relative to air
and water quality and health, oil spills and coal data. In addition, UPGRADE
has been used to support various as-needed projects such as:
• Pollution funding grants vs. city population
• Energy use in the U.S. (from 1850 - 1975), and
• ORDIS program elements vs. funding.
Thus far, most of UPGRADES's data have been selected from data banks
maintained by the Environmental Protection Agency (primarily SAROAD and STORET),
the National Institutes of Health, the National Oceanic and Atmospheric Admin-
istration, and the U.S. Geological Survey.
With the basic system development and extensive testing of UPGRADE com-
pleted, CEQ is planning to increase the available access and user support for
UPGRADE to serve other federal, state, academic, and private organizations.
A-2
-------
Section II - Case Histories
Twelve Rivers Study
425 monitoring stations were selected on the mainstreams of 12 rivers.
These stations were chosen from STORET, EPA's computerized water quality data
system, on the bases of sampling location, monitoring methods, period of
record, sampling frequency, and other criteria. In general, the data base
chosen for this evaluation includes the best ambient trend monitoring stations
on the 12 rivers under consideration.
Of the several kinds of-statistical analyses and indicators used, the
most informative trend indicator seems to be the composite violation rate.
This indicator represents the proportion of all measurements of a specific
water quality variable which exceeds the "benchmark" value for that variable.
Benchmark values were determined on the basis of published water quality
criteria, standards, or in some cases, arbitrarily chosen reference points.
Fecal coliforra bacteria was the main variable chosen to illustrate trends
in sanitary water quality.
Done by - GKY
Dave Tucker - Principal Investigator
Background - EPA did a study on water quality trends - as called for in section
305A of the FWPCAA - report to Congress - CEQ studied 12 rivers as a follow-up
to the EPA report.
25 Cities Study - Water Quality Trends
The CEQ-EPA analysis of sanitary water quality and sewage treatment in 25
U.S. municipal areas from 1968 - 1976. In general, the trend was toward
better sanitary water quality. Areas chosen are a representative cross section
of the nation's municipal areas. An important factor in the selection was the
adequacy of water quality data in EPA's STORET data bank. Water quality data
from a total of 100 monitoring sites were analyzed (average of 4 sites per
city). Monitoring sites and data were screened in consultation with EPA
regional personnel for adequacy of the variables measured, measurement frequen-
cy, period of record, geographic location, and other criteria. The UPGRADE
system was used to analyze the data.
Coliform bacteria, DO and BOD were considered in the water quality anal-
yses for the study areas.
Investigators: Jim Reisa, Steve Fullerton, Mimi Hayman, Bill Chapman, Ed
Pechen, Alec McBride (EPA)
Background: CEQ looked at top 150 cities (population-wise) which they anal-
yzed for available data - came up with 25.
Pesticides
Between 1972 and 1975, EPA cancelled the sale of and prohibited most uses
of the persistent insecticides DDT, aldrin, dieldrin, chlordane, and heptachlor.
A-3
-------
To determine how the ban of these chemicals and trends in pesticide use affect
aquatic contamination, CEQ studied pesticide residues in the water and sedi-
ments of streams and other water bodies in Texas, Louisiana, and Oklahoma.
The study area was chosen on the basis of a history of heavy pesticide use and
availability of adequate stream monitoring data. Sixty stream monitoring
stations operated by the U.S.G.S., in cooperation with state agencies, pro-
vided data for the study.
Investigators: Reisa, Fullerton, Hayman and Dale Bottrell
Phenols
CEQ analyzed phenol pollution of the Ohio River and its major tributaries;
measured at 22 monitoring stations by the Ohio River Sanitation Commission.
Two benchmarks were used: (1) the 1 microgram-per-liter water quality criterion
(the standard used by Ohio and Illinois and recommended by EPA for the protec-
tion of public water supplies) and (2) the 5-microgram-per-liter level.
Investigators: Reisa, Fullerton, Hayman
Coverage: 1968 - 1976
Mahoning River Study
CEQ analyzed water quality in the Mahoning River Basin. The emphasis of
the study was on water quality impacts of 'industrial effluents, including iron
and steel producers, blast furnaces, and coking operations.
The Mahoning River Basin has been the subject of intensive analysis by
EPA because of the fear that imposing strict pollution control requirements on
the antiquated iron and steel production facilities would lead to widespread
social and economic dislocation.
The study was done for the Administrator of EPA as a backup for making
policy decisions regarding the 1976 lawsuit filed by the state of Pennsylvania,
the Sierra Club, and others.
Investigators: Reisa and Fullerton
Time frame: ('68) 1972 - 1976 (note: data available only from 1972)
NASQAN (National Stream Quality Accounting Network) Studies
This is an on-going study - reported annually - with different variables
selected each year, depending on priorities. The (network) program was started
during the 1975 water year (i.e., Oct 74 - Oct 75). It is a mapping project,
as opposed to other studies.
NASQAN collects uniform data at the downstream ends of 334 subregional
drainage basins that collectively cover the entire surface of the nation.
Each of the NASQAN monitoring stations measures the same water quality vari-
ables, with the same frequency, using the same methods.
A-4
-------
Investigators: Reisa and Fullerton
Geographic Area: Nation, including Alaska and Hawaii and Puerto Rico
Urban Air Quality Impacts of the National Energy Plan; An Assessment of Six
Cities
Principal Investigators: K.H. Jones, T. Chapman, Randi Ferrari, John
Walker (ERCO)
EPA's Office of Air Quality Planning and Standards and DOE's Office of
Planning, Analysis and Evaluation have made estimates of future emissions down
to the country and city level, respectively. EPA projected emissions for 902
counties while DOE did the same for all of the 243 air quality control regions
(AQCR's). The results of these two studies were used to generate the emission
projections for this study. All three studies, as far as increased coal utili-
zation was concerned, assumed that all provisions of the NEP would be fulfilled.
The EPA study projected that 126 counties could have air quality viola-
tions in 1985 if the NEP were instituted based on single station data in each
county. For this study, six counties were selected which represented the
worst air quality (with respect to three criteria pollutants) in 1975 of the
counties cited by EPA. The air quality data for TSP, SO. and NO. were collated
using the CEQ UPGRADE capability. Some restrictions were placed on the air
quality data base. Several monitoring methods are known to be unreliable and
therefore excluded from the analysis. These data were then analyzed in accord-
ance with the averaging times of the primary health related standards, e.g.,
annual and 24 hourly intervals for TSP and SO. and annually for NO.. A short-
term hourly distribution of NO. was also run to simulate the potential problems
which might be associated with a short-term standard for this pollutant. In
order to reflect the population impacts, the monitoring sites where the criti-
cal 24 hour data were taken were located on urban population density maps.
A simple proportional rollback/rollup model was used to relate future
emissions to future air quality at all of the stations in a county. This is,
in most cases, a conservative estimating technique. Generally, most emissions
do not impinge equitably on all monitoring sites in an urban region. If we
assume that all of the emissions increase the air quality at all of the sta-
tions we are projecting a worst case from a population exposure point of view.
The only cases where this would not be true would occur where (1) present tall
stack sources would become low level sources of emissions or (2) all of the
emissions growth was allocated to a very few point sources in heavily popula-
ted areas. The use of more sophisticated diffusion models are needed to
resolve such geometry errors in proportional air quality projections. Models
and their necessary special and temporal inventories have yet to be developed
for most cities.
A.-5
-------
A Methodology for Estimating Differential Populations Impacts Under Various
Ozone Standard Scenarios
Principal Investigators: Kay Jones (CEQ), Tim Chapman, Mike Airey, Mark
Feldman
EPA is reevaluating the national ambient air quality standard (NAAQS) for
ozone based on a reexamination of the health related data. The question
arises as to what are the possible differential health risks between the
current standard and alternatives which may be more lenient. In order to
perform such an evaluation one needs to be able to estimate actual population
exposure under various projected control scenarios. CEQ, over the past year,
has been developing a risk analysis capability as part of its overall environ-
mental data analysis system i.e., UPGRADE.
An ozone air quality data base was extracted from CEQ's UPGRADE data bank
and examined for quality and completeness. The 6 Denver ozone monitoring
sites were then located on population density maps so that sectorial popula-
tions could be assigned. Using a transformation routine in UPGRADE, base year
distributions for 1975 were generated and distributions assuming different
control levels were projected. These distributions were then analyzed in
terms of various ozone standard scenarios and population exposure statistics.
Air Section CEQ "Annual Report" 77-78
Description:
Trends were/will be assessed using CEQ's air quality data as retrieved from
EPA's SAROAD and manipulated by UPGRADE into frequency distribution.
Project Principal Investigator: Kay Jones, Ph.D.
Study Coverage (Variables, Geographical Area, Time Period): Selected sites
(good data - complete) from nation. Years 73-76
Internal (draft) work on implications of proposed short-term NO^ standard.
Description:
UPGRADE analytical procedures were/will be used to compile "Laundry List" of
cities in violation of proposed new standard as compared to baseline cities.
Project Principal Investigator: Kay Jones, Ph.D.
Study Coverage (Variables, Geographical Area, Time Period): Selected sites
(continuous monitors) for regions in violation of current standard. N0_,
1973 - 76
A-6
-------
Air Section CEQ "Annual Report" 77 - 78
Description:
Risk - frequency distributions for sites in selected SMSA's were/will be
generated to evaluate exposure parameters for risk assessment.
Project Principal Investigator: Kay Jones, Ph.D.
Study Coverage (Variables, Geographical Area, Time Period): Selected SMSA's
(good data - complete) CO and Ozone. 1975 base year.
A-7
-------
APPENDIX B
-------
APPENDIX B
COMPARISON OF UPGRADE TO OTHER EPA SYSTEMS
-------
COMPARISON OF UPGRADE TO OTHER EPA SYSTEMS
The available documentation on the EPA STORET and SAROAD systems has been
studied to determine the extent of graphics capability built into each system
and to compare their capabilities to the graphics available in UPGRADE. (See
Table B-l.)
The comparison reduces to STORET vs. UPGRADE, since, according to avail-
able SAROAD documentation, there is no graphics capability in SAROAD. Although
the SAROAD documentation is from 1971 (User's Manual), 1973 (Terminal User's
Manual), and 1974 (SAROAD Interactive Access System), there has been no ad-
dition of graphics to SAROAD in more recent years. The SAROAD system is ori-
ented primarily toward the storage and retrieval of aerometric data and some
rudimentary statistical operations on the data, such as grouping, means and
standard deviations, are available. These are also available during UPGRADE
processing of SAROAD data.
According to the June 1977 STORET manual, the following graphics programs
are available: PLOT, LOG, MSP, and REG. Each of these will be discussed in
relation to the graphics capabilities of UPGRADE, and also the STORET program
STAND, which is not a graphics program but does produce output similar to the
bar-charting capability in UPGRADE. Copies of the relevant pages of the STORET
manual are included in Appendix F.
I. STORET PLOT Program
The PLOT program produces scattergram or polygon-type plots of vari-
ables versus time.
PLOT UPGRADE
only time on x-axis time or any other variable on
x-axis
stream loadings not currently available (requires
data loaded to UPGRADE)
log values on y-axis log values either axis, also
probability axes
limits for variable values full data filtering capability
batch mode interactive mode
produces CALCOMP plot (one TEKRONIX hard copy or CALCOMP, or
plot per tape) line printer, etc.
Scaling can be made uniform can be done with slight difficulty
over series of plots
II. STORET LOG Program
The LOG program is the mapping program for STORET. It serves mainly
to show the locations of the water monitoring stations and does not plot
B-l
-------
00
I
to
Requirements
Data
Access
For:
Air Quality
Water Quality
Demographic
Health
Other Fnvir.
User's Daca
DATA Listing
Data Manipulation
Basic Statistics
PolvKon Pines
Dar Charts
Reqrassion
Percent Jliis
Happing
Correlation
Interactive vs. Batch
User background rcq.
Dace of information
STORE!
NO
Main EPA DB
NO
NO
NO
ONLY IF IN STORET
YFS. BATCH JOB
FILTERING
BATCH JOB
STRAIGHT LINE
NO
REG PROGRAM
DFGREE=1
STAND program
LOG program
REC program
B.itch
SOME TEXT EDITOR &
FOLIOl' MANUAL
June '77
ADP01T
NO. bur could
SET UP TOR IMMEDIATE
COULD HE ADDED
COULD Bi: ADDED
COULD BE ADDED
YES, THROUGH UTS
TO TERMINAL OR OTHER
I/O PFVirF.
FULL TRANSFORM
CAPABILITY
YES; PLUS AGGREGATION
AND OTIII:R STATISTICS
STRAIGHT OR DASH
SHADED WIDTH & NUMBER
TONTROI
DECRTE-1 ro 9
FULL USLR PROGRAMMED
NO
NO as standard
CAN BE USER PROGRAMMED
INTFRACTIVE
FORTRAN or BASIC helpful
not rcq'd; learn manual
June '75
SAROAD
Main EPA DB
NO
NO
NO
NO
ONLY IF IN SAROAD
YES, BATCH JOB
FILTERING
BATCH JOB
NO
NO
NO
NO
llarrh
Some text editor &
follow manual
"71, <71, «74
UPGRADE
BY TAPE FROM SAROAH
BY TAPE FROM STORET
VERY LITTLE
BY TAPE FROM NCHS
SOME: oil spill, etc.
YES, GENERAL INTERFACE
TO TERMINAL OR DISK
FILTERING TRANSFORM PLANNED
YES; PLUS SAS
STRAIGHT OR DASH
SHADED
NUMBER CONTROL
DFfiRFF-l rn A
PARTITIONING SUBROUTINES
NASQAN, COUNTY,
DFMnrP «nuio
AS PART OF REGRESSION
TNTFRAfTTVF
No computer's data analysis
•3 It 1 lie ran'H (no fir il 1 \
Aiipucr '7ft
Table B-l. Graphics Capabilities vs. Other EPA Systems
-------
any indication of values of variables; in fact, it does not retrieve any
data. Plots can include county lines, city outlines, lakes, rivers, etc.
The scale of the map is controlled by the user.
The UPGRADE capability most similar to this, currently covers county
level and NASQAN basin maps. UPGRADE produces the appropriate data set
which is then mapped using VITRO software and plotters. At present, the
data bases for the NASQAN and the demographic county maps are being put
directly on UPGRADE.
The UPGRADE mapping capability produces maps with up to five shading
levels (cross-hatching, etc.) giving a visual indication of the geographic
and/or demographic-geographic distributions of pollutants, morbidity/
mortality, and other variables available on UPGRADE.
III. STORET MSP Program
The MSP (Multiple Station Plot) program produces plots of statistical
values of variables observed over a specified time period; y-values are
plotted against an x-axis scale of distance along a stream - thus, mul-
tiple stations on the stream. Again, this is batch mode and produces
CALCOMP or line printer output. UPGRADE does not now have this exact
capability. The statistical capabilities are there; stations can be
grouped, and plots produced for any two variables. One technique that has
been used on UPGRADE for similar analysis is to use automatic sequencing
to produce a series of plots, each one showing time series or violation
percent at one station. Studied in sequence, these plots provide a similar
picture of the situation as a function of the station's position on the
stream.
Also in UPGRADE, users of the IDB (Integrated Data Base) have avail-
able to them an option of plotting data versus a "geographical profile" on
the x-axis. Since this data is at the county level, the interpretation is
not as straightforward as for stations on a stream. Some water, health,
and other variables are available on the IDB and more are being added.
IV. STORET REG Program
The REG program produces a scattergram plot of two variables, or a
variable and time, along with a fitted regression straight line.
REG
UPGRADE
Batch mode
Interactive mode
Output on line printer; user
must draw in line
Only first order polynomial
Can plot two different (or
same) variables at two
different sites
Plot on TEKTRONIX or CALCOMP or
line-printer; much higher reso-
lution available
Up to 6th order polynomial fitted
to data
Can only plot data at one site, or
grouped stations, but not one state
versus another. (IDB can plot one
point for each of many stations.)
B-3
-------
V. STORET STAND Program
This is a non-graphics program that can compare data values to a
standard and compute various violations summaries. Through the use of the
partitioning of data in UPGRADE, a user can achieve these same statistics
and then go on to graphically present the results (typically with a bar
chart showing percentage or number of violations on y-axis versus time on
the x-axis).
Other UPGRADE graphics capabilities not found in STORET programs in-
clude the use of shaded bar charts to represent data, user control of grid
lines, tick marks, axis annotations and other "plot modifications", SAS
statistical procedures applied to data to be graphed, data partitioning on
graphed data, etc.
One of the big differences is that UPGRADE is interactive and STORET
is batch mode. Thus, the UPGRADE user can produce many different graphs
in an hour, each one influencing the next, while the STORET User must wait
hours or days for each plot (and is cautioned not to do many in a short
time).
B-A
-------
Comparison of Data Analysis
and
Graphics Features
ADROIT
vs
UPGRADE
B-5
-------
ADROIT (Automated Data Retrieval and Operations Involving Timeseries) was
developed by UNIDATA, Inc. under contract to the state of Michigan for use in
research on the effectiveness of water quality control procedures. It is
oriented toward the analysis of large amounts of data extracted from STORET and
(apparently) stored in a manner similar to STORET1s, i.e., using parameter
numbers, STORET's station codes, etc.
ADROIT is an interactive system for analysis of water quality data by
rapid retrieval, statistical processing, and graphic display. It is basically
an interpreter for a special purpose, problem-oriented programming language,
designed to produce retrospective statistical analysis of this data and report-
ready graphs of user-selected results.
ADROIT had two major subsystems: the ADROIT Computation Subsystem (ACS)
and the ADROIT Display Subsystem (ADS). An additional related program (COMPOSE)
is available to further process, combine, and replot any of the graphs produced
by ADS. (Detailed documentation for all programs is contained in the June 1975
ADROIT manual.)
ADROIT operates under MTS* in a manner similar to the way UPGRADE operates
under TSO. Thus, a complete evaluation of the ADROIT system would require
study of the MTS capabilities.
The ADROIT special purpose interpretive programming language was designed
specifically for the analysis of timeseries data types. In addition, two new
data types - obs and timeint - were invented. Obs is a four-tuple of values,
comprised of the mean, sample variance, sample weight, and time associated with
the data. Timeint holds beginning and ending time along with interval width,
thus allowing easy time period restriction and data aggregation.
Functions available to operate on variables in ADROIT include: Type
Conversion (obs to numeric, numeric to obs, timeint to scalar, etc.); Statis-
tical (inverse normal, Chi-square, Student's t test, Inverse Fisher's F);
Summation (sum vector); Informational (minimum, maximum, length of vector);
Numeric Computational (ABS, SQRT, EXP, LOG, etc.); Time Series (aggregation,
'simultaneous' observations, extract 'simultaneous1 observations, restrict time
range, etc.), and Hydrological (dissolved oxygen saturation, Water Quality In-
dices [on temperature, turbidity, DO, BOD, pH, etc.]).
One very important feature of ADROIT is that it permits the user to set up
a library of user defined procedures. The basic purpose is to allow the user
to store a set of commands and execute all of them by typing in the name of the
procedure (with any passed parameters). Also, control flow statements are
fully available for use in procedures, thus giving full programming flexibility.
This current capability of ADROIT goes at least as far as the planned "superterse"
capability in UPGRADE (and probably well beyond it in terms of "subroutine"
nesting, access to auxiliary files, etc.).
*MTS: Michigan Terminal System - the time-sharing operating system of the
University of Michigan Computing Center.
B-6
-------
The following is a sample procedure which uses control flow in checking
end of file on the station number file:
PROCEDURE WQIPHOS.(STRING,OBS,TIMEINT)
OBS WQINDX, EPAPAR
OPEN &1
READSTA
WHILE .NOT. EOF
WQINDX = WQI.(&3)
EPAPAR = RESTRICT.(&2,&3)
PRINT CURSTA, WQINDX, EPAPAR
READSTA
ENDWHILE
CLOSE
RETURN
ENDPROC
This procedure could be invoked by typing WQIPHOS.('HURSTA1,P665,TIME
70 THRU 74)
The above sample points out the fact that ADROIT uses non-English language
as compared to the English prompting questions of UPGRADE. Use of ADROIT would
therefore require a thorough reading of the manual to learn the system's capa-
bilities, and would probably require referencing the manual for a while to
learn the exact forms of the commands.
Once learned, however, and especially with use of the user procedure lib-
rary provision, ADROIT could respond more easily and quickly to the require-
ments of the user. In this regard, UPGRADE favors the casual or occasional
user. (Although a programming background would reduce learning time for a new
ADROIT user more than for a new UPGRADE user, such a background is not con-
sidered to be a requirement for learning either system.)
These ADROIT user procedures can also contain graphics control commands of
the ADS. An example of a more complex procedure that would produce either
graphs or data printout (depending on value of 'LOGICAL') follows:
PROCEDURE WQIPHOS.(STRING,OBS,TIMEINT,STRING,LOGICAL)
OBS WQINDX, EPAPAR
OPEN &1
READSTA
WHILE .NOT. EOF
WQINDX = WQI.(&3)
EPAPAR = RESTRICT.(&2,&3)
IF &5
%BEGIN DISPLAY COMMANDS
GRAPH WQINDX,'10000'
EXEC
PCHR = +
LINE
CRVE =1, 2, DASH
HOLD
EXIT
GRAPH EPAPAR, &4
B-7
-------
AUTO
PCHR = +
LINE
CRVF - 1, 2, DASH
EXIT
%END DISPLAY COMMANDS
ENDIF
IF .NOT. &5
PRINT CURSTA, WQINDX, EPAPAR
ENDIF
READSTA
ENDWHILE
CLOSE
RETURN
ENDPROC
A sample graph as might be produced by this procedure is shown in Figure
B-l.
In the ADS, command keywords are available to modify the "structure" of
the graph. A distinction is made between "background" elements of the graph
and "data" elements of the graph. Here is a list of the two sets of elements:
Background Data Group
a) the axis system a) plotting characters
(x-horizontal,y-vertical) b) solid line
b) x tick marks c) dashed line
c) y tick marks d) smooth curve
d) x grid lines e) least squares curve
e) y grid lines f) bar
f) x tick mark labels g) general text
g) y tick mark labels
h) x axis title
i) y axis title
Figure B-2 shows samples of background elements; figure B-3 shows data
elements.
After specifying many graphs by automatic or manual modes, the ADROIT user
may then use the stand-alone program COMPOSE to format several graphs on one
page of final CALCOMP output, and also to add other text, arrows, boxes, etc.
Figure B-4 shows addition of text to two graphs combined on one page. Figure
B-5 shows addition of graphics (circles, lines, boxes) to these graphs. Figure
B-6 shows possible composition layouts (a maximum of six graphs per page is
allowed).
The following table shows some differences between ADROIT and UPGRADE that
are not mentioned in the preceding discussions.
B-8
-------
ADROIT
UPGRADE
• axes:1inear,log
• provision for 3 colors
on Calcomp
• curve smoothing (connect
points with smooth curve)
• "explain" facility for
commands
• multiple plots per graph
frame
• up to 6 graph frames per
page
• regression: degree =
1 through 9
• draw confidence limits or
standard deviation bars
around plotted data
aggregation points
• Make changes to graph without
redrawing on screen
• Produce plots on CALCOMP or
display screen
• water quality and user data
No map interface
SUMMARY
• axes:linear,log,probability
• only one color
• not available
• "help" explanations of
questions
• only one plot per graph
frame
• only one frame per page
regression: degree
1 through 6
not available
• Redraw for any change
plots on CALCOMP, ZETA, line
printer or display screen
water, air, health, and user
data
NASQAN, county, demographic
maps
The main difference between ADROIT and UPGRADE is the mode of user inter-
action. UPGRADE asks English questions; the user answers "yes", "no", "help",
or "01", "13", and so on, as appropriate. The user doesn't have to really
understand where he or she is going, though the results may be questionable if
this understanding is lacking. On the other hand, ADROIT simply waits for the
user to issue a specific command instruction, executes it, then waits for
another command (via keyword commands). More is required of the user, but
more analytic capability is readily available (especially through the procedure
library) as a result.
Other differences are relatively small or cosmetic since the basic motiva-
tion of the two systems is remarkably similar i.e., providing an interactive
graphics system for environmental data.
B-9
-------
STN = 810002
0.05
71 72 73 74
TIME OF OBSERVATION
75
Figure B-l. Total Phosphates at Huron River Station
-------
7
Y-AXIS TITLE
14.-s
CD
X
o
Y-TICK MARK
LABEL —^
Y-TICK MARKS
AXES
+
X-GRID LINE
Y-GRID LINE
X-TICK MARKS
+
10. 15. 20.
TEMPER ATUREr DEG. C
X-AXIS TITLE
25.
30,
X-TICK MARK
LABEL
Figure B-2. Graph Background Elements
-------
GENERAL TEXT
"
SMOOTH CURVE
THIS IS
GENERAL
TEXT
DASHED
LINE
SOLID
LINE
LEAST
SQUARES
FIT
PLOTTING
CHARACTERS
BAR
Figure B-3. Graph Data Elements
-------
WATER QUALITY INDICATORS
100.T
UJ 80.
z
so.
40.
< 20
0.
!
70
71 72 73
TIME OF OBSERVATION
75
STATION 580047
HURON RIVER
BERLIN TWP.
0.05
70
71 72 73 74
TIME OF OBSERVATION
Figure B-4. Original Graphs with Textual Annotation
B-13
-------
TER QUALITY INDICA TORS
100.T
X
8 80-
Z
60."
a «•
fg
I-
o.
• ^ J
70
0.05
LOW
_L
71 72 73 74
TIME OF OBSERVATION
75
STATION 580047
HURON RIVER
BERLIN TWP.
71 72 73 74
TIME OF OBSERVATION
75
Figure B-5. Examples of Graphical Annotation
B-14
-------
1
1
1
1
-2:
Figure B-6. Possible
Composition Layouts
B-15
-------
PGM=PLOT,
The PLOT program plots the values of each selected parameter (y-
axis) for each selected station for the specified time period (x-
axis). Values plotted may take the form of raw concentrations
(e.g. mg/1) or loadings (e.g. Ibs/day). Options include scale
control and plotting of symbols. Plots are produced on EPA's
digital plotting equipment (CALCOKP) and disseminated to users.
ITMCT
cnnom LNGL
«C !« 91.2 0*9 Zl »«.C 1
IMG IIWE 10 111 c
WOSS (iicnicm
I (WE Sift* I Or OTO7U
HI OHTOMICBN HIVtK
IMMFM
eooo fen BWTM ttws oo
tint wrrs
This PLOT program output plots the values of parameter 010«5
(Iron, Total, ug/1) stored at station 070009 for the years 1973
through the first half of 1976.
B-16
-------
PGM=LOC,
The LOG program plots a map of a user defined area, and plots a
symbol to denote the locations of all stations within the area,
using CALCOMP plotting routines. Printed output from the mapping
routine includes a listing of all stations and their associated
latitude and longitude coordinates. Plots can optionally include
outlines of cities, lakes, reservoirs (where available), and
county lines. Plotted stations can be tagged with coded
identifiers which cross-reference to the printed output.
. '"O'fCTiaw OCCMCT
STORET SYSTEM
tt*l.i '• IMMI It > H »LIIM«*
This LOG program output plots the locations of stations along the
Flint River in Michigan, along with the outline of the counties
through which the river flows. The stations are tagged for easy
cross reference to a list of station locations.
B-17
-------
PGM=STAND,
The STAND program compares the observed values of selected water
quality parameters with a set of values (criteria) specified
within the retrieval request. These criteria could be the state
or Federal standards currently in force for a particular stream
segment of interest. Stored parameter values which do not
satisfy the criteria comparison are flagged with an asterisk (*)
in the program output. The program provides for various output
formats, including violations lists and violations summaries.
VIOLATIONS KITH
AKB NT/LAKE
DATE TIHC
73/06/04 0*00
71/09/11 0915
14/07/19 0110
SUPPORTING PARAMETERS
00011 00300 00400 00(10
MATER DO Pd NB3-N
TEMP
rAHH
(1.
((.
78.
TOTAL
HC/L SI) HC/L
9. 7.
1. (.
e. 7.
070D09 LONCL
46 14 11.2 0(9 21 4B 0 1
LONG LAKE 10 41 E HATERSMEET
J4053 HICBIGAN
LAKE SUPERIOR 070793
MS OHTONACOH RIVER
I4ACNFS9
0000 PEET DEPTH CLASS 00
01045 31J01
I ROM TOT COLI
PE.TOT HP I NENDO
UG/L /lOOKL
1500 0>
•00.0*
(20.0* 10 0
AM ITT/LAKE
SUMMARY OP VIOLATIONS ON
•0 OF VALUES
MEAN
MEDIAN
NO or VIOLS
PERCENT VIOL
HINIHUR VIOL
MEAN VIOL
MAXIMUM VIOL
00011
MATER
TEMP
PAHN
32
eg.
(2.
0
0
0.
0.
0.
00300
00
MG/L
22
10.
9.
0
0.
0.
0.
0.
004CJ
PU
SU
32
(.
7.
0
0.
0.
0.
0.
00(10
MH3-N
TOTAL
MC/L
S
0.
0.
0
0.
0.
0.
0.
070009 LOkGL
4( 14 31.2 019 21 4B 0 1
LONG LAKE 10 MI E HATERSMEET
24053 MICHIGAN
LAKE SUPERIOR 070793
MB ONTONAGON RIVER
14ACNPS9
0000 PEET DEPTH CLASS 00
SAMPLES COLLLCTED PROM 73/01/0] TO 74/07/13
0104S
IRON
PE.TOT
UC/L
15
1S6.I
120.0
1
7.
(20.0
(20.0
(20.0
500.0
J1501
TOT COLI
HF1MENDO
/100HL
10
294. S
ts.o
2
20.
UO.O
1150.0
1SOO.O
(00.0
Two formats of the STAND program are shown above: Violations with
Supporting Parameters, and a Violations Summary. ati™s wxtn
B-18
-------
PGM=MSP,
The MSP (Multiple Station Plot) program performs a number of
statistical computations on the values of selected parameters,
and plots the resulting values as a function of the stations
selected. The program allows the user rather extensive control
over the format of the resulting plots. Parameters to be
plotted, scaling and axes control, statistical values to be
plotted, stations to be grouped, and line printer or digital
plotter output are all user-optional specifications.
stontt
o
KILCJ
to
5THTICN PlOT IKSfl
run IUI2I 10 isioa?
SO 120
ISO
110
net no. i
>IO "IKS
-------
PGM=REG,
The REG program allows the computation of the best-fit straight
line relationship between a parameter and time, between two
different parameters at the same site, or between the same or
different parameters at two different sites. The program permits
specification of features of the linear regression analysis to be
performed, including the specification of time periods, and
maximum abscissa and ordinate values.
SUMMARY t>ACC
CORRCLATIO* t »£G»r.SS!0« ANALrSIS HO DUALITY l-ARAHETERS SAKE SITE.
STATION! 070009 LONG LAKE 1C MI C (UTERSHECT
LATITUDE I LONGITUDE: 46 II 31.2 08i 21 46.0 1
ABJCISSA PARAMETER: 00011
OROINATt PARAKETEK: OC300
HATER
DO
TAHN
MG/L
•EOUtSTtO
ANALYSIS rROK: 1»73/ 1
TOi l»7i/ 7
KCtlVU
ANALYSIS tun, l»7}/ I/ J TOi 1«7»/ 7/1J
REGRESSION LINE:
ORIGIN IS 0.0
Y. I7.J43 .
CORRELATION COEFriCIENT: -0.»5
CUErnciENT or DETEIMINATION: o.*i
STANDARD ERROR OF ESTIMATE: O.S4}>!
STANDARD ERROR Or IHTCKCEPTl O.S^OJO
STANDARD LRROR OF bLOPc: U.00906
T VALUE TOR 1NTCRCCPT: 31.334)2
T VALUE rOR SLOPE: 14.11313
4I.2&00 ii.bOOO il.7100
»TER TEMP rum
71.0000
This run of the REG program displays the best-fit straight line
relationship between two parameters (water temperature and
dissolved oxygen) at a single station location (station 070009).
A summary of the statistical computations performed is provided
on a summary page which accompanies the REG print plot. The
asterisks appearing in the plot margin denote the intercepts of
the regression line with the plot axes. Users may draw a line
between these points to show the regression line.
B-20
-------
PGM=PLOT,
PROGRAM DESCRIPTION:
The PLOT program retrieves data from the WQF and plots
the values of each selected parameter for each selected
station (Y-axis) for the specified time period (X-
axis). Values plotted may take the form of raw
concentrations (mg/1), loadings (Ibs/dy), or
logarithms. The program allows for the control of the
plot format including size control and plot symbols
used.
GENERAL KEYWORD APPLICABILITY:
All general keywords described in Sections H and 5 are
valid with the PLCT program. The parameter keyword P
may be specified up to 10 times.
OTHER NOTES AND COMMENTS:
The program generates the plot data onto a magnetic
tape which is forwarded to EPA headquarters for
plotting and dissemination to the user.
Users should exercise constraint in producing a large
number of plots in a relatively short period of time.
Each specification of fPGM=PLOTr' generates a separate
plot tape which is sent to EPA by the computer services
vendor. Requests for a large number of plots in a
short period of time may result in a backlog of tapes
to be plotted at EPA and a depletion of the number of
tapes available for use as plot tapes. If you have any
questions relating to your particular plotting
requirements, please call STORET User Assistance.
The following keywords described in the MEAN program
may be used with the PLOT program to plot calculated
values:
LOAD allows computation of stream loadings
LOG calculates logarithmic values
LL1... specifies acceptable ranges for parameter
values
LV establishes lower limit for a parameter
value
HV establishes upper limit for a parameter
value
calculation of dissolved oxygen saturation
calculation of un-ionized ammonia
B-21
-------
PROGRAM OUTPUT:
Plot values of iron (P=10<»5)
The starting date for the
x-axis (time) is to be
January 1, 1973.
PGM=PLOT, PURP=305B/STA,
A=1UAGNFS9,5=070009,
P=10U5,
BD=730101,
PRT=NO,
1
STONfl
070009 LOHCL
If IV 31.2 085 21 UB.O
LO«C LBKE 10 HI t HOT
26053 KlrnlCIlN
L>I«E SUPEMO" 070
KB ONTONRGOK hIVER
IMOUFSI
0000 fEt' OEP'H CLRSS 00
too m
TJME OPT?
B-22
-------
CLASSIFICATION: Program Associated Keywords
FACT
SYM
PGM=PLOT,
USE:
These keywords may be used with the PLOT program to
specify the size of each plot and the plotting symbol
to be used to represent the plotted data points.
KEYWORD FORMATS AND VALUES:
FACT=n, where n is any numerical value, including a decimal
value, between 0.1 and 5 inclusive. The specified
value is the multiple by which the basic plot size of
5 1/2" x € 1/2" (FACT=1.0) is increased or decreased.
SYM=mss, where ss is any two digit whole number that
equates to one of the plotting symbols shown
below, and where m is an optional minus
sign which specifies that the plotted symbols
are not to be connected by straight lines.
+ X 0
00
01
02
03
04
05
Z Y X
08
09
10
11
12
1«
06
-5"
20
X
07
The value of 20 specifies that no symbol is to
be plotted.
Leading zeroes must be specified.
DEFAULT VALUES: FACT=1.40, (7.75" x 9.0")
SYM=02, (plot using the symbol
straight lines.)
NOTES ON USAGE:
,connected by
These keywords may be specified only once in a
retrieval request.
B-23
-------
EXAMPLE (S) :
Plot values of water temperature PGM=PLOT,PURP=305B/STAr
and DO as a function of time . A=1UAGNFS9,5=070009,
using an "X" to mark data points. P=11rP=300,
The size of the plot produced BD=730101,
is to be approximately 11" by 13". FACT=2,SYM=ur
PRT=NO,
B-24
-------
sc
PGM=PLOT,
CLASSIFICATION: Program Associated Keyword
USE: This keyword may be used with the PLOT program to
specify that the scales of the axes of the plots are to
be uniform throughout the program. The keyword SC
specifies that the scales of all plots associated with
stations and parameters are to be identical.
KEYWORD FORMATS AND VALUES:
SC=A, examines all of the data retrieved and
then sets scales according to the maximum
value for each parameter retrieved and the
maximum number of days. This causes the
scales for the various plots to be the
same.
DEFAULT VALUES: None.
NOTES ON USAGE:
If *SC=A,f is not specified, the x- and y-axes will be
set to the maximum and minimum sampling dates and
maximum parameter values for each parameter at each
station.
B-25
-------
EXAMPLE(S) :
Plot values of DO for the three
stations specified (i.e. three
plots will be produced) . The.
sampling characteristics of the
three stations are as follows:
PGM=PLOT,PURP=305B/STA,
A=1UAGNFS9,5=070002,
8=070006,5=070009,
P=300,
BD=730101,
SC=A,
PRT=NO,
station
070002
070006
070009
date
730108/760713
730108/740516
730103/760713
max, value
11.0
16.U
13.9
Since *SC=A,f is specified the axes
will be set as follows for all three
plots.
origin point
730103
0
end point
760713
16.U
B-26
-------
NOPLOT,
PGM=PLOT,
CLASSIFICATION: Program Associated Keyword
USE: This keyword may be used with the PLOT program to
eliminate the plotting cf specified parameters.
KEYWORD FORMAT AND VALUE:
NOPLOT, specifies that the values of the immediately
preceding parameter are not to be plotted.
There is no value associated with this keyword.
DEFAULT VALUE: Not applicable.
NOTES ON USAGE:
If not specified, all requested parameters will be
plotted.
NOPLOT applies to each parameter keyword it follows,
and can be specified as many times as is required
within a retrieval request, up to a maximum of 10
times.
SOME REPRESENTATIVE USES:
For calculating loadings, a flow parameter must be
retrieved, but need not be plotted.
EXAMPLE (S)
Plot loadings for parameter
650 at the specified station
but do not plot values for the
flow parameter (60) .
PGM=PLOT,PURP=305B/STA,
A=11 15D050,S=255U20,
P=60, NOPLOT. P=650, LOAD,
3-2?
-------
PGM=LOC,
KEYWORDS SPECIAL TO THIS PROGRAM:
The following keywords apply only to this program:
SCALE selects the desired scale for the area to
be plotted
NOCOUN suppresses plotting of county boundary lines
NOPOLPLT suppresses plotting of the polygon
TAGS tags stations with a cross reference number
STREAMS plots streams
CLR plots outlines of cities, lakes, and reservoirs
See Section 7, Advanced Retrieval Programs and their
Special Keywords, for additional capabilities of this
program.
B-28
-------
PGM=LOCC
PROGRAM DESCRIPTION:
The LOG program plots a map of the area defined with
any station selection method, and plots a symbol to
denote the locations of all stations within that area,
the state boundaries, and optionally, the county
boundaries and polygon vertices. The map will be a
maximum of 2U" high (north/south), and 49" wide
(east/west) to include the area plotted. Included with
the map is a listing of all stations and their
associated latitude and longitude coordinates.
GENERAL KEYWORD APPLICABILITY:
All station selection keywords described in Section u
are valid with this program. Data selection keywords
are not valid since the LOG program retrieves no data.
The HEAD keyword may be specified and will appear in
the title block of the map. (When used with LOG, the
text' of the HEAD keyword may not exceed 35 characters.)
SHIFT, PRT and PRMI are not valid for LOG.
All advanced general retrieval keywords described in
Section 5 are valid except the PM keyword.
OTHER NOTES AND COMMENTS:
This program utilizes a relatively large amount of
computer resources and conseguently can be rather
expensive to run.
Although all station selection methods are valid, it is
recommended that polygon selection (LT,L keywords) be
used to ensure the most accurate locations of stations
plotted.
B-29
-------
PROGRAM OUTPUT:
Plot a map of the area
described by the specified
polygon, with symbols plotted
to denote stations belonging
to agency 21MICH.
PGM=LOC,PURP=305B/STAr
TAGS, STREAMS, CLR,
SCALE=250000,
L=U320,L=8UOU,L=U315f
L=833730,L=4330,L=8315,
L=U330,L=830730,L=U30730,
L=830U,L=U32230,L=825230,
L=U33730,L=825230,L=U33730,
L=82U7,L=U3a5,L=82U7,L=43U5,
L=8315,L=«»33730,L=8315,
L=a330,L=83U5,L=U330,L=8«405,
U= 21 MICH,
B-30
-------
CLASSIFICATION:
USE:
Program Associated Keyword
SCALE
PGM=LOC,
The SCALE keyword may be used with the LOG program to
specify the desired scale for the map to be plotted.
KEYWORD FORMAT AND VALUES:
SCALE=scale,
where scale is any numerical valqg
indicating the scale desired. No
commas are to be embedded within
the value. Units are in real-
distance/map-distance, so
SCALE=500000, will produce a
map at scale 1:500,000.
DEFAULT VALUE: None.
NOTES ON USAGE:
If this keyword is not specified, the system will
maximize the scale used, based upon the size of the
area to be plotted.
If the scale specified would result in a map over 2U"
high (north-south direction) or
-------
NOCOUN,
PGM=LOC,
CLASSIFICATION:
USE:
Program Associated Keyword
The NOCOUN keyword may be used with the LOG program to
suppress the mapping of county boundaries.
KEYWORD FORMAT AND VALUES:
NOCOUN, There is no value associated with this
keyword.
DEFAULT VALUE: Not applicable.
NOTES ON USAGE:
On maps covering large geographical areas, the presence
of county boundaries on the map may hinder the study
and interpretation of the map. Specifying NOCOUN can
help alleviate this condition, as well as reduce the
cost of the plot.
EXAMPLE (S) :
The map specified will be
plotted with county lines
suppresed.
PGM=LOC,PURP= 3 05 B/STA,
SCALE=250000, NOCOUN,
LT=I,
L=1320,L=8UOU,L="315,
L=833730,L=4330,L=8315,
L=U330,L=830730,L=U30730,
L=830U,L=U32230,L=825230,
L=U33730,L=825230,L=U33730f
L=82U7,L=a345,L=8247,L=<*3<»5,
L=8315,L=U33730,L=8315,
L=U330,L=8345,L=«330,L=8105,
U=21MICH,
-------
NOPOLPLT,
PGM=LOC,
CLASSIFICATION: Program Associated keyword
USE:
The NOPOLPLT keyword may be used with the LOG program
(when the area whose stations are to be plotted is
defined by a polygon) to suppress mapping of the
polygon.
KEYWORD FORMAT AND VALUES;
NOPOLPLT,
There is no value associated with this
keyword.
DEFAULT VALUE: Not applicable.
EXAMPLE (S) :
The map specified will be
plotted without the
polygon outline.
,PURP=305B/STA,
50000, NOPOLPLT,
PGM=LOC,PURP=
SCALE=250000,
LT=I,
B-33
-------
TAGS,
STREAMS,
CLR,
PGM=LOC,
CLASSIFICATION: Program Associated Keywords
USE: These keywords may be used with the LOC program to tag
stations with a cross reference number, and to plot,
where available, streams, cities, lakes, and
reservoirs.
KEYWORD FOFMAT AND VALUES:
TAGS, specifies that each station plotted on the map is
to be tagged with a coded identifier which relates to
a listing of descriptive information for the
stations.
STREAMS, specifies that stream traces are to be plotted,
if the data are available. (Areas which have
such data include the State of Michigan and
the Southeast Region.)
CLR, specifies that the outlines of cities, lakes,
and reservoirs are to be plotted, if the data
are available. (Areas which have such data
include the State of Michigan and the
Southeast Region.)
There are no values associated with any of these keywords.
DEFAULT VALUES: Not applicable.
NOTES ON USAGE:
If TAGS is specified, only 300 stations may be
retrieved/plotted. STORET will produce as many maps as
necessary to avoid overprinting any of the tags. Users
should not use this keyword when plotting many stations
located in a relatively small area.
-------
EXAMPLE (S) :
For the previous map include PGM=LOCf PURP=305B/STA,
cross reference tags for the TAGS, STREAMS ,CLR ,
stations, stream traces, and SCALE=250000,
outlines of cities, lakes LT=I,
and reservoirs. L=U320,L=8UOU,L=U315,
L=833730,L=U330,L=8315,
L=4330,L=830730,L=430730,
L=830U,L=a32230,L=825230,
L=U33730,L=825230.L=433730,
L=8315,L=tt33730,L=8315,
L=a330,L=83U5,L=4330fL=8U05,
U=21MICH,
B-35
-------
APPENDIX C
-------
APPENDIX C
INTERAGENCY AGREEMENT
BETWEEN THE
COUNCIL ON ENVIRONMENTAL QUALITY (CEQ)
AND THE
ENVIRONMENTAL PROTECTION AGENCY (EPA)
-------
EPA-IAG D7-01226
INTERAGENCY AGREEMENT
BETWEEN
COUNCIL ON ENVI30&&SNIAL QUALITY
AND THE
PROTECTION AGENCY
I. PURPOSE: The Office of Air, Land, and Water Use (QALNU), Office
of Research and Development, wishes to enter into an agreement
with the Council on Environmental» Quality (CEQ), Executive Office
of the President, 722 Jackson Place, N.W., Washington, D.C. 20006,
to carry out a project, to evaluate and prepare the CEQ UPGRADE
environmental data analysis system for use and possible co-sponsor-
ship by EPA. The interagency agreement will be" coordinated with
the Office of Monitoring and Technical Support, which will provide
support and technical direction. The duration of the agreement
is one year, October 1, 1977 through September 30, 1978.
The UPGRADE system has been developed over the period of the
last three years by CEQ. The system has the capability of inter-
facing with the irajor EPA data systems. The systerrs ease of access,
analytical capability, and sophisticated output formats would irake
it a valuable tool for research, and environmental po.: -.cy and manage-
ment decisions at the national and regional level of £PA. As a
result of this agreement EPA will have directly available for its
use this powerful analytical and decision making tool.
II. SCOPS OF WOPK; The work will be performed by a combination of two
CEQ sole-source contracts plus CEQ staff efforr. The work to be
carried out is comprised of the following 5 tasks:
1. Compilation of a system survey for possible installation.
of UPGRADE into the EPA system.
2. Completion of a user needs survey including demonstrations
t!o potential users.
3. Completion of a system design analysis to identify possible
need for redesign or reconfiguration.
4. Completion of a system management requirements analysis to
estimate resource requirements for installs, tico-and
operation.
5. Completion, of a user's documentation package for the UPGRADE
system and its data bases. This manual will be widely
circulated in draft form among EPA offices and regions, other
Federal agencies, State and local users, and academic insti-
tutions, and will be revised to reflect the comments received.
C-l
-------
In addition, the documentation will include complete
explanations of the criteria used to select data for inclusion
in each data base. It will be written in clear, easy-to-
understand language.
The results of these tasks will be presented to EPA in the
form of a final report due to EPA no later than September 30, 1978.
CEQ will provide EPA with the original plus ten copies of the
final report.
III. PROVISIONS: Changes in the work schedule or in the terras of the
agreement rray be made by irutual consent of the Project Officers
representing the respective Agencies, provided that no irajor change
in the scope of the project or in the cost to the funding agencies
is involved. Any irejor changes in scope or in cost shall require
the approval of the Authorizing Officials.
IV. DURATION OF AGREEMENT: This agreement is from October 1, 1977
through September 30, 1978. No extension of this agreement is
contemplated.
V. REPORTS:
a. Notice of Research Project; Within 20 days from the effective
date of this agreernent the Council on Environmental Quality
shall submit an executed copy of EPA Form 5760.1, Notice of
Research Project, to: Technical Infonration Division (RD-680),
Office of Monitoring and Technical Support, Office of Research
and Development, U.S. Environmental Protection Agency,
Washington, D.C. 20460.
b. Final Report: The report resulting from this agreement for
delivery" to EPA will be prepared in accordance with current
ORD publication requirements. Detailed instructions are
provided in the attached "Handbook for Preparing ORD Reports,
^5ay 1S76." Because the effort described in this interagency
agreement is part of an overall Agency program, the final
report, if published, will be assigned an EPA number and have
a standard EPA cover. The title page and content of the
report will clearly recognize the source of the described
results to credir or identifyt the CEQ.
VI. PROJECT Or'FICCR:
a. For the CEQ
Dr. James J. Reisa 202/633-7107
Council on Environmental Quality
Executive Office of the President
722 Jackson Place, N.W.
Washington, D.C. 20006
C-2
-------
b. For the EPA
Dr. Lance A. Wallace 202/426-4153
Office of tonitorir.g & Technical Support
Office of Research and Development
U.S. Environmental Protection Agency
Washington, D.C. 20460
VTI. FUNDS:
a. The total cost of the Interagency Agreement in FY'77 is
estimated to be $100,000, all of which is to be paid for
by EPA. EPA1 s~'funds will be provided in approved FY'77
funds from Program Element 1KC519. It is anticipated that
funds will be advanced in a single block. Request for pay-
ment should be made to the Accounting Operations Branch,
Financial Management Division (PM-226), U.S. Environmental
Protection Agency, Washington, D.C. 20460.
b. Appropriate accounting data follows:
Appropriation No: 687/80107
Account No: 761926WOA2
Document Control No: W10011
Object Class: 25.70
Amount: $100,000
VIII. AUTHORITY: This agreement is entered into pursuant to the Provisions
of the Clean Air Act and the Federal Water Pollution Control Act, as
amended, and the Safe Drinking Water Act.
IX. APPROVALS;
Environmental Protection Agency Council on Environmental Quality
Thcraas A. i-Xirphy Edwin H. Clark, II
Deputy Assistant Administrator Acting Executive Director
Office of Air, Land and Water Use
Office of Research and Develocment
B / ^ ?-
DATE
C-3
-------
APPENDIX D
-------
APPENDIX D
UPGRADE EVALUATION REPORTS
-------
APPENDIX D
UPGRADE EVALUATION REPORTS
1. This appendix contains the EPA procedure to follow in evaluating UPGRADE
and the subsequent reports generated by EPA. Each also contains quantification
data, if available. They are in the following order.
1. Procedures to follow in Evaluating UPGRADE
2. Office of Toxic Substances
3. Office of Air and Waste Management, OAQPS/RTP.
4. Office of Research and Development, OMTS/HQ
5. Office of Research and Development, EMSL/Las Vegas
6. Office of Research and Development, EMSL/Las Vegas
7. Office of Research and Development, OHEE/HERL/CINN.
8. Office of Planning and Management
9. Office of Enforcement
10. Region III
11. Region X
12. Transaction Data from CEQ User Support Group Test.
Note: Those EPA reports not included will be inserted when available.
D-l
-------
"-= \ LiNi^ED STATUS ENVIRONMENTAL. °RO7ZCTiON AGENCY
'1 '.'-
-.. '„...• WASHINGTON. DC 2C--.SJ
'"
REI-EARd- •^(•JC. L'^ViTI-O"'!': r-1'
to Follow ?.E. i:v*-lu?-:::-*s 'JI-G^VH
irRCv: Project Off deer frr "PGUAflS Evaluation
Monitoring Technology Division
Office of Monitoring & Technical Support (RD-680)
TO: UPGRADE Coordinators
See Bel 01:
All five UPGr.iDE Coordinators from the Environmental Protection.
Agency Program Offices have no\: been selected (see enclosure A). As I
hi. -/a aibcusscil pej.-nona.lly with yov, I am transriitting a fornaJ npr-j
t.-s sf.~.ci.f>-v in pjL-f-.auer detail, the o"r>jecci\es ai.d y ocedui.es tn be fcllowt-I
in the Asai.cy-Wic!s LPGK/J)!; £\£.laatiou.
We will need from the UPGPADE Coordinator for each offica a list of
people within his office vho ray reasonably be expected f:o have an interest
in using or inspecting the UI'Gi-JiDE S3'stein. (Sone of you have already
supplied this list.) Tnece individuals \rill form a nucleus for whom we
will describe the evcilnacicn project, arnnge de^on&trationE, trair-^ng
sessions, and whatever else is required ^o allow an objective evaluation
of the system.
Any individuals who wish to learn more about UPGPJU5E or to evaluate
the sysren will be given the opportunity to do so over the next three
months. (That is, a terminal will be made available and user's support
will be forthcoioing.) At the end of their evaluation, we would appreciate
receiving a report on their experiences in a memorandum to their UPGRADE
coordinator (with a copy to me) containing at least the elements listed
in enclosure B (i.e. , a description of their work with UPGPADE. the problems
they found, and their reconmendaticns regarding modifications in UPGRADE
and their expected level of use) . The evaluators may be contacted again
to discuss the memorandum. The conclusions from every memorandum will be
included in the final evaluative report which I will prepare.
D-2
-------
We need to have a Summary Memorandum from each coordinator, summa-
rizing the experiences of his office in the evaluation of UPGRADE and
including the documentation from individual evaluators. We will need
these reports by June 1, .1978, so we can prepare a final summary report
describing" the Agency wide evaluation and making recommendations about
UPGRADE.
Lance Wallace
Addressees:
Ray Smith (AW-443)
Bruce Rothrock (EN-320)
Warren Muir (WH-557)
Phil Taylor (WH553)
Elijah Poole (PM-218)
Attachments
cc: J. Reisa (CEQ)
K. Jones (CEQ)
L. Milask (CEQ)
M. Dorlester (VITRO)
bcc: ORD-CRU
Dr. Gage (RD-672)
Mr. Trakowski (RD-680)
Mr. Brunot (RD-680)
Dr. Wallace (RD-680)
Mrs. Warner (RD-680)
MTD (RD-680/chrono)
prepared by: RD-680/LWallace/ep/3809 WSM/ 426-2177/2-16-78
D-3
-------
Enclosure "A"
UPGRADE COORDINATORS
OFFICE UPGRADE COORDINATOR TELEPHONE
ORD Lance Wallace (Project Officer) 426-2175
OAWM Ray Smith 755-0470
OE Bruce Rothrock 755-0724
OTS Warren Muir 755-4871
OWHM Phil Taylor 755-1567
0PM Elijah Poole 755-0916
D-4
-------
Enclosure "6"
ELEMENTS TO INCLUDE IN WRITTEN REPORT EVALUATING UPGRADE
I. Introduction
1. Identification of the evaluator
Name, telephone number, Office and Division, mail drop
2. Brief description of evaluator's function as it relates to UPGRADE
Kind of monitoring data normally dealt with (source/ambient,
air/water/food, etc.), uses of data (annual reports, one-
time research studies, support to regions/states)
II. Description of Experience
3. Extent and nature of evaluator's experience using UPGRADE
Number of people and total man-hours spent
a) Familiarizing self with system from documents &
demonstrations
b) Using the system
4. Description of the tasks or goals that the evaluator set for
UPGRADE
Major needs of the evaluator, (other data bases, rapid
analyses, graphics capabilities, etc.)
5. Evaluation of UPGRADE performance in meeting those objectives
UPGRADE features that satisfied requirements;
UPGRADE features that did not satisfy requirements
III. Recommendations
6. Evaluator's recommendations
What UPGRADE features need to be modified? What new data
bases should be added? Would his office find UPGRADE
useful for any purpose? If so, estimated amount of use
UPGRADE would receive per month. What level of user's
support would be adequate for his office?
D-5
-------
ENVIRONMENTAL PROTECTION AGENCY
2nd ANNUAL ADP CONFERENCE
LANCE WALLACE, ORD
Introduction
The Office of Research and Development has recently entered into
an Interagency Agreement with the President's. Council on
Environmental Quality (CEQ) to carry out an evaluation of UPGRADE,
an automated, interactive graphic and statistical analysis system. UPGRADE
(User Prompted GRAphic and Data Evaluation) has been developed by CEQ to
analyze information from a variety of environmental and related economic
and demographic data sources. The system includes an integrated database
developed by the Council to study the relationships between environmental
pollutants and health.
The objectives of the IAG are to determine how UPGRADE might
best be implemented within the EPA. It is important that all potential
users whether in the regions, laboratories, or headquarters be notified
of the existance of the IAG and be given the opportunity to evaluate
the usefulness of the system within individual program areas.
The intent of the presentation and the following on-line
demonstration of UPGRADE (give locational information and time) is
to introduce you to the nature and scope of the UPGRADE system and its
related databases.
Background
UPGRADE has been developed during the past 2% years at an
approximate cost of $500,000 CEQ's initial design requirements
dictated a system that could provide advanced statistical analysis
D-6
-------
and graphic display of environmental trends and interdisciplinary
relationships, particulary between environmental pollutants and national
health. The system also had to be completely accessible to the Council's
staff who did not possess specialized computer training.
UPGRADE contains the following features:
.Interdisciplinary analysis of w.-ter (STORE! and USGS/
WATSTORE), air (SAROAD), health, demograaric, and
related data. Any type of digitized data can be used on
the system.
.English language prompted — knowledge of syntax
structures or computer systems and languages are not
required to use the system making it accessible to
scientists and managers without specialized training.
Interactive and graphic orientated — UPGRADE gives the
user immediate feedback through on-line computing which
allows the user to efficiently evaluate and manipulate
analytical results. This capability allows instant production
of bar charts, scatter plotting, regression analysis and
plotting, and off-line production of maps.
.Access to advanced statistics — the statistical analysis
system(SAS) is being interfaced to the system. SAS includes
/
a wide variety of advanced statistical procedures which
can be used through UPGRADE
UPGRADE has been used by the Council on Environmental Quality
for a variety of environmental and health studies during the past
year as well as the 1977 annual report to Congress.
D-7
-------
EPA's interest in UPGRADE, stems from the AgencvJ.s perception of several
basic needs:
• The Need for Linking Health and Environmental Data
Recent reports, including the 11-volume, 5-mi11ion-dollar
NAS study of the EPA, have remarked on the inability of the Agency
to relate ambient environmental quality to health effects. UPGRADE
cara provide a rapid determination of correlation between mortality
rates and environmental variables. These correlations can then be
investigated further to see if cause-effect relations are involved.
Thus in this relatively mechanical way, the system can act as a
hypothesis-generator in much the same way as the Cancer Atlas does.
• The Heed for Increased Analysis of Environmental Data
The same reports have pointed out the imbalance between the
Agency's collection of environmental data and its analysis of that
data. By making such analysis available to a wide audience of
users, including those without computer training, it is possible
that much more useful analysis will result. (There is, of course,
the danger that untrained or naive users will conclude more than
the data can bear; combating this danger will be a challenge to
the people participating in the present evaluation.)
• The Need for Rapid Investigation of Available Monitoring Data on
a Given Pollutant
The "Pollutant of the Month" syndrome is likely to continue
in the foreseeable future, with the attendant Congressional
inquiries and other requests for fast-turnaround analyses and
naps, charts, graphs, or other information aids. The rapid graphics
D-8
-------
capabilities of the system can help in this regard,
although at present the environmental variables represented
are drawn largely from the most familiar air and water
pollutants.
Since UPGRADE gives promise of satisfying several important needs for
the agency, Dr. Stephen Gage of the Office of Research and Development
has authorized the present evaluation. Dr. Lance Wallace of EPA's
Office of Monitoring and Technical Support is the Project Officer for
the evaluation, and can be reached at 426-4657. Dr. James J. Reisa is
the CEQ representative. The subcontractors are SIGMA DATA CORP and
VITRO Automation Industries, represented by Larry Milask of SIGMA Data
Computing Corp. (202) 633-7074\ and Marc Dorlester of VITRO Laboratories
(301) 871-2512.
The IAG prescribes five tasks:
1. System Installation Survey
Presently UPGRADE is supported on the NIH System.
Can it be supported either on COMNET or the UNIVAC
installations at EPA?
2. User Needs Survey
Prospective users must be found, given time to
familiarize themselves with UPGRADE, and allowed
to arrive at recommendations concerning its present
utility in satisfying their needs, its potential
utility if the proper modifications are made, etc.
This will require an information campaign, including
demonstrations at the Regions and at the laboratories.
D-9
-------
A questionnaire will be developed and circulated
to all prospective users.
3. Systems Design Analysis
The systems design analysis will depend on the
findings of the installation survey and the user needs
survey. Detailed design specifications will be developed
tailored to the special installation requirements and
the most relevant user needs.
A. Management Requirements
Requirements for optimum maintenance and operation of
the system, including a data base management system to
be developed for UPGRADE, will be determined based on
the findings of the first three phases of the evaluation
and the recommendations of the prospective users and the
affected members of MIDSD.
5. Documentation
A user's manual will be developed
D-10
-------
22 August 1978
MEMORANDUM
To: Lance Wallace
From: Charlie Foole
Subj: UPGRADE Evaluation
Attached you will find my evaluation of the UPGRADE system. I
have broken it down by illustrative projects and studies I have per-
formed using the system and its data bases.
The estimates for resources expended are best guesses only. I
have tried to keep the evaluation limited solely to projects and
studies which made use of data sets and analysis/display capabilities,
excluding the actual building of data bases and development of
capabilities.
If the evaluation team needs more information, I will be glad to
provide it.
D-ll
-------
PROJECT:
DATA BASE:
CAPABILITIES USED:
RESOURCES:
COMPARABLE RESOURCES:
COMMENTS:
NEEDS:
Product maps to study the geographic patterns of
County-level mortality in relation to water quality
across the U. S., 1968-1972.
IDE
A. Mortality files
B. NASQAN files
A. SORT/RANK procedure to obtain percentiles for
shading intervals
B. Mapping capability (interactive) - Regular and
population adjusted maps
A. Approximately 100 mortality maps
1. Approximately $4,000 computer time, plotter
time, etc.
2. Approximately 1-2 weeks analyst's time
(obtaining percentiles, specifying maps
interactively)
B. Approximately 15 NASQAN maps
1. Approximately $600 computer time, plotter
time, etc.
2. Approximately 3-5 days analyst's time
Producing maps of these data would have been impossi-
ble without this system. Manual production of the maps
would have been unthinkable.
These maps have been and continue to be used for a
number of studies and other projects. Several are
mentioned and under other Projects. One which is not
is the brief discussion of respiratory disease
mortality distributions found at the end of the 1977
CEQ Annual Report's Environmental Health Section.
Slides, poster-size enlargements and "quick and dirty"
Xerox copies of these maps are on hand and are used
for demonstration, discussions, talks, etc.
The interactive, "always ready" nature of the data
and the system enable the production of special maps
"to order."
A. Batch-specification of maps
B. State and regional maps (possibly on-screen)
C. Capability for mapping time-trends, age-specific
rates, and additional geographic areas (e.g.,
State Economic Areas).
D-12
-------
PROJECT:
DATA BASE:
CAPABILITIES USED:
RESOURCES:
COMPARABLE RESOURCES:
COMMENTS:
NEEDS:
Study the relationship between cardiovascular disease
mortality rates and constituent levels in drinking
water.
IDB
A. Extraction software
B. Data listing
C. Ist-order regression/correlation
A. Approximately 2-3 months analyst's time
B. Approximately $1,000 - $2,000 computer time
For production of approximately 2,000 plots of
selected water quality variables and demographic
variables vs. selected cardiovascular disease
mortality rates on a race/sex-specific basis.
At least 5-10 times the above to perform these
analyses by hand.
This work appeared in the 1977 CEQ Annual Report
Environmental Health Section. Only a subset of the
correlatives were chosen to report. The speed and
ease of UPGRADE analysis enabled the production of
many plots from which we could choose in presenting
results.
Related studies (e.g., altitude vs. heart disease,
drinking water vs. cancer, etc.) are possible and
some have begun.
A. Nationwide drinking water data (current IDB has
data for 300-400 counties only), environmental
data (e.g., altitude), and demographic data.
B. More sophisticated and quicker extraction pro-
cedures
C. Batch processing
D-13
-------
DATA BASE:
CAPABILITIES USED:
PROJECT; Study time trends and mean violation rates of drinking
water constituents in selected surface supplies of
public drinking water systems.
STORET stations for drinking water supplies (specially
prepared tapes of STORET raw data retrievals)
SOTRET interface
A. • Station combining/automatic sequencing
B. Regression/correlation plot production
C. Batch processing
A. Approximately 4 weeks intern's time
B. Approximately 2 weeks analyst's time
C. Approximately $2,000 computer time for production
of approximately 3-5,000 plots
COMPARABLE RESOURCES: At least 5-10 times for manual production
RESOURCES:
COMMENTS:
NEEDS:
Preliminary results from the mean violation rate
analysis presented in Environmental Health Section
of 1977 CEQ Annual Report. Both studies (violation
rates and time trends) are currently waiting for
personnel to complete in-depth investigation (both
look promising).
UPGRADE "as-is" handled those analyses fine. More
sophisticated batch processing would have been
helpful. Biggest need now is someone to follow up
in the 2 studies in progress.
D-14
-------
PROJECT:
DATA BASE:
CAPABILITIES USED:
RESOURCES:
COMPARABLE RESOURCES;
COMMENTS:
NEEDS:
Prepare ad hoc materials with quick turnaround time
for administrators, Congressional interest, etc.
A. IDB almost always
B. STORET drinking water tapes occasionally
A. Extract software and data listing usually
B. Occasionally special statistical analyses or
special maps are prepared.
Usually 1-2 days (including midnight oil) analyst's
time. Usually $50-150 computer time.
These special analyses would not be possible in the
required time frame without the system in place and
operational.
Those needing the information would otherwise have
to rely on existing information without it (see
example under "COMMENTS" below).
These ad hoc reports are of UPGRADE'S most useful
features, yet they are the hardest to document.
Examples are the two packages of briefing materials
on cancer mortality I prepared for Steve Jellinek
(AA for Toxic Substances) while I was still at CEQ.
The first was a package on cancer mortality in
California. Mr. Jellinek was on his way to a meeting
with officials in that state when he learned that
one of them had recently been quoted describing
California as a cancer "hot spot". In less then 24
hours (literally overnight) I provided a briefing
document, including maps, which discussed in a fair
amount of detail California's absolute and compara-
tive cancer mortality status. The overall conclusion
was that the situation was not as bad there as the
official's statements had indicated. About a month
later, Environmental Health Letter reported statements
by independent scientists in California disagreeing
with the official and reaching essentially the same
conclusions we had provided in 24 hours.
Mr. Jellinek subsequently requested briefing materials
in cancer for the U. S. as a whole. Within 48 hours
(some overtime again) he had a package which included
several poster-size enlargements of maps to use in
presentations. These materials included a few points
on which our maps revealed different geographic pat-
terns than those displayed by the NCI Cancer Atlas.
A. Quicker, more sophisticated mechanism for gener-
ating mortality "status reports" as specified by
the user.
B. More publicity for this capability of the system.
C. More data bases (especially mortality rates cover-
ing different time periods and age-specific mor-
tality rates).
D-15
-------
UPGRADE USER EVALUATION
USER
OTS
USE
Produce maps
to study the
geographic
patterns of
County-level
mortality In
relation to
water qunlit;
across the
U. S.
1968-1972
Study the
relationship
between
cardiovascu-
lar disease
mortality
rates and
constituent
Levels In
drinking
water
DATA RASE CAPABII.ll IES
CATEGORY
IDE: Mortality
files
NASQAN
tiles
IDB
AVAILABLE
Yea
Drinking water
300-400
counties
NEED
.
Nation-
wide
RANK
.
N
UPCRADh CAPABILITIES
CATEGORY
SORT/RANK
mapping
Data Extraction
AVAILABLE
Ye-i
Interactive
neutral text
to CALCOMP
Yes
NLCD
_
State &
RcRJonal
maps on
screen
Quicker
RANK
_
D
D
a
-------
UPGRADE USER EVALUATION
USER
OTS
USE
Time Trends
& Mean Vio-
lation races
of drinking
water con-
stituents In
selected
surface-
supplied
public drink
Ing water
system
Prepare
Ad hoc
materials
with quick
turnaround
time for
administra-
tors.
Congression-
al Interest.
etc.
DATA BAKE CAPABILITIES
CATEGORY
STORE!
IDB
STORET
AVAILABLE
Manual
Interface
Yes
Manual
Interface
NEED
_
lore data
_
RANK
_
N
_
UPGRADE CAPABILITIES
CATEGORY
Station combining/
Automatic sequencing
Regression/Correla-
tion plot production
Batch processing
Data Extraction
Data Listing
Statistical Analysis
Mapping
AVAILABLE
Yes
Yes
Yes
Yes
Yes
Yes
Yes
NEED
_
_
_
_
_
HANK
_
_
_
_
_
I
M
-J
-------
TRANSACTION DATA
USER
(TASK)
OTS (water/morta-
lity)
OTS (Cardiovascular
mortality vs drink-
ing water)
OTS (mean violation
rates of drinking
water)
CTS (Adhoc/average
per request! abort-
time f rame)
Extract from DB
STORET
X
X
SAROADS
NIU
IDE
X
X
X
Store
In
IDB
Analysis
& Terminal
PLOT
Regression/
correlation
2000 Plots
3-5000
Plots
Various
members
OFF-
LINE
PLOT
100
morta-
lity
naps
IS NA-
SQAN
maps
NO
NO
Number of
Terminal
Sessions &
Time Per
Session
1-2 weeks
10 maps/
1*5 hrs
3-5 days
2.5 months
10-20% on
machine
W Intern
80Z on
tern
2U analyst
5% on mach
1-2 days
COST
Per
SESSION
$4.000.
Total
$600.
Total
$1-2000.
Total
$2000.
Total
$50-150.
COST
Per non-IDB
Extraction
$50.
$50.
Could
Not
Do.
X
X
X
MANUAL
TIME
Indeterminati
Indeterminatf
19 months
3OW Intern
15W Analyst
Indetermlnat
-------
UNITED STATES ENVIRONMENTAL PROTECTION AGENCY
SUBJKCT: UPGRADE Evaluation DATE: 5/1/78
/S /$•* ft L/ -^^f
FROM: Jon B. Clark and Neil H. Frank x,->«-> *-> LX"^—
Monitoring and Reports Branch (J
TO: Lance Wallace
Kay Jones
Council on Environmental. Quality
Scope of Evaluation
The UPGRADE evaluation was done in conjunction with the analysis
performed for the CEQ Annual Report. The evaluation of UPGRADE was
based on the system on the NIH compute*1 available to EPA at that time.
Many of our recommended changes may have been made or are currently
being made. Due to the changing nature of the system, no attempt
v/as made to determine which, if any, of the recommendations had been
completed.
The UPGRADE system was evaluated only in regard to its ability
to handle air quality data. While the use of each UPGRADE procedure
was investigated, emphasis was placed on (1) the procedures used in
the analysis performed for the CEQ Annual Report (ref. 4-3-78 memo
from Neil Frank to Kay Jones), and (2) the ability to handle the
Ozone Study (ref. 2-9-78 memo from Jon Clark to Lance Wallace). As
a result of these applications, we concluded that the system is easy
to use because of its English language and conversational prompting
mode, but we also observed that the system is inefficient at the
present time and performed the analyses in a laborious manner.
Discussion
The UPGRADE system is a general purpose interactive analysis
package developed for unsophisticated computer users. The package
presently operates on an IBM computer. There is no similar package
available on the UMIVAC computer. Th:.- most effective package for
OAQPS usage would, of course, be a package for the UfllVAC computer
which could easily access the AEROS data bases. Because the UMIVAC
has no such system, our analysis of air quality data is accomplished
through the use of a collection of computer programs. Although soira
of these programs can be used in an interactive mode, like UPGRADE,
they are usually intended for the more sophisticated computer user.
The UPGRADE system contains the following procedures: 3 plot
procedures, a regression procedure, and some data manipulation oro-
cedures. In addition to these general analysis techniques, UPGRADE
has one procedure that was specifically designed to meet the needs
D-19
-------
of air data analysis—rollback of a frequency distribution of data.
Although this procedure could benefit from sorr.a further refinement,
it is, nevertheless, useful in its present form and highlights one
of the unique features of the UPGRADE system.
Cost of Using System
All of the costs for the OAQPS1 use of the UPGRADE system were
charged to CEQ. Requests have been made for accounting information
regarding run charges, data loading charges, and interactive session
charges. This information has not been made available to us. How-
ever, partial accounting information which was available indicates
that the system can handle small to moderate data bases in a cost
effective manner.
Findings
The system has potential as an aid to personnel with no data pro-
cessing knowledge who wish to perform certain types of air analyses.
Prior to spending money for additional developmental work on UPGRADE,
all alternative general purpose analysis software should be studies.
Because OAQPS does not have the personnel to perform such a comparative
evaluation, it should be performed by a central group such as MIDSD.
They have more expertise in the area. In addition, any general soft-
ware package would undoubtedly be useful to all EPA users.
The remainder of the report deals with recommendations for chang-
ing UPGRADE as well as problems noted in the system. The next section
discusses the three major areas of concern with the UPGRADE system.
The last section discusses suggested enhancements, minor errors, and
minutia concerning the system.
Three Major Problem Areas
The first area of concern for any user of air quality data is
how will that data be made available to the UPGRADE system. Currently,
this can only be accomplished by a couple of conttactor personnel
working for CEQ. The process of loading any air data set for use by
UPGRADE consists of 8 processes. Any user is now totally dependent
on the contractor performing these necessary steps before UPGRADE can
be used. It is necessary that some method be established whereby
any user ca.n easily build his own data sets quickly and efficiently.
The second area of concern is the limitation on the amount of
data that can be handled by UPGRADE. The original ozone problem
specified continuous data for 105 cities. This requirement was later
pared to 10 cities. Upon arrival in Washington, it was discovered
that UPGRADE could not handle 4 years of continuous data for 1 site.
Only a fraction of the analysis was finally performed using 1 site
D-20
-------
year of continuous data. The data handling problems are caused by
limitations at MIH as well as problems in the UPGRADE file design.
Unfortunately, one feature of any interactive system is that it can-
not handle extremely large data sets. Hov/ever, a few years of con-
tinuous data for a couple of pollutants at a few sites is a reasonable
requirement for any general purpose software system.
The third major concern for any user of UPGRADE is the plodding
required by the system to accomplish any repetitive task or any fairly
large task. The UPGRADE system is supposed to have 3 conversational
modes. The "Terse" mode and the "Verbose" mode are currently avail-
able. Both are very similar. They are easy for beginners; however,
there is a very flat learning curve. The user cannot make the system
perform significantly faster even after he has had many hours of
experience with the system. This causes the system to be much too
slow and frustrating for most users. The "Super Terse" mode which
is to be developed should provide for a much higher learning curve
and provide the capability to perform repetitive tasks much more
efficiently.
Other Findings
1. A generalized transformation capability should be developed
to allow tho user to create new variables for analysis.
2. Better capabilities are needed to merge certain data, i.e.,
combine sunder data from many years.
•
3. The UPGRADE system should allow for the incorporation of
user developed routines.
4. The terminology used by the system is often confusing and
should be standardized.
5. The plot procedure currently displays at most 400. data points.
Since this can limit the analysis of air quality data, users should
be allowed to determine the amount of data displayed.
6. The prediction procedure should be modified so that it can
be better utilized for air quality analysis. The modification should
include an improved choice of plotting positions so that the maximum
observed concentrations can be included in the analysis.
7. The following specific minor problems were noticed while using
the system:
D-21
-------
(a) At one point in the partitioning procedure, the system gives
plot options for the Y axis. The options are 1 to 8 with 8 represent-
ing summation. Uhen 8 is requested, the system responds that this is
an invalid entry.
(b) In the partitioning procedure, there is an error in the
routine that sets interval widths.
(c) In the bar chart plotting procedure, the titles for the vari-
ables should be changed to match those in the summary table of the
partitioning procedure.
(d) In the graphics procedures, the annotation for end year on
the X axis is not correct.
cc: W. Barber
R. Campbell
R. Neligan
J. Padgett
D. Goodwin
R. Rhoads
W. Cox
J. Reisa
W. Ott
D-22
-------
UNITED STATES ENVIRONMENTAL PROTECTION AGENCY
DATE. April 3, 1978
SUBJECT: Accomplishments During Detail to the Council on Environmental Quality
T0:
Monitoring and Reports Branch
Kay Jones
Council on Environmental Quality
The purpose of this memo is to summarize my accomplishments during
the CEQ detail of January 23 to March 15, 1978. The primary arras of
accomplishment are (1) familiarity with CEQ's UPGRADE system, and (2)
data reduction of CO and oxidant air quality for trends analysis. These
items satisfy requirements outlined in Kay Jones' memorandum of February
23, 1978.
Concerning the evaluation of UPGRADE, a formal discussion of UPGRADE'S
analysis capabilities and suggested areas for improvement will be contained
in an overall OAQPS evaluation. Concerning CO and oxidant trends analysis,
the following discussion will summarize this activity.
TRENDS ANALYSIS OF CO AND OXIDANT FOR CEQ
Analysis Plan
I selected a list of trend stations (Attachment 1) for the analysis
based on CEQ's requirement that data be available for each year during
1973-1976. Twenty-five oxidant stations in 11 AQCR's and 39 CO stations
in 18 AQCR's were selected. An annual data completeness criterion of 75;J
(6570 hours) for CO and 67% (5840 hours) for oxidant was utilized. The
trends analysis was to be based on the daily maximum hour oxidant and daily
maximum 8-hour CO. The analysis was to be performed on UPGRADE with data
summaries prepared in format compatible to the 1977 CEQ Report. Data
retrievals from SAROAD were to be obtained by CEQ/SIGMA personnel in order
to create UPGRADE data sets.
During this time, I obtained available NEDS emission data from the
National Aerometric Data Branch for possible comparison to the air quality
data.
Data Availability
The air quality data bases covering the time period 1974-1976 were
accessible by UPGRADE on February 29, 1978. The 1973 data has apparently
been retrieved from SAROAD but are not currently available on UPGRADE.
Based on available emissions data, 1973-1975 estimates wore tabulated for
the 22 AQCR's covered by the air quality data. Preliminary 1976 data
expected August 1978 (reference 3/2/78 memo from Chuck Mann).
arc
EPA r...... 1171 t if,.
D-23
-------
Data Summaries
The air quality summaries for 1974-1976 daily maximum 8-hour
CO were prepared using UPGRADE. These are shown in Attachmsnt 2.
Based on modifications to the UPGRADE data sets, 1976 su-merics for 9
sites could not be prepared. The format Tor data suii-nsrios were frequency
distributions and bar charts, in agreement with presentations in the
1977 CF.Q Report. Because of the lerge amount of man-hours required to
extract data from UPGRADE, a flRB data retrieval capability for extracting^
daily maximum hour data was employed to generate the required datr. su.-paries
for oxidant using the EPA UNIVAC. These are shown in Attachment 3.
Appropriate graphical presentations can be easily prepared from this
information.
Emissions summaries for hydrocarbons and carbon monoxide covering the
time period 1973-1975 for 22 AQCR's are shown in Attachment 4.
Discussion
The information presented in this report provides basic information
for an analysis of CO and oxidant trends. The use of UPGRADE to extract
the CO summaries was hampered by the small number of sites per data set;
four UPGRADE data sets were required. The efficiency of this type of
analysis can be improved if each data set only contained the data for
a single year, thereby allowing more sites on a single data set. Further
efficiency would be possible if the UPGRADE data sets did not allocate
space for as many as 50 variables regardless of the number of variables
needed.
The presentation of AQCR trends in a format compatible to the 1977
CEQ Report can be accomplished by representing the AQCR with the data
for a "typical" station or an area-wide composite. A series of four
frequency diagrams could be presented, one for each year. In order to
improve upon the graphical presentation, a series of trend lines showing
cumulative frequency of occurence of selected concentration levels can
be employed. Superimposed on this type of graph, emission levels could
also be displayed. These two methods of presentation are shown for oxidant
trends at the Azusa station in Los Angeles (Attachment 5).
Attachments 5
cc: Robert E. Neligan, HDAD
Lance Wallace, ORD
-------
! FACHMEfil 1. AQCR'S WITH TRLNU STATIONS FOR CO AND OX1DAFITS
NUMBER OF SITES BY POLLUTANT
AQCR AQCR IIAME OXIDAMT/03 CARBON KO.'IOXIDE
052 West Central Florida
(Tarnpa-St. Petersburg) 1
050 Southeast Florida
(Miami) 1 1
079 Metropolitan Cincinnati 1
119 Metropolitan Boston 1 1
173 Dayton, Ohio 1
178 Northwest Pennsylvania-
Youngs town 1
024 Metropolitan Los Angeles 10 11
031 San Joaquin Valley
(Fresno) 1 2
028 Saramento Valley 2 1
030 San Francisco Bay Area 4 6
220 Uasatch Front
(Salt Lake City) 2 2
009 Northern Alaska
(Fairbanks) 1
D-25
-------
AQCR'S WITH TREND STATIONS FOR CO AMD OXIDAIITS
NUMBER OF SITES BY POLLUTANT
AQCR AQCR NAME OXIDAHT/03 CARBON MONOXIDE
036 Metropolitan Denver 1
094 Metropolitan Kansas
City 2
099 South Central Kansas 1
078 Louisville, Kentucky 1
042 Hartford-New Haven-
Springfield 1
131 Minneapolis-St. Paul 2
085 Metropolitan Omaha-Council
Bluffs 1
148 Northwest Nevada (Reno) 1
193 Portland, Oregon 2
229 Pugct Sound (Seattle) 2
D-26
-------
"?;~:H '.3,197fl I LIST Or STATIONS FOX TJE RETJJIEUftL CflLLEDl
C£Q.£A.-<0.':3.£OcJ.CC.03.T»;i;.l.!;J.U02
CCDI
1
2
3
4
5
7
S
3
11
STATION
COLE
c- ic
SA.WO
AGFIiCV
CODE
03*505113
03s|5j-l?/
•i-isC "2C3
448C42C3
74837523
S7A7ICH DESCRIPTION
1.1
i.ae3 esf.sziojai^i AZUSA
1.ZC3 CS3G5CC31U1 Lft H.-i3Rfl
1.434 253SC.13.J1121 LEMMOX_
licSS $3-!!SC23lI31 LOS'"
1.337 851183833131 I.CS
1.2.18 es-izeaeoiiei LOS ANGCLES co
1.303 »5Sl26e3lI61 HEUHALL
I.eiC 25£cb8CS4I81 SA/I DIEGO
l.eil C57223C84F&1 SA.-fTA t.VDARA
1.012 es87t"aooiiai UHITTIER
6j
CO
' • •* I
TKE SCxECJi TO Pr.ESERVE THIS LISTING AN9 THEN DEPRESS THE RETUSH KEV TO CONTINUE
t>
-------
-V.-CH £1.1073 : LIST Or STATIONS FOR THE RETRIEVAL CALLED:
CEO. 5A.!?C-iO.E37S. CO. 03. TRENDS. VI 1
ftGEHCY
CODE
STATION DESCRIPTJCM
2
3
S
7
8
5
10
11
12
:C?V THE
nn
STATION
CODE:
CG5C-2C1COO!3~01 CSCCCiEO 1 CD
I . *• !
i
i y.'. x i
CONTINUE
10
00
-------
O
ro
SO
r'.VICH 22, ID'S : LIST OF STATIONS FOB THE Kt
CEG . tV,RC,'-.- . £073 . CO . 03 . 1 KuKDS . VI 3
yftL CALLED:
rooz
•j
3
6
7
S
"3
13
11
DOPY
lJl
COD
S72505-O
E'i-iC 13-50
71530053
76CC&9S9
7G030D80
STATION DESCRIPTIOM
C01
CJ2
033
!.C)3S
lS37eC001G01
SPRINGFIELD
Mri.'-.EAf-OLIS
H!ll/i£a?OLJS
Giu'nfi
337
038 381-!G337Er-31
009 ^£05^^>^Gi^a=
-I918-!e251f01
-4513-<&053Ffil
POSTLft.'ID
FCRTLfytD
01C
SEflTTLE
SEflTTLE
03 Ox O)
Xx
II
!<*
THE SCREEN TO PRESERVE THIS LISTING WiD THEN DEPRESS THE RETURN KEY TO CONTINUE
-------
tl
o
W.
CEO. fj
-------
4.
OiISSU),'lS Of HYDROCARBONS AMD CARBON I-'ONOXIDr., 1973-197$ FOB 22
SCLLCTCD AHCP.'S
The following Table contains Hydrocarbon and Carbon Monoxide pnir.sicns
for 22 selected AQCR's in the years 1973 - 1975. The 1973 data uerc
derived from the 1973 National Emissions Report (EPA-450/2-76-007),
and the 1974 and 1975 data were obtained fron National Air Data Branch.
All Figures For each pollutant were found by taking the "Grand Total"
of pollutant emissions and subtracting the totals for the "Industrial
Process-Point" and the "Miscellaneous-Area" categories. This method
was used in order to record the most stable and consistent categories
and exclude those whose methodologies have been changed during the
period From 1973 - 1975.
D-31
-------
Emissions of Hydrocarbons and Ccirbon !'/ji!Oxic!<>
AQCR £ Major City 1973 1974 H'7B
OG2 Tampa-St. Petersburg,
Florida
Hydrocarbons 172 * 124
Carbon Monoxide 913 * 803
050 Miami, Florida
Hydrocarbons 151 * 179
Carbon Monoxide 813 * 1,173
079 Cincinnati, Ohio
Hydrocarbons 113 106 99
Carbon Monoxide 600 618 582
119 Boston, Massachusetts
Hydrocarbons 178 163 151
Carbon Monoxide 1,012 1,049 980
173 Dayton, Ohio
Hydrocarbons 78 77 71
Carbon Monoxide 431 46 428
178 Youngstown, Ohio
Hydrocarbons 112 104 98
Carbon Monoxide 583 595 555
024 Los Angeles, California
Hydrocarbons 732 761 715
Carbon Monoxide 4,001 4,840 4,603
031 Fresno, California
Hydrocarbons 164 157 148
Carbon Monoxide 953 1,012 968
D-32
-------
Emissions of Hydrocarbons and Carbon Monoxide
AQCR f/ Major City 1973 1974 1975
028 Sarnmento, California
Hydrocarbons 132 118 111
Carbon Monoxide 711 710 673
030 San Francisco, California
Hydrocarbons 351 357 337
Carbon Monoxide 1,936 2,212 2,102
220 Salt Lake City, Utah
Hydrocarbons 52 65 51
Carbon Monoxide 259 411 234
009 Fairbanks, Alaska
Hydrocarbons 877
Carbon Monoxide 45 28 31
036 Denver, Colorado
Hydrocarbons 104 108 104
Carbon Monoxide 583 728 703
094 Kansas City, Missouri
Hydrocarbons 101 102 96
Carbon Monoxide 545 671 637
099 Wichita, Kansas
Hydrocarbons 51 55 50
Carbon Monoxide 242 301 277
078 Louisville, Kentucky
Hydrocarbons 46 45 44
Carbon Monoxide 251 2G8 2!i7
D-33
-------
Emissions of Hycirocdrbotis and Carbon F'.onoxick1
AQCR /.'
042
131
085
148
193
229
Major Ci Ly
Hartford-Hew Haven, Conn.
Hydrocarbons
Carbon Monoxide
Minneapolis - St. Paul
Minnesota
Hydrocarbons
Carbon Monoxide
Omaha, Nebraska
Hydrocarbons
Carbon Monoxide
Reno, Nevada
Hydrocarbons
Carbon Monoxide
Portland, Oregon
Hydrocarbons
Carbon I'.onoxide
Seattle, Washington
Hydrocarbons
Carbon Monoxide
1973
197",
1975
806
120
664
43
232
14
61
149
772
115
582
124
812
41
260
18
103
147
894
122
751
139
843
118
776
39
243
16
87
142
861
121
751
* Data not currently available.
-------
Or
O'*''2fa7 AT
— - ..;
I
i
i
i.
^tt'
te-
$L-
-S-
1 •!
•$'
0-
3Z
f" J
-S-
i
1
5
!
jl
s
m3
_J2»*\
-fe
^4'- 303
J
' i
P i» Cx» 80 ICO IZO ify 1(
L i
H
1
— — . — J
t
•.
(
i
x
i
_l:
"i
1
124!
1(0.1
—
if.
1
1 " 1
1 j
j
w
A
,
!
:
I
J
• i
ij
li
20 ^0 fcb go iCb Uo f'f'j lk> il
i 1 ! i 1- l i
i
1
^J8-
Ho
ti
-H-
t»-
-«-
1-8
-1o-
.7/1
--s
—
._..
*!_
1
— .
T
|
1
r
a
i
i
i
.3
1
i
i
»j»j
"
i
i
i
i
— _.f ... .
i
I
t
-, —
1
T
1
5'-
i
!
"
,.4._
i
.. j_ ....
i !
nk\
VT
i
j
1 ' !
1
" " T
I
•
-™. _«
1
—
•—•-I——
]-...,
i
4
j
i
1
i
i — , — „ j — . — . — i '
20 ^o {,0 3/j )fl« |7i) hto '.!
1 . 1 1 i .' : | '. *
i —
—
-
—
i
i'
if
[
J
3
0 i
D-35
-------
" 9
/.T
TT
-J0o-
•£uj
.S-J-JfcO-
ti-i'O
JK2 I _.
u)
u
I rvJ<
.JzoJ
[4"pf^
i i . r_v L
Et:&ri?riJ
—M
-U-U
iT~tI2-
i I i ii i •>' i i
•~.—i \~~ .iii
|
-------
UPGRADE USER EVALUATION
USER
OAUM
(QAQl'S)
(RTP)
USE
Data Reduc-
tion of Air
Quality
Data for
Trend
Analysis
DATA BASE CAPABILITIES
CATEGORY""
SAROAD
Interface
AVAILABLE
Manual
Interface
NEED
Auto
RANK
N
UPGRADE CAPABILITIES
CATEGORY
Data Extraction
Data Manullpatlon
Meaningful size data
Bets
Variable size for
variables
Faster system
operation
Additional
Variables
Additional
Analysis
Routines
AVAILABLE
Yes, but too
much tine
required
Contractor
Interface
1 site year of
cont Inuoua
data
Fixed size
field for 50
variables
Verbose and
terse modes
Predefined
Variables
Predefined
Analysis
Routines
NEED
User
manageabl
SAROAD
extrac-
tion
User
Control
Multi-
year con-
tinuous
data for
multi-
sites
Variable
sized
field
depending
on no. ol
variable!
Super
Terse
Mode
User
creation
of New
Variables
Ability
to add
user
developed
routines
RANK
i E
E
E
E
N
N
D
o
-------
UPGRADE USER EVALUATION
USER
OAUM
(QAQPS)
(RTP)
USE
Data Reduc-
tion of Air
Quality
Data for
Trend
Analysis
DATA BASE CAPABILITIES
CATEGORY |
AVAILABLE
NEED
RANK
UPGRADE CAPABILITIES
CATEGORY
Plotting of a large
number of data
points
Standardized
terminology
AVAILABLE
Plot max. 400
data points
Confusing
terminology
NEED
User
specify
(1 of
data
points
Standard
Ize
RANK
E
N
o
u>
CD
-------
TRANSACTION DATA
o
CO
VO
USER
(TASK)
OAWM/OAQPS/RTP
(Trend Analysis for
Mt Quality)
Extract from UB
STORE!
SAROAOS
2 extracts
NI1I
IDS
Store
In
IDB
438K By tea
($50)
Analysis
& Terminal
PLOT
100 Plots
OFF-
LINE
PLOT
0
Number of
Terminal
Sessions &
Tine Per
Session
5-10/2 hi.
COST
Per
SESSION
COST
Per non-IDD
Extraction
$50/per
($100)
Could
Not
Do-
MANUAL
TIME
Existing
Systems are
comparable
for this
task
-------
ORD-HQ Evaluation
Main evaluator: L. Wallace
Total time spent learning system: 30-40 hours.
Types of analysis or graphics routines used heavily:
Data listing; basic statistics; polygon plots; bar charts; regression;
partitioning; the corr routine in SAS; filtering; plot mods, including
windowing and axis changes (linear to logarithmic or probability axes);
Stepwise in SAS; GLM in SAS.
Types of analysis or graphics routines not used heavily:
The Autoregression routine in SAS; neutral text; mapping;
shading and other plot mods.
Total time spent using system: 150-250 hours.
Data interfaces used heavily:
General purpose; IDB; Air
Data interfaces not used heavily:
STORET; NASQAN
Main effort: Correlations between mortality rates (1968-72) and
drinking water quality variables. About 800 correlations run;
200 linear regression maps produced. Each graph cost about lOc
(estimated cost of paper—$50/roll @ 500 sheets per roll.) in
paper. Computer costs for the 200 graphs appeared to be about $100.
Another 600 correlations were run without making graphs (using the
CORR routing); here the costs appeared to run about $40.
Advantages: Instant access to 191 health, demographic, and
environmental variables, in compatible form for running
correlations.
Other Uses of UPGRADE
INTERCORRELATIONS OF ENVIRONMENTAL VARIABLES.
Colinearity of the independent variables causes problems in
attributing the variance of the dependent variable to either
one or the other. The UPGRADE format allows easy tests of
relationships among the independent variables. For example,
a series of water quality variables were tested against "hardness",
a somewhat ill-defined quantity, to see in fact which variables
correlated most closely with "hardness". About 100 correlations
were run for the drinking water variables.
-------
INTERCORRELATIONS OF MORTALITY VARIABLES.
Many causes of death are more or less closely related. Finding
the exact relationships allows the analyst to group those causes
that vary similarly to achieve greater numbers and more stable
rates. For example, several types of digestive system cancer
(stomach, rectum, and intestine) correlate fairly well and can
be grouped; however, two main urinary system cancers do not
correlate well across the nation (cancer of the kidney and cancer
of the bladder) and should not, therefore, be grouped. (In
fact, many previous studies do so wrongly according to the
1968-72 data). About 300 such intercorrelations were run.
CROSS-CHECKING EPA AND USGS WATER QUALITY MEASUREMENTS.
Many variables, such as hardness, fecal coliform, and certain
heavy metals, are measured both by the stations operated by EPA
personnel and those forming part of the USGS national system
known as NASQAN. Comparing these variables indicates how well
or poorly these independent measurements agree. About 50 such
correlations have been run.
CORRELATIONS WITH DEMOGRAPHIC VARIBLES.
Often demographic variables such as urbanization or employment
in manufacturing are far stronger predictors of a changed mortality
rate than environmental variables. Learning- which diseases are
so related is a necessary fist step in determining environmental
influences. About 300 such correlations have been run. Graphs
of 75-100 have been made, at about the same costs as indicated
previously.
MULTIPLE REGRESSION USING BOTH ENVIRONMENTAL AND DEMOGRAPHIC VARIABLES.
Once the diseases affected by both environmental and demographic variables
have been determined, the relative power of each variable can be
estimated by running a multiple regression, either the stepwise
variety or one of five other types contained in the SAS GLM
(General Line Model) Procedure. Several multiple regressions
were run.
D-41
-------
UPGRADE USER EVALUATION
USER
ORD/OMTS/
I1Q
USE
Correlate
mortality
rates and
drinking
water
DATA BASE CAPABILITIES
CATEGORY
IDB
User Created
Data
AVAILABLE
Yes
Yes, no
documentation
NEED
More
data
Ability
to ex-
tract
data
from
Tape
Storage
RANK
H
N
N
UPGRADE CAPABILITIES
CATEGORY
Data manipulation
Graphic Analysis
Batch Mode
AVAILABLE
numeric data
4 type plots
Neutral Text
NEED
alpha-
numeric
multi-
plot
A/N axis
descrip-
tion
Batch
Produc-
tion
Mode
RANK
D
D
D
E
TRANSACT ION
VOLUME
COMPARABLE
MANUAL
TIME
Could not do.
-------
TRANSACTION DATA
USER
(TASK)
ORD/OWTS/HQ
(Correlate mortality
races (1968-72) and
drinking water
quality variables
Extract from DB
STORET
SAROAOS
NIH
IDE
X
X
Store
In
IDB
Analysis
& Terminal
PLOT
800 correla-
tion runs £
200 linear
Regression
maps
600 correla-
tions (no
graphs)
OFF-
LINE
PLOT
Number of
Terminal
Sessions &
Tine Per
Session
2 days
6 hra/day
III hrs
total time
COST
Per
SESSION
$100 total
+ $20 for
paper
$40.
COST
Per non-IDD
Extraction
Could
Not
Da.
NOT
NOT
MANUAL
TIME
J>-
OJ
-------
I. Introduction
The purpose of this document is to describe areas where UPGRADE capa-
bilities can be applied to the analysis, interpretation and graphic display of
the EPA's National Phytoplankton Data Base.
A major effort in the classification and enumeration of phytoplankton
algae was initiated at EMSL-Las Vegas in 1972, in support of the National
Eutrophication Survey Program (NES). Since 1973 these activities have been
performed at the EMSL-Las Vegas facilities by the University of Nevada, as in-
house contractors, and monitored by resident experts in the Water and Land
Quality Branch, MOD. During this period, a phytoplankton data base of over
44,000 entries has been developed which represents information from nearly 600
lakes in 37 states. Work is being continued on sample material from New England,
New York, Michigan, Wisconsin, and Minnesota which will complete the only com-
prehensive nationwide (48 contiguous states) phytoplankton data base in exist-
ence.
The sampling, preservation, classification, and enumeration techniques
used were consistent throughout, proving unique opportunities for comparisons
and contrasts, both within (geographic distributions, and community assoc-
iations) and between the phytoplankton data base and the other physical,
chemical, and biological information gathered concurrently (over 2 1/2 million
data points from the NES alone). These combined data bases provide a unique
opportunity to identify specific water quality indicators, develop trophic
classification methods, relate factors in lake problem conditions, and to
predict and/or control algal forms associated with taste, odor, and/or toxicity
in potential drinking water sources.
II. Data input requirements
Several levels of data input are identified below which progressively
increase the utility, and flexibility of UPGRADE in meeting our needs, and the
actual or potential needs of other users. Optimally the entire phytoplankton
data base would be transferred to UPGRADE from COMNET along with the corres-
ponding lake-date mean physical/chemical parameter values calculated for each
of the 800 lakes in the NES program.
A. Access, through UPGRADE, to our phytoplankton data files in COMNET
(without permanent transfer and storage), would enable us to examine geographic
distributions and community structure. This approach would considerably reduce
the interactive advantages of UPGRADE by increasing job times during data
transfer. Perhaps the most significant loss would be to other users who would
not otherwise have access to the National Phytoplankton Data Base.
B. Transfer of the entire phytoplankton data base to UPGRADE would be
optimal; of lesser value would be the transfer of phytoplankton data for select
species. At this level, geographic distributions and community structure could
be analyzed on a national or regional basis with minimal time loss due to
initial data transfer from one computer system to another at the beginning of
each user session. This level of data transfer would give all UPGRADE users
access to the National Phytoplankton Data Base.
D-44
-------
C. The mass of physical and chemical data collected by the NES program is
being reduced to lake-date mean parameter values (1/2 complete at this time).
Since about 800 lakes were sampled at least 3 times each (spring, summer, and
fall) there will be nearly 2400 sets of mean values. This entire data set
should be transferred from COMNET to UPGRADE for several reasons.
1. It, in conjunction with the phytoplankton data, would enable users to
examine, statistically and graphically, the relationships between phytoplankton
species (genera) and the environmental conditions associated with their occur-
rence. Conditions associated with the growth of special interest and problem
algae could be analyzed in more depth and with greater flexibility than is
feasible with the present system.
2. Increased effectiveness of 208 planning could be achieved through
geographic selection of data for use in preconstruction prediction of phyto-
plankton response to proposed environmental modification.
3. The physical and chemical data is in itself unique, and when married
to UPGRADE would provide the means for users to characterize and analyze lake
subsets of special interest.
III. Report needs
Computer programs have been written and tested which perform the routine
compilation of environmental data associated with the occurrences of the var-
ious phytoplankton forms. The execution of these programs has resulted in a
wealth of information leading to the initiation of several reports. Each of
the reports, in order to satisfactorily explain the data, requires extensive
use of various plots and histograms. UPGRADE would enable us to fulfill these
needs as well as initiate investigations into areas previously untouched due to
the size of the data base and the inadequacy of our present system to handle
geographic problems. The following are specific areas where UPGRADE would be
useful to our programs, and suggested uses of our data by other organizations
with access to UPGRADE.
A. Geographic distributions and representations (phytoplankton and
environmental factors).
B. Statistical evaluations of the environmental requirements of well-
known problem and special-interest algae.
C. Development of phytoplankton indices of trophic state (already well
under way at the genus level). UPGRADE would be useful for testing, index
modification, and development of species level indices to water quality.
D. UPGRADE, with its interactive capabilities, could be used for econo-
mical screening of data-analysis approach techniques on small sub-sets of data
before applying them to entire data sets.
E. Retrieval of baseline phytoplankton data within geographically re-
stricted areas.
F. Retrieval of baseline phytoplankton data within geographically pre-
dictions relative to 208 planning.
D-45
-------
G. Selection of lake subsets of special interest, Ce-g-» high or low
productivity), for comparison with community structure.
H. Provision of ambient water quality and/or sensitive biological com-
ponents for inclusion in multiparameter models for land-use and watershed
management.
I. Opportunity to interface the water quality and phytoplankton data with
other information of specific user interest.
D-46
-------
U.S. ENVIRONMENTAL PROTECTION AGENCY
OFFICE OF RESEARCH AND DEVELOPMENT
ENVIRONMENTAL MONITORING AND SUPPORT LABORATORY - LAS VEGAS
P.O. BOX 15027. LAS VEGAS. NEVADA 89114 • 702/736-2969 (FTS 595-2969)
Date: April 6, 1978
Reply to
Attn of: MSA
Subject: Establishing a SAROD Subset Data Base in the UPGRADE System
TO: Dr^-Wayne Ott
RD-680
This is to confirm my telephone request to establish two SAROD data sets:
one for Phoenix, Arizona, and the other for Tampa/St. Petersburg,- Florida,
covering the period from 1974 to 1976.
We would like the hourly average for NOx, 03, and hydrocarbons included
in the data set for each station, if available.
Larry Mylask indicated they are having some problem storing large amounts
of data but they would have a solution to that problem within two months.
, -,-,
E. A". Schuck, Chief
Monitoring Systems Design
and Analysis Staff
Monitoring Systems Research
and Development Division
-------
"Robust" Correlations of Mortality Rates (br*!j races & sexes) with Drinking Wntcr Constituent1-
1
Cancer, Total
" Esophagus
Stomach
11 Intestine
Liver
11 Kidney
Leukemia
I
00
Major
Cardiovascular
Disease
Hypertension
Chronic Ischemlc
Heart Disease
Cerebrovascular
Disease
Arteriosclerosis
— '
+
4-
—
—
—
4-
—
4-
-h
—
•
• ni
—
-1-
+
— .
4-
+
•
—
— ,
4-
|
•
4-
—
—
4-
4-
4-
-------
UPGRADE USER EVALUATION
USER
ORU
(KMSL -
Las Vegas
USE
Correlate
Drinking
Water
Concentra-
tions and
mortality
DATA BASE CAPABILITIES
tAiTCORT
IDB
AVAILABLE
DB Is avail-
able but very
limited data
NEED
Larger
Data Set
RANK
E
UPGRADE CAPABILITIES
CATEGORY AVAILABLE
Data manipulation
capabilities
Analysis and plot 1
variable to 1 or more
stations
Correlate two vari-
ables after both
variables arc plotted
to stations
Restart capabilities
after computer mal-
function
Additional
Analysis
Routines
Formal access to
Upgrade User Support
Group
Faster System
operation
Limited
Yes
Can plot two
variables
across groups
of stations
Start over
SAS
Per Contract
Verbose and
terse modes
NEED
Increase
Specific
Lnstruc-
tlon
)ato Save
Capabili-
ties
Added
lou tines
Jo cost
iccesa
Super
terse
RANK
N
D
N
N
D
TRANSACTION
VOLUME
COMPARABLE
MANUAL
TIKE
o
VO
-------
TRANSACTION DATA
USER
(TASK)
OUTS/Las Vegas Lab
(water/mortality)
Extract from DB
STORET
SAROADS
NIH
IDB
50K
data
point!
Store
In
IDB
Analysis
& Terminal
PLOT
Regression
IK plots G
20-200 pta
OFF-
LINE
PLOT
0
Number of
Terminal
Sessions &
Time Per
Session
15/2 hr.
COST
Per
SESSION
COST
Per non-IDB
Extraction
Could
Not
Do •
MANUAL
TIME/COS t
3lirs/plot
(S51/plot)
(Jl
o
-------
UPGRADE USER EVALUATION
USER
OlEE/HERI
CINN
•
USE
Correlate
Cardiovas-
cular vs.
rid 10 1
hcirdncss
DATA BASE CAPABILITIES
CAtTCOKY
User data base
AVAILABLE
CEQ user
support group
interface
NEED
RANK
UPGRADE CAPABILITIES
CATEGORY
SAS
Plotting
AVAILABLE
Yes
Yes
NEED
RANK
TRANSACTION
VOLUME
5K Records
COMPARABLE
MANUAL
TIME
Should save
10 times over
use of COHNET-
lo Graphics
at COMNET.
o
Ul
-------
UNITCU S1ATLS ENVIRONMENTAL PROTECTION AGENCY
WASHINGTON D I 20460
SEP 1 1 1978
on .n 0'
PLANNING AND
SURJECT: Evaluation of UPGRADE System
FROM: Ten Gardenier, Statistician /— -^-__ ,, ^
Statistical Evaluation Staff,^-PM-22'j-
TO: Lance Wallace, Environmental Scientist
Monitoring Technology Division, RD-680
CC:
Marcia Williams
Thanh you for getting me started on the UPGRADE system. I had a chance
to have sane "hands on" experience with both the extract and upgrade
analysis options. Since I promised I would provide some feedback to you
on the use of the system, let me give you my preliminary impressions. I
already talked with Joe Higgins and I understand Marcia routed over to you
comments from our staff. We had a chance to discuss it briefly last week.
1. First, let me say that I think the idea of the system is very good
because it gives us the flexibility to access actual data which, as
statisticians, we are able to analyze. This is a very strong point, since
data already complied in tabular form in other reports may not correspond
to sumraries that we may want to generate. For example, maximum concen-
trations at point, sources may have been used as input; but we may want to
have an index of variability (standard deviation) around the point source.
Unless access to time-series data is available this would not be possible.
It would take personal contacts and administrative time to make actual
data available to statisticians to analyze and perhaps reanalyze; thus the
system orovides a capability and diversification in our job function
strengtheningthe versatility of our results.
2. Second, environmental indicates leading to a specific health hazards
arc many and they involve a multitude of pollution sources. Consider, the
example of blood lead concentration which could be transported through
ambient air or drinking water. If one is interested in the relationship
between morbidity and/or mortality of a specific health condition and
environmental variables, more than one possible source of contribution is
far more superior to use as input parameters than a single parameter.
D-52
-------
Data collection efforts in the form of a "matrix" are often tedious
and occasionally impossible if one is interested in a particular location
and a specific time interval. I am impressed with capability to search
and rctrive for several health indicators (such as cancer of various
types, kidney infections, various types of heart and respiratory diseases)
as well as air and water quality variables. With this capability the user
is able to create a subfile for correlations, cross-tabulations, or other
analyses.
3. Tliore are a fe,v points about the system which emerged while I was
working with it which probably need attention. Again, there may be features
within the system which I am not familiar with to bypass these issues. If so,
I will know in time.
a. When in the extract mode, a "matrix is created out of the data
containing a full set of observations on all variables selected. That is,
one receives a matrix the size of which is dependent on the number of
variables sleeted and the number of observations in the variable with
minimun common data points. While I was searching within water quality
data I also searched for the variable dealing with "population/sq. mile"—so
I received as few observations as a couple of hundred. This was because
the population data is scanty and in the process of update. Perhaps it
might be useful to receive a "response" indicating how many observations
exist in each variable searched.
b. There is a state and county code identification in the main data
base, however unless the user asks for it as a "variable" the subject extract
datafile does not have any I. D. associated with each record. The matrix
includes a generated code of I to n as a sequential keycode for each record
and the specific observation on the variables. This delimits the use of
specific lists. I will now always include the inquiry for state and county
code. Yet, it might be worthwhile to automatically write into extract files
a specific reference I. D. for the record.
4. An idea related to the previous point in 3b might be to provide sunmary
data grouped by AQCR (Air Quality Control Region) for air quality related
variables. I do not know if a corresponding type of categorization exists
for water quality data, on whether the locations comprise the same territory.
Tins might be an idea for generating "national sanplc" data with a
feasible number of observations in the matrix. Otherwise, the more than
3000 entries for all countries appear in the extract database.
D-53
-------
7-.iothcr idea for creating national subsets would be to "randomize" on.
selected demographic parameters within state and county records. I can
visualize a 1/k stratified sample generating subroutine which could
create 3 subset national samples on one or more selected demographic
parameters.
5. The new modifications to UPGRADE evidently needs more disk space; so
when logging in, the user needs to specify Region (500). Perhaps until
automatic logon procedures are implemented, there could be a message to the
user with the request for logon, requesting the additional statement.
6. I also tried the system on an Andersen Jacobson printer rather than
the Tcktroniy wo were using. I found it convenient for listing large
data sets, since it takes a while to obtain a copy of the screen on the
Tektronix. At the time I did not want any graphics but tried the option
anyway. I received no documentation showing how one queues the output.
If the "no" option is given everything seemed to be clear-cut.
7. I also followed up on your comment on skewness and kurtosis. As I
mentioned, there is an approximation to skewness using the median.
This is: 3(X - median)
standard deviation
There is also a similar approximation to kurtosis using the semi- inter-
quartile range and 10th and 90th percentiles which is ^(Q., - Q,)
P - P
90 10
I hope these comnents were helpful. I will keep you posted as I get
more experience with the system. Do let me know if I can provide any
assistance.
-------
UPGRADE USER EVALUATION
USER
or&M
USE
Air quality
and health
data
DATA BASE CAPABILITIES
CAiTCoHir
County ID and
descriptive
data reference
AVAILABLE
Good aelectloi
of variables
NEED
Total
popula-
tion
RANK
X/N
UPGRADE CAPABILITIES
CATEGORY
Quick exit capability
Documentation of In-
terface between up-
grade and extract and
where the extract
file la GLIDE
AVAILABLE
None
Not vet avail-
able
NEED
Posaiblll-
bility to
trite dati
into cardi
ind/or
Jlsk tape.
RANK
X/N
TRANSACTION
VOLUME
Three a week
estimated use
for approx 12
variables and
5 sites.
COMPARABLE
MANUAL
TIME
light take
too long to
locate data.
so statistics
may not be
lone on actual
data.
-------
EVALUATION OF UPGRADE
WATER ENFORCEMENT DATA SYSTEM INTERFACE OPTIONS
By
William C. Blackman, Jr.
June 1978
The President's Council on Environmental Quality
Washington, D.C.
D-56
-------
EVALUATIOfl OF UPGRADE
WATER ENFORCEMENT DATA SYSTEM INTERFACE OPTIONS
by
William C. Blackman, Jr.*
INTRODUCTION
The Council on Environmental Quality (CEQ) has, since 1975,
sponsored and principally utilized a highly versatile, computerized
data system known as UPGRADE (User-Prompted Graphic Data Evaluation).
The system is interactive, employing ordinary English language in-
structions, stap-by-step analyses, graphic display and hard copy, and
line printing. This system is designed to enable managers, scientists
and angir=ers with no computer training to accass and analyze a wide
range of environmental natural resources, public health and related
data. The software package presently provides for a variety of graphic,
statistical and procedural options. These options are detailed in
Appendix A.
As directed by President. Ca.rter in his May 23, 1977 Me??=r;<3 to
Congress, CEQ has convened an interagency task force " to review
present environmental monitoring and data programs, and to recommend
improvements that would make these programs more effective." Accord-
ingly, CEQ is examining possibilities for interfacing the UPGRADE
system, with systems having environmental data bases, in order to:
1. Assist the agencies by making the capabilities of the
versatile UPGRADE system available where applicable, and
2. Achieve the access to existing data bases that is necessary
in order to carry out the Presidential Directive cited.
While on temporary assignment to CEQ.
D-57
-------
OBJECTIVES
The specific objectives of this analysis are:
1. To provide an assessment of various options for interfacing
UPGRADE with existing or planned automated data systems
operated by the Office of Water Enforcement (OWE) and Enforce-
ment Divisions in the Environmental Protection Agency (EPA);
2. To identify the adaptations that are necessary to enable
EPA Enforcement entities in Headquarters, Regional Offices,
and Field Offices to utilize the analytic, graphic and map-
ping capabilities of the UPGRADE system; and
3. To illustrate to CEQ the options believed available to re-
trieve information from a broad environmental data base
(including parametric data not now accessible) for use by
CEQ in achieving the general requirements of its mission,
and the specific requirements of the President's May 23,
1977 message.
This evaluation is in no way intended to conflict with either
studies of UPGRADE capabilities by the EPA Office of Research and
Development presently in progress, or the work of the Interagency
Task Force on Environmental Data and Monitoring. Rather, it is in-
tended to enhance and assist boui of these efforts by indicating and
ranking several options that could initiate interfacing activities in
a specific area of mutual interest to all parties.
DATA SYSTEMS PRESENTLY SERVING EPA WATER ENFORCEMENT ACTIVITIES
Automatic data processing (ADP) in EPA's enforcement operations
is fragmented, parochial, and largely housekeeping in nature. It is
unnecessary here to reconstruct the history of a series of contractors
and their respective proprietary systems, except to say that the extant
situation grows from a chronic absence of an operational centralized
D-58
-------
system. The most recent episode Involved the inadvertent destruction
of the General Point Source File (GPSF), which at one time was to
have been the centralized system for storage, retrieval and analysis
of National Pollutant Discharge Elimination System (NPDES) data.
After years of frustration, the regional, field office and Head-
quarters program managers have opted for local systems that meet
their immediate needs, or in some cases, make maximum use of limited
resources.
More recently, a trend toward centralized data processing in
regions! offices has limited the accessibility of data processing
facilities by technical program managers and their staff members.
Although the approach varies between Regional orfices, the present
trend is to locate terminals in the Management Division where the
technical program manager must send data processing requests. There,
the ADP manager mai.-.^ain^ -he discretion as to when and if the request
is to be filled. This leaves the requesting technical program with
little opportunity to "tinker" with the data or to experiment with
various analytical procedures. This "management" phenomenon may be
appealing from a budgetary standpoint but it is not effective in
terms of utility of data.
All Regional NPDES compliance operations and the Office of
Enforcement in Headquarters utilize a bookkeeping system known as
Permit Compliance System (PCS) for tracking compliance schedules and
permit renewal dates, forecasting events such as due dates for in-
spections, preprinting Discharge Monitoring Reports (DMRs) and similar
functions. The system is up on COMNET, the present EPA contractor
facility, and can be accessed by a local terminal in each Region.
The system is not designed to store or process raw parametric data.
It is the single data system that is used at least to some extent by
Water Enforcement operations in all Regions and the Headquarters.
D-59
-------
STORET, a well known automatic data base used by EPA for storage
and retrieval of stream quality data, has recently been adapted for
storage of effluent data by some users. Regional enforcement data
processors were canvassed (see Appendix B) to determine the extent of
this use. It appears that at least three Regional Surveillance and
Analysis Divisions are now routinely storing effluent data generated
by their inspections, and that one additional S and A Division intends
to do so. Although there is some doubt about such use by a few states,
Regional contacts indicate that nine state agencies are presently storing
effluent data in STORET and that three others are preparing or planning
to do so. The generalities but not the specifics of this information
are confirmed by analysis of a retrieval from STORET.
To enable a more accurate understanding of the present use of
STORET for storage and retrieval of effluent data, a retrieval was
designed to indicate recent pertinent activity. The retrieval included
only stations having 1976 and lazer data, and only stations at which
any of five effluent parameters (BOD, TSS, pH, fecal coliforms, and
flow) had been stored. This retrieval indicated that twenty-two states
have stored at least some measurements of the specified parameters.
These states are:
NEW JERSEY* CONNECTICUT
ALABAMA MASSACHUSETTS
MARYLAND ARIZONA
KENTUCKY PUERTO RICO
INDIANA SOUTH CAROLINA*
NEW YORK* NORTH CAROLINA*
FLORIDA* WEST VIRGINIA
TENNESSEE OHIO
NEW MEXICO TENNESSEE
VIRGIN ISLANDS WYOMING
MISSOURI UTAH
PENNSYLVANIA
Indicates significant amounts of data. Among these states only
Maryland, Florida, South Carolina and North Carolina have stored
appreciable amounts of effluent data since January 1, 1976.
D-60
-------
These findings indicate that a very sma-11 portion of the ef-
fluent data being generated by the NPDES Permit Program se'lf-monitor-
ing requirements is currently accessible through STORET.
Three of the regional Surveillance and Analysis Divisions (II,
VIII, X) indicate that parametric data from monitoring being con-
ducted in conjunction with compliance inspections by S & A Divisions
is being stored in STORET. A representative of Region IV indicates
that preparations are being made to use STORET similarly. Although
the present commitment of EPA inspection data to STORET is limited,
the data so generated should be of higher quality than the self-
monitoring data. Numerous quality control evaluations by EPA's
National Enforcement Investigations Center (NEIC) have shown that the
reliability of the self-monitoring data varies widely.
Several of the Regional operations have developed local systems
to meet water enfcrcGrce.-it needs, including the capability to store,
retrieve and manipulate effluent data. Salient features of these
systems are as follows:
REGION I - A local system, Region One (I) Enforcement Data base
(ROEDS) consists cf three components that function as cheir-titles
indicate: 1) Compliance Schedule File (COMP); 2) Self-Monitoring
Report File (SMON); and 3) Managment Information and Control System
File (MICS). The system does not receive nor store parametric data.
COMP and SMON perform the PCS functions that are similar in most
regions.
REGION II - A five-component system which is resident on COMNET
includes: 1) Status of Permit Development (SPD); 2) Status of Permit
Compliance (SPC); 3) Local Effluent Data System (LEDS); 4) Regional
Industrial Contributors System (RICS); 5) Non-Filer System (NOFS).
SPD and SPC are similar to PCS, and Region II interfaces the two systems
with PCS. LEDS is the system of major interest to CEQ in that it uses
D-61
-------
a data base that includes parametric effluent data. RICS contains
information that will be of central importance in achievement of
pre-treatment requirements. NOFS is of diminishing importance as
non-filers are identified and brought into compliance.
REGION V - Supplementing the normal PCS, Region V employs a sub-
system dubbed ENF-V to track enforcement actions such as response
dates, orders, etc. (See later discussion of Region V proposed inter-
face with state systems.)
REGION X - The Surveillance and Analysis Division operates a Point
Source File that stores permit and effluent data on about 450 sources.
This number includes all major dischargers and what are termed "signifi-
cant" minors.
Some state agencies have developed ADP systems for storage and
retrieval of effluent data. Those known to have ADP capability include:
Pennsylvania, Ohio, Indiana, Michigan and Kansas. State officials in
Dreg.-.-,, Washington and Idaho are Et various stages of development of
ADP systems. The nature, extent, and requirements for interfacing
these systems is not known, however, the associated EPA Regional En-
forcement Divisions are, in some cases, planning to interface them.
DATA SYSTEMS UNDER DEVELOPMENT
Two systems are in development by EPA contractors at this time
that are of immediate interest to CEQ. These are: 1) Compliance
Analysis System (CAS) which is to be developed for the Office of Water
Enforcement; and 2) an interfacing system as yet unnamed, that is
intended to enable the Region V enforcement staff to interface four
state-operated systems with CAS.
D-62
-------
CAS is designed primarily as a national permit tracking system
that will store permit conditions, pre-print DMR's, detect'viola-
tions, compare performance of dischargers, etc. It is supposedly
designed to accept raw parametric data. The concept has progressed
through an initial feasibility study by Arthur Young and Co., was
deemed unaccep tably costly by EPA, and a simplified approach is in
the process of a second feasibility study by Young and Co. OWE
expects to have that study in hand within the month. EPA staff
estimates that if a decision to develop CAS is made, operational
status will trail that decision by at least 27 months.
OPTIONS AVAILABLE
It appears that four options several sub-options are available
for interfacing the UPGRADE system with water enforcement-related
data systems. None of the options hold promise for immediate access
to a nationwide base of parametric data on water pollution sources.
OPTTON I - Interface UPGRADE with the ex+e*?t PCr
sy&te..; uu.ci a>.*cu.;: cperazic^i
the proposed CAS system.
cf
PRO - This approach would give to CEQ early
access to a nationwide base of non-
parametric data that would permit studies
of the administrative aspects of the
NPDES program. For example, numbers
of sources due to achieve compliance in a
particular month or quarter, numbers of
major sources not in compliance, geog-
raphical locations of major sources, etc.
This approach would initiate the working
level contacts and cause development of
the communications channels that will be
necessary to establish any kind of inter-
facing activity.
D-63
-------
EPA staff would gain the opportunity to
work with UPGRADE, experience its im-
pressive capabilities, and examine its po-
tential for a wide range of applications
by EPA Headquarters and Regional
Offices.
CON - As indicated above, a system that is
limited to analysis of the housekeeping
type data that is stored in PCS is of
questionable utility. Apparently, some
two to three years of development time
lie ahead for CAS if development is
authorized. Thus, the opportunity for
timely interface with a parametric data
base is not inherent in this option.
OPTION II(a) - Interface UPGRADE with STORET.
FRO - CEQ would, in a shirt p-oficd, be
enabled to access a parametric data base
of limited scope.
CON -
OPTION II (b) -
The base would be data-rich for only
five states. Most of the accessible data
grows from self-monitoring (as differen-
tiate fie::: C2ir.pjiei:ce n?ofc.hc,. hȣ ay the
regulatory agencies), thereby adding an
element of questionable quality control.
There is no uniformity of format between
contributors of data to STORET, thus
causing the effluent data stored therein
to be difficult to use for comparisons or
statistical analyses.
Interface UPGRADE with STORET in con-
junction with and in cooperation with, an
EPA program of greatly expanded stor-
age of effluent data. This option would
require a modest commitment by Water
D-64
-------
Enforcement personnel in the Headquar-
ters and/or Regions.*
PRO - EPA would gain the ability at an early
date to perform a wide range of com-
parisons, statistical evaluations, re-
gressions, maps, automated calcula-
tions, etc.
CEQ would gain access to a meaningful
data base that would grow steadily in
scope, richness and utility.
CON - In the absence of a switch to another
ADP contractor or the improvement in
responsiveness of COMNET the chronic
difficulties of working with STORST
could be expected to plague tne op-
eration .
It is unrealistic to expect that all EPA
Regions and delegated state agencies
would voluntarily participate in the pro-
gram. If not, the gaps would deprive
CEQ of a nationwide data base, and the
extent or seriousness of that deprivation
would be directly proportional to the
numbers of non-participants.
OPTION III - Interface UPGRADE with individual sys-
tems in EPA Regional Offices. This ap-
proach does not hold promise with re-
spect to the ROEDS system in Region I
since the accessible data would be essen-
tially PCS data from one Region. It
holds marginal promise in terms of the
LEDS component of the Region II system
Details would require negotiation with EPA but the proposition
is that EPA Regional and delegated states be shown the capabili-
ties and flexibilities that would be made available by adopting
UPGRADE. Participants would be asked to store effluent data gen-
erated by compliance monitoring inspections and enforcement
evaluations. Some provision for separate retrievals should be
provided in the event Regional and State program directors wish
self-monitoring data to be included.
D-65
-------
in that LEDS contains parametric effluent
data. However, the data consists of
pre-calculated loads, and is generated
by self-monitoring operations in
(presently) only one state. It is thus a
very limited data base, having the qual-
ity control questions associated with
self-monitoring data. It limits demon-
stration of UPGRADE capabilities in that
flow measurement-raw concentration
manipulations, regressions, calculations,
and other "data-tinkering" type opera-
tions are not possible. Stated another
way, LEDS requires the user to manually
(or in some other external system) do
the calculations that the interfaced
UPGRADE should do, if its full capa-
bilities are to be used. The Region X
system appears ideal for interfacing,
number of sources (450) represented by
the data.
The most favorable of the regional sys-
tem/ UPGRADE interface options is with
th-i system being developed by a con-
tractor for Region V. Regional staff has
no access to DMR data. All states of
Region V have been delegated the NPDES
program and are maintaining DMR data in
state files of which three (and soon a
fourth) are automated storage and re-
trieval systems. The system presently
under development is intended to inter-
face the state systems with the Head-
quarter's CAS system, which again is at
least 27 months from operational status.
PRO - Interface with UPGRADE could enable
Region V to achieve access to state
agency files at a much earlier date than
will be possible with CAS (assuming that
CAS is brought to operational status).
Interface with UPGRADE would provide
CEQ with an early opportunity to work
with a significant data base, and to
demonstrate UPGRADE potential.
0-66
-------
CON - The data base is thought to be con-
stituted primarily of self-monitoring data
and thereby embodies the quality control
doubts. There are two related consid-
erations: 1) though quality of the data
may be questionable, the interface would
provide the opportunity to test, improve
as appropriate, and demonstrate the
utility of UPGRADE; 2) Region V rep-
resentatives might find it possible to
negotiate provision for identification of
compliance monitoring data generated by
the regulatory agencies. The latter
would enable Region V and CEQ to ac-
complish meaningful data analyses.
OPTION IV -Although this analysis was to have been
confined to considerations of suitability
for interface with Water Enforcement data
systsT.G, it is necessary *o include con-
siderations that lie outside of that realm.
Various EPA operations are presently
contracting with consultants (have re-
cently done so, or planning to do so) to
develop separate data systems. There
are now or will be, data systems devel-
oped for each major EPA media program
in addition to the operational and admin-
istrative support systems.
The independent development efforts
have both negative and positive aspects.
Certainly the needs of the media pro-
grams differ with each other and with
those of the housekeeping operations.
Moreover, individual program managers
upon finding that no data system that
meets the program needs is extant within
the Agency, cannot be faulted for taking
the steps necessary to meet those needs.
However, economics-of-scale, intermedia
analyses, and technology transfer would
be enhanced if a single system could be
developed.
Exploratory discussions between CEQ and
EPA could ascertain the possibility of
structuring a single system, incorpora-
ting or interfaced with UPGRADE so that
D-67
-------
the Council staff would be enabled to
access the full range of environmental
data. The trade-off for EPA would be
the acquisition of capability by Head-
quarters and Regional staffs to exploit
the capabilities of the UPGRADE system
by returning environmental data analysis
to the technical and scientific personnel
who should conduct such analyses.
Hopefully, the present study by the EPA
Office of Research and Development will
reach that conclusion and provide the
details of the process which would be
necessary to initiate that approach.
SUMMARY AND CONCLUSIONS
UPGRADE is a versatile, interactive, user-prompted data analysis
system that could be exceptionally useful to Water Enforcement operations
in EPA if interfaced with appropriate data bases. The system can be
adapted to perform a wide variety of calculations and statistical
analyses, and to produce hard copy print-outs, plots, maps, regres-
sion curves, etc. It is designed to enable the evaluation of en-
vironmentally related factors against others, for example, morbidity
statistics can be analyzed in terms of the concentrations of a par-
ticular air pollutant, or first-through sixth-order regressions can
be performed for a water quality parameter on flow.
It is deemed highly desirable by CEQ, the sponsor and presently
the principal user of UPGRADE, that the system be interfaced with EPA
data storage and retrieval systems, such that CEQ could access Water
Enforcement data bases. Such access would be very helpful to CEQ in
the accomplishment of its mission and objectives, but the present
options to gain the desired access are limited. There are no exist-
ing systems that contain a nationwide base of parametric data on wa-
ter pollution sources. Such a system may shortly go into development
D-68
-------
with operational status approximately 27 months later. One system
(PCS) contains "housekeeping" data such as compliance dated, Dis-
charge Monitoring Report pre-print data, etc. Relatively small
amounts of source data have been stored in STORET, and if all or most
compliance monitoring data were placed in STORET it could be made to
provide a satisfactory data base with which to interface. Various
EPA Regional Offices operate Water Enforcement-related data bases.
Of these, only the Region X system contains parametric data. Region
V is developing a system that will enable accessing of pollution
source data in the State files.
In view of the absence of a uniform nationwide storage system
for pollution source data, and the present trend toward development
of separate systems for each media program, CEQ might render an im-
portant service and at once achieve its accessing goals by proposing
discussions with EPA to consider joint development of a comprehensive
data system that: a) would be designed to interface with UPGRADE;
b) meet the needs of all media programs in EPA; and c) thereby
reverse the trend of concentrating technical data processing in the
administrative elements of EPA, and return that function to the
technical and scientific operations where it can best be employed.
It is recommended that .CEQ consider the latter options to be
most favorable and the former to be least favorable, i.e., if it is
possible to join EPA in a comprehensive data management system with-
out undue delay of media program objectives, that approach embodies
the greatest potential effectiveness. Failing that possibility, CEQ
should negotiate an interface agreement with Region V: a) to adapt
UPGRADE to Water Enforcement needs and fully develop the system's
inherent capabilities and, b) demonstrate those capabilities to the
other EPA entities.
D-69
-------
The STORE! interface option is. workable and would increase in
effectiveness as the number of State agencies and Surveillance and
Analysis Divisions that could be persuaded to participate. An under*
standing of the utility of UPGRADE to Regional and State technical
staff will be the key to gaining participation by those entities.
The PCS interface option is recommended only as the last resort.
The non-parametric data therein is of limited value and the wait for
operational status of CAS is a major inhibitor.
D-70
-------
APPENDIX A
UPGRADE ANALYSIS OPTIONS
(Reprinted from the pamphlet "The UPGRADE System User's Overview"
- President's Council on Environmental Quality - August 1977)
D-71
-------
UPGRADE ANALYSIS OPTION'S
UPGRASS Graphics Options
1. Scatter plotting -- '3. plot of data points arid their
distribution
2. Polygon (point-to-point) plot — data point plot with a
straight line connecting successive points
3. Bar chart — each data value is represented by a bar.
(Numerous shading and density options are available)
4. Polynomial fit — scatter plot with up to 6th order
least-squares fitted line and table of m, r, t, and
f values.
5. Multi-plotting (FY 1978) -- to allow up to 5 y-axis
variables on same graph.
6. Multiple-site plotting (FY 1978) — to allow the
plotting of sites (instead of a variable) on the
x-axis.
Plot Modifications
A large variety of analytical and cosmetic options are
available, allowing the user to "tailor" graphic output
froai the UPGRADE systea. Future development will include
the addition of even more plot-aod options.
Interchange axes -- x becones y and vice-versa.
Reverse axis scaling -- scale from max. to ciin.
instead of vice-vers2.
D-72
-------
Change scale factors — allows user to
specify scale ranges. This mod can be
used for "windowing" a plot so only
datapoints within a specified range are
plotted.
Change number of axes annotations — to
modify precision of scale divisions.
Change number of axes tick marks — to
modify precision of scale divisions.
Add or delete grid lines — to divide a
plot into quadrants.
Change letter size of axes annotation —
to nodify legability of scale annotations.
Change s7=1*001 and symbol size of datapoiats-
will also be used to differentiate between
different variables when multi-plotting
becomes available.
Change graph title.
Add 2nd line to graph title.
Eliminate or restore plotted datapoints —
if a regression line without the plotted
points is desired, for instance.
Change line type — will also be used to
distinguish between lines for different
variables when multi-plotting becomes
available.
D-?3
-------
Change axes length -- to modify dimensions
of entire graph.
Change axes scale type -- to allow use of
log and probability scales.
Change degree of polynomial fit — to use
up to a 6th order fit for regression
analysis.
Eliminate outlying datapoints from a fitted
plot — to "window" a regression line to a.
selected range of data values.
Eliminate or restore current date printout
that appears on every plot.
Change number of decimal places used for
axes annotation — to modify precision of
annotations.
Change number of decimal places used for
bar annotation — to modify precision of
bar chart annotations.
Suppress statistics printout on fitted plot —
if Eit r, f, t, ndp values are not needed.
Change bar density and shading specifications
for a barchart — see graphics 8 and- 9 for
examples.
-------
Statistical Options
1. Sort and rank -- to obtain a table of median, quartiles,
tertiles and 15th and 85th percentiles for any one variabl
or complete sort arid/or rank for each data point.
2. Data filtering and listing — to eliminate, outliers , selec
a range of data values, or obtain a listing of data point
valves.
3. Linear regression — to produce a table of coefficients o:
variance for regression analysis.
4. Data partitioning — to group data values for the x-axis
into class intervals and plot against a partitioned y-axi:
variable. Statistics for class intervals and partitions
can be produced.
5. Basic statistical srr-naries of selected data,
including minimum, maximum, mean, standard
deviati'on, number of data points, and historical
period of record.
6. Data transformation FY 1973 — to allow user to perform
arithmetic operations on variables to obtain, ratios, etc.
User will also have capability of using a variable for
exclusion purposes to obtain a selected range of data
values for that variable.
D-75
-------
7. SAS FY 1978 -- an integrated system for data management
and statistical analysis.
Highlighting SAS's statistical capabilities are its
versatile least-squares procedures, which produce a
wide variety of linear and non-linear regression
analyses, analyses of variance and covariance, and
multivariate analyses of variance. One can produce
highly specialized analyses vith comprehensive matrix
manipulation procedures.
SAS can also produce multiple and partial correlation
coefficients, Spearman's and Kendall's correlation
coefficients and contingency table chi squares.
It has several procedures for analyzing time-
series data. One can calculate summary statistics
and print them or use them directly for further analysis
One can obtain frequency and cross-tabulation tables
and analyze them as well as. perform, discriminant
analyses, factor analyses, and cluster analyses. One
can construct and evaluate Guttman scales, and can
perform t-tests or tests of goodness-of-fit or probit
analyses.
D-?6
-------
APPENDIX B
CANVASS OF WATER ENFORCEMENT DATA
SYSTEMS IN EPA REGIONS AND DELEGATED STATED
-------
fr ~\ -z
A i A /'« •"«•"«/) ?*
x J WATER ENFORCEMENT DATA HANDLING QUESTIONAIRE
1. Region 3— . person Interviewed jiff' •?•*-&' IWC' ******* . Phone 2 ^ ' *3 - ^ ^ 7 &
fft.ii°"A ' <^f/«" ^i,/«v*/«~~' '
2. General Dcscripbion of Region's system RQ&P$ — ? )•*>"
ce>**6j~'- •£*"*-•**'(•**
3.1s industrial effluent data being stored up ? Majors **° 7 Minors 7 Self-
MonihorJng ? EPA monitoring 7 State/local agency monitoring 7
1. Is municipal effluent data being stored ** 7 Self-monitor ing 7 EPA
monitoring 7 State/local monitoring 7
„
5. Is data from NPDES States being stored **C? 7 What states
7 Self-monitor ing 7 State/local monitor lng_
p
? Wat states
7 Self-monitor ing 7 State/local monitoring
EPA monitoring
7. Does system store and retrieve compliance schedule data_
B. Ooes system store and retrieve receiving water quality data CjgS *" *«»
"7^
9. What hinds of analyses are performed
10. What hardware components are in use
C.twJf. ftaM^ -Qt>r*o> • «••• I f
-------
WATER ENFORCEMENT DATA HANDLING QUESTIONAIRE
* Xr.
1. tteglon_JLlL_. person Interviewed y*-^**** /r.\ _ . Phone
I / '/ 7 / / ? / A -«:.
2. General Description of Region's system Uvl^^J^'-^i £ ******* f***-/^.^..* /k <• /**«.,«.
r\ /•-'- f -r (3>^/OS '•«••/ /'//£• -f /*/> <•.»/-» & A *)*- J
- /J*^ - t". Ls Jf*..: /i — f^ /..'/. /-A
J.la inclutiLrictl effluent data being stored ^j«*> ? Majors LI** ? Minors ^''^ ? Self-
• ^^^^^^^^^^^^^^^^^
lL ••»"')
HoniL Airing tj ^ •.• _ ? EPA monitoring^- ' ^//C' __? State/local agency monitoring _ ?
J '
A. Is municipal effluent data being stored i*f*. 7 Self-monitoring ^^v ? EPA
i i t ' r
monitoring _y>/ 'f> ? State/local monitoring c4£-«v ?
«
b. Is dnta from NPDES States being stored Ufe- ? What statos
? Self-monitor ing ^^^ ? State/local monitoring
EPA monitoring
6. Is dutn from non-NPDES States being stored tV'A ? Wat states
? Self-monitor ing 4^> ? State/local monitoring
EPA mouitoring_
7. Doeb system store and retrieve compliance schedule data
0. Ooes system store and retrieve receiving water quality data J&'rt f~ 3k rt t+fp
9. What kinds of analyses are performed ^V-^ ]£.**?•* n*-*v -•&/* 6«* £'<>+.. 4e"-*+j /H A>
«I^_^MM^M«*iM«MBMM-M^WM«a^B*^BH^«M«^B^B«Ha^HM«BB^^MI^HM^^^MM«M«^^aB*^"^^H">IH
'-'£' -•»***+*-
10. What hardware components are in use
-------
WATER ENFORCEMENT DATA HANDLING QUESTIONAIRE
1. Region HE . Person Interviewed J?f£ A^'S _ . Phone
2. General Description of Region's system b?~" '•*>*+,' t> $e-lv& '«"_<> -»/ 'y .(-Ol Lt
3.1s industrial effluent data being stored **£> _ ? Majors _ 1 Minors _ ? Self-
Monitor iny _ ? EPA monitoring _ ? State/local agency monitoring _ ?
4. Is municipal effluent data being stored (fiO _ ? Self -monitor ing _ ? EPA
monitoring _ ? State/local monitoring _ ? ^
"
5. Is dnta from NPDES States being stored U£> ? What states
Self-monitor ing ? State/local monltoring_
£, EPA monitoring
0
.
6. Is data from non-NPDES States being stored (AC> _ ? Wat states
_? Self-monitor ing ? State/local monitor ing_
EPA monitoring_
7. Does system store and retrieve compliance schedule data
8. Xoes system store and retrieve receiving water quality data HO
9* What kinds of analyses are performed.
r— 3-
10. What hardware components are in use
,.,-. A -, f,..,., -./w/f^l-'. /.. ' .'j'f'&JCf-' 7'-
-------
WATER ENFORCEMENT DATA HANDLING QUBSTIONAIRE
Person Interviewed ^
1. Reg iep_
2. General Description of Region's system
VfcJ f-
3 . la industrial effluent data being stored
Monitoring __ ? EPA monitoring
we?
_? Majors
? Minors
4. Is municipal effluent data being stored
monitoring _ ? State/local monitoring
? State/local agency monitoring^
? Se If -roon itor ing
5. Is dnt.i from Nl>DES States being stored V\D
? Self-monitoring_
? What states
_? Self-
?
? EPA
? State/local monitoring
l CPA monitor ing
Ou
l mo
tl^
6. la data from non-NPDCS States being stored
? Wat states
_? Self-monitoring_
? State/local monitoring^
EPA monitoi:ing_
7. Doeu system store and retrieve compliance schedule data_
U. ':-oes system store and retrieve receiving water quality data
9. What kinds of analyses are performed 6*4£: $70*& /
I ' C*^> £*\5
-4.if I*
10. What, hardware components are in use
f ctrllv 12 /
-------
WATER ENFORCEMENT DATA HANDLING QUESTIONAIRE
A '0 t fr
/Jif«S'(r Let**-1'
1. Region . Person Interviewed if«S'r et**-1' _ . Phone
2. General Description of Region's system ft-*£ - 7£*< :*"% »>^ c*««f '•**,* c •'• L,,l,.f*t f?ff''ss.i f^^
<>/ <• r.. .v-
3.1s industrial effluent data being stored />ft _ ? Majors _ 7 Minors _ 7 Self-
Monitoring _ ? EPA monitoring _ ? State/local agency monitoring _ ?
4. Is municipal effluent data being stored ./"*£* _ ? Self-monitoring _ ? EPA
monitoring _ ? State/local monitoring _ ?
5. Is data from NPDES States being stored h& ? what sttes itjts L*t.e /Jfff&S /•fj*"'",, ^ /
jf/U
jfr-J^ i,a.'t > jf < A**-*- ? Self-monitor ing ? State/local monitoring
F
J5 EPA monitoring ?
6. Is data from non-NPDES States being stored ? Wat states
_? Self-monitor ing ? State/local monitoring_
EPA monitoring_
7. Does system store and retrieve compliance schedule data
8. .Ooes system store and retrieve receiving water quality data _
** / * f f\ff/ \ fi r *
9. What kinds of analyses are performed CASisc&*v&• */ ^IXK^ ^t^-^^ {stri**e+~>T ff&r*- •**.
10. What hardware components are in use^
-------
WATER ENFORCEMENT DATA HANDLING QUEST ION A I RE
1. Region"VIZ . Person Interviewed LJle**, />i-*l~, _ . Phone "72. ^ ""
2. General Description of Region's system fs*elc ' */»•*«/ /-tf«~«*- £tt+&/*i m t»^^*-^ *r**t>S*
5
3.1s induiiLrial effluent data being stored V\t> _ ? Majors _ ? Minors _ ? Self-
Monitor Jmj _ ? EPA monitoring _ ? State/local agency monitoring^ _ ?
4. Is municipal effluent data being stored (/\& 7 Self-monitoring _ ? EPA
monitoring _ ? State/local monitoring _ ?
5. Is data from NPDES States being stored Mf> ? What states
? Self -monitor ing ? State/local monitoring
EPA monitoring_
6. Is data from non-NPDES States being stored HO ? ' Wat states^' ' *^<
? Self-monitor ing ? State/local monitoring /X
EPA monitoring_
7. Does system store and retrieve compliance schedule data
Q. Ooes system store and retrieve receiving water quality data
0. What kinds of analyses are performed <*£>*» ^
10. What hardware components are in use_
-------
cln-h*. 1ft
1. negion • Person Interviewed 6*' ? . phono
ENFORCEMENT DATA HANDLING QUESTIONAinB
'^ Ah I
2. General Description o£ Region's sysbem I £S ^ -//ticuT^y ci» }*ve(g> f-~>~y
4-
3.1s industrial effluent data being stored . ytb ? Majors 7 Self-
Monitoring i/tt? 7 EPA monitoring ,» M 7 State/local agency monitoring 7
4. IS municipal effluent data being stored U/-S ^" 7 Self-monitoring ^^ 7 EPA
monitoring £> 7 State/local monitoring 7
5. Is data frort ttPDEB States bein^ stored 1 What states
_? Self-monitoring 7 State/local monitoring £7. A*;*,
EPA monltoting
2 .
6. is data from hon-tfPDBS States being stored _ 7 Wat states1^ "
_7 Self-monitoring 7 State/local monitorlng_
EPA monitoring_
* boed systeiii storfl and retrieve compliance schedule data
toes system store and retrieve receiving water quality data V\O
9. What kinds bf analyses are performed "&$ - \^.t,{4^^ 5>TPKt«=T /isK~H5 f .
oT
7
10. What hardware components are in use (tf*** will (_QfiF*i?(
-------
WATER ENFORCEMENT DATA HANDLING QUEST ION A I RE
K.?
1. HegAonTTTTT' . Person Interviewed K.?y*f 'e.i+t' _ . Phone 33 7 -
2. General Description of Region's system '//«•**. »-w»~( k-t>n ^ -» A^ / —
3.1s industrial effluent data being stored *"£ _ ? Majors f»^-* ? Minors M ? Self-
MoniLr-rintj /j4-£ _ ? EPA monitoring p*'* __? State/local agency monitoring ti P
4. la municipal effluent data being stored U/*s _ ? Bel f -mon itor ing pt'- > _ ? EPA
luoiiiborinr; __H>£4 _ 7 State/local monitor ing *O 1 ^.
7 ut$ CD
ti. Js . la datn from non-NPDES States being stored **& ? Wat states
_? Self-monitoring i**> ? State/local monitor ing *»p
EPA
I A
7. lioea uyateiu sLore and retrieve compliance schedule data y**» — lJ--^t P*"-y /v»~*T**'••( fa
0. «:-oea ayaLem store and retrieve receiving water quality data .M^ — f^ yiPffrf - 0" *
'J. Wliak kinds of analyses are performed^
&-L. M
&--f /^tf-*-."-*-^ £^-\*.Sl> « ff-(
10. what hardware compononta are in use
»fi» 0
-------
WATER ENFORCEMENT DATA HANDLING QUESTIONAIRE
1. RegionJL£-_. Person Interviewed M-/\f /K^cV- _ . Phone *&C?
2. General Description of Region's system /"uf-" — ~ f^* &.•'••< / 1 *:> -^ */<"•• !•-
3.1s industrial effluent data being stored V& _ ? Majors **'? 7 Minors _ ? Self-
Monitor ing____*J£ _ ? EPA monitoring t*-i"* _____ 7 State/local agency monitoring __ ?
4. Is municipal effluent data being stored /ll?_ _ ? Self -monitor ing H.& _ ? EPA
monitoring /ig» _ ? State/local monitoring _ ?
5. Is dntn from NPDES States being stored tft*-f _ ? What states
r _ ? Self -monitor ing b\c* _ ? State/local monitorlng_
OD — — — ^— — — — — — ^-^— — ^— — __^___— __ _
O\
EPA monitoring
6. Is data from non-NPDES States being stored _ ? 'Wat states
? Self-monitoring ^^ ? State/local monitoring_
EPA monitoring AC7 ?
7. Does system store and retrieve compliance schedule data IJS1> - WT nc> t'H*e'V*li &>•/ t^.*f-
8. Ooes system store and retrieve receiving water quality data
9. What kinds of analyses are performed
10. What hardware components are in use
'? ^'^•h£^^-l ft.vir' **•'*•
ft
-------
1. Region
WATER ENFORCEMENT DATA HANDLING QUESTIONAIRE> -
. Person Interviewed Jc*^£" f>i k OS *L '.' . Phone T?^ *9 - / 2 c.,
2. General Description of Region 'a^sy at em5
'
.//, s.'j «,«.,. '/xi'«
'
3. la industrial effluent data being stored n <* <* ? Majors
y
A*'"'
Minor s_£f^
Monit-jring
? EPA monitoring
? State/local agency monitoring
4. Is municipal effluent data being stored ty £* _ ? So It-monitor ing tj£ *
y /
Of
monitor JIKJ _ ^ t? •* ? State/local monitoring ?
5. la dntn from NPDES States being store
__ ? Self-monitoring
Self-
7
? EPA
EPA monitoring_
6. Is data from non-NPDES States being
? State/local monitoring
? State/local monitoring
7. nooj, oy»Lom storo and retrieve compliance schedule data
0. ':f-'/l
-------
UNITED STATES ENVIRONMENTAL PROTECTION AGENCY
Region III - 6th & Walnut Sts.
Philadelphia, Pa. 19106
SUBJECT: Region III Evaluation of UPGRADE
FROM.
TO:
DATE:
Dee Ortner
Information Systems Branch, 3MA20
Dr. Lance Wallace
Office of Monitoring and Technical Support, RD-680
The approach taken in our evaluation focused on the comparison of
UPGRADE with two graphics packages presently supported by the Agency.
Experience of our staff in the use and capabilities of graphics
software and output devices was extremely limited prior to this
analysis and thus results reflect an unbiased assessment.
Evaluation results are structured in accordance with Enclosure "B"
of your memorandum dated February 17, 1978. Additionally, I have
been in contact with Mr. Joe Higgins of VITRO Laboratories to provide
him with as much data as available from our analysis.
If Region III may be of further assistance do not hesitate to contact
me or either of the evaluators indicated on the following pages.
Attachments
EPA-III.013.73.T
D-88
-------
REGION III EVALUATION OF UPGRADE
Identification of the Evaluation
Principal Investigator for the evaluation was
Maria Gilbert 597-9769 (FTS)
Information Systems Branch, 3MA20
Management Division
EPA - Region III
Ms. Gilbert was assisted by
Steven Belczyk 597-9964 (FTS)
Management Division
EPA, Region III
Evaluator's Function as it Relates to UPGRADE
Both Ms. Gilbert and Mr. Belczyk perform systems analyses and provide
programming assistance as support to Regional personnel. Presently the
evaluators are investigating the use of UPGRADE as a means of supporting
the Region's efforts in developing Environmental Profiles and in dis-
playing data for public information/awareness. Other probable applications
of UPGRADE would be in the areas of 208 planning and the drinking water
and new source programs where public health effects are an integral part
of impact assessment.
Evaluator's Experience Using UPGRADE
Neither evaluator was familiar with UPGRADE prior to the evaluation.
Ms. Gilbert had overall responsibility for reviewing the documentation
and directing Mr. Belczyk in system use. Time spent in documentation
review approached 50 hours and time on the system totaled 10 hours. Both
evaluators found UPGRADE easy to use once data was stored. However more
detailed documentation with examples for both the general user's guide
and the GLIDE manual would be helpful.
Goals Set for UPGRADE
The approach taken in the evaluation focused on the comparison*of
UPGRADE with two graphics packages presently supported by the Agency:
CALCOMP (California Computer Products)
IGP - Interactive Graphing Package (TEKTRONIX, Inc).
Criteria by which all systems were evaluated are shown in Figure 1. The
intent of the evaluation was to rate each system independently and not in
relation to either or both of the other systems.
D-89
-------
Figure 1
Evaluation Criteria
1. Input
a. Ability to read data from tape and disk
b. Ability to input interactively
c. Ability to easily modify data
d. Ability to store data for later use
e. Ability to use all types of data, STORE!, AEROS, etc
f. Ease of Data preparation and entry
2. Output
a. Types of graphs; barcharts, linegraphs, 3D plots, isovariate plots, maps
b. Display features
c. Devices: calcomp, textronics, line printer
d. Ability to save output
e. Ease of output structuring
f. Overall aesthetic appeal
3. Special Features
a. Statistical capability
b. Ability to interface with EPA Headquarters and Regional
software
c. Transferability between NCC and WCC
d. Ease of operation for non-ADP user
4. Quality of Installation
a. Completeness of documentation
b. Performance
5. Costs of Test Graph
a. Sign-on
-time
-cost
b. Execution cost
c. Programming time
D-90
-------
Performance
Each of the systems was rated according to points allocated for
each of the following five categories of evaluation criteria:
CATEGORY POINT SCORE RANGE
Input 0-5
Output - 0-5
Special Features 0-5
Quality of Installation 0-5
Costs of Test Graph 0-3
*range goes from low (0) to high (5) performance rating
Performance rating for each system is the summation of category scores.
Results of UPGRADE'S performance and that for CALCOMP, and IGP are shown
in Table 1.
Recommendations
From a Regional perspective it would be difficult to estimate
the amount of use UPGRADE would receive or to anticipate the necessary
level of user's support without first making the following modifications
to the system:
-include the capability for analyzing functions and the plotting of
more than one variable (Y-axis) per graph;
-include the capability for generating 3D and contour graphs;
-include the capability to interface with Headquarters/Regional
software (extend STORET, MSIS, PCS);
-expand data modification capabilities;
-enhance the "data save" capabilities;
-prepare more detailed documentation on system use and the update
of user manuals.
Presently the Region has access to a great number of graphics packages.
The extent to which we would use UPGRADE, consequently, would be dependent
on the number of modifications or enhancements made to the existing
system. The Agency should consider this perspective before authorizing
the "sponsorship" or "co-sponsorship" of the UPGRADE system.
D-91
-------
page 1 of 4
Evaluation
Criteria
UPGRADE
CALCOMP
IGP
INPUT:
- Ability to read
data from tape
and disk
- Ability to
input inter-
actively
Ability to
easily modify
data
Ability to
store data for
later use
Ability to use
all types of
data; STORE!,
AEROS, etc
G
\o
ro
SCORE: 3
Data must be stored in one of the
UPGRADE interfaces for accessing.
Up to 50 variables of a user's
data can be input*interactively
for analysis.
Modification before graphing and
between successive graphs is
limited to selecting a range of
data values and/or eliminating
specified data points.
Data entered interactively can be
stored; data selected from UPGRADE
interfaces cannot be stored, the
selection process must be re-
peated for subsequent analysis.
Any data can be stored.
SCORE: 4
Data can be read from tape and
disk.
CALCOMP is not an interactive
system; however, conversational
jobs can be run for which data is
inputted interactively.
Input data can be modified, either
by editing the data or modifying
the program; modifying data
becomes difficult if a data set
cannot be accessed over the
terminal.
Any input data can be stored.
Any data can be stored.
SCORE: 2
This capability does not exist.
All data is input interactively.
Data can be inserted, deleted and
changed before and after each
graph is produced.
This capability does not exist;
data values are lost after
terminal session is ended.
No data can be used unless
entered interactively.
-------
page 2 of 4
Evaluation
Criteria
UPGRADE
CALCOMP
IGP
- Ease of data
preparation and
entry
OUTPUT:
- Types of graphs;
barcharts, line
graphs, 30
plots, iso-
variate plots,
maps
- Display
features
Devices: CAL-
COMP, tektronix,
line printer
Ability to save
output
Data inputted interactively is
easily entered. Stored data is
prepared very easily for analysis.
SAROAD data is entered by CEQ
through user request; STORET data
must be transferred to tape and
mailed to CEQ with a copy of the
data layout.
SCORE: 3
3D and contour graphs cannot be
produced. Only one y-axis
variable can be plotted per graph.
Functions cannot be plotted.
Only one graph can be displayed on
the screen. Text cannot be shown
within the graph.
Plots can be produced on tektronix,
line printers, CALCOMP printers
and microfiche.
-Could not be determined-
Some data may be preprocessed
before use. For STORET, a'
retrieval would first be run then
accessed by CALCOMP. For some
Regional (other user) data, it Is
feasible to write extraction
programs to feed data to CALCOMP.
SCORE: 4
Any type of graph exclusive of maps
can be produced.
Data is easily entered.
All display features determined as
Regional needs are available.
Plots can be produced on tektronix
and CALCOMP printers, but not a
line printer.
Output can be stored on tape for
plotting on the CALCOMP printer.
Output cannot be stored for plot-
ting on the tektronix.
SCORE: 3
3D, coutour, maps or curve fitting
graphics cannot be produced. Bar-
charts without variable shading
can be graphed.
All display features needed for
Regional use are available.
Tektronix printers only can be
used.
Up to ten (10) data sets con-
taining the instructions for
graphics display can be stored.
-------
page 3 of 4
Evaluation
Criteria
UPGRADE
CALCOMP
IGP
Ease of output
structuring
- Overall
aesthetic
appeal
SPECIAL FEATURES:
- Statistical
capability
Ability to
interface with
EPA Head-
quarters and
Regional soft-
ware
Ease of opera-
tion for non-
ADP user
,VO
Display features cannot be altered
without restarting the Data
Selection or Analysis section.
Graphics look acceptable for
public distribution (see Attach-
ment 1).
SCORE: 2
UPGRADE can perform basic sta-
tistics and can interface with the
Statistical Analysis System (SAS)
to perform more complicated
analyses.
No direct data access capability
is available, therefore data must
first be stored. Updating of data
must be maintained.
Special screen displayed prompts
assist the user during terminal
sessions. Aid of an ADP person
may be required for transferring
data to CEQ for storage in
UPGRADE.
Any structure is possible but some
display controls cannot be'changed
Graphics look professional (see
Attachment 1).
SCORE: 3
No statistics are available.
Extraction programs or retrievals
of data may first be needed.
Knowledge of FORTRAN, EXEC 8,
and ALPHA is needed by user.
JCL
An exhaustive list of display
options are available. Problems
were encountered using some of
these options.
Quality of graphics adequate for
in-house use (see Attachment 1).
SCORE: 2
No statistics are available.
Interface is irrelevant as data
only can be inputted inter-
actively.
Easily operated by a non-ADP
person, IGP has a "HELP" command
to aid the user during a terminal
session.
-------
page 4 of 4
Evaluation
Criteria
QUALITY OF
INSTALLATION:
- Completeness of
documentation
- Performance
COSTS OF TEST
GRAPH (see
Attachment):
- Sign-on:
time
cost
- Execution
- Programming
time
TOTAL SCORES
CJ
i
NO
V.n
UPGRADE
SCORE: 3
The User's Overview and GLIDE
Manual did not provide sufficient
information to learn specifics
of UPGRADE. Screen instructions
(prompts) during terminal sessions
were good.
Problems encountered using some
display features, the neutral text
option and mapping.
SCORE: 3
35 minutes
$6.52
N/A
N/A
SCORE: 14
CALCOMP
SCORE: 4
Subroutines were well documented
and examples provided. Variables
used by each subroutine were
adequately explained. Instructions
for the 3D and contour plots were
difficult to understand. In-
structions for using CALCOMP at
NCC and WCC were good.
A problem with the THREED sub-
routine was encountered.
SCORE: 1
20 minutes
$9.00
$8.28
15 minutes
SCORE: 16
IGP
SCORE: 3
User's Guide was not available
at time of evaluation. More
detailed explanation and examples
are required.
HELP command did not function;
commands for storing/retrieving
data sets did not work. Some
problems were encountered with
axis range, bar size and tic marks.
SCORE: 2
27 minutes
$9.88
$1.59
N/A
SCORE: 12
-------
UPGRADE
T
U
R
B
1
D
I
T
V
15.90
13.59
12.00
9.04
7.5*
3.W
i.se
0>
o
I"
3
i J. .v
-------
CALCOMP
) 00 2.00
YflKIMfl RIVER BflS IN
RT CLELLUh RlVf.n
1969
u.oq
6.00 8. UP
MONTH
lu
1^.00
IM.UO 16.00
-------
IGP
•-
oo
T
U
R
B
I
D
I
T
Y
15-
10
t
I
»»
•",
-------
UPGRADE USER EVALUATION
\SS£.fl
Region
III
USE
Correlate
water
turbidity
vs. time
(months)
DATA BASE CAPABILITIES
CATEGORY"
Custom data
set
Data save
capabilities
Data Extraction
AVAILABLE
Limited data
modification
Manual
NEED
Expanded
Automa-
tic
RANK
D
N
N
UPGRADE CAPABILITIES
CATEGORY
Plot
Display features
AVAILABLE
NEED
3D
Contour
fuiction
idd user
irmlysls
•ou tines.
nultlplal
Quick
exit
RANK
N
N
N
E
D
TRANSACTION
VOLUME
COMPARABLE
MANUAL
TIME
-------
TRANSACTION DATA
USER
(TASK)
Region III
(Correlate water
turbidity versus
time (months of
year) monthly
values)
Extract from DB
STORZT
X
SAROADS
NIH
IDE
Store
In
IDS
X
Analysis
& Terminal
PLOT
No analysis
1 plot
OFF-
LINE
PLOT
Number of
Terminal
Sessions &
Tine Per
Session
1 session
35 nln.
COST
Per
SESSION
6.52
COST
Per non-IDD
Extraction
Could
Not
Do .
MANUAL
TIME
Calcomp
tlpe-20mln
slgnon cost-
S9.00 term -
$8.28 exec
Prog=15 rain.
1CP=27 nln
slgnon $9.88
=1 . 59 exec
tek. plot
o
I-"
o
o
-------
U.S. ENVIRONMENTAL PROTECTION AGENCY
,«••*>, REGION X
*r 120° SIXTH AVENUE
t SEATTLE. WASHINGTON 98101
3*5
September 7, 1978
Mr. Lance Wallace
Environmental Protection Agency
Office of Research & Development
Washington, D.C. 20460
Dear Lance:
As requested I have summarized Region 10 '3 evaluation of the UPGRADE
system. Please bear in mind that our review may not be as comprehensive
as desired since we were unable to devote more time with this task due to
other priority commitments. The following paragraphs contain our
summarization .
X. Introduction
1. Identification of evaluator
•Bruce Cleland & Shirley Towns
EPA, Surveillance & Analysis Division
M/S 345 (FTStf 399-1193 or -1106)
2. Brief description of ''valuator's function
•Both ambient and source data for air and water.
•Data uses varied from annual reports, one-time quick
responses, and support to states.
'•'•. Description of Experience
3- Extent and nature of evaluator 's experience using UPGRADE
•Two people and approximately 25 man-hours were spent.
a) 10 hours gaining familiarization with system from
documents and demonstrations
b) 15 hours using the system
D-101
-------
U. Description of tasks or goals that the evaluator set for UPGRADE
•Major needs of the' evaluator centered on graphical and
statistical analyses of only one data base (e.g. SAROAD or
STORET, etc.) to generate environmental profiles, to determine
trends, and to define source/receptor relationships.
5. Evaluation of UPGRADE performance in meeting those objectives
•Portions of UPGRADE did provide good graphics display and
statistical analyses with respect to some of the more general
purpose items (scattergrams, means standard deviations, etc.).
•Since the data's initial input will be from the National
Aerometric Data Bank the most current data may not be available.
•Overall, UPGRADE, when needed, was too general to be specific
for most of the regions analytical needs. Simple "in-house"
programs exist which accomplish the same things UPGRADE provides
at a considerably lower cost to the user. In addition, it is
much easier to "customize" these programs to satisfy specific
needs at the regional level.
Ill. Recommendations
6. Eva'luator's recommendations
•About 75$ of the time spent using the system was devoted to
getting through the sometimes confusing prompts confronting the
user before the desired analysis could be performed. As a
whole, it was felt that UPGRADE was too general purpose to be
useful at this time. With present budget cuts and increasing
work loads, the time needed to sit down and run UPGRADE simply
does not exist, particularly when "in house" programs give most
of the same outputs.
I hope this information wilL.be of some assistance to you in preparing
the final summary report. If you have any other questions regarding this
summarization, please contact Shirley Towns.
Sincerely,
William B. Schmidt, Chief
Air Surveillance & Investigation Section
D-102
-------
•DATE-
"IBJECT-
UNITED STATES ENVIRONMENTAL PROTECTION AGENCY
April 11, 1978
UPGRADE System
FROM.
TO:
Ben Eusebiol'Chief
Surveillance Branch
Lance Wallace
EPA RD-680
401 M Street S.W.
Washington, D.C. 20460
On March 29 & 30 we discussed the possibilities of utilizing the UPGRADE
system for Region X's air data analysis. In order for the region to
evaluate this system, we request that the following air monitoring
stations be included:
Pollutant/Method
TSP 11101 91
0, 44201 11
S02 42401 14
CO 42101 11
Station ID
All Region X Sites
38
38
38
38
49
49
49
49
49
49
13
13
13
13
13
13
13
02
02
02
02
02
49
49
49
49
49
49
1200
0560
1460
1580
0960
0980
1560
2140
2220
1840
0840
0840
0840
1420
1420
1420
0220
0040
0160
0160
0160
0160
1840
1840
1840
1840
2040
2040
001
008
002
Oil
001
010
002
001
007
058
002
Oil
012
021
026
001
007
013
002
012
013
014
051
059
062
063
012
013
F01
F01
F01
F01
101
F05
F01
F01
F01
101
F02
J02
JO 2
F02
JO 2
F02
F01
F01
G01
G01
F01
G01
F01
F01
F02
F02
F01
F01
Time Period
69-77
74-77
74-77
70-77
73-77
74-77
77
74-77
74-77
74-77
76-77
76-77
75-77
75-77
75-77
75-77
77
75-77
74-77
72-77
73-77
73-77
77
72-77*
72-76*
74-77
74-76
71-76
76-77
* A portion of the time frame for these sites already exists in UPGRADE.
EPA FORM 1320-6 (REV. 3-76)
D-103
-------
If you are unable Co retreive 1977 SAROAD data from NADB, please contact
Ray Nye from our Data Systems Branch (399-1580). He should be able to
assist you in retrieving this information.
As you are aware, we have used the 1-year, 2-year, or 3-year (depending
on the amount of available data) moving means to depict trends. We would
like to see UPGRADE set up a program that would select the appropriate
trend and automatically produce the chart(s) and/or graphs(s) with the
1-year, 2-year, 3-year or x-year moving means. The enclosed data chart
shows a calculated 3-year moving means from one of our particulate
sites. Please note that the quarterly geometric means are used in
determining the moving means as opposed to the monthly arithmetic means
used for other pollutants (continuous data).
Also, when analyzing data for profiles, we projected the estimated number
of days in violation for sites with less than 365 monitoring days by
using the following formula:
(365)
where: r = number of actual violation days
n = number of actual sampled days
(Reference Guideline: A Mathematical Model for Relating
Air Quality Measurements to Air Quality Standards AP-89-
Ralph I. Larsen, Ph.D.)
We would like this calculation to be provided as an option.
Some of the statistics most often used in Region X are not currently
summarized in the UPGRADE system. Would it be possible to program
additional summary reports? Basically, Region X's summary reports need
to have the following conditions:
1. Flexibility to retrieve the data by various parameters such as:
geographical boundaries (Region, State, AQCR, county, city),
number of violations, and second max, etc.
2. Conformity of units to standards:
Micrograms per cubic meter
Milligrams per cubic meter (carbon monoxide)
3. One averaging period per report:
1-hr
3-hr
8-hr
24-hr
-------
4. Averages for the first hours of a month should include values
for the last hours of the previous month.
5. Excursions are counted on the day of the ending hour period in
which the violation occurs.
6. Compute, but flag data which does not meet NADB reporting
criteria.
7. The Monthly Summary Report should be updated each quarter as a
continuous report for one calendar year. On completion of up-
dating the fourth quarter data, the data should be computed to
give annual totals on the Monthly Summary Report. These totals
will also be used to comprise the Annual Summary Report.
8. An option to retrieve data for one or several calendar years
should be available from the Annual Summary Report.
The outlines for the summary reports and suggested printout format are
attached.
If you have any questions or need further clarification, please contact
Shirley Towns (399-1106) of my staff.
Enclosures
cc: Ray Nye
D-105
-------
TRENDS El TSP OF AQCR 062 3/77
CLARKSTO:.' — CITY HALL
ASOTIN COMITY — 4903R0001 F01
DATA INPUT FOR 3 YEAR Rir^JIIIG AVERAGE
MEASURED HI IIICROGRAMS PER CIIBIG METER
YEAR QTR -JOBS GEO MEAN RUN MEAN 1 YEAR 2:iD HIGH
1971
210
1
2
3
4
3
19
117.64
103.26
0.00
0.00
0.00
0.00
1972
1
2
3
4
20
21
13
20
115.13
90.01
108.26
68.54
0.00
0.00
0.00
0.00
0.00
104.22
104.41
92.32
177
1973
1974
1975
1976
1
2
3
4
2
12
1
3
174.35
83.91
147.00
32.14
0.00
0.00
0.00
0.00
87. 28
85.22
79.20
80.01
1
2
3
4
13
10
8
8
92.04
89.65
119. fil
116.54
0.00
92.89
93.76
93.05
80.74
82.51
88.56
101.29
1
2
3
4
14
13
15
13
85.02
126.84
130.63
78.62
89.03
92.42
94. SQ
98.39
98.28
108.45
112.87
103.14
1
2
3
4
17
15
14
13
101.97
104.18
127.97
115.60
97. 9R
100.10
102.26
105.70
107.70
103. 00
102.18
111.24
204
276
224
219
D-106
-------
T3P 7 KEND3
U
G
X
o n
160 -T-
IB0 --
100 -L-
CLARK3TQN -- CITV HAUL
A3OTIN COUNTY 4-30380001 F01
-I—I—I—f—f
1371
1372
1373
1374-
137B
-------
MONTHLY SUMMARY REPORT FOR CONTINUOUS DATA
(i.e. CO, 0 )
SITE INFORMATION
1. Pollutant: Name & Code (include Method Code)
2. Calendar Year
3. Site Identification
AQCR Number
County Name & Code
Site Code Number
Site Name & Address (include city)
OBSERVATIONS
1. Number of samples observed for each month
2. Number of days sampled for each month
Minimum of 18 hours of a 24 hour period (within one
calendar day) constitutes a valid day (includes zeros)
3. Maximum Value for each month
Nonoverlapping
Includes midnight
Reported to nearst tenth
4. Second Maximum Value for each month
Nonoverlapping
Includes midnight
Reported to nearest tenth
EXCURSION OF STANDARDS
1. Number of actual excursions (non-overlapping, including midnight) for
each month
2. Number of days exceeding standard (non-overlapping, including
midnight for each month.
3. Number of days exceed secondary but less than primary standard
4. Number of days exceed primary but less than alert level for
each month
5. Number of days exceed alert level for each month
MEANS AND STANDARD DEVIATIONS
1. Arithmethic Mean for each month
Report to nearest tenth
2. Arithmetic Standard Deviation for each month
Report to nearest hundredth
3. Geometric Mean to each month
Report to nearest tenth
4. Geometric Standard Deviation for each month
Report to nearest hundredth
D-108
-------
MONTHLY SUMMARY REPORT FOR NON-CONTINUOUS DATA
(i.e. TSP)
SITE INFORMATION
Same as in Continuous Data Report
OBSERVATIONS
1. Number of days sampled for each month
2. Maximum Value for each month
Report to nearest whole integer
3. Second Maximum Value for each month
Report to nearest whole integer
EXCURSION OF STANDARDS
1. Number of days exceeding standards for each month
2. Number greater than secondary standard but less that primary standard
3. Number greater than primary standard but less than alert level
4. Number greater than alert level
MEANS & STANDARD DEVIATIONS
1. Quarterly Arithmetic Mean — report to nearest tenth
2. Quarterly Arithmetic Standard Deviation — report to nearest hundredth
3. Quarterly Geometric Mean — report to nearest tenth
4. Quarterly Geometric Standard Deviation — report to nearest hundredth
D-109
-------
MONTHLY SUMMARY REPORT FOR CARBON MONOXTDB--8IIOUR Calendar Yrar 11 Id !>tate: ureRon
AQCR. 193 County! rluUnomh Site fade: 38 1460 075 101 Sice Address: Hollywood Arcade, Sandy Blvd.. Portland Method; NDIR Site Typei NA»)TS
1UMBI R OF
f
AQCR: ( 193 Countyi Hultnoniih
.Site Code: J8K6flOJ5Kol JAH '
[Site Aildrein: llnllywood ArcndH'R',
<; ', Snnily Hid HARi'
' " I'ortlnnd —'I
:Melhod- MDIR Al'R 1|
;Slt« Type: NAQTS M\Y
I JUN
JUL
AUC
" SI I'
•i
:r
i
6S7.
7*7
737
Of ] PrHCrS'T OS-} MAXIMUM
11-VALID DAYS n MTA .MI'.CUNO --VAJ.UE'
i| SAftmD || ,-CUTERLA Ii
OCT
SOV
nrr
If
II
J/
30
I
SECnin
MJMRl'R
HUMnrR OF
,_-._-d.. DA»S ---.
VALUE
j 7f
L-/P.7-
i; -1--
._.!'?/*_
M.L.. \JJ£ j—^Z-
..//?.._LL/i«_J_ jy__ _
i i i
. ^
!',..,
KXCURSIOHS
UO
7.1
. j_ ;/
1 P
15
y
/
0
II
!i
d
'd
o
ia
/i*
_^_
jl SEVERITY i ARITIWITIC CI-OHI'TRIC
IOF! I.XCURSIONG , -, = i. , ,
'Pel. Alert1'Hrin Scd Dov Mran.Scd D<»
«.'
; f+s-t-
! i1/xx
*XAX
IOJ3
itJrt
j M/IXf
I //x/
!Foc Cc^ot,'r\B^OXid«.-^.^ HOL*R. .
i ' ! j ' J ;'"i 1 -1 ' ; ' '' ' !l ' '
,frt.- j-i *rt. U.»,x
! >"x | -y/.j
I mt ,:: t^
33^ *! w ij
/
/?'/
x/c
'f'f
f.<-s
{»•
v 'I!",".
JC/A !! i i^V
t?i,
S < /»
ffi "*<*?
y;1// j /n"\,v
l\*f i ^.6^/
1 x-' '
VA\
viixiv; x/'W
x/^.
ff
*•."..•.••"
'i .
/.
^\«
>
-------
RtPofvr OF PrxiTic.yi.flTe OAT/S Fop, C.V
«••=-- rs. - .-i— i
Number o( i
«y» , ••
i1
193 1.IIIN
• ]« 00:0 002 rni
i
Councv Cburtboit^r
Hmrborn Si.
I! Albuny
'j Ill-Vol
-VO4 .•
SLAMS
£
j :_
HaKinum
Ji-hr V»luc
"5
5/.
1
-ir
Second"""* I] Nuitber ol~ l"! Murobor of'F.licuriionii DyTsavcrlty f Arithmetic ' r.m<^
Highlit.—.--|l E«cur«lon_ - -. n II 111 |i - | — .-u- - t —-. tftfn Std Diy Mrc» y^ F*v
-210-
1 il
• i
I !
o
'
:; i f>
).°° \ o
^&—
I
Jj.
!• i
i- i
i *
, o
-!>-
o
! i
_o
o
9
O
«
ii fi?x? .a?.»7
i i!
A'?
.!_..
;/, //i •
-t
i i.
. n i
j ;
c
-------
UPGRADE USER EVALUATION
USER
Region
V
USL
Air Quality
data for
Trend
Analysis and
Summary
Reporting
DATA BASE CAPAHIMllES
CATEGORY
SAROAD
Interface
AVAII.AH1.L
Manual
Interface
NI'.l.D
Auto
Current
Data
HANK
N
E
UPGRADE CAPAUII.111LS
CATEGORY
Data Extraction by
User specified
Parameters
Standardized
Units
Statistic aummary
Routines
Various Mean and
Standard Deviation
Analysis Routines
Extraction from more
than one data set and
combining data sets
Ability to summarize
data
One averaging period
per report
Meaningful size data
sets
Compute and Flag
Data not meeting
report criteria
Varying parameters
for plotting
AVAI1.A1ILL
Extraction
levels: 1. SA-
ROAD programs
2. Data filter
Ing
Whatever units
are used In
data base
Partitioning
statistics, SAS
Partitioning
statistics.
SAS
CLIDE. IDB
techniques
Partitioning
statistics
Dependent on
data available
3.9H bytes per
data set
Filtering
Any pair of
parameters nay
be plotted
Nlll)
Addition-
al such
aa 2nd
max
User con-
trolled
variable
transfor-
mations
loving
neans
More di-
rect user
control
I/O for
atorlng&
reusing
Addition-
al avera-
ging tran
s forms
nultlyear
ilr data
Idd
flagging
RANK
E
N
E
E
E
E
N
TRANSACTION
VOI.IIMT
COWFAHAuLli
MANUAL
TIM
-------
TRANSACTION DATA
USER
(TASK)
CEQ user
Support Group
(Teat-Average
Session)
Extract from DB
STORET
SAROADS
NIK
IDS
X
Score
In
IDB
400 data
points
Analysis
& Terminal
PLOT
65AS analy-
sis 12 Std
Tables 20
Plots
OFF-
LINE
PLOT
Number of
Terminal
Sessions &
Time Per
Session
1 Session
35 oln.
COST
Per
SESSION
S25.52
+ 2.00
paper
COST
Per non-IDD
Extraction
Could
Mot
Do
MANUAL
TIME
-------
APPENDIX E
-------
APPENDIX E
CONVERTING UPGRADE TO OTHER SYSTEMS
-------
APPENDIX E
CONVERTING UPGRADE TO OTHER SYSTEMS
I. Introduction
Portability, transferability and convertability of computer software from
like-to-like and like-to-unlike computers remains one of the thornier problems
of system development and use. When this is combined, as is the case with UP-
GRADE, with a system developed on a specific computer configuration for a
single user whose objective was the rapid demonstration of the basic capabili-
ties of the system, problems for both conversion and software configuration
control are to be expected. UPGRADE has been developed piecemeal to meet the
original development goals; it has been further developed to an initial level
of a controlled, production-oriented software package, as evidenced by its
successful portability to like-configured computers.
This Appendix discusses the technical factors involved in the conversion
to unlike computers and the transfer to similiar computers with differing
utility software (especially operating systems). The nature of UPGRADE, its
software environment, and the technical characteristics of the target computers
are discussed. These considerations form the basis for the estimates for
conversion in the feasibility study document.
II. Discussion
UPGRADE has been co-evolving with the NIH-DCRT computer installation since
early 1975. As DCRT made larger TSO regions available and, in a one-year con-
version process, shifted from MVT to MVS operation, UPGRADE has been changing
and growing to use these and other additional resources.
This discussion will focus on those aspects of the current version of
UPGRADE which would present problems in the process of transferring UPGRADE to
another computer center. Of course, no one will have a complete list of con-
version complications and problems until after any given conversion is com-
pleted. The discussion is broken into four parts to correspond to four levels
of potential differences between NIH-DCRT and the target computer center.
These four levels are computer type, operating system, interactive system, and
specific installation. For example, UPGRADE currently operates on IBM-370/MVS/
TSO/ NIH-DCRT. The comparable description of COMNET's EPA system would be
IBM-370/MVT/ALPHA/COMNET-EPA.
The use of IBM computers from the very start of UPGRADE development to-
gether with the demands of users for a more powerful and flexible system has
resulted in the use of certain programs, programming languages, and techniques
which are unique to IBM computers. These would present problems in trans-
porting UPGRADE to any non-IBM system, e.g., the Univac at Research Triangle
Park (RTF), North Carolina.
The most basic aspect of this area is the computer architecture as it has
affected UPGRADE architecture. IBM computers use a four byte, thirty-two bit
word as the basic unit of information for data program instruction storage.
Other computers use a different scheme for encoding data and program instruc-
tions, employ differing methods of input and output of data between storage
E-l
-------
media and the computer core storage, and differ from IBM computers in other
underlying ways. To the extent that any of these differences have become in-
corporated in the structure of UPGRADE and the way that it handles data, changes
will have to be made to adapt UPGRADE to a different make of computer.
Some more specific areas are currently identifiable where a move to a
non-IBM machine would require changes. These include the SAS subsystem, Sort/
Merge Program, Assembly language, and FORTRAN extensions.
The Statistical Analysis System (SAS) is sold and maintained by the SAS
Institute. They currently maintain and support SAS only for use on IBM com-
puters and have no plans to expand this coverage to other types of computers.
Some version of SAS has been made available on the RTP Univac; this would make
conversion problems to that computer slightly less than for any other computer.
The source coding for SAS is approximately 35% IBM Assembly language, 60% PL/I,
and 5% FORTRAN.
The IBM Sort/Merge program is used in several parts of UPGRADE for sorting
of data. Since this is a proprietary IBM product, a move to any other computer
would require modification of UPGRADE to use the available sort program. Most
computer manufacturers supply such software; however, if none were available,
one would have to be written to meet the sorting requirements in UPGRADE.
IBM Assembly language is currently used for about 5% to 10% of UPGRADE
coding (about 1000 lines). The functions performed by this code include ter-
minal input/output, SAS subtasking, linkage to IBM Sort/Merge, manipulation of
the "help" libraries (which are stored in a partitioned data set) , dynamic
allocation, RHB and IRB routines. In a conversion, all of these subroutines
would have to be rewritten in the appropriate language of the target computer.
This could be a major problem area, since, at this level of programming, source
code statements correspond to specific machine dependent instruction set opera-
tions.
The remaining area of computer type dependence again refers to differences
in the software available on the target machine as compared to the IBM software.
The IBM FORTRAN Gl compiler has been used in UPGRADE development from the very
start. As a result, there has been a tendency to use all features of the IBM
compiler rather than restricting the programmers to the contents of ANS FORTRAN.
Some of the IBM features will also be available in other compilers; other fea-
tures will not be available. A specific determination could be made for any
given target compiler.
Some features would be relatively easy to change to ANS FORTRAN coding;
for example, replacement of literals delimited by apostrophes with the ANS
character count format would be a simple task. Conversion of certain other
features (such as direct access files, END= in READ statements, etc.) would
require varying amounts of reprogramming or even restructuring of the UPGRADE
system.
The second general area of potential conversion problems concerns the
operating system of a potential target IBM 360/370 computer (any other model of
large IBM computer is now very rare and could be considered to present similar
problems as going to a different manufacturer's computer). Other operating
E-2
-------
systems in use on IBM 360/370 computers include MVT and MFT. Any of the areas
discussed in this section would also be of concern in any move to a non-IBM
computer.
Perhaps the most important problem would be the overall size of UPGRADE.
Currently the UPGRADE system as run on the NIH-DCRT MVS system requires approxi-
mately one-half million bytes of core storage. On the MVS operating system,
some of this is actual core, some is "virtual," i.e., residing on a direct
access device (typically a magnetic drum storage unit). On an MVT or MFT
system, all 500,000 bytes would have to be actual core. At most such installa-
tions, this size region of core is either never available or available only
late at night or weekends. Also, costs of using such a large region on MVT or
MFT would usually be high. UPGRADE might have to be restructured to fit into a
smaller core size. The costs and difficulty of this reprogramming would be
dependent on the reduction in core size required.
Subtasking is currently used in UPGRADE for the operation of the SAS rou-
tines. Since subtasking in the same manner is not available in MVT or MFT,
changes in UPGRADE would be required. Two other areas that would require
changes are the DAIR (Dynamic Allocation Interface Routine) used to allow
changes in the dataset - I/O channel assignments during a single UPGRADE session,
and certain TSO MACRO'S which are somewhat different in MVT or MFT.
The third general area to be considered is the interactive or timesharing
system to be used. UPGRADE now uses TSO. A change to a different processor
(e.g., ALPHA on EPA's COMNET IBM system) would require some reprogramming;
again, a change to a non-IBM computer would probably require even more changes.
The attached pages list in some detail those parts of the current version
of UPGRADE which may cause problems in moving UPGRADE to some installation
other than NIH/DCRT. The list is based on my analysis of UPGRADE program
listing, plus several discussions with Sigma Data. Of course, no one will have
a complete list of conversion complications until after any given conversion is
completed.
The list is broken into four sections to separate, as far as possible,
those parts specific to:
• MVS
• TSO
• IBM
• NIH/DCRT
This makes it easier to see the potential complications of any given con-
version.
E-3
-------
Current parts of UPGRADE
which are specific to MVS*
• OVERALL SIZE - approx. 500K - this much core is not usually available
except on a virtual memory system.
• ISO MACROS - certain ones (e.g., TGET & TPUT) are said to be different
under MVS.
• DAIR - Dynamic Allocation Interface Routine
• SUBTASKING - used with ATTACH macro for operation of SAS.
*These present difficulties in moving UPGRADE to a non-MVS IBM computer
(e.g., Vitro of COMNET), and even more problems in going to non-IBM.
E-4
-------
Current parts of UPGRADE
specific to IBM*
• FORTRAN EXTENSIONS - IBM extensions to ANSI FORTRAN are in use, e.g.,
list directed I/O, subroutine entries, END = in
read statements, direct access files, object time
dimensioning of arrays, >3 dimensions in array,
etc. Some of these may be found in other FORTRAN
compilers.
• ASSEMBLY LANGUAGE - about 1,000 lines (currently) - about 5-10% of the
UPGRADE code. Performs terminal I/O, SAS sub-
tasking, linkage to IBM sort, manipulation of
"HELP" libraries (PDS), dynamic allocation, RHB &
IRB routines.
• SORT/MERGE Program
• SAS (Statistical Analysis System) - SAS Institute support SAS only for
IBM machines.
*These present problems in going to a non-IBM machine (e.g., RTP Univac).
E-5
-------
Current parts of UPGRADE
which are specific to IBM TSO*
• CLISTS - Various CLISTS are used to allocate data sets, set terminal
characteristics etc.
• DAIR - Dynamic Allocation Interface Routine
• TSO MACROS - used in some ALC subroutines.
*These present difficulties in moving U/G to a non-TSO environment (e.g.,
COMNET), and even more in going to non-IBM.
E-6
-------
Current parts of UPGRADE
specific to NIH/DCRT
• RHB and IRB routines - ALC code written at NIH. A few are
available at COMNET
• IFF (Integrated Plotting Package) - used to produce graphs
• IPP-Tektronix Resident Processor - converts IPP neutral text to commands
for Tektronix graphics software .
• NIH/WYLBUR - used for setting up jobs auxiliary to
UPGRADE and maintaining file records,
etc. Other versions of WYLBUR exist at
some other computers.
E-7
-------
APPENDIX F
-------
APPENDIX F
POSSIBLE CORE SAVINGS IN UPGRADE
-------
APPENDIX F
POSSIBLE CORE SAVINGS IN UPGRADE
A number of the UPGRADE evaluators found themselves limited by present
internal storage constraints in the system in terms of the kinds and amounts
of data they need to handle.
A number of these constraints exist because of the manner in which UPGRADE
was developed. The system has only in the past year reached a level of software
stability permitting production packaging. It was developed in the context of
rapid proving of basic capabilities, followed by the rapid sequential add-on
features found needed by its initial CEQ user community. Rapid response to
frequent increases in requirements provides the developer with a severe problem
in software configuration control and system optimization. Software configura-
tion control has been achieved, with separated test and production versions of
UPGRADE, with a controlled and orderly procedure for building new features into
the system. With CEQ the only user, these controls had not previously been
required. Thus, at its present stage of development, UPGRADE can be improved
considerably in performance by retuning the internal design of the system.
As part of the analysis of UPGRADE capabilities, the overlay, file alloca-
tion, and coding conventions were reviewed in the context of the now larger set
of system requirements identified by EPA's potential user community. This
Appendix lists the results of the analysis, referencing the UPGRADE Version II
program listing. It is recommended that EPA support the review of the internal
software structure of UPGRADE, and its rebuilding along the lines suggested
by this Appendix, for subsequent production versions. The result will be an
increase in efficiency (and consequent reduction in cost) of UPGRADE processing,
and more importantly, increase in the size of the data base processable by
UPGRADE. Refer to Figures F-l and F-5 for a pictorial presentation of the
program structure.
F-l
-------
TOTAL
UPGRADE;
LENGTH <
247.816 I
LENGTH 136-080
OVERLAY A
OVERLAY C
OVERLAY D
LENGTH 8.224
S LENGTH 16.872
LENGTH 86.640
Figure F-l. Overlay Structure
-------
START
Move C.O. Save '1FC'
•n
1*1
Move to A.C.D. - Save '6FC'
Move C.O. -Save-11 A'
HPDCB
X~N
x^\
/^\
s~\
HPINIT
HPENO
HPLIST
TOTAL LENGTH HPACCS
136,080 (Decimal) PPPTEK
/-v PPPTK
Type (A) TK|N|T
* TKSIZE
TKCHR
NTKSYM
TGETER
GTALPH
(A) MAIN
HDCOPY
TABDSP
DIREC
ANALYS
SASDCB
(A) SASSTT
^ AiunnnnF
AOUTST
A10UT
DRWABS
HOME
IHOECOMH
V
;
IHOCOMH2
MOVABS
NEWPAG
IHOSATN2
IHOSSCN
CSIZE
IN ITT
RHB230
IHOSSQRT
IOW AIT
RHB240
PPPBUF
PPPOPN
PPPBFL
PPPSIZ
PPPDRW
SSPECS
PPPSPC
TOUTST
IRB229
IHOLDFIO
I
I
IHOLI02
RHB201
RHB213
RHB218
IHOFOXPI
IHOSLOG
APROB
IHOSEXP
IHOFRXPR
IHOFRXPI
ALFMOD
ANSTR
CWSEND
KAM2AS
KA12AS
TSEND
IHOFCVTH
IHOEFNTH
IHOEFIOS
V i
;
IHOFIOS2
IHOUOPT
VECMOD
XYCNVT
BUFFPK
IHOERRM
IHOUATBL
NEWLIN
RESET
RHB241
RHB242
SASA
TOUTPT
AOEOUT
CARTN
IHOFCONI
IHOFCONO
IHOETRCH
LINEF
PLTCHR
IHOFTEN
\ i
;
RASA
CONMON
ANSERS
TEXT
ANAL
STATS
NAMES
ACCESS
n W I* k «JW
STASEQ
PCOM
CNCHS
NCHSDA
SASCOM
IPSKAT
HAROCY
CACCS4
BARS
LIMITS
SVSTAT
CFILT
CCLIST
XTB
STASL
PARTTT
PLOT
VARSEL
CTRANS
ANOTA
TKTRNX
PPPZET
Figure F-2. Root Segment
-------
Length 8224 (Decimal)
r
1 1 1
ACCES4 ACNCHS IPSCAT
1 1
XPART SAINIT
SAENO
1 1
SAATCH ACCMAN
SAWAIT
SASDTH
SASOUT
CORESZ
SASINT
SASGET
KEYCHK
RHB206
1 1
FILTER DATLST
RRANGE
Figure F-3. Overlay A
-------
length 16.872 IDccinull
1
IVSAVE
IV01GP
IV03GP
IV02GP
IVINIT
IVCALC
1 1
XTABLM XTABGR
INTLIM
1
STNSL
1 1
STEPWS AUTO
1
GLM
1
FRED
1
OBCT
1
IVEW
EOF02
1
WKSORT
RHB233
1
VAHOSP
DISP1
BASIC
1
MAPN
STITQ
STITH
1 1
NCHSTA NCHSCO
1 1
IDBRO PLOTIT
1
BARPLT
BLOK
ABIOK
SHADEX
GLINE
SHAID
1
AXISPT
ANOTAT
1
AUTOST
1
PLTMOD
IPSU61
Figure F-4. Overlay C
-------
length 86640 (Decimal)
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
REGR STPMOD SORT SASMON XTABCT PLM001 PLM002 PLM003 PLMOD4 PLMOOS BARSHA OAT APT OATASH HCXCO HR6HT
POLY GLMMOO FVAR WINDO HXOUIV MCCSVH
POLSUM VARDEP ASMSRT FRANCE STMR01 IMXC1 SYHOFF
MINVRS VARIND BPAS1 IMXC2 UOVECH
MATMOL FRQMOD DATARD INXC3 OSPEGS
TRANS1 VAHTBL DYNAM HXTFRU
TRANS2 PPPEMO
BPROB SIGHOH
ZZOMES
FABLGX
FABLIX
„ AXPRPR
1 P5LGLI
"^ PSLILG
PSllll
PSLIPR
GOLGLG
GDLGLI
GDLILG
GOLILI
GDLIPR
PSLGLG
PSLGPR
PSPRLG
PSPRLI
PSPRPR
OLLILG
DLLILI
DLLIPR
GDLGPR
GDPRLG
GDPRLI
I
r *
GOPRPR
SLLIL6
SLLILI
SLLIPR
DLLGLG
DLLGLI
DLLGPR
DLPRIG
OLPRLI
SLLGLG
SLLGLI
PRSPEC
FABLGY
FABIIY
FABPRX
FABFHY
UOVPEN
SAXLGB
SAXLGL
SAXLIB
SAXLIL
SAXPRL
AXLILG
AXLILI
AXLIPR
NODLIB
MODLIL
NOLGB
NDLGL
MOPRB
NOPRL
SAXPRB
AXLGLG
(.
r ;
AXLGLI
AXLGPR
AXPRLG
AXPRLI
SLLGPR
SLPRLG
SLPRLI
DECVAL
DIVLG
DIVLI
DIVPR
OLPRPR
GTITLE
SLPRPR
TITLES
TITLET
BLGSCL
BLISCL
BPRSCL
DECBCD
DIGITS
TITLE L
AXSMRK
ORWDL
DRWSL
GRDURK
LGLG
LGLI
LGPR
LILG
LIU
LINLAB
LIPR
I
\
LOGLAB
MATCH
NCHARS
NLOGB
PRBLAB
PRLG
PRLI
PRPR
PSNLIN
SIMPNO
SLNLIN
TITLE
Figure F-5. Overlay D
-------
Possibilities for saving core in UPGRADE
I. Certain arrays could share core storage by use of equivalence. If data in
array "A" needed to be kept while array "B" used the same core space, array
"A" could be written onto disk to be save'd.
Representative Examples:
ACCMAN p. 40 may be STATS and NAMES commons
STNR01 p. 57 PEARZ (500) seems of little use - reordering
of the "STATS""arrays and taking first 50
entries - maybe shows "STATSVL .arrays are too big.
ACCSB2 p. 62 & 87-89 does sort using tore storage, could
use disk
FILTER p. 122 1 24 common SV STAT
HISTO (and others) p. 131 1 26 common XTB
IDBRD (and others) p. 139 1 22 common CNCHS
•
others exist
II. Some arrays are simply bigger than they need to be (space allowed for future
expansion, or for ease of programming). Reduce these to minimum.
Representative Examples;
Several arrays associated with data variables are dimensioned 100 (e.g.,
all those in NAMES and STATS commons - see ACCMAN p.40 for example).
I don't think all of these are now used - maybe 65 are used. At any
rate, number of variables could be < 100 and thus save core.
Representative Examples: (continued)
Several (most) of the large arrays in XTB common are 20% to 40% larger
than need be (unused space) - (//intervals = 60, 50 used) (# STATS = 10,
8 used)
- Others -
III. Some subroutines could be broken up and called selectively or sequentially
to reduce region sizes.
Representative Examples:
ACCMAN - is already broken up but is all in one region of the overlay
ACCES4
ACNCHS
F-7
-------
IP SCAT
SASINT
ACCMAN
PLTMOD
1PSUB1
XTABLM
INTLIM
XTABGR
STNSL
and most all others.
The first ones to be hacked up should be those which are making parti-
cular regions long.
IV. Some groups of subroutines (in one segment), and entry point areas, could
be moved to lower region, thus decreasing size of upper segment.
Of course this would increase the size of the lower region, unless
put in as separate entries in the overlay structure.
For examples, see list under Item III, as many of the small sections
that could be taken out of these segments could be moved (or would have
to be moved lower to retain logical calling sequence and proper overlay
structure.
V. Where possible, use I *2 in place of I *4 integer variables.
ACCMAN p. 40 1 25-26
IPSCAT p. 144 1 36
SAS1 p. 339 1 25-26
and most other areas where integer variables are actually used for
arithmetic, not text, arrays.
VI. Rewrite "ANSER" handling to use less space - perhaps with set of logical
variables held in common to TGETER
Or use computed GOTO
F-8
-------
Examples;
ACCMAN p.40 1 51-54
ACCMAN p.41 1 109-114
ACCMAN p.42 1 166-171
p.46 1 378-386; 1 394-401
others
•
•
AUTOST p.98 1 129-137, 147-154
and nearly every place that TGETER is used to bring in an
"ANSER" - most of these could be handled by returning an INTEGER
value corresponding to "YES," "NO," "HELP," etc. and using GOTO
(100, 200, 300, xxxx), INTEGER
VII. Put test of questions into partitioned data set library and call similarly
to "HELP" library.
Examples;
ACCMAN p.40 1 41-49
ACCMAN p.42 1 159-164
•
examples in nearly all subroutines
Also, nearly all places that a question prompt is given to user, core
would be saved by having the text of the question in a. -BDS and" using CALL
QUES (nn,m) to put the text on the screen.
VIII. Several routines have unchanging text stored in arrays - could be put
on disk data set, with keyed or direct access -to--bring inappropriate text
(or PO library)
IVINIT (ACCMAN) p. 50 1 617-632
ACCS4 p. 68 1 36-52
MAPN p. 170 1 39-56
F-9
-------
Several other subroutines have lesser amounts or text arrays could
be initialized by BLOCK DATA - would save core in some cases.
IX. Use LOGICAL *1 to replace L *4
Variables (a 75% core savings!)
ACCMAN p. 40 1 27
ACCS4 p. 68 1 54
1PSCAT p.145 1 61-63
XTABGR p.302 1 55-56
X. Reduce capabilities of UPGRADE (e.g., drop SAS, reduce // of variables
allowed, etc.)
SASINT and other SAS subroutines could be dropped, etc.
XI. Move certain subroutines out of root segment by making multiple copies of
appropriate lower region segments
DATCON
HDCOPY
TASDSP
NTKSYM
RHB routines
and many others could be determined by more analysis.
XII. Determine if any of the FORTRAN, IPP, or TEKTRONIX subroutines added to
root segment are never used (e.g., may be trigonometric functions).
Perhaps some type of program monitor could be used to determine which
subroutines are not used in a full exercise of UPGRADE.
The trigonometric functions for tangent, sin, cosine are in the root
segments, and I would bet are never used.
XIII. Replace HPLIST (7200 bytes of "HELP" xxxx number lists) with code to
generate needed number.
XIV. Some areas of the detailed code could be made smaller by writing more
complex (therefore harder to maintain or change) code.
¥-10
-------
Examples:
AXISPT p.107-109, 1 349-480
DATCON p.166, 1 299-353
NCXCO p.178-181
PLOTIT p.204-207
SORT p.271 1 80-95
SAS1 p.354 - duplicate code - could be subroutine.
also SASWT p.371-2
and probably many more areas.
XV. Conversion table for NCHS to VITRO mapping codes is-.now in. a common -
could be put on disk.
This conversion table is now about 25K bytes (INXC1 through INXC3). With
the current dynamic allocation, it could now easily be put on disk and
thus release this core.
F-ll
-------
APPENDIX G
-------
APPENDIX G
UPGRADE REPORTS
-------
APPENDIX G
UPGRADE REPORTS
The following is a list of EPA reports contained in this appendix.
1. Office of Research and Development, OPR
2. Office of Research and Development, OEMI/IERL/CINN
3. Office of Planning and Management
4. Statistics and Data Management Office
5. Epidemiology Branch, FSD, HERL
G-l
-------
U.S. ENVIRONMENTAL PflQTECTIOI AGENCY
OFFICE OF RESHflCH AND DWsLOP.Ic.17
MOHiTOaW& AND SUPPORT LABORATORY - LAS VESAS
P.O. BOX 15027. LAS VESAS. NeVAOA 59114 * 702/7:5^69 iFTS 555-£cS)
JUN 22 1978
Date
"?piy to
Attn ofc MSD
Subject Evaluation -of OTGSSDE-
To: Dennis A. Tirpak
Office of Planning and Review, BD-675
Per your telephone request, we have used the UPGRADE system to examine
certain data bases in order to become familiar with the usefulness of
the system.
Summarizing Our experience suggests the following:
1. UPGRADE is a good tool for displaying graphical functions and
for interacting, directly with data.
2. The system requires considerable additional software development
before it becomes a useful tool.
3. UPGRADE could ba developed into a useful management tool, partic-
ularly at the national level. It might also have usefulness as a research
screening technique.
4. Beyond the developmental phase, it will be necessary to maintain
a central computer staff for interaction between data bases and users.
5. The amount and kinds of data available on the system at the present
time are very limited.
Our experience shows the accessible data are basically trend data, i.e.,
yearly or monthly averages on the geographical scale of counties. Of
course, special data bases can be constructed and interfaced with the
system. We have, in fact, had the contractor generate such a data base
from air quality data. This was accocplished in a reasonable tine franse
as have our requests for interface of new data manipulation procedures.
However, note that we are not privy to information regarding the contractor
costs incurred as a result of these requests and, therefore, have no basis
for judging the cost effectiveness.
G-2
-------
Vhether UPGRADE should ba developed %r.t,o a ti.auagamer.t taal depends
to such questions a^»:
1. What organizations wouJJEi mak's. iise* of Chi^y'sCSrtl.t'in i/'r>at manner,
and how much use?
2. What is the investment required in order to stake ithe system usefsl?
3. What is the maintenance cost of the systen?
Obviously, these are interrelated questions, i.e., the required- investment
will be a function of who, for what, and how much usa? This, of course,
requires a detailed survey of the potential users and their needs. For
example, would individual Regions make use of raationvPida- take 'Into account potentia.i rtsl-:-= frc:?.
a variety of disease states when deciding' how' ouch "hardness, etc.* is healthful.
Also note the suggestion' that " the higher' tKs.BOD the be'ttcr tVie waiter may b«
G-3
-------
for human consumption. Reflection would suggest that BOD determinations
are carried out on water supplies beiora treatnent. High 300 requires
better or different treatment before being deemed satisfactory for
human consumption. Thus,, this particular finding could be indicating
not that high BOD is good, but that the better or different treatment
given high BOD waters does result in a more healthful water supply. In
any event, the foregoing is speculation*and as stated, merely points
out areas for further investigation.
TABLE 1
Number of Correlations with Various Disease State
Positive Negative Total
Water Quality Variables Correlations Correlations Correlations
Dissolved Oxygen 459
Hardness 279
Sulfate 246
BOD-5 Day 066
Chloride 325
Calcium 145
la point of fact, we cannot claim by such a preliminary examination that
the 921 correlations showing less than a 952 confidence level are unim-
portant. We have at this time no knowledge of the quality assurance of
the data used nor- knowledge of the concentration distribution of the water
quality data. These factors can have a decided effect on whether or not
potential correlations are indicated.
This effort required 120 man-hours of work of which one-third was required
to become familiar with the system. However, we understand that the con-
tractor is modifying the program so that this type of survey involving
approximately 1000 correlations could be accomplished in about 30 hours.
What this exercise did do is give us enough hands-on experience to be able
to forward our evaluations of UPGRADE as given in the opening paragraph.
This evaluation is admittedly from a restricted viewpoint, yet our experi-
ence indicates UPGRADE, in its present state, is not ready for general use.
Decisions regarding its further development should be based as indicated
on a survey of users and the development and maintenance costs.
Edward A. S chuck.
Acting Deputy Director
Monitoring Systems Research
and Development Division
3 Enclosures
cc: "w/o -enclosures
A. 'C. Trakowski, .RD-680
•H<. M. Bills, RD-680
G-4
-------
INDUSTRIAL ENVIRONMENTAL RESEARCH LABORATORY
UNITED STATES ENVIRONMENTAL PROTECTION AGENCY
CINCINNATI, OHIO 45268
DATE- June 28, 1978
SUBJECT- UPGRADE Evaluation
FROM: David R. Watkins
OCPB
TO: Lance Mai lace
USEPA, ORD/OMTS
401 M Street, SW
Washington, DC 20460
I would like to thank you for the computer demonstration that
you conducted on 5/25/78 in Cincinnati. We have utilized some of the
health data that was generated during the demonstration.
The UPGRADE data was found to be of greater value than .that data
generated by a contractor of the'IERL-Ci. The reason for this.was the
more current mortality entries in UPGRADE. These results were compared
with those of a system which had been in operation for a longer period
and found to be of greater value fop our oarticu-lar case,
I found the UPGRADf.system very impressive and of obvious benefits.
However, we do not have the personnel nor Allotted time to conduct this
type of in-house searches. Contractors for lERL-Ci normally conduct
these types of investigations on their own systems or subcontracted
systems.
cc: E. E. Berkau
G-5
-------
23
8BBJEC?; UPGJVUDE Styt*
FROK: Mareia Villian, Chief
Statistical Evaluation Staff, IW-223
TO? Jjanee WaUae*,. Eovirpnnental Scientist
Monitoring Technilpgy filvision, RD-660
Hir staff and * have- r^vdav^ the UPGRADE system "Users Overview" docucent and
I v»ld like to pass along tvo coananta. First, ve think it vculd be useful
to expand and clarify the section of the manual dealing vith the description
of included data. In particular, it vould be useful to itemize each variable
vhlch la contained In. each data set-vlthin UPGRADE.
Second,- VB haye a nunfcer of auggestions about djata vhich ve v>uld lite to have
added to the UPGRADE ayeten* Sone of these data 'should be readily available
vhlle other data vlll need to be searched out:
1* Itanber of eaployeea by SIC codes by county (Ed Brooks did this for
'1959, 1967, 19T3);
2. Itan&er of etaployeee by SQC codes by county (this should be available
Iron the 1980 census) J-
3* Demographic- and cllaate' ajeaeurea by county (Ed Brooks did 30 such
measures);
4» Additional age adjusted abrtality 'ratea (Ed Brooks did 56 cause of
dealth categories and these could, be used to supplement existing categories);
5« Transportation related data by county such as nunber of vehicles,
nuaibcr of VMTS, number of ailes of roads (DOT hex this);
6* Stationary and aobile source emission data, by county, (tons per year)
from HED6,;
T. Certain census type information, by county, such as type of vater system,
type of , hoae beating eyaten, existence of hose A/C system, etc*;
Ac other thoughts, occur to me, I will pass them am. Call me if you vent tc
discuss further.
by: FM-223/MWalliaaa/a«/$/21/T3/1J72-5CAO
G-6
-------
UNITED STATES ENVIRONMENTAL PROTECTION AGENCY
DATE. October 12, 1977
SUBJECT: The UPGRADE System.
FROM: William C. Nelson, Ph.D.
Chief, Statistics and Data Management Office
T0: Lance Wallace (RD-680)
Office of Monitoring and Technical Support
Thanks for sending us the draft User 's-OvsfvleW'' and otfier mateVla-ts
for the graphics and data base system UPGRADE and for the opportunity
to give you our comments.
The UPGRADE system obviously- has an elaborate display and analysis
capability. However, we do/hay?. sev*£fal,jreservatlons 'cfbottf Hs usefulness
to EPA in general and to HERk.r:RT{MrL* parti culajr*.
1 . Concern over conversioni.probTe
------- |