FEASIBILITY STUDY:  USE OF USER PROMPTED GRAPHICS

     DATA EVALUATION (UPGRADE) SYSTEM BY EPA
                  Prepared for:

         Environmental Protection Agency

                       and

  President's Council on Environmental Quality

                      under

                Contract EQ7AC021
                  Prepared by:

           Automation Industries, Inc.
           Vitro Laboratories Division
              14000 Georgia Avenue
         Silver Spring, Maryland  20910
                  November 1978

-------
                             FOREWORD
     The User Prompted GRAphic Data Evaluation (UPGRADE) System provides
an on-line, graphic, statistical analysis tool for a wide spectrum of
multi-media environmental data.  This feasibility study of its potential
use in EPA defines a user community, its experience in utilizing UPGRADE
to assist in data analysis, summarizes the findings and conclusions, and
recommends a specific course of action.

     The Management Information and Data Systems Division (MIDSD) has
reviewed the study in draft form, and has concurred with the recommen-
dation that UPGRADE be co-sponsored by EPA.  Furthermore, MIDSD
recommends that UPGRADE be used by the present set of interested users
on the NIH-DCRT computer complex, until EPA's requirements either exceed
the usage threshold at NIH or the graphics analysis support capabilities
and requirements evolve in other directions.  A review and update of this
feasibility study by August 1, 1979, to determine alternatives for use
of UPGRADE in FY80, was recommended.

-------
                         ACKNOWLEDGEMENTS
     Key contributors to this feasibility study include:  from EPA,
Dr. Lance Wallace, Dr. Wayne Ott, Dr. James Reisa, Mr. Elijah Poole,
and Mr. Charlie Poole; from CEQ, Dr. Douglas Buffington; from Vitro
Laboratories, Messrs. Mark Dorlester, Joseph Higgins, Hartley Holte,
Alan Rundquist, Gary Sitek, Char-1-es Wellander, and John Terebey; and,
Mr. Larry Milask and staff from Sigma Data Computing Corporation.

-------
                              TABLE OF CONTENTS
I.              MANAGEMENT SUMMARY                                       1-1

               A.   Background and Objective                            1-1
               B.   UPGRADE                                             1-1
               C.   Comparison to EPA Systems                           1-1
               D.   User Needs                                          1-2
               E.   Usage Summary                                       1-5
               F.   Alternative Solutions                               1-6
               G.   Evaluation of Alternative Solutions                 1-6
               H.   Recommendation                                      1-7
               I.   Project Plan Outline                                1-7

II.             NEEDS ANALYSIS AND EVALUATION CRITERIA                  II-l

               A.   Mandates                                           II-l
               B.   UPGRADE                                            II-3
               C.   Program and Computer Environments                  II-7
               D.   User Requirements                                  II-8
               E.   Ranked Major Outputs Required from UPGRADE         11-17
               F.   UPGRADE Acceptance Criteria                        11-17
               G.   Summary of Savings and Benefits                    11-20

III.           FEASIBLE ALTERNATIVE SOLUTIONS                         III-l

               A.   OPTION I - EPA Direction of UPGRADE               III-l
               B.   OPTION II - EPA Co-Sponsorship of UPGRADE         III-3
               C.   OPTION.Ill - EPA Limited Use of UPGRADE           III-3
               D.   OPTION IV - No EPA Involvement/Use of UPGRADE     III-3
               E.   UPGRADE Changes                                   III-3
               F.   Major Benefits                                    III-5
               G.   Functional Advantages and Disadvantages           III-5

IV.             EVALUATION OF ALTERNATIVE SOLUTIONS                     IV-1

               A.   Cost Assumptions                                   IV-1
               B.   One-Time Costs                                     IV-1
               C.   Recurring Costs                                    IV-3
               D.   Development Lead-Times                             IV-5
               E.   Cost/Benefit and Cash Flow Analysis                IV-6
               F.   System Acceptance Ratings                          IV-6

V.              RECOMMENDATIONS                                          V-l

               A.   Initial Recommendation                              V-l
               B.   Subsequent Recommendation                           V-l
               C.   Time Phasing of Enhancements                        V-l
               D.   Project Plan                                        V-l

-------
                      LIST OF APPENDICES


APPENDIX                                                  PAGE

   A           HISTORY OF UPGRADE                         A-l

   B           COMPARISON OF UPGRADE TO OTHER
                 EPA SYSTEMS                              B-l

   C           INTERAGENCY AGREEMENT BETWEEN
                 THE COUNCIL ON ENVIRONMENTAL
                 QUALITY CCEQ) AND THE ENVIRONMENTAL
                 PROTECTION AGENCY CEPA)                  C-l

   D           UPGRADE EVALUATION REPORTS                 D-l

   E           CONVERTING UPGRADE TO OTHER SYSTEMS        E-l

   F           POSSIBLE CORE SAVINGS IN UPGRADE           F-L

   G           UPGRADE REPORTS                            G-l
                             ii

-------
                               LIST OF FIGURES









NO.                                  TITLE                               PAGE




  1-1          SUMMARY                                                   1-4




 II-l          COMPARISON OF EXISTING EPA SYSTEMS                       II-6




 II-2          USER SUMMARY                                             II-9




 II-3          USER SUMMARY                                             11-10




 II-4          USER SUMMARY                                             11-11




 II-5          INTERFACING STORET DATA TO UPGRADE                       11-12




 II-6          INTERFACING STORET DATA TO UPGRADE                       11-13




 II-7          DATA BASE COMMENTS                                       11-15




 II-8          DATA BASE COMMENTS                                       11-16




 II-9          UPGRADE COMMENTS                                         11-18




 11-10         UPGRADE COMMENTS                                         11-19




III-l          UPGRADE CURRENT USES                                    III-7




III-2          UPGRADE FY 79 FUNDED USES                               III-8




III-3          UPGRADE PROJECTED USES                                  111-10




 IV-1          LIFE CYCLE ONE-TIME COSTS (IN $1,000)                    IV-4




 IV-2          DEVELOPMENT LEAD-TIMES                                   IV-7




 IV-3          DEVELOPMENT LEAD-TIMES                                   IV-8




 IV-4          DEVELOPMENT LEAD-TIMES                                   IV-9




 IV-5          OPTION I ($000) ALTERNATIVE 1 (NIH)                      IV-11




 IV-6          OPTION I ($000) ALTERNATIVE 2 (COMNET)                   IV-12




 IV-7          OPTION I ($000) ALTERNATIVE 3 (RTF)                      IV-13




 IV-8          OPTION I ($000) ALTERNATIVE 4 (GSA)                      IV-14




 IV-9          OPTION II ($000)                                         IV-15




 IV-10         UPGRADE ACCEPTANCE CRITERIA                              IV-20




  V-l          PROJECT PLAN ALL USER NEEDS PROGRAMMING                   V-3
                                       ill

-------
                        LIST OF FIGURES (Continued)









NO.                               TITLE                                   PAGE




B-l            TOTAL PHOSPHATES AT HURON RIVER STATION                    B-10




B-2            GRAPH BACKGROUND ELEMENTS                                  B-ll




B-3            GRAPH DATA ELEMENTS                                        B-12




B-4            ORIGINAL GRAPHS WITH TEXTUAL ANNOTATION                    B-13




B-5            EXAMPLES OF GRAPHICAL ANNOTATION                           B-14




B-6            POSSIBLE COMPOSITION LAYOUTS                               B-15




F-l            OVERLAY STRUCTURE                                          F-2




F-2            ROOT SEGMENT                                               F-3




F-3            OVERLAY A                                                  F-4




F-4            OVERLAY C                                                  F-5




F-5            OVERLAY D                                                  F-6
                                   iv

-------
SECTION I

-------
                            I.  MANAGEMENT SUMMARY

A.  Background and Objective

     Since its establishment in 1970, EPA has placed priority on the develop-
ment and acquisition of environmental data bases to provide a basis for
analysis of environmental status and trends.  Much data has thus been compiled
to date, and the current emphasis is to develop tools to support analysis of
these data.  In particular, tools are needed to support technical and policy-
level analyses and involving multi-media (e.g., air vs. specific health effects)
Such tools must meet a wide range of criteria - from management access and
confidence to rapid response and cost-effectiveness to proven analytical
accuracy - if their application to EPA needs is to be both successful and
practical.

     The objective of this study is to determine the feasibility of adapting
CEQ's User Prompted GRAphic Data Evaluation (UPGRADE) system for EPA use to
satisfy these needs.  UPGRADE has been successfully used by CEQ since 1975,
and has been adapted for use in support of environmental analysis needs of
other federal and state agencies.  An Interagency Agreement between EPA and
CEQ provides specific guidelines for this study (see Appendix C).

B.  UPGRADE

     UPGRADE (User Prompted GRAphic Data Evaluation) is a versatile system for
analyzing computer stored data on the environment, natural resources, public
health, and related topics.  Employing ordinary English language instructions,
step-by-step analysis, and graphic display, UPGRADE is designed for effective,
immediate use by managers and scientists with little or no computer training.
UPGRADE attempts to interactively guide and prompt the user by English lan-
guage instructions and responses through all phases of data selection, proces-
sing, and display.  Most UPGRADE users have been able to prepare their own
"presentation-ready" copy of analysis results after less than an hour's demon-
stration of the system.  UPGRADE is presently being used for environmental and
health studies at CEQ and has been planned for future use by other Federal,
State, academic, and private organizations.

C.  Comparison to EPA Systems

     The major EPA data gathering and data base systems consist of SAROAD and
STORET.  EPA is also planning to install an interactive graphic system called
ADROIT to work in conjunction with the STORET data base.  A detailed compari-
son of these systems with UPGRADE revealed the following:

     SAROAD - A batch-data entry, storage, and retrieval system for air data

     STORET - A batch system with some graphics and analysis capabilities, for
           water data

     ADROIT - An interactive graphic system geared for water data that could
           add other media data for cross-correlation.  It has no mapping
           capability and is more oriented to a computer trained user.  This
           system is presently being rewritten so that it can run on the
           COMNET computer.
                                     1-1

-------
     UPGRADE - An interactive graphic system designed for cross-correlation of
           various media data.   It has mapping capabilities and is geared to
           the non-computer trained EPA researcher.

     UPGRADE was found to provide the most capability and required the least
amount of training for the non-computer trained EPA researcher.  However,
UPGRADE would not replace these existing systems;  it would supplement EPA's
analysis capabilities by adding the ability to correlate environmental data to
health data (i.e., multi-media capability).  Cost-effectiveness is highly
favorable with respect to alternatives.

D.  User Needs

     EPA's evaluation of UPGRADE was actively conducted by six individual
users.  Two other potential users have identified  requirements and another is
using UPGRADE on a trial basis.  One of the six evaluators is now actively
using UPGRADE under a separate IAG with CEQ.  Each evaluation report has been
included in Appendix D.  Other EPA evaluators have either expressed no inter-
est or their evaluations were not sufficiently detailed to permit quantitative
analysis.  These comments are included in Appendix G.  The evaluations brought
a variety of results, notably that UPGRADE increases the user's analysis
capabilities at equal or significantly reduced cost.  However, each user has
indicated a need or desire for some form of improvement or enhancement to the
UPGRADE system as a whole.  These requirements were categorized as:

     Essential - User cannot effectively use UPGRADE without the capability

     Necessary - Would make the use of UPGRADE easier, should be incorporated
                 as money and priority allows

     Desirable - Nice to have, would enhance the system
                                   1-2

-------
     Those requirements termed "Essential" are listed below, along with re-
questing EPA office, and a CEQ User Support Group time and cost estimate:

                                                         CEQ Estimate
Requirements
Data compression or
restructured Data Base
Refine Terse Mode
Internal Data
Management System
Limited Batch Mode
Moving Means
Multiplot
All Essential requirements
other sponsors of UPGRADE
months .
Requester
OAQPS/RTP and Region X
ORD/OMTS/Las Vegas
ORD/OMTS/Las Vegas and
Region X
ORD/HQ
Region X
Region III and ORD/HQ
TOTAL
Cost Time
$ 7,500 4-6 weeks
10,000 5-8 weeks
5,000 4-5 weeks
Available in FY 79
10,000 8 weeks
10,000 8 weeks
$42,500
except "Moving Means" have already been funded by
and are scheduled for implementation in the next few
Those requirements termed "Necessary" are:
Requirements
Improved STORET
Interface
Improved SAROAD
Interface
Data Extraction from
Tape Storage
Super Terse Mode
(Full Batch Capabilities)
Add User Analysis
Requester
ORD/OMTS/Las Vegas and
Region X
OAQPS/RTP and Region X
and III
ORD/HQ
ORD/OMTS/Las Vegas and
OAQPS/RTP, and ORD/HQ
ORD/OMTS/Las Vegas and
CEQ Estimate Yr.
Cost Time Planned
$15,000 12 weeks 1
15,000 12 weeks 1
3,000 2 weeks 1
35,000 24 weeks 2
9,000 4 weeks 2
 Routines
3D Plots
 OAQPS/RTP

Region III
Evaluate Available
 Packages
                                     1-3

-------
 SUMMARY
wr
System Acceptance Rating
(0-600)
Total Present Value Net Savings
($000) (LIFE CYCLE)
Total Present Value Costs
($000) (LIFE CYCLE)
RECOMMENDATION
OPTIONS
I
EPA DIRECTION OF UPGRADE
ALTERNATIVES
1
NIH
417
210.8
2381.3

2
COMNET
373
(2040.5)
4632.6

3
RTP
315
(1455.7)
4047.8

4
GSA
429
(1439.7)
4031.8

II
EPA CO-
SPONSORSHIE
OF UPGRADE
505
1099.7
1492.5
X
III
EPA LIMITED
USE OF
UPGRADE
385
Not Applicable
User Dependent
(30.8/User/Yr)

IV
NO EPA IN-
VOLVEMENT/USE
OF UPGRADE
80
Not Applicable
Not Applicable

Figure 1-1

-------
   Requirement

Increase Plot Size

Add User Defined
 Models

Contour Plots

Data Save
 Capabilities
        Requester

OAQPS/RTP

ORD/OMTS/Las Vegas


Region 111

ORD/HQ and
 ORD/OMTS/Las Vegas and
 Region III
              CEQ Estimate
            Cost       Time
            7,500

           20,000
 6 weeks

16 weeks
  Yr.
Planned

   1

   2
            Available in FY79

           30,000    24 weeks
                        TOTALS  Yr
                                Yr
          1
          2
$ 70,500
  64,000
                         TOTAL            $134,500

    Those requirements termed "Desirable" are:
   Requirement

Overlying Points on
 Graph

Improved Alphanumeric
 Axis Description

Interactive SPSS
E.  Usage Summary
        Requester

         ORD/HQ


         ORD/HQ


         ORD/HQ
CEQ
Cost
$10,000
5,000
15,000
Estimate
Time
8 weeks
4 weeks
12 weeks
Yr.
Planned
3
3
3
                                            TOTAL   $30,000
     UPGRADE, being an operational system, is being used now by the CEQ, DOE,
and EPA to correlate aspects of water and air quality to mortality rates, to
produce demographic maps, and to accomplish other related studies.  During FY
79, the planned and funded UPGRADE uses include regional Health and Energy
facility siting studies, CEQ air quality emission data analysis, USGS NASQAN
biological data analysis, and EPA/OTS mortality studies.

     The addition of more and varied data for prospective users would invite
many more studies and outputs.  Typical data additions would include: Ambient
Air, Multiple causes of death, Morbidity, Drinking water, Altitude, Phytoplank-
ton, Hanes II, and Wisconsin's Industrial Effluent data.  These data added to
other data accessible by UPGRADE could produce:  maps on air quality for dif-
ferent pollutants and aggregations, variations in contributing causes of death
with water and air quality, and morbidity; studies on drinking water versus
cancer, altitude versus heart disease, and phytoplankton distributions and
indices of trophic state; analysis of CO exposure in the blood and CO levels
in breath samples; and correlations of source and ambient water data in the
                                    1-5

-------
Wisconsin area.  Cost estimates have been made for the first three of the
above data additions.  The cost averages $55,000 per data type.

F.  Alternative Solutions

     There are four options for EPA related to the use of the UPGRADE system.
They are:

     1.  OPTION I - EPA Direction of UPGRADE

           EPA would be the main UPGRADE sponsor or would take a version of
UPGRADE for its own use.  For this option, EPA requires a minimal User Support
Staff of six.   There are hardward alternatives associated with this option.
They are:  to remain on the NIH-DCRT computer; move to the EPA-COMNET computer
(requiring 20% reprogramming of UPGRADE); move to the EPA-RTP computer
(requiring 80% reprogramming of UPGRADE); or move to a commercial computer
(requiring no reprogramming).  This option would best serve more than 15
dedicated EPA users.

     2.  'OPTION II'- EPA Co-Sponsorship of UPGRADE

           EPA would co-sponsor UPGRADE at a level that would entitle EPA to
heavily influence the development to meet EPA requirements.  For this option,
EPA requires a dedicated UPGRADE coordinator and CEQ liaison attuned to EPA
requirements.   The system would continue to run at NIH-DCRT.  This option
would best serve a dedicated EPA user base of 5 to 15.

     3.  OPTION III - EPA Limited Use of UPGRADE

           EPA would be a user of UPGRADE, only able to buy certain "goods"
from available stock, but not able to influence the development of UPGRADE.
This option would best serve a user base of 1 to 4.

     4.  OPTION IV - No EPA Involvement/Use of UPGRADE

           EPA should exercise this option if no users are found for UPGRADE.

G.  Evaluation of Alternative Solutions

     Each option for satisfying EPA's requirements was evaluated on a cost
basis over a 5-year life cycle and against a set of 10 system acceptance
criteria.  To perform the cost evaluation, assumptions were made from discus-
sions with EPA evaluators and the tests performed by the CEQ User Support
Group.  The assumptions concerned percent of system use, average computer/an-
alysis time, average data requirements and outputs.  Estimates were provided
by the CEQ User Support Group for system enhancements and conversions.  The
costs are summarized by non-Recurring, Recurring and Net Cost Savings.  The
quantification of benefits was difficult because a number of studies that were
performed as part of the individual EPA evaluations would not have been attemp-
ted without UPGRADE.  This is due to the separate physical locations of the
data and the laborious process to read listings, extract values, perform
statistical analysis and plot the results.  This intangible benefit has been
given a nominal cost value for comparison purposes in the Cash Flow Analysis.
The anticipated dedicated EPA user base was set at "9" based on the Evaluation
                                     1-6

-------
reports.  More less-dedicated users or less more-dedicated users will not
affect the cost analysis.  The resulting total life cycle costs and savings
are summarized on Figure 1-1.  As can be seen, Option 11 provides the greatest
Net Cost Savings for the given user base.

     To generate a System Acceptance Rating, each criteria was defined and
weighted according to relative importance to EPA as a whole using the UPGRADE
system.  Each option was then rated as to overall satisfaction of the criteria,
taking into account functional advantages and disadvantages, and the incor-
poration of user defined requirements.  Figure 1-1 shows that Option II has
the highest System Acceptance Rating.  The sensitivity of the rating is great-
est for the number of estimated EPA users, followed by Response Time Benefits
and Analysis Benefits.

H.  Recommendation

     Based upon the foregoing analysis, the most cost effective solution for
implementation of the UPGRADE system is Option II.  It has the highest System
Acceptance Rating and the greatest Net Cost Savings for the presently anti-
cipated volume of users.  Its cost savings exceed other options by at least
$350,000 over the five-year life cycle.  Other options may be selected at EPA
descretion should the number of users significantly change.  EPA, as a co-
sponsor of UPGRADE, may wish to foster a maintenance program as defined in
Appendix F, software configuration control, and exert influence to accomplish
EPA user requirements termed "Necessary" and "Desirable" in accordance with
the schedule in Section IV.  In the event that UPGRADE must be moved off the
NIH-DCRT computer, EPA may wish to consider the alternate site analysis of
Option I and locate UPGRADE on a commercial service bureau such as the GSA-
Boeing computer site.  A version may be retained on the NIH-DCRT computer as a
development system because of its relatively low cost.

I.  Project Plan Outline

     •   Designate EPA Coordinator
     •   Put Initial Users on the System
     •   CEQ User Support Group accomplish Program Modifications
     •   EPA/CEQ prepare detail schedule for programming "Necessary"
          and "Desirable" System Enhancements
     •   Write IAG for execution
                                   1-7

-------
SECTION II

-------
                II.  NEEDS ANALYSIS AND EVALUATION CRITERIA
A.   MANDATES

     The logical follow-up to the National Environmental Policy Act (NEPA)
was the creation of a single agency to oversee the mandates of the Act.
Thus, the Environmental Protection Agency was established on December 2,
1970, through an executive reorganization plan, to consolidate certain
federal environmental activities set forth by NEPA.  Since its inception,
EPA's mission has been defined, and re-defined, through enactment of the
following laws:

     •  The Clean Air Act Amendments of August 1977
     •  The Federal Water Pollution Control Act Amendments of 1972
     •  The Federal Insecticide, Fungicide, and Rodenticide Act
        Amendments of 1972
     •  The Marine Protection, Research and Sanctuaries Act of 1972
        ("Ocean Dumping")
     •  The Noise Control Act of 1972
     •  The Toxic Substances Control Act of 1977
     •  The Resource Conservation and Recovery Act of 1977
     •  The Safe Drinking Water Act Amendments of November 1977, and
     •  The Clean Water Act Amendments of 1977

     EPA's priority mandate for the first years of operation has been the
development and acquisition of environmental data bases to provide a base for
analysis of the environmental characteristics and trends.  Tools to develop
analyses of these data, especially in areas of cross-correlation, such as
ambient environmental quality data with health effects data, have received
secondary attention.  The need to examine many environmental variables to
identify potential cause-effect relationships is increasing.

     The National Academy of Sciences has reported that most EPA environ-
mental data are not analyzed adequately and, when published, contain no
interpretation of the data.  It further states that although EPA maintains
large environmental data banks, "Little effort has been made to relate these
data to data on health and ecological effects although opportunities exist
to do so, for example, with the following:  data on morbidity and mortality
from the National Center for Health Statistics (NCHS); data on cancer from
the National Cancer Institute (NCI); ...."

     Several additional reports have been written by the General Accounting
Office criticizing EPA's performance in and response to specific program
areas.  Typical critiques include:

     •  AIR

        "Federal Programs for Research on the Effects of Air Pollutants"

        "Pollution from Cars on the Road-Problems in Monitoring Emission
        Controls"
                                     II-l

-------
     •  WATER

        "Implementing the National Water Pollution Control Permit Program:
        Progress and Problems"

        "Better Data Collection and Planning Is  Needed  to Justify Advanced
        Waste Treatment Construction"

        "Problems and Progress in Regulating Ocean Dumping of Sewage
        Sludge and Industrial Wastes"

        "Continuing Need for Improved  Operation  and Maintenance of Municipal
        Waste Treatment Plants"

        "National Water Quality Goals  Cannot Be  Attained Without More
        Attention to Pollution from Diffused or  'Nonpoint' Sources"

     •  RADIATION

        "The Environmental Protection  Agency Needs Congressional Guidance
        and Support to Guard the Public in a Period of  Radiation Prolifer-
        ation"

        "Efforts by the Environmental  Protection Agency to Protect the
        Public from Environmental Nonionizing Radiation Exposures."

     •  PESTICIDES

        "Federal Pesticide Registration Program:  Is It Protecting the
        Public and the Environment Adequately from Pesticide Hazards?"

        "Special Pesticide Registration by the Environmental Protection
        Agency Should Be Improved"

     •  NOISE

        "Noise Pollution - Federal Program to Control It Has Been Slow and
        Ineffective"

     •  GENERAL

        "Environmental Protection Issues Facing  the Nation"

        "GAO Reviews of Federal Environmental Research  and Development"


     The addition of appropriate data  from these program areas to UPGRADE
could Increase the quantity and meaningfulness of data  analysis that could
be performed by EPA's research staff by reducing the effort presently re-
quired to locate, peruse, and identify significant interactions, and lucidly
display the results.

     The National Academy of Sciences, in the previously mentioned report,
has cited CEQ's use of UPGRADE as an effective method of responding to these
cross media analysis requirements.


                                    II-2

-------
     A need for rapid Investigation of available monitoring data on any
given pollutant has also persisted.  High interest in a given pollutant by
Congress and the public and other requests for fast turn-around analyses
and maps, charts, graphics and other information aids will predictably
continue in the future.

     An on-line interactive system with user prompting and rapid graphics
output would support these requirements.


B.   UPGRADE

     1.  Definition

     The CEQ undertook the responsibility of designing a new, versatile
computer system which would analyze computerized data relating to the environ-
ment, natural resources, public health, and related topics.  Their end-product
is UPGRADE - User Prompted GRAphic Data Evaluation.  One of the prime criteria
in the design of UPGRADE was flexibility.  The system can be used with data
from any field of scientific endeavor that requires graphic display and/or
statistical analysis.  The system was developed to:

     •  provide easier access to computerized environmental data

     •  make these data available to a larger portion of the nation's
        scientists and managers

     •  facilitate more efficient and convenient environmental assessments

     •  foster increased uses for available environmental data

     •  provide better capabilities for identifying correlations between
        factors represented in different computerized data (e.g., air
        quality and health), and

     •  improve environmental research and data collection programs through
        the insights and feedback provided by users of the UPGRADE system.


     Although UPGRADE is designed to run on an IBM 360/370 computer equipped
with TSO CIBM's acronym for "Time Sharing Option"), other computer manufacturers
also have systems which perform the same functions as TSO, and UPGRADE can be
adapted to run under these systems.

     The data presently available on the UPGRADE system have been selected from
several major Federal data banks which contain information relating to the
environment, natural resources, public health, and related topics.

     The analytical and graphic features presently available on UPGRADE include
basic statistical summaries, data sorting and ranking, data transformations,
data partitioning, scatter plots, least squares regression with Nth-order
polynominal curve-fitting, polygon (point-to-point) plots, shaded bar graphs,
and shaded maps.
                                      II-3

-------
     UPGRADE also provides extensive plot-modification capabilities.  The
user can interactively tailor any graphic display by altering the statistics
plotted, the axis scales or annotations, the plotted symbols, the ranges
and levels of shading, or a variety of other specifications.  These features
were developed because it is not always possible to anticipate at the outset
what format or results will be most informative, and varying analytical
demands required flexible capabilities for visual presentation.

     The graphic display and statistical results of UPGRADE analysis can be
produced immediately on an appropriate remote terminal video screen (with
optional hard copy capability) or a standard typewriter terminal.  Automatic
sequencing options are also available, permitting the user to specify that
similar graphs be made for a selected series of variables or data sets.

     UPGRADE data bases are carefully screened for adequacy of the variables
measured, measurement frequency, period of record, geographic location,
quality assurance, and other criteria, depending on the type and intended
use of each data.  Thus far, most of UPGRADE'S data have been selected from
data banks maintained by the Environmental Protection Agency (primarily
SAROAD and STORET), the National Institutes of Health, the U.S. Geological
Survey, and various state agencies.

     The designers of UPGRADE envision a community of users from a variety
of disciplines representing Federal Government, state and local agencies,
academic institutions, and private interests.  As more data sets are inte-
grated into the system, they can be made available to all the users.


     2.   Use

     CEQ and collaborating Federal agencies have been screening and selecting
data for use in UPGRADE since 1975.  Since that time, UPGRADE has been used
to support:

     •  CEQ Annual Reports

     •  Other CEQ analysis for reports to the White House and/or OMB

     •  State of New Jersey environmental analysis

     •  CEQ reports supplemented to the Annual Report

     •  Department of Energy Atlas

     •  Mapping of NCHS cancer death rate data


     Specific examples of its use are contained in Appendix A.


     3.   EPA Systems vs. UPGRADE Capabilities

     In terms of EPA requirements for increased analytical capabilities,
existing EPA systems and a proposed EPA system were compared to UPGRADE.
                                      II-4

-------
Figure II-l lists the requirements and the relative ability of each system to
satisfy those requirements.  A detailed comparison of each system's capabil-
ities is contained in Appendix B.

     The salient points of the comparison are that SAROAD is basically a
batch-data entry, storage, and retrieval system for air data.  It has some
batch statistical capabilities.  STORE! is a batch data entry, storage and
retrieval system for water data.  It has some graphics and some batch analysis
capabilities.  ADROIT is an interactive graphic system currently being re-
programmed by an EPA contractor to operate on the EPA-COMNET computer
primarily with water data.  It has the ability to correlate water quality
data with other data.  However, there are no computer run time statistics
available at this time for comparison, and it is geared for a computer-
trained user.   UPGRADE has the capability to correlate data from various
media (including STORET and SAROAD), is fully interactive, has mapping capa-
bilities and is geared for the non-computer trained user.  UPGRADE is
operational now, accessible to EPA users and contains a significant EPA-
applicable data base.  UPGRADE does not supplant existing systems; it adds
more analysis capability to EPA researchers.  STORET and SAROAD are EPA water
and air data bases.  ADROIT adds water analysis capabilities and UPGRADE adds
the capability to correlate environmental data with health information.

     There are two other interactive graphics systems, one by CALCOMP and the
other by Tektronix.  These are EPA systems and are being compared to UPGRADE
by Region III personnel.  Judged on the graphic features alone, the CALCOMP
system was rated higher, with UPGRADE ranked second.  However, the results
indicate that UPGRADE is easier to use and costs about one-half of the other
systems.  The evaluation report generated by Region III is included in
Appendix D.

     An Interagency Agreement was signed by CEQ and EPA to analyze UPGRADE
capabilities and its ability to satisfy EPA needs.  The remainder of this
study, therefore, is a cost benefit analysis on the use of UPGRADE.


     4.   Interagency Agreement

     Executed in September 1977, the Interagency Agreement between the
Council on Environmental Quality and the Environmental Protection Agency
(See Appendix C) provides that EPA will join CEQ and DOE as a co-sponsor in
the development of the UPGRADE system, sharing all rights to the use and
sponsorship of the system.

     The LAG prescribes five tasks:

          a.  System Installation Survey

              Presently, UPGRADE is supported by the NIH  Computer Center.
              Identify the possible EPA installations which might support
              UPGRADE and determine the effects.  These include:

              •  Government  IBM TSO environment

              •  Government  UNIVAC 100  (RTF) environment
                                     II-5

-------
Requirements
Data
Access
For:



Air Quail LV
U.icor Quality
Demographic
Health
Other Envir.
User's Daca
UVIA Ll.sLlin-
D.it.i Manipulation
Basic Statistics
Polygon Plots
Bur Charts
Regress! oa
I'crcoiul hjs
Moping
Co isolation
Interactive vs. Batch
User background rcq.
Date of information
STORET
NO
Main EPA UB
NO
NO
NO
ONLY IF IN STORET
YFS. BATCH JOB
FILTERING
BATCH JOB
STRAIGHT I INC
NO
REG PROGRAM
l>rGKCE=l
STAND program
LOG program
RUG program
Batch
SOME TEXT EDITOR &
KOLI.OW MANUAL
June '77
ADROIT
NO. bur could
SET UP FOR IMMEDIATE
COULD BE ADDED
COULD BE ADDED
COULD BE ADDED
YES, THROUGH MTS
TO TERMINAL OR OTHER
I/O DFVKL"
FULL TRANSFORM
CAPABILITY
YES; PLUS AGGREGATION
AND OTHER STATISTICS
STRAIGHT OR HASH
SHADED WIDTH & NUMBER
CONTROL
DrfiREE=l to 9
FULL USER PROGRAMMED
NO
NO as standard
CAN BL USER PROGKAMMED
INTERACTIVE
FORTRAN or BASIC helpful
not req'd; learn manual
June '75*
SAROAD
Main EPA nfl
NO
NO
NO
NO
ONLY IF IN SAROAD
YES, BATCH JOB
FILTERING
BATCH JOB
NO
NO
NO

NO

Bn f rh
•Some text editor &
follow manual
'71, '73 '74
UPGRADE
liY TAPE KltOri SAMOA!)
BY TAPE FROM STORET
VERY LI1TLL
BY TAPE FROM NCIIS
SOME: oil spill, etc.
YES, CFNERAL INTERFACE
TO TER1I1NAL OR DISK
FILTERING : TRANSFORM PLANNED
YES; PLUS SAS
STRAIGHT OR DASH
SHADED
NUMDFR CONTROL
DFRRFF-1 r-n fi
PARTITIONING SUItROHTTNES
NASQAN, COUNTY,
AS PART OF REGRESSION
TNTFRAPTTVr
No computer's data analysis

-------
               •  COMNET/ALPHA environment

               •  Commercial IBM TSO environment

           b.   User Needs Survey

               This task has two elements.  The first is the identification
               of offices which have a high and immediate interest in UPGRADE,
               and identification of their requirements not directly met by
               UPGRADE.   The second element is the survey of potential users
               to determine overall EPA requirements for the system.  This
               effort was redirected towards working with UPGRADE evaluators
               in key offices.  The EPA Project Officer determined that EPA
               Staff Coordinators in each major office would be used to
               survey the technical staffs, on an informal basis, to identify
               potential users.  As the basis for evaluation, they would also
               support these users in the practical use of UPGRADE in solving
               a current problem.  This provides a more effective base than
               the approach originally envisioned.

           c.   System Design Analysis

               The user requirements will be analysed in terms of the imple-
               mentation times and costs to permit evaluation of their cost
               effectiveness by EPA management.  A program plan for the added
               software requirements will be prepared so as to show the time
               frames in which they could be expected to be added to the
               production system, adding that factor for management consider-
               ation in setting priorities.

           d.   System Management Requirements

               The CEQ User Support Staff is developing a production soft-
               ware configuration control system to assure the orderly
               meeting of the many user community requirements for software
               extensions to the system without degrading current production
               system performance by the errors commonly introduced with
               software modification/extension.  The specification for
               software changes to meet (1) new user requirements and (2)
               transport UPGRADE to a new computer environment if necessary
               will be designed to keep the control and maintenance of
               UPGRADE software at a production level.

           e.   Documentation

               The CEQ User Support Group will prepare a user manual specifi-
               cally addressing the requirements of the EPA user community.

C.   PROGRAM AND COMPUTER ENVIRONMENTS

     The potential users of UPGRADE in the EPA are non-computer trained
scientists and analysts doing environmental research, problem solving, air
and water quality reports, trend analysis and correlation studies.  Since
                                     II-7

-------
all potential users of this system cannot conduct an individual evaluation,
there are designated coordinators and potential users doing an active evalu-
ation of UPGRADE while others have indicated their intent to use the system
if acquired.  Six active evaluations have been conducted by:  the Office of
Air and Waste Management (OAWM) (1); the Office of Research and Development
(ORD) (2); the Office of Toxic Substances (OTS) (1); Region III (1), and
Region X (1).  Evaluations are also under way in the Office of Planning and
Management (0PM) (1) and the ORD (1).  ORD also has another potential user
with defined requirements.  Each evaluation report is included in Appendix D.
Other EPA evaluations have either expressed no interest or their evaluations
were not detailed enough to be included in this analysis.  These comments
are included in Appendix G.  Additional statistics were gathered via per-
sonal communications, but some evaluators/users were unable to quantify
their evaluation.  A summary of each user's evaluation (use) objective,
current capabilities, data volumes and costs (if available) is contained in
Figures II - 2, 3 and 4.  There are a variety of results, notably that UPGRADE
increases the user's analysis capabilities at equal or significantly reduced
cost.  However, each user has indicated a need for some form of improvement
or enhancement to the UPGRADE system as a whole.  These are addressed in the
remaining paragraphs of this section and are addressed as Data Base require-
ments and UPGRADE requirements.

D.   USER REQUIREMENTS

     1.  Current Data Base Capabilities

     Data available to UPGRADE users resides in the Integrated Data Base (IDB)
or UPGRADE Data Base (DB).

     This currently consists of:

           - County Level Health Data
           - County Level Drinking Water Data
           - County Level Demographic Data
           - General Water Quality Data
           - Aquatic Pesticide Data
           - National Stream Quality Accounting Network
           - General Air Quality Data (limited user-defined subset)

Programs currently exist to extract user designated data from EPA's Water
Quality Data Base (STORET) and Air Quality Data Base (SAROAD) residing in
the computers at COMNET and Research Triangle Park (RTP).  Once extracted,
these data sets can be loaded on the computer for access by UPGRADE.  There
is also the capability of adding user-supplied data sets that can be kept
current or modified as desired.  However, the additions to the IDB require
the intervention of the CEQ UPGRADE User Support Group and sometimes can re-
quire as long as 1-2 weeks for data extraction, format, and load.  The inter-
face processes are shown in chart form on Figures II-5 and 6.


     2.  Additional Data Base Requirements

     The evaluation reports have indicated several areas of concern by
potential users in regard to data for UPGRADE.  The primary area of concern
for users of air data is the data base structure which severely limits the

                                    II-8

-------
       USER SUMMARY
CATEGORIES
Program objectives and prob-
lem definition
Current support systems and
data bases
Interfacing activities re-
quired to reach UPGRADE
Data volumes and special
handling requirements
Current coats vs. UPGRADE
coses
Deficiencies/strengths In
current systems
Projection of future needs
expannlon
OAWM/OAQPS
RTF
Reduction of Air Quality
Data for Trend Analysis
(CEQ Annual Report)
SARD AD
Data extraction and Listing
Extract SAROAD data and
load on UPGRADE. FTS
call. MODEM and terminal
1 yr hourly data - 8,760 pti
required data extraction
and loading on UPGRADE by
CEQ User Support Croup
Mould be comparable for
this application but did
not make use of Inter-
active capability
Current system has no In-
teractive graphic
capabilities or on-line
analysis capabilities
UPGRADE Data Base Improve-
ments and expanded plot
capabilities
REGION
X
Graphs for Regional Profiles
Trend analysis and summary
reporting of Air data
Current Regional SAROAD Data
Ixical mini-computer programs
No Interactive Graphics
Extract SAROAD data and load
on UPGRADE. FTS call,
MODEM and terminal
Not available
No current data on SAROAD
Not available
Current system does not have
programs comparable to
UPGRADE, but can be added.
Region Data Base Is more
current than SAROAD
Add and maintain current
Region data In custom data
set
ORD/OMTS
Las Vegas Lab
Correlate Phytoplankton data
with Water Quality data
Waiting for UPGRADE
Load data on UPGRADE.
Require access to terminal
and MODEM
2,400 lines of STORET data
70.000 lines of Phytoplan-
ton data. Require CEQ
User Support Group to load
data
Potential User
Potential Vast

ORD/OMTS
Las Vegas Lab
Correlate mortality rates
with drinking water con-
stituents
Unknown
FTS call, MODEM and terminal
250 counties, 40 Water
Quality and 20 Disease
variables per county, about
SDK points
3000 hrs versus 120 hours
(o Graphics
JPCRADE Data Base Improve-
ments and additional
analysis routines
Figure II-2

-------
USER SUMMARY
CATEGORIES
Program objectives and prob-
lem definitions
Current support systems und
data bases
Interfacing activities re-
quired to reach UPGRADE
Data volumes and special
hand] inn requirements
Current costs vs. UPGRADE
coats
Deficiencies/strengths in
current systems
Projection of future needs
expansion
ORD/OHEE/HERL
CINN
Correlate cardiovascular
mortality to water hard-
ness
STORET
FTS call, MODEM, and term-
inal
5,000 records, requires CEQ
User Support Croup to
load data
Evaluation In process.
Current system would be
10 times expected coats
No Graphic analysis and
plotting capabilities
Unknown
REGION
III

versus time
(monthly readings)
CALCOMP and Interactive
Graphing Package (IGF)
(Tektronix) STORET
FTS call
12 points on plot
1 parameter. Required
CEQ User Support Croup to
load data S(10n rornrHa
CALCOHP-$17.28 machine +
3.75 Programmer
IGP -$11.47
UPGRADE-} 6.52
UPGRADE - Easier to use
- Output looks good
Environmental Profiles
Public Info. Hedlth Related
Effects 208 Planning
OTS

mortality. Trend analysis.
Quick Response analysis
Access to STORET and SAROAD
files
FTS call, MODEM and terminal
Depends on study if not in
the IDE, extract from
STORET
Some studies could not be
accomplished, others would
cost 5-10 times more
without UPGRADE
No Graphic analysis and
plotting capabilities. No
correlation capabilities,
More data for more
studies
ORD/OHTS
I1Q
Correlate causes of death
tilth environmental varlablei
also intercorrelations of
health, demographic A en.va,
Access to SAROAD and STORET
FTS call. No 1200 Baud
MODEMS as yet, so operating
at 300 baud
Often work with 3082 counties
Cannot do health analysis
without UPGRADE
No access to health variables.
Plotting not available to
More data for more studies
  Figure II-3

-------
                                                                 USER  SUMMARY
         CATEGORIES
                                          0PM
                                                                     OE
Program objectives and problem
  definitions
Analysis of ambient
   levels for lead
Use ADP for storing and
analyzing pollution source
data
Current support systems  and
  data bases
Manual Investigation
   access to SAROAD
PCS, ROEDS,  LEDS  and CAS -
little access to  parametlc
data	
Interfacing activities
  required to reach UPGRADE
ITS call. MODEM and
   terminal
UPGRADE tested while  the
user was on detail at CEQ
Data volumes and special
  handling requirements
S-10K records require
   CEQ User Support  Group
   to lead data
Not supplied
Current costs vs.  UPGRADE
  coats
                                Evaluation In process
                           Not applicable - no
                           alternative exists
Deficiencies/strengths In
  current systems
Availability of data  on
   UPGRADE
Projection of future needs
  expansion
                               Additional analysis
 No centralized  parametric
 data for air  and water
                           Centralized data on
                           emissions, effluent
                                                                  Figure  II-4

-------
     INTERFACING STORET DATA
           TO UPGRADE

           FLOW CHART
                                          NOTES:
   LOGON TO EPA/COMNET
    COMPUTER FACILITY
                              ALPHA TEXT EDITOR TO SET
                              UP  JOB
  RETRIEVE  DESIRED DATA
  USING STORET  RET PGM
  W/MORE=4  OPTION
 COPY RETRIEVED DATASET
 TO  COMNET TAPE
  TRANSFER TAPE FROM
  COMNET TO NIH/DCRT
  INSTALLATION
    COPY TO NIH
    LIBRARY TAPE
PROCESS DATA THROUGH
UPGRADE PRE-PROCESSOR
   RUN DATA TO DISK
• CATALOG DATA FILE
  IN UPGRADE DYNAMIC
  ALLOCATION CATALOG

• CATALOG FILE ON ISO
   EXEC UPGRADE
   AND ENTER WATER
   QUALITY INTERFACE

         Figure II-5
                              REFER  TO  STORET MANUAL
                              RET- RAW  DATA
                              MOREA=COPYABLE TO  TAPE
                              IBM UTILITY  IEBGENER
                              COMNET  SCRATCH TAPE
                              BY COURIER SERVICE USUALLY
                              WYLBUR TEXT EDITOR
                              IEBGENER
                               REFORMATS STORET DATA
                               AND COMPUTES OVERVIEW
                               STATISTICS
                               IEBGENER
                              TSO CAT COMMAND
             11-12

-------
                   Interfacing SAROAD Data to UPGRADE
    Flow Chart
                   Notes
SAROAD run at RTF

(NADB & INTRFAC2)
      SARTRAN
   UPGRADE
   Preprocessor
      UPGRADE
         \/
   Vitro Mapping
   Processor
                                  Using:
                                  get:
          AEROS Manuals
          Inventory by Site
          Inventory by Pollutant
           Special listings
                                          Run Totals
                                          Tape for next Step (CEQ.SAROAD,
                                                 PSI Format)
 add Site Information

 get Tempfile in STORET More=4 format
                                 produces:
          basic statistics
          UPGRADE Data Set (CEQ.UPGRADE.
                  SAROAD)
                                 produces:
                                          Graphic and Statistical Analyses
                                          Map tape
produces;
                                         CALCOMP Plotter Maps
                              Figure II-6
                                 11-13

-------
amount of useful data per data set.  For example,  only one year of hourly
data can be stored on a data set (8,760 points in  time X 50 variables per
point) because of the fixed space reserved for 50  variables per point when
there may be only one variable required.  Reduction of this fixed space to
one dependent upon the number of variables specified is essential to the
storing of usable air data on the IDS.   Current data in the IDB is a concern
of users who are doing work with current year data.  There seems to be a time
delay in the loop wherein data comes from the regions into STORET or SAROAD
and is extracted by User request for UPGRADE.  This could be overcome by the
regions directly storing their current  year data as a "custom" data set and
keeping it current.  However, they are  dealing with hourly data and required
more than one year's worth of data per  data set.  This is a current restriction
as previously discussed.  The STORET and SAROAD extraction process is another
area of concern.  Users feel that the automation of this process or much
quicker access to these Data Bases is required. Automation of this process
would allow the user to directly initiate the extraction process and eliminate
the time required by the CEQ UPGRADE User Support  Group to accomplish this
task.  If EPA uses UPGRADE heavily, ORD/OMTS/HQ recommends that the ability
to extract data from tape storage be added to reduce the amount of on-line
storage required.  A ranked summary of  the user's  data base comments and
additional requirements are included in Figures II-7 and 8.  A time and cost
estimate is also included where applicable.  It should be noted that the only
essential requirement is the data base  restructuring or data compression
requirement.  This requirement is already funded by other UPGRADE users and
is scheduled for completion early in FY 79.  The ranking is:

           E = Essential - Cannot use UPGRADE without the capability.

           N = Necessary - Would make the use of UPGRADE easier, should be
                           incorporated as money and priority allow.

           D = Desirable - Nice to have; would enhance the system.


     3.  Additional UPGRADE Requirements

     Current capabilities of UPGRADE are described in Paragraph II.B.I.
These evaluations have brought out some pertinent  points; (1) UPGRADE is an
easy system for those without computer  training because of its English language
and conversational prompting; (2) It has the capability to allow the user to
correlate data from almost any accessible data base with other data, such as
air and/or water with mortality rates;  (3) The correlations are readily
evident through the use of graphical analysis on a Video terminal with instant
copy available; (4) It invites initiation of studies previously shunned be-
cause of the laborious data extraction  and manual  correlation required.

     However, the evaluations have also indicated  the need for additional or
improved capabilities before effective  use can be  made of UPGRADE.  Those
needs that are most pressing seem to be:  batch production mode; an improved
Terse operational mode for the more familiar and constant user; the ability
to compute moving means; the ability to use more than one value per plot axis
or multiplot; and an internal data management for  storing and revising data.
These requirements and less pressing needs are ranked and summarized in
Figures II-9 and 11-10.  Those requirements ranked as Essential, except
Moving Means, are already funded by other UPGRADE  users for incorporation in
early FY 79.  All UPGRADE users share these benefits.


                                    11-14

-------
                               DATA BASE COMMENTS
DATABASE
II)B
STOKKT Interface
SAROAD Interface
OMENTS
Excellent concept
Data must be kept current
File Structure/Storage
Needs More Data
User Created/Maintained data
Current Retrieval Capability Needs to be
Automated (Tape to N1H)
Current Retrieval Capability needs to be
Automated (Tape to NI11)

OTS
X


X/N



USERS/Kank
LAS VEGAS





X/N








OAQPS
/RTP

'
X/E



X/N
HERL
/CINN







Region
X

X/E
X/E

X/E

X/N
Region
III




X/D
X/N
X/N
ORD7
OMTS/
11Q
X


X/N
X/N

515K

Coat
Estimate

Another Retrieval
4-6 weeks for
compressions
$7.5K
Depends on User
needs
Available Docunien
tatlon not ready
3 months
515K
3 months
S15K
X •= Comment/Requirement
E = Essential - Cannot use UPGRADE without the Capability
N - Necessary - Would make the use of UPGRADE easier, should be
                Incorporated as money and priority allow
D • Desirable - Nice to have; would enhance the system
                                  Figure I1-7

-------
DATA BASE COMMENTS
DATA BASE CAPABILITIES
Cacegoiy
Data Extraction











Fluid Size for
varltibleti

liner defined
variables
Meaningful size
data sets





Available
Integrated Data Base —
predefined parameters
of all Data elements In
DB
SAROAD Programs -
User defined parameters
(Contractor extraction)
STOHET Programs -
User defined parameters
(Contractor extraction)
User Cieated Data -
User defined parameters
Fixed size field for
SO variables

Predefined variables

3.9M bytes per data act
(Because of Fixed size
field for SO variables -
data sets with 1, 2, or
3 variables are limited.
I.e., 1 site year of
continuous data)
Required Additions
User defined parameters and
more direct control



User manageable extraction


User manageable extraction

Ability to extract data
from tape storage
Variable size field depending
on the number of variables
used (compression)
User Creation of new
variables
Adjustment of variable field
size (would allow multlyear
continuous data for
multlsltes)



USERS/Rank
OTS
























Las Vegas

X/E






X/E




































X/E


OAQPS
Kit





X/E








X/E

X/N




X/E


I1ERL
CINN
























Region

X/E



X/E















X/E


Region
III





X/N


X/N















HQ

X/D









X/N













Coat
Estimate
Local Program
can do It at
term site


See DB


See DB

2 weeks
S3K












     Figure 11-8

-------
E.   RANKED MAJOR OUTPUTS REQUIRED FROM UPGRADE

     The additional user output requirements are summarized in Figures II-9
and 11-10 under Plot, Graphic Analysis, and Mapping Categories.  Of the addi-
tions, Multiplot is ranked as Essential and has already been funded by
another UPGRADE user, and will be available early in FY 79.  Contour plots
will be available in FY 79 and the remainder, not defined as Essential, will
be candidates for early implementation.


F.   UPGRADE ACCEPTANCE CRITERIA
     1.  The acceptance criteria for EPA use of UPGRADE consists of 10 factors,
each rated on a 0-10 scale with "0" being the least significant.  In addition,
each factor has a 0-10 weight assigned according to its relative importance
to EPA as a whole using the UPGRADE system.  The 10 factors are as follows:

           a.  Number of EPA Users - The number of current and identified
               users within EPA.  This is the major determining factor of
               whether or not there will be a minimum amount of use to
               justify using or supporting UPGRADE.  It also is the deter-
               mining factor for level of EPA involvement with UPGRADE,
               if it is to be used.  Consequently, it carries a weight of
               10.  The rating will be proportional to the number of
               identified users.

           b.  Satisfy User Needs - The ability of UPGRADE to fulfill the
               ranked user requirements.  The more needs that are satisfied,
               the higher the rating.  The relative weight of this factor
               is 6.

           c.  Expansion Capability - The ability of the individual options
               to accept growth, both in numbers of users and in numbers/
               sizes of data sets.  The easier it is to expand, the higher
               the rating.  The relative weight of this factor is 5.

           d.  Implementation Costs - In terms of the applications identified
               in the evaluation process, this is the total one-time cost to
               the user to initiate analysis.  This includes data acquisition,
               data loading, and special program requirements.  The lower the
               cost, the higher the rating.  The relative weight of this
               factor is 4.

          e.   Operating Costs - For each application, this is the cost to
               perform UPGRADE analysis.  The cost includes data storage,
               computer time, support staff and terminal cost over a 5-year
               life cycle.  The lower the cost, the higher the rating.  The
               relative weight of this factor is 4.

           f.  Cost Savings - This is the overall life cycle cost savings
               determined by comparing the cost of current analysis to
               UPGRADE operating costs and implementation costs.  Where
               savings are intangible, a nominal figure will be estimated.
               The higher the cost savings, the higher the rating.  The
               relative weight of this factor is 6.


                                    11-17

-------
                                                                             UPGRADE COMMENTS
UPGRADE CAPABILITIES
Category
Faster System





Analysis
Routines






Plot





Plot


Graphic Analysis






Display features
Available
Verbose and Terse Modes





SAS package

Sort and Rank

CEQ Air Quality Rollback
Model
Basic Statistics

Plot maximum = 400 data
points







Polygon Plot
Bar Chart
Scatter Plot
Multlslte Plotting

Linear Regression


Required Additions
Ruperterse Mode

Batch Production Mode

Improve Terse

Ability to add User Developed
analysis routines
Moving means

Additional averaging
transforms
Add Interactive SPSS
capability
Increase maximum





Overlying points on graph


3D Capability
Multlplot
Alpha Numeric Axis
description
Contour
Ability to add User defined
models
Quick EXIT from System
USERS/Rank
OTS



























.



La Veeas





























X/N

X/N



X/E


X/N























OAQPS
RTF
X/N






X/D






X/N
















HERL
CINN































Region
X








X/E


X/E



















Region
III






X/N
















X/N
X/E


X/N


X/D
ORD
HQ
X/D

X/E

X/N









X/D





X/D



X/D

X/D





Cost
Estimate
6 months
S35K
Full-wlth S-terse
Llmlted-ln FY79
5-8 weeks
$10K
2-6 weeks
$2.5K - 15K
B weeks
$10K
Add SAS
Procedure
3 months
$15K
1. NC Increase
cores
2. Create Temp.
data set
6 weeks
$7.5K
2 months
$10K
Evaluate avail .
Pkg. & acquire
2 months - $10K

1 month - $5K
Available In FY7<
3-5 months
S15-25K
Available In FY7S
H
M
                                                                               Figure  11-9

-------
UPGRADE COMMENTS
UPGRADE CAPABILITY

Standardized
Terminology
Standardized
Units
M tipping
Keutart
Capabilities
Founal access
to Ll.e UPGRADE
User Support
Group
Data Manipulation



Confusing terminology
llnlta used In Data Base
County Map
MASCjAH Map
Start Over
Per Contract
Data Filtering
Data 1'jitltlonlng
Can only handle numeric
ES USERS/Bank
Required Additions
Standardize
User controlled variable
transformations
More Maps on Screen
Data save capabilities
No cost access, possible an
In-llouse Croup
Add flags for data not
meeting report criteria
I/O for storing and
revising data
Alpha/Numeric Data
OTS


X/D





Las Vegaa











X/N
X/E

X/E

OAQPS
RTF
X/N







HERR
CINN








Region
X

X/N



X/N
X/E

Region
III








ORD
uq



X/N



X/D

Coat
Estimate
on going
maintenance
Need to be STD at
DB entry level
Part of DB
maintenance
Basic Demo. In
?Y79
4-6 months
$30K

Not feasible In
UPGRADE - Should
be done at DB
extraction
1 month
$5K and
logistics
Available om
FY79
    Figure 11-10

-------
           g.  Time Benefits - The user benefits derived from being able to
               respond quicker in terms of calendar time.  The response
               would be to ad hoc queries and analysis problems.  The
               shorter the response time, the higher the rating.  This is
               exclusive of data transfer time.  The relative weight of this
               factor is 8.

           h.  Analysis Benefits - For each application, the improvements in
               analysis capabilities.  The ease with which the system can be
               used and the studies that would not be attempted without
               UPGRADE.  The greater the improvement, the higher the rating.
               The relative weight of this factor is 7.

          i.   EPA Controllability - The degree of influence EPA has over
               the development of the UPGRADE system to meet EPA needs.
               The greater the control, the higher the rating.  The relative
               weight of this factor is 2.

           j.  Risk - The degree of identifiable risk associated with each
               solution.  This includes both economic and technical risks.
               The smaller the degree of risk, the higher the rating.  The
               relative weight of this factor is 5, primarily because UPGRADE
               is  and will be an ongoing system supported by other govern-
               ment agencies and not a new system awaiting development.


     2.   In the evaluation process, each factor will be rated within each
proposed solution.  The rating will be multiplied by the weight factor to
determine a relative score for each factor and solution.  The summation of the
weight scores for each proposed solution will be the System Acceptance Rating.
It will be correlated to recurring costs, non-recurring costs, and net cost
savings to determine a final recommendation to EPA in the use of UPGRADE.
     3.   These factors are given values in Section IV, Evaluation.
G.   SUMMARY OF SAVINGS AND BENEFITS

     From the EPA evaluations that have been conducted and are included in
Appendix D, most applications show an average 5-fold cost savings (when
qualified) and intangible benefits.  The intangible benefits range from the
fact the UPGRADE'S rapid pictorial response allows EPA to either support or
disprove an allegation with substantiating figures to the fact that many
study applications would not have been attempted without UPGRADE.
                                    11-20

-------
SECTION III

-------
                      III.  Feasible Alternative Solutions

     There are four options for EPA in the use of the UPGRADE system.  They
are:
           OPTION   I - EPA Direction of UPGRADE
           OPTION  II - EPA Co-Sponsorship of UPGRADE
           OPTION III - EPA Limited Use of UPGRADE
           OPTION  IV - No EPA Involvement/Use of UPGRADE

Option I carries a number of hardware alternatives for possibly moving UPGRADE
to another computer site.  Each option and alternative is discussed in succeed-
ing paragraphs followed by a discussion of UPGRADE changes.  Costs are discussed
in Section IV.

A.  OPTION I - EPA Direction of UPGRADE

     1.  Under this option, EPA would become the main UPGRADE sponsor or would
take a version of UPGRADE for its own use and future development.  This is the
most costly action that EPA could take and should only be considered if and
when the dedicated user base is greater than 15.  EPA also incurs the cost of
maintaining an UPGRADE User Support staff to perform the following functions:

     Technical Staff                                   Function

            2                          Assist new/old users in realizing the
                                       total analysis potential with UPGRADE,
                                       including training, data base format-
                                       ting and data acquisition.

            4                          Program maintenance, enhancements.

     The User Support staff would be involved in planning future development,
improvements, and expansion of the system according to EPA needs on a long-
range basis.  It would then become necessary for EPA to define its role rela-
tive to UPGRADE.  EPA could either operate independently from the UPGRADE user
community (currently DOE and State governments), or it could, by agreement
with CEQ, become the lead agency in the development of UPGRADE.  The cost of
maintaining an in-house staff would run $150,000 per year.  However, it is
unlikely that EPA would be granted the additional billets for this purpose,
thus a comparable contractor staff cost of $300,000 has been used for analy-
sis.

     With the heavy use envisioned under this option, UPGRADE may have to be
moved from the NIH-DCRT computer center in Bethesda, Md.  A discussion of 4
alternatives for computer support follows:

     2.  Alternative 1 is to remain on the NIH-DCRT computer.  It has the
cheapest operating costs and least user implementation costs; consequently, it
has the most cost savings.  These are shown in Section IV.  However, although
NIH sets no formal restriction on the amount of computer resources available
to a single user, users who build up charges in the range of $10-15,000 per
month or who use a great deal of on-line storage (approaching max of 880 data
sets) are encouraged to commence seeking alternative computer sources.  These
numbers vary with the total loading on the NIH computer complex at any given


                                     III-l

-------
point in time.  CEQ's experience indicates that a dozen active user units will
reach the present NIH usage threshold.   If and when this limit is reached,
UPGRADE will have to be moved to another computer installation such as those
described in Alternatives 2, 3,  and 4.

     3.  Alternative 2 is use of EPA's  computer installation at COMNET.  This
computer is the same type IBM machine as that at NIH-DCRT,  however the opera-
ting systems and terminal interface systems are very different.  NIH-DCRT has
a virtual memory system (MVS) with a Time Sharing Option (TSO) for terminal
interface.  COMNET has a fixed partition (MVT) operating system and an ALPHA
terminal interface system.   Moving UPGRADE onto COMNET would require 20%
reprogramming at a cost of $82,000 as estimated by the CEQ  UPGRADE User Support
Group.  The nature of COMNET1s operating system is such that each user would
require UPGRADE'S maximum core requirements (500,000 bytes) available during
the entire analysis session.  For example, 6 users on the system at one time
ties up 3,000,000 bytes of COMNET's core virtually stopping its computer from
being used by anyone else.   This would  not be allowed and UPGRADE users would
have to wait in line for computer time.   A severe reduction of system use and
user satisfaction would result.   See Appendix E for a detailed discussion on
UPGRADE conversions.

     On a cost basis, the UPGRADE conversion costs would be added to the user
implementation costs, as shown in Section IV.  Also at COMNET, interactive
sessions carry a cost multiplier of "6" which increases the annual computer
operating costs to 6 times that of NIH-DCRT.  (Batch costs  at the two installa-
tions are similar.)

     4.  Alternative 3 is the use of EPA's computer installation at Research
Triangle Park (RTP), Durham, North Carolina.  This computer is non-IBM and, as
such, employs a different scheme for encoding data and programming instructions.
It also employs different methods to inputting and outputting data between
storage media and the computer core storage.  Other significant differences
are that the statistical analysis subsystem (SAS) would have to be changed
along with the Sort/Merge Program, Assembly Language Code and FORTRAN exten-
sions.  In essence, this would cause an 80% rewrite of UPGRADE at an estimated
cost of $164,000.  See Appendix E for a detailed discussion of UPGRADE conver-
sions.  Another point to consider is the terminal interface system.  At present,
the computer installation at RTP has a  finite limit of approximately 40 on-line
users at one time.  This would present  an indeterminate wait time for UPGRADE
users and a reduction of system use and user satisfaction.   Operating costs at
this installation are comparable to COMNET charges, about 6 times that of
NIH-DCRT.

     5.  Alternative 4 is the use of a  commercial computer  service.  For
purposes of this study, the GSA installation at Boeing Computer Services is
used.  This installation is comparable  to NIH-DCRT and the  CEQ UPGRADE User
Support Group has successfully transferred UPGRADE to this  system on a test
basis with no change in user capabilities or reprogramming.  The use of this
computer installation would increase operating costs by a factor of 4 over
NIH-DCRT costs based upon the CEQ UPGRADE User Support Group test.  At this
installation there are no known expansion limitations.
                                     III-2

-------
B.  OPTION II - EPA Co-Sponsorship of UPGRADE

     1.  Under this option, EPA would support UPGRADE at a level that would
entitle EPA to direct a portion of UPGRADE'S future development in conjunction
with the other supporting agencies.  This option should be considered when the
dedicated EPA user base numbers between 5 and 15.  With this option EPA should
maintain an UPGRADE User Support Staff of 1 to assist new/old users in realiz-
ing the total analysis potential with UPGRADE.  This person will assist the
user with data base formatting and data acquisition, and be an EPA liaison to
the CEQ UPGRADE User Support Group.  The person would also be EPA's representa-
tive in planning future UPGRADE development, improvement, and expansion to
ensure that EPA needs are satisfied to the extent possible.  The contractor
cost of this one-person staff would be $50,000 per year.  The system would
continue to operate on the N1H-DCRT, but when the expansion limitation, as
defined in Option I, is reached, the same hardware alternatives apply.  As
such, the EPA liaison should use the hardware alternative analysis of Option I
to influence any relocation of UPGRADE.

C.  OPTION III - EPA Limited Use of UPGRADE

     1.  Under this option EPA would be a user of UPGRADE.  They would be a
customer, able to buy certain "goods" from available stock, but not able to
influence the long-range development.  An EPA liaison would not be required as
individual users would interface directly with the CEQ UPGRADE User Support
Group.  This option should be considered when the dedicated EPA user base
numbers less than 5.  The costs of this option would depend upon the individual
user involvement with UPGRADE.

D.  OPTION IV - No EPA Involvement/Use of UPGRADE

     1.  This option is presented to include the possibility that there are no
interested EPA users for UPGRADE.  The User survey shows that this is not the
case.

E.  UPGRADE Changes

     1.  Those requirements ranked as "Essential" are listed below, along with
requesting EPA office, and a CEQ User Support Group time and cost estimate:

                                                              CEQ Estimate
    Requirements                    Requester             Cost         Time

Data compression or         OAQPS/RTP and Region X   $ 7,500        4-6 weeks
 Restructured Data Base

Refine Terse Mode           ORD/OMTS/Las Vegas        10,000        5-8 weeks

Internal Data               ORD/OMTS/Las Vegas and     5,000        4-5 weeks
 Management System           Region X

Limited Batch Mode          ORD/HQ                     Available in FY 79
                                       III-3

-------
    Requirements

Moving Means

Multiplot
        Requester

Region X

Region III and ORD/HQ
                 CEQ Estimate
             Cost         Time
         10,000

         10,000
8 weeks

8 weeks
                                               TOTAL $42,500
All Essential requirements except "Moving Means" have already been funded by
other sponsors of UPGRADE and are scheduled for implementation in the next few
months.

     2.   Those requirements termed "Necessary" are:
    Requirements

Improved STORET
 Interface

Improved SAROAD
 Interface

Data Extraction from
 Tape Storage

Super Terse Mode
        Requester

ORD/OMTS/Las Vegas and
 Region X

OAQPS/RTP and Region X
 and III

ORD/HQ
ORD/OMTS/Las Vegas and
 (Full Batch Capabilities)   OAQPS/RTP,  and ORD/HQ
Add User Analysis
 Routines

3D Plots
Increase Plot Size

Add User Defined
 Models

Contour Plots

Data Save
 Capabilities
ORD/OMTS/Las Vegas and
 OAQPS/RTP

Region III
OAQPS/RTP

ORD/OMTS/Las Vegas


Region III

ORD/HQ and
 ORD/OMTS/Las Vegas and
 Region III
CEQ
Cost
$15,000
15,000
3,000
35,000
9,000
Estimate
Time
12 weeks
12 weeks
2 weeks
24 weeks
4 weeks
Yr.
Planned
1
1
1
2
2
Evaluate Available
Packages
7,000
20,000
6 weeks
16 weeks
1
2
         Available in FY 79

        30,000         24 weeks
                           TOTALS  Yr
                           	Yr

                            TOTAL
            1
            2
$ 70,500
  64,000

$134,500
                                     III-4

-------
     3.  Those requirements termed "Desirable" are:

                                                             CEQ Estimate      Yr.
   Requirement                     Requestor             Cost         Time   Planned

Overlying Points on              ORD/OMTS/HQ           $10,000      8 weeks     3
 Graph

Improved Alphanumeric            ORD/OMTS/HQ             5,000      4 weeks     3
 Axis Description

Add Interactive SPSS             ORD/OMTS/HQ              5,000    12 weeks     3

Quick Exit from System           0PM and                   Available in FY 79
                                  Region III
                                               TOTAL   $20,000

F.  Major Benefits

     1.  A significant number of benefits come to EPA with the use of UPGRADE.
The most significant is that it allows EPA to utilize its large environmental
data bases and the health data bases for analysis of the factors affecting the
length and quality of life of every area in the United States.  It allows
studies to be conducted that would not previously be attempted because of the
time and cost required.  For those studies that would be attempted, UPGRADE,
with proper use, reaps a 5-10 fold saving on cost and time.  Evidence of these
benefits is also demonstrable by other agencies using UPGRADE.  The total
accumulation of benefits to EPA is indeterminate due to the intangible nature
of the benefits derived from studies that would not have been undertaken
without UPGRADE.  As can be seen from the evaluation report from OAWM at RTP,
not.all uses of UPGRADE produce a significant cost savings but most do and
bring large benefits.

     2.  Crosswalks between differing classes of source data provide a major
analytical tool.  Initial EPA evaluation activities included correctional
analysis of air and water pollutants and NCHS mortality data.  Possible analy-
sis products would be Atlases of mortality/morbidity air and water quality
after the model of NCI's Cancer Atlas.  In-depth analysis in this area would
be responsive to EPA mandates and will alleviate criticisms of EPA's prior
lack of activities in this area.

G.  Functional Advantages and Disadvantages

     1.  The functional advantage is that UPGRADE is a user-oriented system
geared to the non-computer analyst.  It requires little training and allows
the cross-correlation of different media data.  UPGRADE provides rapid visual
display of analysis results that are retainable in instant hard copy form.
The current UPGRADE data available to the user and typical outputs are shown
in Figure III-l.  However, the system is a tool for analysis and, as with all
tools, should and can be enhanced to provide even greater capability.  Those
additional capabilities and uses that are planned for the next fiscal year and
funded by other users are shown in Figure III-2.
                                   III-5

-------
     2.  Additional advantages can be gained just by adding more data to the
UPGRADE system.  Typical outputs/uses that can be achieved by adding data are
shown in Figure III-3.   These outputs are significant steps in responding to
the criticisms EPA has  received concerning analysis of collected data.

     3.  The functional disadvantage is that UPGRADE does require some modifi-
cation before some EPA  offices can make effective use of the system.
                                   III-6

-------
                                                 UPGRADE
                                              CURRENT USES
                 DATA  AVAILABLE

County level mortality data

County level drinking water data

County level demographic data

General water quality data

Aquatic pesticide  data
National stream quality accounting (NASQAN)
  data

General air quality data (limited user-defined
               TYPICAL OUTPUTS  (Con't.)

Ad hoc materials with quick turnaround time

Mortality rates vs. drinking water quality variables

Water quality  variables vs. "hardness"

Intercorrelations of mortality  variables

Cross-checking EPA & USGS  water  quality measure-
 ments

Correlations with demographic variables

Multiple regression using both environmental and
  demographic  variables

Air Quality and health data
                 TYPICAL OUTPUTS

Maps  showing geographic  patterns of county-level
  mortality/water  quality

Relationship  of cardiovascular disease mortality
  rales and constituent levels in drinking water

Time  trends  and mean  violation  rates of drinking
  water constituents  in selected surface supplied
  public drinking water systems
Water turbidity vs  time (months)

Air quality data for trend analysis and  summary
 reporting

Cancer mortality in California

Trends analysis of  CO and oxidant
                                                Figure III-l

-------
                                                FY
                                                UPGRADE
                                              79 FUNDED USES
CO
               DATA/CAPABILITY ADDED

1.  NE Region Drinking Water DB and Energy
      Environmental DB on  an industrial and
      county basis.

2.   Midwest Region Drinking Water DB and
      Energy Environmental DB on an
      industrial and county basis.

3.   Statistical Tables from the Compendium
      on Environmental Statistics (250 small
      data sets)

4.   Integrate National Meteorological Data
      with existing IDB Air Data
      (NO A A National Weather Service DB)

5.   NEDS Air Quality  Emissions data

6.   International  water quality data from
      Canada (GEMS)

7.   Biological Data  (BENCHMARK)  from
      NASQAN  stations

8.   Complete GLIDE Interface with Age
      Adjusted Mortality Rates,
      100-200 demographic variables
                                                                           TYPICAL OUTPUTS

                                                          1.  DOE NE Region Health and Energy Facility siting
                                                               studies
                                                          2.  DOE Midwest Region Health and Energy Facility
                                                               siting studies
3.  CEQ  studies and improved report generation



4.  DOE  and CEQ  studies



5.  CEQ  studies

6.  CEQ  Annual Report and USGS studies


7.  CEQ  and USGS NASQAN studies


8.  o  More sophisticated transforms

    o  Apply NCHS comparable ratios for
         1959-61 and 1968-71 mortality rates
                                                    Figure III-2

-------
                   DATA/CAPABILITY  ADDED

    9.  Dat.a base management system for specific
          mortality
        Industrial concentrations and
          Occupation variables

   10.  Mapping enhancements
   11.  Enhanced report generation
                   TYPICAL OUTPUTS
  9.  EPA/OTS studies
10.  •  State and regional  levels
     •  Spot mapping of rare causes of death
     •  Batch Specification Maps

11.   •  Tabular displays
     •  %Distribution and statistical
         significance testing
v£>
                                              Figure III-2  (cont'd)

-------
                                                      UPGRADE
                                                  PROJECTED USES
                           DATA  ADDED
     1.   Ambient Air Data
M
O
2.  Multiple  Causes of Death



3.  Morbidity Statistics




4.  Increase Drinking  Water Data to  Nationwide

5.  Altitude

6.  Source Data
     7.   Phytoplankton Data and National
           Eutrophication Survey (NES)
           Program Data
                         TYPICAL OUTPUTS

      1.   o  Air Quality maps  for different pollutants  and
              different aggregations

         o Report on Air Quality correlations with  mortality,
             water quality, and other  variables

      2.   o Maps of geographical variations in contributing
              causes of death with drinking water, water
              quality, and  air quality

      3.   o Maps of morbidity

          o Report on morbidity versus air and water
              quality

      4.   o Drinking water versus cancer

      5.   o Altitude versus heart  disease

      6.   o Report relation of point source  trends  to
              ambient quality  trends

      7.   o Geographical distributions  and representations
              (phytoplankton and  environmental factors)

          o Statistical evaluations  of the environmental
              requirements of well-known problem and
              special interest  algae

          o Development of phytoplankton indices of  trophic
               state
               (testing index  level modification,  and  develop-
               ment of species level indices to water quality)
Figure III-3

-------
                      DATA  ADDED
8.   Hanes  II Data
9.  Wisconsin's Industrial Effluent Data
9.
               TYPICAL OUTPUTS

o Retrieval  of baseline phytoplankton data within
    geographically  restricted areas

o Retrieval  of data from like areas for making
    preconstruction predictions relative to 208
    planning

o Selection  of lake subsets of special interest,
    (e.g.,  high or low productivity),  for com-
    parison with community  structure

o Provision  of ambient water quality and/or
    sensitive biological components for inclusion
    in multiparameter models for land-use and
    watershed management

o Opportunity to interface the water quality and
    phytoplankton  data with other information of
    specific user interest

o CO exposure in the blood by geographic area
    and/or  community

o CO levels in breath samples

o Correlation of source and  ambient water data
    in the Wisconsin  area

o Use as  pilot study for extending efforts to other
    areas
                                            Figure III-3 (cont'd)

-------
SECTION IV

-------
                   IV.

A.  Cost Assumptions
EVALUATION OF ALTERNATIVE SOLUTIONS
     1.  There are a number of assumptions that must be made to analyze costs.
They are as follows:
               Category

               Life Cycle

               Amortization
               Remote Terminal
                Station
               User Unit
                        Assumption
               5 years
               Data Extraction
                per User Unit
               Computer Usage
               Fixed Costs will be spread on
               straight line basis over 5 years

               TEKTRONIX 4014 CRT Graphics Unit
               TEKTRONIX 4631 Hard Copy device
               1200 Baud Modem

               15% of Total Analyst Time
                @ $30K/yr
               15% of Total Intern Time
                @ $10K/yr

               SAROAD = 1 per quarter

               STORET = 1 per quarter

               A user unit will average 2 runs/day
                @ 35 min/session with 4 data sets.
                It produces 40 plots/day and 10
                maps/day.
B.  One-Time Costs

     1.  Hardware Implementation

         Based upon the evaluation reports, this analysis will assume 9 poten-
tial EPA users with only 4 requiring a full remote terminal station.  The
other 5 either have terminals or access to one.  The cost of a remote terminal
station is:
         Terminal Hard Copy Printer Purchase:

         a.  TEKTRONIX 4014 Terminal with full graphics
         b.  TEKTRONIX 4631 Hard Copy Printer
         c.  1200 bps Modem
                                               11,600
                                                4,400
                                                  750
                                               16,750
     The total cost of 4 stations is $67,000.  This cost applies to OPTIONS I,
II, and III.  For Option IV there is no cost.
                                   IV-1

-------
     2.  Software Implementation

         UPGRADE modifications ranked as  essential  by EPA users are:   restruc-
turing the data base or data compression  at a cost  of $7,500;  refinement of
UPGRADE'S terse mode at a cost of $10,000;  the addition of multiplot at a cost
of $10,000; and the beginning of an internal data management system at a cost
of $5,000.  These modifications have already been funded by other users.  Only
the addition of moving means remains to be  funded by EPA for software implemen-
tation.  The total cost of software implementation  is $10,000  - this cost
applies to Options I and II.  There is no cost for  Option III  and IV.

     3.  Non-Recurring Operations - Fixed

         This is the cost of program and  data base  enhancements ranked as
necessary and desirable by the EPA users.
Type
N Data Base
N Data Base
N Data Base
N UPGRADE
N
N
N
N
D
D
D "
Year
Enhancement Planned Cost
STORET Interface
SAROAD Interface
Data Extraction from Tape Storage
Superterse Mode
Add User Analysis Routines
Plot Size
Add User Defined Models
Data Save Capabilities
Overlying Point on Graph
Alphanumeric Axis Description
Add Interactive SPSS
1
1
1
2
2
1
2
1
3
3
3
$ 15,000
15,000
3,000
35,000
9,000
7,500
20,000
30,000
10,000
5,000
15,000
$164,500
Time
(Weeks)
12
12
2
24
4
6
16
24
8
4
12
124
     The data base enhancements will be funded to completion in the first year
and the UPGRADE enhancements will be prioritized and spread out over the first
three years of the 5-year life cycle.   This cost applies,  in full,  for Option
II and would be 20% for Option III because of the limited  EPA involvement.
Option IV has no cost.   For Option I,  this cost will be replaced by the recur-
ring cost of maintaining a User Support Staff.

     4.  Non-Recurring Operations - Variable

         This cost item is only applicable in Option I, if UPGRADE  is moved
from the computer installation at NIH-DCRT to EPA-COMNET or EPA-RTP.  Based
                                  IV-2

-------
upon the CEQ User Support Group, the cost to develop UPGRADE amounts to
$411,000 and the reprogramming cost involved in moving it to EPA-COMNET is
$82,000.  The cost to move UPGRADE to EPA-RTP is $164,000, as estimated by the
CEQ User Support Group.

     5.  Summary

         Figure IV-1 summarizes the one-time costs that would be incurred for
each option in the use of UPGRADE.

C.  Recurring Costs

     1.  Fixed

         Recurring fixed costs are the same for Options I, II, and III and are
derived from a User Units' use of the system as defined in paragraph IV A.
The costs are based on an estimate of 9 potential units.   The average annual
costs are:

                                            Annual Cost             Total Annual
          Cost Item                          Per User                   Cost

Analyst Time                                  $ 4,500                 $40,500

Intern Time                                     1,500                  13,500

Data Extraction and Load                        1,000                   9,000

Terminal Hardware Maintenance
         Terminal                               1,340                  12,060
         Printer                                  660                   5,940
         Materials                              1,000                   9,000

                                    TOTAL     $10,000                 $90,000

     2.  Variable

         The average annual costs that vary between and within options are
detailed herein using the use assumptions of paragraph IV A for 9 User Units.

         a.  Option I

             Under Option I, EPA has its own version of UPGRADE and must incur
the cost of an EPA User Support Group consisting of 6 persons:  two for Data
Base, two for the system, and two for user interfaces.  This group would
perform all UPGRADE enhancement work after program conversion for COMNET and
RTP hardware alternatives.  The average contractor cost per person for this
group is estimated at $50,000/year.  Computer/Storage/Plotter charges were
acquired from the EPA evaluations, CEQ UPGRADE User Support Group test runs,
phone contacts with installation personnel, and charge algorithm analysis.
The costs under this option are:
                                   IV-3

-------
            Life Cycle One-Time Costs (in $1,000)
                         Option I


COST ITEM
Hardware Implementation
Software Implementation
Non-Recurring Operations-Fixed
Non-Recurring Operations-Variable
Years
1
NIH
67
10
0
0
COMNET
67
10
0
82
RTF
67
10
0
164
GSA
67
10
0
0
2

0
0
0
0
3

0
0
0
0
4

0
0
0
0
5

0
0
0
0


TOTAL
67
10
0
82/164
TOTALS
 77  159
     241
       77
                         OPTION II

COST ITEM

Hardware Implementation
Software Implementation
Non-Recurring Operations-Fixed
Non -Recurring Operations-Variable
Years

1
67
10
70.5
0

2
0
0
64
0

3
0
0
30
0


0
0
0
0

5
0
0
0
0

TOTAL

67
10
164.5
0
TOTALS
147.5  64
      30
                          241.5
                         OPTION III

COST ITEM

Hardware Implementation
Software Implementation
Non-Recurring Operations-Fixed
Non-Recurring Operations-Variable
Years


1
0
0
33
0


2
0
0
10
0

3
0
0
10
0

4
0
0
10
0

5
0
0
0
0

TOTAL

0
0
63
0
TOTALS
 33
10
10
10
                          Figure IV-1
                          IV-4
63

-------
ITEM
Computer Usage
Data Storage
Plotter
User Support Group
TOTALS
(* in $,000)
Annual Cost Per User
NIH
12.5
4.32
1.25
-
18.07
COMNET
75
.864
0
-
75.864
RTP
52.5
5.6
0
-
58.1
GSA
50
10.8
1.25
-
62.05
Total Annual Cost
NIH
112.5
38.88
11.25
150
462.63
COMNET
675
7.77
0
150
982.77
RTP
472.5
50.4
0
150
822.9
GSA
450
97.2
11.25
150
858.45
         b.  Option II

             Under Option II EPA, as a co-sponsor of UPGRADE, requires a
central UPGRADE co-ordinator and CEQ liaison.  This would be one person at a
contractor cost of $50,000/year.  The costs (in thousands) for this option
are:
          Item

          Computer Usage

          Data Storage

          Plotter

          Coordinator

                TOTAL

         c.  Option III
Annual Cost
 Per User

   12.5

    4.32

    1.25
   18.07
Total Annual
    Cost

    112.5

     38.88

     11.25

     25.

    187.63
             The costs under this option would vary with user involvement and
a central UPGRADE coordinator would not be necessary.   The costs involved
would be the same as Option II's Annual User Cost of $18,070.  For 4 users,
the average annual cost would be $72,280.

         d.  Option IV

             No cost is involved.

D.  Development Lead-Times

     Approval of this Feasibility Study by MIDSD in the normal time frame
requires one month.  Should approval be granted for EPA participation in the
development and use of UPGRADE, preparation of an appropriately worded IAG
between EPA and CEQ may be required.  This IAG would take about one month to
be prepared and approved.  The IAG may include provision for programming the
User requirements and necessary system conversions, depending on the option
                                   IV-5

-------
approved.   After IAG approval,  under Option I,  contract  negotiations may be
required for acquisition of an  EPA UPGRADE User Support  Group.   Under Option
II, an EPA UPGRADE Coordinator  would be designated.   Also  at  this time,  poten-
tial users would request terminal equipment,  if necessary.  Those users  with
equipment would begin using the system to the extent possible until the  Essen-
tial User Requirements are operational - 3 months.   Even in Option I where
conversion is required, limited use of the system could  be made, until such
time as the alternate hardware  installation is  operational.   This would  be
approximately 6 months for COMNET and 12 months for  RTP.   Charts of these time
relationships are presented in  Figures IV-2,  3, and  4.   There are no charts
for Options III and IV.  Option III requires  only MIDSD  approval, preparation
of individual user lAGs with CEQ, and use of  the system.   Option IV requires
no further action.

E.  Cost/Benefit and Cash Flow  Analysis

     1.  Cash Flow Analysis

         Cash flow analyses were made for Option I (all  alternatives) and for
Option II in order to provide a measure of the  value of  each  option and  the
alternatives within the options.  Cash flow analyses of  Options III and  IV are
meaningless because of the number of potential  UPGRADE users.   All costs were
taken from paragraph IV-B, One  Time Costs, and  from  paragraph IV-C, Recurring
Costs.  Deviation of quantifiable and non-quantifiable benefits are described,
in paragraph 2 of this section.

     Figures IV-5 thru 9 show the costs incurred for Development and Operation
of each of the Options and the  results (total present values) in order of
precedence are as follows:

                                                    Total  Present Value  ($000)

           Option II                                          1099.7

           Option I, Alternative 1                             210.8

                         11       4                           (1439.7)

                                 3                           (1455.7)

                                 2                           (2040.5)

     It is obvious that the computer usage charge is the dominant factor in
the analyses, and with this small volume of potential users (and thus limited
benefits), only use of the NIH  computer facility allows  a  positive total
present value over the costed life cycle (5 years).   An  increase in the  number
of users, or an increased judgment in non-quantifiable benefits would improve
the present value picture for other alternatives, but as long as NIH computer
charges are significantly less  than other facilities, the  comparative analyses
will remain the same.
                                    IV-6

-------
    Action Steps
Secure MIDSD Approval
Prepare IAG for Use
of UPGRADE
Program Essential
User Requirements
Contract for User
Support Services
        1 mo. I
                   3 mos.
                                                       Development Lead-Times
                                                              Option I  Alternatives  1  and  4
                                              Continuous
                3 mos.
                    / ]\   Secure MIDSD Approval

                    /2\   Approve IAG

                     '3
Order additional User terminal
   stations as requested
                                                             6  mos.       12 mos.    24 mos.  36 mos.  48 mos.     60 mos.
Issue User Support Group
   contracts.

Install terminal equipment
   upon arrival
                                                           Figure IV-2

-------
CO
           Action Steps
       Secure MIDSD Approval
       Prepare IAG for Use
       of UPGRADE

       Convert UPGRADE for
       New Installation
       Program Essential
       User Requirements
       Contract for User
       Support Services
               1 mo.
                                       A
                                                             Development Lead-Times
                                                                    Option I  Alternatives  2  and  3
                           6 itios •
                                          .Alt. .3...
                                           3 mos.
                                                   Continuous
                                                                                                          -4-
       0              3 mos.

/1\   Secure MIDSD  Approval

 ^2\   Approval IAG

       Order additional User  terminal
          stations as requested
                                                           6 mos.    9  mos.    12 mos.   2k mos.  36 mos.  48 mos.  60 mos.
                                                                                     Issue User Support Group
                                                                                        contracts

                                                                                     Install  terminal equipment
                                                                                        upon  arrival
                                                                  Figure IV-3

-------
    Action Slepa
Secure Ml USD Approval
Prepare  IAU  for  use
of Ul
Program Essential
User Kc(|iiliuuiunLu
Program Necessary
     DU Requirements
Program Nucetiaary
llacr ut'CUADU
Uiiijulruiuentti


fru^rum Den 1 rattle
Ituer III'CKADU
Kct|ul ruumnca
3 moa.
                                            3 noa.
               3.5 mos.
                                    Development  Lead-Timea

                                         for Option II
                           I     B.6
                                                             n>OB.
                                                                                                                                -l-
               6  moa.
12 mos.
24 moa.  36 moa. 48 mos.   60
                          moa.
                           Secure MIDSU approval

                           Deulgnace  EPA UPGRADE Coordinator

                           Order additional User terminal
                              stations  aa requested
                          rA\   Install terminal equipment
                                   upon arrival

                                Train Users aa requested
                                                           Figure IV-4

-------
     2.  Benefit Derivations

         Benefits include quicker response, better accuracy, a wider range of
problem solving capabilities, a decrease in time consuming repetitious manual
calculating, an increased ability to analyze crossmedia data, etc.  With all
these improvements comes better visibility and greater respect from other
agencies.  However, many of these benefits are not directly quantifiable so
most of the benefits calculations shown in Figures IV-5-9 will be in the form of
cost avoidance or cost savings.

     Users have estimated that during their testing of UPGRADE, analyses and
plots could be produced 5-10 times faster using UPGRADE as could be produced
manually, if they could even be produced manually.  At least 40% of the tests
were considered too complex to have been done manually at all.

         Approximately 10% of the users doing testing already had access to
some automated system with which they could minimally perform the test func-
tions.  These users estimated that use of these existing systems takes approxi-
mately twice the cost of UPGRADE to produce the same results.

         Therefore, benefits were calculated from the cost avoidances in not
having to perform the analyses as at present i.e., 10% by inferior automated
means, 50% by manual means, and 40% too complex to quantify completely.  Thus,
the quantifiable and non-quantifiable benefits have been calculated as follows:

         a.  Quantifiable Benefits.

             (1) Replace present automated procedures (10% of users)
                 Assume costs are the same as UPGRADE system.
                 Operating Costs (exclude all Development costs.  Use factor
                 of 2, and 10% of users.  277.6 X 2 X .10 = 55.5
             (2) Replace present manual procedures (50% of users)
                 Assume User Personnel, Data Extraction and Plotting cost 5-10
                 times more than UPGRADE when done manually.  Data Storage is
                 the same and other costs are not involved.  Also assume that
                 the UPGRADE enhancements improve efficiency so that during
                 the first year, UPGRADE can be used 5 times more efficiently
                 than manual means, the second year 7 times, the third year 9
                 times, and 10 times more efficiently thereafter.
                 1st year  (54+0+11.25)( 5)(.50)+(38.9)(.50) = 205.1
                 2nd year  (54+9+11.25)( 7)(.50)+(38.9)(.50) = 279.3
                 3rd year  (54+9+11.25)( 9)(.50)+(38.9)(.50) = 353.6
                 4th year  (54+9+11.25)(10)(.50)+(38.9)(.50) = 390.7

         b.  Non-Quantifiable Benefits

             (1) Worth equivalent to manual procedures (40% of users)
                 Assume the complexity of manual procedures described above
                 for those 40% of analyses which would probably not now be
                 attempted.
                 1st year  (74.25)( 5)(.4)+(38.9)(.4) = 164.1
                 2nd year  (74.25)( 7)(.4)+(38.9)(.4) = 223.4
                 3rd year  (74.25)( 9)(.4)+(38.9)(.4) = 282.9
                 4th year  (74.25)(10)(.4)+(38.9)(.4) = 312.6
                                       IV-10

-------
                                  OPTION I ($000)
                                Alternative 1 (NIH)
Development Costs
   Remote Terminals Purchase
   Software Implementation
   Software Enhancement
   Computer Site Change
   User Support Personnel

Operating Costs
   Remote Terminal Maint. & Materials
   User Personnel (Analyst & Intern)
   Data Extraction
   Data Storage
   Computer & Plotter Usage
   User Liaison Personnel
Total Costs
Quantifiable Benefits
Non-Quantifiable Benefits
Net Savings
Present Value Factors
  (10% Discount Rate)
Year
N
67.0
10.0
0
0
300.0
27.0
54. 0
9.0
38.9
123.7
0
629.6
260.6
164.1
(204.9)
1.00
N + 1




300.0
27.0
54.0
9.0
38.9
123.7
0
552.6
334.8
223.4
5.6
0.91
N + 2




300.0
27.0
54.0
9.0
38.9
123.7
0
552.6
409.1
282.9
139.4
0.83
N + 3




300.0
27.0
54.0
9.0
38.9
123.7
0
552.6
446.2
312.6
206.2
0.75
N + 4




300.0
27.0
54.0
9.0
38.9
123.7
0
552.6
446.2
312.6
206.2
0.61
Discounted Savings
(204.9)      5.1    115.7    154.7    140.2
Total Present Value
 210.8
                                       Figure IV-5
                                     IV-11

-------
                                  OPTION I ($000)
                               Alternative 2 (COMNET)
                                                              Year
Development Costs
   Remote Terminals Purchase
   Software Implementation
   Software Enhancement
   Computer Site Change
   User Support Personnel

Operating Costs
   Remote Terminal Maint. & Materials
   User Personnel (Analyst & Intern)
   Data Extraction
   Data Storage
   Computer & Plotter Usage
   User Liaison Personnel

Total Costs
Quantifiable Benefits
Non-Quantifiable Benefits
N
67.0
10.0
0
82.0
300.0
27.0
54.0
9.0
7.8
675.0
0
1231.8
260.6
164.1
N + 1




300.0
27.0
54.0
9.0
7.8
675.0
0
1072.8
334.8
223.4
N + 2




300.0
27.0
54.0
9.0
7.8
675.0
0
1072.8
409.1
282.9
N + 3




300.0
27.0
54.0
9.0
7.8
675.0
0
1072.8
446.2
312.6
N + 4




300.0
27.0
54.0
9.0
7.8
675.0
0
1072.8
446.2
312.6
Net Savings
Present Value Factors
  (10% Discount Rate)
 C807.1)   (514.6)  (380.8)  (314.0) (314.0)
    1.00      0.91     0.83    0.75    0.68
Discounted Savings
 (807.1)   (468.3)  (316.1)  (235.5) (213.5)
Total Present Value
(2040.5)
                                       Figure IV-6
                                     IV-12

-------
                                  OPTION I  ($000)
                                Alternative 3  (RTF)
                                                              Year
Development Costs
   Remote Terminals Purchase
   Software Implementation
   Software Enhancement
   Computer Site Change
   User Support Personnel
                                                N
                N+l   N+2   N + 3   N + 4
        67.0
        10.0
         0
       164.0
       300.0    300.0   300.0   300.0   300.0
Operating Costs
   Remote Terminal Ma int. & Materials
   User Personnel (Analyst & Intern)
   Data Extraction
   Data Storage
   Computer & Plotter Usage
   User Liaison Personnel
27.0
54.0
9.0
50.4
472.5
0
27.0
54.0
9.0
50.4
472.5
0
27.0
54.0
9.0
50.4
472.5
0
27.0
54.0
9.0
50.4
472.5
0
27.0
54.0
9.0
50.4
472.5
0
Total Costs
Quantifiable Benefits
Non-Quantifiable Benefits
      1153.9    912.9   912.9   912.9    912.9
       260.6    334.8   409.1   446.2    446.2
       164.1    223.4   282.9   312.6    312.6
Net Savings
Present Value Factors
  (10% Discount Rate)
      (729.2)  (354.7)   (220.9)   (154.1)  (154.1)
         1.00     0.91    0.83    0.75    0.68
Discounted Savings
      (729.2)  (322.8)  (183.3)  (115.6)  (104.8)
Total Present Value
     C1455.7)

Figure IV-7
                                     IV-13

-------
                                  OPTION I ($000)
                                Alternative 4 (GSA)
                                                               Year
Development Costs
   Remote Terminals Purchase
   Software Implementation
   Software Enhancement
   Computer Site Change
   User Support Personnel
          N     N+l   N + 2   N+3   N + 4

        67.0
        10.0
         0
         0
       300.0    300.0   300.0   300.0   300.0
Operating Costs
   Remote Terminal Maint. & Materials
   User Personnel (Analyst & Intern)
   Data Extraction
   Data Storage
   Computer & Plotter Usage
   User Liaison Personnel

Total Costs
Quantifiable Benefits
Non-Quantifiable Benefits
27.0
54.0
9.0
97.2
461.2
0
1025.4
260.6
164.1
27.0
54.0
9.0
97.2
461.2
0
948.4
334.8
223.4
27.0
54.0
9.0
97.2
461.2
0
948.4
409.1
282.9
27.0
54.0
9.0
97.2
461.2
0
948.4
446.2
312.6
27.0
54.0
9.0
97.2
461.2
0
948.4
446.2
312.6
Net Savings
Present Value Factors
  (10% Discount Rate)
      (600.7)   (390.2)  (256.4) (189.6) (189.6)
         1.00     0.91    0.83    0.75    0.68
Discounted Savings
      (600.7)   (355.1)   (212.8)(142.2) (128.9)
Total Present Value
     (1439.7)

Figure IV-8
                                     IV-14

-------
                                  OPTION II ($000)
                                                               Year
Development Costs
   Remote Terminals (4) Purchase
   Software Implementation
   Software Enhancement*
   Computer Site Change (Reprogramming)
   User Support Personnel
    N     N+l   N + 2   N + 3   N + 4

  67.0
  10.0
  70.5     64.0    30.0
   0
   0
Operating Costs
   Remote Terminals Maint. & Materials
   User Personnel (Analyst & Intern)
   Data Extraction
   Data Storage
   Computer & Plotter Usage
   User Liaison Personnel (Coordinator)

Total Costs
Quantifiable Benefits
Non-Quantifiable Benefits

Net Savings
Present Value Factors
  (10% Discount Rate)
27.0
54.0
9.0
38. 9
123.7
50.0
450.1
260.6
164.1
(25.4)
1.00
27.0
54.0
9.0
38.9
123.7
50.0
366.6
334.8
223.4
191.6
0.91
27.0
54.0
9.0
38.9
123.7
50.0
332.6
409.1
282.9
359.4
0.83
27.0
54.0
9.0
38.9
123.7
50.0
302.6
446.2
312.6
456.2
0.75
27.0
54.0
9.0
38.9
123.7
50.0
302.6
446.2
312.6
456.2
0.6i
Discounted Savings
 (25.4)    174.4   298.3    342.2    310.2
Total Present Value
1099.7
*As a cosponsor of UPGRADE this cost is a worse case analysis.  Some of
 these costs will probably not be incurred by EPA but by other users.

                                        Figure IV-9
                                     IV-15

-------
F.  System Acceptance Ratings

     Determination of System Acceptance Ratings  for EPA using UPGRADE has been
measured against the Acceptance Criteria and presented  in Figure IV-10.   The
analysis for each rating is discussed in the following  paragraphs.

     1.  Number of EPA Users - The number of identified users with defined
requirements is 9.  The user level for each option is as follows:

                         OPTION            User  Level
                            I                 >15

                           II                 5-15

                          III                   <5

                           IV                    0

The rating assigned to Option I is "2" because there are less than those
required.  This applies to all alternatives of Option I.  The number of users
fits within the limits of Option II so it is assigned a rating of "10".   It
also exceeds the user limits of Option III and is also  assigned a rating of
"10".  Under Option IV, a rating of "0" is assigned because there are identi-
fied users and Option IV would not satisfy any of their needs.

     2.  Satisfy User Needs - Option I will satisfy the user needs in varying
degrees.  On NIH's computer, user needs will be  completely satisfied, this
system has good response and little downtime. It is assigned a rating of
"10".  At EPA's COMNET installation, UPGRADE would have to operate in a non-
virtual memory mode.  Some users will have to wait until others are finished
because of the number of UPGRADE size memory partitions available.   It is
assigned a rating of "9", because the wait will  depend  on the number of users
on the system at one time.  Users of EPA's computer at  RTP, N.C. already have
delays getting on the machine.  Installation of  UPGRADE on this computer would
foster further delays.  It is assigned a rating  of "8".  A commercial computer
facility such as GSA's Boeing Computer Services  will provide good user service
to stay in business.  It has the same computer equipment as NIH, thus it is
assigned a rating of "10".

     Option II implies that EPA, although it will have  great influence, will
not be able to completely control UPGRADE'S development and satisfaction of
EPA User needs.  Thus, it is assigned a rating of "9".
     Option III gives EPA little influence or satisfaction of EPA user needs.
For the most part, EPA must use what is available.   It is, therefore, assigned
a rating of "4".

     Option IV satisfies none of the user's needs and is assigned a rating of
"0".

     3.  Expansion Capacity - Under Option I, the NIH installation has data
base expansion constraints due to the limitations imposed for on-line disk
storage.  It is assigned a rating of "2".  EPA's COMNET computer has greater
data expansion capability but has severe user expansion contraints by the
                                   IV-16

-------
current operating system.  Effectively, each user requires the maximum amount
of UPGRADE core for the entire analysis session.  Four users require 2 million
bytes of core dedicated.  It is, therefore, assigned a rating of "7".  If the
operating system changes in the future, this rating would change.  The compu-
ter at RTF has a finite capacity for on-line users; the limit is 40.  There
are other users of this computer.  Therefore, a rating of "6" is assigned.
The GSA installation will add capability to support users with money.  It is,
therefore, assigned a rating of "10".

     Option II has limited expansion capability, again constrained by NIH's
data storage limitations.  However, under this option it is assigned a rating
of "5" because other agencies will also be underwriting system expansion.

     Option III is assigned a rating of "1" because EPA will have little
influence over the nature of future system expansion.

     Option IV is assigned a rating of "0" because where there is no involve-
ment, there is no expansion.

     4.  Implementation Cost - For Option I at NIH, the costs would be nominal
because the system is running there.  At COMNET, the cost would increase
greatly because major program modifications are required.  It is, therefore,
assigned a rating of "4".  Before the computer is used at RTP, the system must
be rewritten, thus a rating of "1" is assigned.  At GSA no reprogramming is
required.  The computer is compatible with NIH, a rating of "9" is assigned.

     Option II is assigned a rating of "9" because only nominal costs are
involved.

     Option III is assigned a rating of "10" because the cost of co-sponsor-
ship is removed.

     Option IV is rated "10" because no costs are involved for this option.

     5.  Operating Cost - The recurring operating costs over the 5-year life
cycle are low at NIH, but under Option I the total cost of a User Support
Group must be added.  The NIH alternative, under Option I, is thus assigned a
rating of "8".  At COMNET, computer charges are significantly higher, thus a
rating of "6".  At RTP, the computer charges are comparable to COMNET, thus
RTP is also assigned a rating of "6".  At GSA the computer charges are higher,
thus a rating of "5" is assigned.

     Option II is assigned a rating of "9" due to the cost of co-sponsorship
for the User Support Group.

     Option III is assigned a rating of "10" because it has the lowest opera-
ting costs.

     Option IV has no operating costs; it also is assigned a rating of "10".

     6.  Cost Savings - Identified cost savings are significant and should
increase with use of the system.  However, they must be offset by Implementa-
tion costs amortized over the 5-year life cycle along with annual operating
costs.  Relative cost savings are high in Option I.  On the NIH computer
                                      IV-17

-------
alternative,  the cost savings are reduced by the  total  cost of the User Support
Group, thus a rating of "9" is assigned.   At COMNET operating costs are higher
and some reprogramming is involved to offset savings,  thus a rating of "7".
At RTF, considerably more reprogramming is involved,  reducing cost savings
again, thus a rating of "5".   At GSA, the cost savings  are only reduced by the
higher operating costs.  A rating of "8"  is assigned.

     Option II produces the most cost savings, thus a  rating of "10".

     Option III has reduced capabilities, thus reduced  analysis and reduced
cost savings.  A rating of "5" is assigned.

     Option IV is assigned a rating of "0".  "No  use"  implies no savings per
application.

     7.  Time Benefits - Increases in response time range from 5-10 times in
some applications to immeasurable in other.  The  ability to respond to ad hoc
queries is one of UPGRADE'S most useful features.   In  Option I, Alternative I,
NIH, the response time benefits are rated "10".   Alternative 2 is rated "9"
because of the possibility of delay for a user getting  on the system.   Alter-
native 3 is rated "8" for the same reason.  Alternative 4, GSA, will have'the
same response time benefits as NIH, thus  a rating of "10".

     Option II response time will be slightly lower,  a  rating of "9",  because
EPA does not have complete control over satisfying user needs.

     Option III control is even less, thus a rating of  "8".

     Option IV has no time benefits, thus a rating of  "0".

     8.  Analysis Benefits - The analysis improvements  are high with UPGRADE.
On a relative basis, all alternatives of Option I rate  a "10" because EPA can
fully satisfy all user needs.  Under Option II, EPA is  a co-sponsor and has
reduced control and a rating of "9".  Under Option III, EPA has very little
control so the rating is reduced to "4".   And under Option IV, EPA has no
control, thus a rating of "0".
     9.  EPA Controlability - Under Option I,  EPA controls the system, thus a
rating of "10" for all alternatives.  Under Option II that control is reduced
and a rating of "7" is assigned.   Option III entails very little control, thus
a rating of "2".  Option IV has a rating of "0".
    10.  Risks - There are both economic and technical risks involved in using
any system.  Fortunately, UPGRADE is a system that has been in operation for
approximately 3 years, so the overall risks are significantly less than with a
new system.

     Under Option I, the NIH alternative is assigned a rating of "7" because
of EPA divorcing itself from CEQ and supporting its own version of UPGRADE.
The risk is even greater with the COMNET alternative, a rating of "4", due to
the program modification required.   At RTP, the risk increase is even greater
because of complete reprogramraing.   A rating of "1" is assigned here.  At GSA,
the risk is comparable to NIH, thus a rating of "7".
                                   IV-18

-------
     Option II has a rating of "9" strictly because of the co-responsibilities
of sponsoring a User Support Group.

     Option IV has considerable risk by not responding to outside mandates and
criticism.  A rating of "0" is assigned.

    11.  Totals - The weighted rating is derived by multiplying the factor
weights by the raw ratings.  The complete picture is contained in Figure
IV-10, p. IV-20 .  The System Acceptance Ratings are:

           OPTION            Weighted Rating            Rank

             I - 1                417                     3
               - 2                373                     5
               - 3                315                     6
               -4                429                     2
            II -                  505                     1
           III -                  385                     4
            IV -                   80                     7
                                    IV-19

-------
                                              UPGRADE  ACCEPTANCE CRITERIA
(Rating is from 0-10)
(Weight is from 0-10)

Max.
Score Criteria

100 1. Number of EPA Users
60 2. Satisfy User Needs
60 3. Expansion Capability
40 4. Implementation Cost
40 5. Operating Cost
60 6. Cost Savings
80 7. Time Benefits
80 8. Analysis Benefits
20 9. EPA Controlability
60 10. Risk




Weight

10
6
5
4
4
6
8
7
2
5
.System Acceptance
Racing
Rank
OPTION
I
ALTERNATIVES
1
NIH
R
2
10
2
9
8
9
10
10
10
7
W
20
60
10
36
32
54
80
70
20
35
77
417
3
2
COMNET
R
2
9
7
4
6
7
9
10
10
4
W
20
54
35
16
24
42
72
70
20
20
68
373
5
3 1
Rr
R
2
8
6
1
6
5
8
10
10
1
pp
W
20
48
30
4
24
36
64
70
20
5
57
315
6
GSA/Boeing
R
2
10
8
9
5
8
10
10
10
7
W
20
60
40
36
20
48
80
70
20
35
79
429
2
OPTION
II


Cosponsor
R
10
9
5
9
9
10
9
9
7
9
W
100
54
25
36
36
60
72
63
14
45
86
505
1
OPTION
III

Limited
Use
R
10
4
1
10
10
5
8
4
2
10
W
100
24
5
40
40
30
64
28
4
50
64
385
4
OPTION
III

NO
Invol\
R
0
0
0
10
10
0
0
0
0
0
ement
W
0
0
0
40
40
0
0
0
0
0
20
80
7
<1
o
                                                       Figure IV-10

-------
SECTION V

-------
                              V.  RECOMMENDATIONS

A.  Initial Recommendation

     Based upon the evaluation of Section IV, the most cost effective solution
for implementation of the UPGRADE system is Option II.  It has the highest
System Acceptance Rating and the greatest net cost savings.  Its cost savings
are at least $350,000 more than the next option.  It presents an additional
analysis capability for EPA that has been successfully utilized by CEQ, DOE,
and some of EPA's evaluators.  Other options may be selected at EPA discretion
should the number of users significantly change.  UPGRADE enhancements termed
"Essential" by the user evaluations and not currently funded by other UPGRADE
users have been reduced to one.  It is the addition of "Moving Means".  EPA,
being a co-sponsor of UPGRADE, may wish to foster an increase in user training
for UPGRADE analysis to obtain maximum capabilities and benefits.

B.  Subsequent Recommendation

     During this study, a very detailed program analysis was conducted of
UPGRADE.  From this analysis, a number of ways were found to improve program
operation and reduce core requirements (see Appendix F).  EPA, as a co-sponsor
of UPGRADE, may wish to foster maintenance and software configuration control
programs that fulfill the users defined requests for standardization of termi-
nology and follow the recommendations in Appendix F.

     In the event the combined user base of UPGRADE should ever exceed the
capabilities of the NIH-DCRT installation, EPA may wish to consider the alter-
nate site analysis within Option I of this study.  That is, UPGRADE can be
moved, when necessary, to a commercial installation similar to the GSA-Boeing
computer center and the NIH-DCRT installation could be retained as a develop-
ment installation for cost purposes.

C.  Time Phasing of Enhancements

     Accomplishment of EPA user requested enhancements must be on an UPGRADE
user-wide priority basis.  However, it is recommended that EPA provide funds
and exert influence to ensure the enhancements defined in Section II and
scheduled for accomplishment in Section IV be programmed in the time frame
recommended.

D.  Project Plan

     The next step upon MIDSD's approval of this study, is for EPA to prepare
the appropriate IAG with CEQ containing program requirements and schedules, as
provided in Figures IV-2, 3, and 4 and Figure V-l.

     The project plan for programming enhancements is based upon the practical
limitation that, only a programmer experienced in UPGRADE coding can make the
modifications and enhancements required.  The programmer must also be capable
of operating within the CEQ Support Staff conventions for handling UPGRADE
Software configuration control.  Thus, a minimum of parallel efforts are
scheduled.
                                     V-l

-------
     The sequence of tasks is determined by relative priority of the feature.

     Each task can be broken down into the standard subtasks of maintenance/
enhancement programming:

            review design vs. existing code and files
            verify interfacing of new routines into system
            technical review of approach
            prepare detailed program (routine) specification
            prepare test specifications
            code
            check code in test version of UPGRADE
            compile program documentation
            update system documentation to reflect new features
            review
            system regression test
            install code in production version of UPGRADE

Management tracking by these subtasks (or a subset thereof for small enhance-
ments) will provide a management visibility to the the project as detailed as
management chooses to make it.
                                    V-2

-------
              ACTION STEPS
f
CO
MIDSD APPROVAL
JAG
ESSENTIAL RQMTS
  MOVING MEAN
NECESSARY D.B. RQMTS
  Improved STORET Interface
  Improved SAROAD Interface
  Data Extraction from
    tape storage
NECESSARY UPGRADE RQMTS
  SUPERTERSE Mode
  Add Users Analysis
    Routines
  Plot size
  Add User-defined Models
DESIRABLE UPGRADE RQMTS
  Data Save Capabilities
  Overlying Points on Graph
   Improved Alphanumeric
   Axis Description

  Add Interactive SPSS
                                                                                                  PROJECT PLAN
                                                                                                 ALL  USER NEEDS
                                                                                                  PROGRAMMING
                                                                                             (PERSON  TIMES  IN MONTHS)
                                                                     \
                                                                     12
                                                                                              I
                                                                                              24
36
                                                                         CALENDAR MONTHS
                                                          Figure V-l

-------
APPENDIX A

-------
    APPENDIX A




HISTORY OF UPGRADE

-------
                              Section I - History


Under NEPA, the CEQ was established to (among other requirements):

     •  gather timely and authoritative information concerning the conditions
        and trends in the quality of the environment..., to analyze and inter-
        pret such information...
     •  review and appraise various (Federal) programs and activities in light
        of NEPA... make recommendations to the President...
     •  develop and recommend...  national policies to foster and promote the
        improvement of environmental quality...
     •  conduct investigations,  studies, surveys, research, and analyses
        relating to ecological systems and environmental quality...
     •  document and define changes in the natural environment... accumulate
        necessary data and other information for a continuing analysis of
        these changes or trends  and an interpretation of their underlying
        causes...

     Early on, CEQ recognized that to be fully responsive to the requirements
of its charter, it would need some type of cross-disciplinary analysis tool
which would assist in correlating environmental, natural resources,  public
health and related data.  Because environmental policies, programs,  and deci-
sions at all levels had to be guided by the best analytical information avail-
able, the required analysis tool - and its resultant data - would have to be
readily accessible to both policy makers, or managers, and scientists.

     With a clear concept of what is needed, CEQ developed the UPGRADE system
to:

     •  provide easier access to computerized environmental data
     •  facilitate more efficient and convenient environmental assessments
     •  foster increased uses for available environmental data
     •  provide better capabilities for identifying correlations between
        factors represented in different computerized data banks, and
     •  improve environmental research and data collection programs  through
        the insight and feedback provided by the users of the UPGRADE system.

     Sine 1975, CEQ has been actively collaborating with other federal agencies
in screening and selecting data  for use in the UPGRADE system.  From the time
it was first implemented, UPGRADE has been used to support various federal and
state government projects including:

        CEQ Annual Reports
        Other CEQ analyses for reports to the White House and/or OMB
        State of New Jersey environmental analyses
        The Environmental Protection Agency
        The Department of Energy, and
        'Mapping of NCHS cancer death rate data.

     (Several case histories of water and air quality studies, using UPGRADE,
are presented in Section II of this appendix).

-------
     Support for the above projects has come in the form of:   straight data
analyses; maps (for NASQAN, NCHS,  and county and population-weighted air,
water and health data); and graphs, charts,  and bar-graphs relative to air
and water quality and health,  oil  spills and coal data.   In addition, UPGRADE
has been used to support various as-needed projects such as:

     •  Pollution funding grants vs. city population
     •  Energy use in the U.S. (from 1850 -  1975), and
     •  ORDIS program elements vs.  funding.

     Thus far, most of UPGRADES's  data have  been selected from data banks
maintained by the Environmental Protection Agency (primarily SAROAD and STORET),
the National Institutes of Health,  the National Oceanic  and Atmospheric Admin-
istration, and the U.S. Geological Survey.

     With the basic system development and extensive testing of UPGRADE com-
pleted, CEQ is planning to increase the available access and user support for
UPGRADE to serve other federal, state, academic, and private organizations.
                                       A-2

-------
                          Section II - Case Histories

Twelve Rivers Study

     425 monitoring stations were selected on the mainstreams of 12 rivers.
These stations were chosen from STORET, EPA's computerized water quality data
system, on the bases of sampling location, monitoring methods, period of
record, sampling frequency, and other criteria.  In general, the data base
chosen for this evaluation includes the best ambient trend monitoring stations
on the 12 rivers under consideration.

     Of the several kinds of-statistical analyses and indicators used, the
most informative trend indicator seems to be the composite violation rate.
This indicator represents the proportion of all measurements of a specific
water quality variable which exceeds the "benchmark" value for that variable.
Benchmark values were determined on the basis of published water quality
criteria, standards, or in some cases, arbitrarily chosen reference points.

     Fecal coliforra bacteria was the main variable chosen to illustrate trends
in sanitary water quality.

Done by - GKY
Dave Tucker - Principal Investigator
Background - EPA did a study on water quality trends - as called for in section
305A of the FWPCAA - report to Congress - CEQ studied 12 rivers as a follow-up
to the EPA report.


25 Cities Study - Water Quality Trends

     The CEQ-EPA analysis of sanitary water quality and sewage treatment in 25
U.S. municipal areas from 1968 - 1976.  In general, the trend was toward
better sanitary water quality.  Areas chosen are a representative cross section
of the nation's municipal areas.  An important factor in the selection was the
adequacy of water quality data in EPA's STORET data bank.  Water quality data
from a total of 100 monitoring sites were analyzed (average of 4 sites per
city).  Monitoring sites and data were screened in consultation with EPA
regional personnel for adequacy of the variables measured, measurement frequen-
cy, period of record, geographic location, and other criteria.  The UPGRADE
system was used to analyze the data.

     Coliform bacteria, DO and BOD were considered in the water quality anal-
yses for the study areas.

Investigators:  Jim Reisa, Steve Fullerton, Mimi Hayman, Bill Chapman, Ed
Pechen, Alec McBride (EPA)
Background:  CEQ looked at top 150 cities (population-wise) which they anal-
yzed for available data - came up with 25.


Pesticides

     Between 1972 and 1975, EPA cancelled the sale of and prohibited most uses
of the persistent insecticides DDT, aldrin, dieldrin, chlordane, and heptachlor.
                                       A-3

-------
To determine how the ban of these chemicals  and  trends  in pesticide use affect
aquatic contamination,  CEQ studied pesticide residues in the water and sedi-
ments of streams and other water bodies in Texas,  Louisiana, and Oklahoma.
The study area was chosen on the basis of a  history of  heavy pesticide use and
availability of adequate stream monitoring data.   Sixty stream monitoring
stations operated by the U.S.G.S., in cooperation  with  state agencies, pro-
vided data for the study.

Investigators:  Reisa,  Fullerton, Hayman and Dale  Bottrell


Phenols

     CEQ analyzed phenol pollution of the Ohio River and its major tributaries;
measured at 22 monitoring stations by the Ohio River Sanitation Commission.
Two benchmarks were used:  (1) the 1 microgram-per-liter water quality criterion
(the standard used by Ohio and Illinois and  recommended by EPA for the protec-
tion of public water supplies) and (2) the 5-microgram-per-liter level.

Investigators:  Reisa,  Fullerton, Hayman
Coverage:  1968 - 1976


Mahoning River Study

     CEQ analyzed water quality in the Mahoning  River Basin.  The emphasis of
the study was on water quality impacts of 'industrial effluents, including iron
and steel producers, blast furnaces, and coking  operations.

     The Mahoning River Basin has been the subject of intensive analysis by
EPA because of the fear that imposing strict pollution  control requirements on
the antiquated iron and steel production facilities would lead to widespread
social and economic dislocation.

     The study was done for the Administrator of EPA as a backup for making
policy decisions regarding the 1976 lawsuit  filed  by the state of Pennsylvania,
the Sierra Club, and others.

Investigators:  Reisa and Fullerton
Time frame:  ('68) 1972 - 1976 (note:  data  available only from 1972)


NASQAN (National Stream Quality Accounting Network) Studies

     This is an on-going study - reported annually - with different variables
selected each year, depending on priorities.  The  (network) program was started
during the 1975 water year (i.e., Oct 74 - Oct 75).  It is a mapping project,
as opposed to other studies.

     NASQAN collects uniform data at the downstream ends of 334 subregional
drainage basins that collectively cover the  entire surface of the nation.
Each of the NASQAN monitoring stations measures  the same water quality vari-
ables, with the same frequency, using the same methods.
                                       A-4

-------
Investigators:  Reisa and Fullerton
Geographic Area:  Nation, including Alaska and Hawaii and Puerto Rico


Urban Air Quality Impacts of the National Energy Plan; An Assessment of Six
Cities

Principal Investigators:  K.H. Jones, T. Chapman, Randi Ferrari, John
Walker (ERCO)

     EPA's Office of Air Quality Planning and Standards and DOE's Office of
Planning, Analysis and Evaluation have made estimates of future emissions down
to the country and city level, respectively.  EPA projected emissions for 902
counties while DOE did the same for all of the 243 air quality control regions
(AQCR's). The results of these two studies were used to generate the emission
projections for this study.  All three studies, as far as increased coal utili-
zation was concerned, assumed that all provisions of the NEP would be fulfilled.

     The EPA study projected that 126 counties could have air quality viola-
tions in 1985 if the NEP were instituted based on single station data in each
county.  For this study, six counties were selected which represented the
worst air quality (with respect to three criteria pollutants) in 1975 of the
counties cited by EPA.  The air quality data for TSP, SO. and NO. were collated
using the CEQ UPGRADE capability.  Some restrictions were placed on the air
quality data base.  Several monitoring methods are known to be unreliable and
therefore excluded from the analysis.  These data were then analyzed in accord-
ance with the averaging times of the primary health related standards, e.g.,
annual and 24 hourly intervals for TSP and SO. and annually for NO..  A short-

term hourly distribution of NO. was also run to simulate the potential problems

which might be associated with a short-term standard for this pollutant.  In
order to reflect the population impacts, the monitoring sites where the criti-
cal 24 hour data were taken were located on urban population density maps.

     A simple proportional rollback/rollup model was used to relate future
emissions to future air quality at all of the stations in a county.  This is,
in most cases, a conservative estimating technique.  Generally, most emissions
do not impinge equitably on all monitoring sites in an urban region.  If we
assume that all of the emissions increase the air quality at all of the sta-
tions we are projecting a worst case from a population exposure point of view.
The only cases where this would not be true would occur where (1) present tall
stack sources would become low level sources of emissions or (2) all of the
emissions growth was allocated to a very few point sources in heavily popula-
ted areas.  The use of more sophisticated diffusion models are needed to
resolve such geometry errors in proportional air quality projections.  Models
and their necessary special and temporal inventories have yet to be developed
for most cities.
                                          A.-5

-------
A Methodology for Estimating Differential Populations  Impacts Under Various
Ozone Standard Scenarios

Principal Investigators:  Kay Jones (CEQ),  Tim Chapman,  Mike Airey, Mark
Feldman

     EPA is reevaluating the national ambient air quality standard (NAAQS) for
ozone based on a reexamination of the health related data.   The question
arises as to what are the possible differential health risks between the
current standard and alternatives which may be more lenient.  In order to
perform such an evaluation one needs to be able to estimate actual population
exposure under various projected control scenarios. CEQ, over the past year,
has been developing a risk analysis capability as part of its overall environ-
mental data analysis system i.e., UPGRADE.

     An ozone air quality data base was extracted from CEQ's UPGRADE data bank
and examined for quality and completeness.   The 6 Denver ozone monitoring
sites were then located on population density maps so  that sectorial popula-
tions could be assigned.  Using a transformation routine in UPGRADE, base year
distributions for 1975 were generated and distributions assuming different
control levels were projected.  These distributions were then analyzed in
terms of various ozone standard scenarios and population exposure statistics.


Air Section CEQ "Annual Report" 77-78

Description:

Trends were/will be assessed using CEQ's air quality data as retrieved from
EPA's SAROAD and manipulated by UPGRADE into frequency distribution.

Project Principal Investigator:  Kay Jones, Ph.D.

Study Coverage (Variables, Geographical Area, Time Period):  Selected sites
(good data - complete) from nation.  Years 73-76


Internal (draft) work on implications of proposed short-term NO^ standard.
Description:

UPGRADE analytical procedures were/will be used to compile "Laundry List" of
cities in violation of proposed new standard as compared to baseline cities.

Project Principal Investigator:  Kay Jones, Ph.D.

Study Coverage (Variables, Geographical Area, Time Period):  Selected sites
(continuous monitors) for regions in violation of current standard.  N0_,
1973 - 76
                                          A-6

-------
Air Section CEQ "Annual Report" 77 - 78

Description:

Risk - frequency distributions for sites in selected SMSA's were/will be
generated to evaluate exposure parameters for risk assessment.

Project Principal Investigator:  Kay Jones, Ph.D.

Study Coverage (Variables, Geographical Area, Time Period):  Selected SMSA's
(good data - complete) CO and Ozone.  1975 base year.
                                         A-7

-------
APPENDIX B

-------
                APPENDIX B




COMPARISON OF UPGRADE TO OTHER EPA SYSTEMS

-------
                  COMPARISON OF UPGRADE TO OTHER EPA SYSTEMS

     The available documentation on the EPA STORET and SAROAD systems has been
studied to determine the extent of graphics capability built into each system
and to compare their capabilities to the graphics available in UPGRADE.  (See
Table B-l.)

     The comparison reduces to STORET vs. UPGRADE, since, according to avail-
able SAROAD documentation, there is no graphics capability in SAROAD.  Although
the SAROAD documentation is from 1971 (User's Manual), 1973 (Terminal User's
Manual), and 1974 (SAROAD Interactive Access System), there has been no ad-
dition of graphics to SAROAD in more recent years.  The SAROAD system is ori-
ented primarily toward the storage and retrieval of aerometric data and some
rudimentary statistical operations on the data, such as grouping, means and
standard deviations, are available.  These are also available during UPGRADE
processing of SAROAD data.

     According to the June 1977 STORET manual, the following graphics programs
are available:  PLOT, LOG, MSP, and REG.  Each of these will be discussed in
relation to the graphics capabilities of UPGRADE, and also the STORET program
STAND, which is not a graphics program but does produce output similar to the
bar-charting capability in UPGRADE.  Copies of the relevant pages of the STORET
manual are included in Appendix F.

I.   STORET PLOT Program

         The PLOT program produces scattergram or polygon-type plots of vari-
     ables versus time.

                PLOT                                     UPGRADE

     only time on x-axis                   time or any other variable on
                                            x-axis

     stream loadings                       not currently available (requires
                                            data loaded to UPGRADE)

     log values on y-axis                  log values either axis, also
                                            probability axes

     limits for variable values            full data filtering capability

     batch mode                            interactive mode

     produces CALCOMP plot (one            TEKRONIX hard copy or CALCOMP, or
      plot per tape)                        line printer, etc.

     Scaling can be made uniform           can be done with slight difficulty
      over series of plots

II.  STORET LOG Program

         The LOG program is the mapping program for STORET.  It serves mainly
     to show the locations of the water monitoring stations and does not plot
                                    B-l

-------
00
I
to
Requirements
Data
Access
For:
Air Quality
Water Quality
Demographic
Health
Other Fnvir.
User's Daca
DATA Listing
Data Manipulation
Basic Statistics
PolvKon Pines
Dar Charts
Reqrassion
Percent Jliis
Happing
Correlation
Interactive vs. Batch
User background rcq.
Dace of information
STORE!
NO
Main EPA DB
NO
NO
NO
ONLY IF IN STORET
YFS. BATCH JOB
FILTERING
BATCH JOB
STRAIGHT LINE
NO
REG PROGRAM
DFGREE=1
STAND program
LOG program
REC program
B.itch
SOME TEXT EDITOR &
FOLIOl' MANUAL
June '77
ADP01T
NO. bur could
SET UP TOR IMMEDIATE
COULD HE ADDED
COULD Bi: ADDED
COULD BE ADDED
YES, THROUGH UTS
TO TERMINAL OR OTHER
I/O PFVirF.
FULL TRANSFORM
CAPABILITY
YES; PLUS AGGREGATION
AND OTIII:R STATISTICS
STRAIGHT OR DASH
SHADED WIDTH & NUMBER
TONTROI
DECRTE-1 ro 9
FULL USLR PROGRAMMED
NO
NO as standard
CAN BE USER PROGRAMMED
INTFRACTIVE
FORTRAN or BASIC helpful
not rcq'd; learn manual
June '75
SAROAD
Main EPA DB
NO
NO
NO
NO
ONLY IF IN SAROAD
YES, BATCH JOB
FILTERING
BATCH JOB
NO
NO
NO

NO

llarrh
Some text editor &
follow manual
"71, <71, «74
UPGRADE
BY TAPE FROM SAROAH
BY TAPE FROM STORET
VERY LITTLE
BY TAPE FROM NCHS
SOME: oil spill, etc.
YES, GENERAL INTERFACE
TO TERMINAL OR DISK
FILTERING TRANSFORM PLANNED
YES; PLUS SAS
STRAIGHT OR DASH
SHADED
NUMBER CONTROL
DFfiRFF-l rn A
PARTITIONING SUBROUTINES
NASQAN, COUNTY,
DFMnrP «nuio
AS PART OF REGRESSION
TNTFRAfTTVF
No computer's data analysis
•3 It 1 lie ran'H (no fir il 1 \
Aiipucr '7ft
                                  Table B-l.   Graphics Capabilities vs. Other EPA Systems

-------
     any indication of values of variables;  in fact, it does not retrieve any
     data.   Plots can include county lines,  city outlines, lakes, rivers, etc.
     The scale of the map is controlled by the user.

         The UPGRADE capability most similar to this, currently covers county
     level and NASQAN basin maps.  UPGRADE produces the appropriate data set
     which is then mapped using VITRO software and plotters.  At present, the
     data bases for the NASQAN and the demographic county maps are being put
     directly on UPGRADE.

         The UPGRADE mapping capability produces maps with up to five shading
     levels (cross-hatching, etc.) giving a visual indication of the geographic
     and/or demographic-geographic distributions of pollutants, morbidity/
     mortality, and other variables available on UPGRADE.

III.  STORET MSP Program

         The MSP (Multiple Station Plot) program produces plots of statistical
     values of variables observed over a specified time period; y-values are
     plotted against an x-axis scale of distance along a stream - thus, mul-
     tiple stations on the stream.  Again, this is batch mode and produces
     CALCOMP or line printer output.  UPGRADE does not now have this exact
     capability.  The statistical capabilities are there; stations can be
     grouped, and plots produced for any two variables.  One technique that has
     been used on UPGRADE for similar analysis is to use automatic sequencing
     to produce a series of plots, each one showing time series or violation
     percent at one station.  Studied in sequence, these plots provide a similar
     picture of the situation as a function of the station's position on the
     stream.

         Also in UPGRADE, users of the IDB (Integrated Data Base) have avail-
     able to them an option of plotting data versus a "geographical profile" on
     the x-axis.  Since this data is at the county level, the interpretation is
     not as straightforward as for stations on a stream.  Some water, health,
     and other variables are available on the IDB and more are being added.

IV.   STORET REG Program

         The REG program produces a scattergram plot of two variables, or a
     variable and time, along with a fitted regression straight line.
                REG
           UPGRADE
     Batch mode
Interactive mode
     Output on line printer; user
      must draw in line
     Only first order polynomial
     Can plot two different (or
      same) variables at two
      different sites
Plot on TEKTRONIX or CALCOMP or
 line-printer; much higher reso-
 lution available

Up to 6th order polynomial fitted
 to data

Can only plot data at one site, or
 grouped stations, but not one state
 versus another.  (IDB can plot one
 point for each of many stations.)
                                   B-3

-------
V.   STORET STAND Program

         This is a non-graphics program that can compare data values to a
     standard and compute various violations summaries.   Through the use of the
     partitioning of data in UPGRADE, a user can achieve these same statistics
     and then go on to graphically present the results (typically with a bar
     chart showing percentage or number of violations on y-axis versus time on
     the x-axis).

         Other UPGRADE graphics capabilities not found in STORET programs in-
     clude the use of shaded bar charts to represent data, user control of grid
     lines, tick marks, axis annotations and other "plot modifications", SAS
     statistical procedures applied to data to be graphed, data partitioning on
     graphed data, etc.

         One of the big differences is that UPGRADE is interactive and STORET
     is batch mode.  Thus, the UPGRADE user can produce many different graphs
     in an hour, each one influencing the next, while the STORET User must wait
     hours or days for each plot (and is cautioned not to do many in a short
     time).
                                     B-A

-------
Comparison of Data Analysis




            and




     Graphics Features








          ADROIT




            vs




          UPGRADE
           B-5

-------
     ADROIT (Automated Data Retrieval and Operations Involving Timeseries) was
developed by UNIDATA, Inc. under contract to the state of Michigan for use in
research on the effectiveness of water quality control procedures.  It is
oriented toward the analysis of large amounts of data extracted from STORET and
(apparently) stored in a manner similar to STORET1s, i.e., using parameter
numbers, STORET's station codes, etc.

     ADROIT is an interactive system for analysis of water quality data by
rapid retrieval,  statistical processing, and graphic display.  It is basically
an interpreter for a special purpose, problem-oriented programming language,
designed to produce retrospective statistical analysis of this data and report-
ready graphs of user-selected results.

     ADROIT had two major subsystems:  the ADROIT Computation Subsystem (ACS)
and the ADROIT Display Subsystem (ADS).  An additional related program (COMPOSE)
is available to further process, combine, and replot any of the graphs produced
by ADS.  (Detailed documentation for all programs is contained in the June 1975
ADROIT manual.)

     ADROIT operates under MTS* in a manner similar to the way UPGRADE operates
under TSO.   Thus, a complete evaluation of the ADROIT system would require
study of the MTS capabilities.

     The ADROIT special purpose interpretive programming language was designed
specifically for the analysis of timeseries data types.  In addition, two new
data types - obs and timeint - were invented.  Obs is a four-tuple of values,
comprised of the mean, sample variance, sample weight, and time associated with
the data.  Timeint holds beginning and ending time along with interval width,
thus allowing easy time period restriction and data aggregation.

     Functions available to operate on variables in ADROIT include:  Type
Conversion (obs to numeric, numeric to obs, timeint to scalar, etc.); Statis-
tical (inverse normal, Chi-square, Student's t test, Inverse Fisher's F);
Summation (sum vector); Informational (minimum, maximum, length of vector);
Numeric Computational (ABS, SQRT, EXP, LOG, etc.);  Time Series (aggregation,
'simultaneous' observations, extract 'simultaneous1 observations, restrict time
range, etc.), and Hydrological (dissolved oxygen saturation, Water Quality In-
dices [on temperature, turbidity, DO, BOD, pH, etc.]).

     One very important feature of ADROIT is that it permits the user to set up
a library of user defined procedures.  The basic purpose is to allow the user
to store a set of commands and execute all of them by typing in the name of the
procedure (with any passed parameters).  Also, control flow statements are
fully available for use in procedures, thus giving full programming flexibility.
This current capability of ADROIT goes at least as far as the planned "superterse"
capability in UPGRADE (and probably well beyond it in terms of "subroutine"
nesting, access to auxiliary files, etc.).
     *MTS:  Michigan Terminal System - the time-sharing operating system of the
University of Michigan Computing Center.

                                      B-6

-------
     The following is a sample procedure which uses control flow in checking
end of file on the station number file:

         PROCEDURE WQIPHOS.(STRING,OBS,TIMEINT)
         OBS WQINDX, EPAPAR
         OPEN &1
         READSTA
         WHILE .NOT. EOF
         WQINDX = WQI.(&3)
         EPAPAR = RESTRICT.(&2,&3)
         PRINT CURSTA, WQINDX, EPAPAR
         READSTA
         ENDWHILE
         CLOSE
         RETURN
         ENDPROC

         This procedure could be invoked by typing WQIPHOS.('HURSTA1,P665,TIME
         70 THRU 74)

     The above sample points out the fact that ADROIT uses non-English language
as compared to the English prompting questions of UPGRADE.  Use of ADROIT would
therefore require a thorough reading of the manual to learn the system's capa-
bilities, and would probably require referencing the manual for a while to
learn the exact forms of the commands.

     Once learned, however, and especially with use of the user procedure lib-
rary provision, ADROIT could respond more easily and quickly to the require-
ments of the user.  In this regard, UPGRADE favors the casual or occasional
user.  (Although a programming background would reduce learning time for a new
ADROIT user more than for a new UPGRADE user, such a background is not con-
sidered to be a requirement for learning either system.)

     These ADROIT user procedures can also contain graphics control commands of
the ADS.  An example of a more complex procedure that would produce either
graphs or data printout (depending on value of 'LOGICAL') follows:

         PROCEDURE WQIPHOS.(STRING,OBS,TIMEINT,STRING,LOGICAL)
         OBS WQINDX, EPAPAR
         OPEN &1
         READSTA
         WHILE .NOT. EOF
         WQINDX = WQI.(&3)
         EPAPAR = RESTRICT.(&2,&3)
         IF &5
         %BEGIN DISPLAY COMMANDS
              GRAPH WQINDX,'10000'
              EXEC
              PCHR = +
              LINE
              CRVE =1, 2, DASH
              HOLD
              EXIT
              GRAPH EPAPAR, &4


                                      B-7

-------
              AUTO
              PCHR = +
              LINE
              CRVF - 1, 2, DASH
              EXIT
         %END DISPLAY COMMANDS
         ENDIF
         IF .NOT. &5
         PRINT CURSTA, WQINDX, EPAPAR
         ENDIF
         READSTA
         ENDWHILE
         CLOSE
         RETURN
         ENDPROC

     A sample graph as might be produced by this procedure is shown in Figure
B-l.

     In the ADS, command keywords are available to modify the "structure" of
the graph.  A distinction is made between "background" elements of the graph
and "data" elements of the graph.  Here is a list of the two sets of elements:

         Background                        Data Group

     a)  the axis system               a)  plotting characters
         (x-horizontal,y-vertical)     b)  solid line
     b)  x tick marks                  c)  dashed line
     c)  y tick marks                  d)  smooth curve
     d)  x grid lines                  e)  least squares curve
     e)  y grid lines                  f)  bar
     f)  x tick mark labels            g)  general text
     g)  y tick mark labels
     h)  x axis title
     i)  y axis title

     Figure B-2 shows samples of background elements; figure B-3 shows data
elements.

     After specifying many graphs by automatic or manual modes, the ADROIT user
may then use the stand-alone program COMPOSE to format several graphs on one
page of final CALCOMP output, and also to add other text, arrows, boxes, etc.
Figure B-4 shows addition of text to two graphs combined on one page.  Figure
B-5 shows addition of graphics (circles, lines, boxes) to these graphs.  Figure
B-6 shows possible composition layouts (a maximum of six graphs per page is
allowed).

     The following table shows some differences between ADROIT and UPGRADE that
are not mentioned in the preceding discussions.
                                    B-8

-------
               ADROIT
          UPGRADE
     •   axes:1inear,log

     •   provision for 3 colors
          on Calcomp

     •   curve smoothing (connect
          points with smooth curve)

     •   "explain" facility for
          commands

     •   multiple plots per graph
          frame

     •   up to 6 graph frames per
          page

     •   regression: degree =
          1 through 9

     •   draw confidence limits or
          standard deviation bars
          around plotted data
          aggregation points

     •   Make changes to graph without
          redrawing on screen

     •   Produce plots on CALCOMP or
          display screen

     •   water quality and user data
         No map interface
SUMMARY
•   axes:linear,log,probability

•   only one color


•   not available


•   "help" explanations of
     questions

•   only one plot per graph
     frame

•   only one frame per page
    regression: degree
     1 through 6

    not available
•   Redraw for any change
    plots on CALCOMP, ZETA, line
     printer or display screen

    water, air, health, and user
     data

    NASQAN, county, demographic
     maps
     The main difference between ADROIT and UPGRADE is the mode of user inter-
action.  UPGRADE asks English questions; the user answers "yes", "no", "help",
or "01", "13", and so on, as appropriate.  The user doesn't have to really
understand where he or she is going, though the results may be questionable if
this understanding is lacking.  On the other hand, ADROIT simply waits for the
user to issue a specific command instruction, executes it, then waits for
another command (via keyword commands).  More is required of the user, but
more analytic capability is readily available (especially through the procedure
library) as a result.

     Other differences are relatively small or cosmetic since the basic motiva-
tion of the two systems is remarkably similar i.e., providing an interactive
graphics system for environmental data.
                                   B-9

-------
                                         STN  =  810002
0.05
             71        72        73       74
                TIME  OF  OBSERVATION
75
            Figure B-l.  Total Phosphates at Huron River Station

-------
7
                   Y-AXIS TITLE
                 14.-s
               CD

               X
               o
Y-TICK MARK
  LABEL   —^
                             Y-TICK MARKS
                                   AXES
                                      +
                                                                 X-GRID LINE
                                                                      Y-GRID LINE
                                                       X-TICK MARKS
                                                   +
                                      10.       15.       20.

                                   TEMPER ATUREr DEG.  C
                               X-AXIS TITLE
                                                           25.
30,
                                                                    X-TICK MARK
                                                                      LABEL
                            Figure B-2.  Graph Background Elements

-------
            GENERAL TEXT
"
                                                   SMOOTH CURVE
THIS  IS
GENERAL
 TEXT
           DASHED
            LINE
                             SOLID
                             LINE
               LEAST
              SQUARES
                FIT
         PLOTTING
         CHARACTERS
                                                              BAR
                           Figure B-3.  Graph Data Elements

-------
WATER QUALITY INDICATORS
       100.T
     UJ 80.
     z
       so.
       40.
     < 20
        0.
             !
         70
71    72    73
  TIME OF OBSERVATION
                                   75
                          STATION 580047
                           HURON RIVER
                            BERLIN TWP.
        0.05
          70
 71    72    73    74
   TIME OF OBSERVATION
       Figure B-4. Original Graphs with Textual Annotation
                    B-13

-------
TER  QUALITY  INDICA TORS
  100.T
 X
 8 80-
 Z
   60."
 a «•
 fg

 I-
   o.
•      ^       J
    70
  0.05
       LOW
         _L
     71     72    73    74
       TIME OF OBSERVATION
                               75
                     STATION 580047
                      HURON RIVER
                       BERLIN TWP.
         71     72    73    74
           TIME OF OBSERVATION
                          75
    Figure B-5. Examples of Graphical Annotation
               B-14

-------
          1
1
                                                              1
                                                             1
-2:
              Figure B-6.   Possible
               Composition Layouts


                      B-15

-------
PGM=PLOT,

The PLOT program plots the values  of each selected  parameter  (y-
axis) for each selected station  for the specified time period  (x-
axis).  Values plotted may take  the form of raw concentrations
(e.g. mg/1)  or loadings  (e.g.  Ibs/day).  Options include scale
control and  plotting of symbols.   Plots are produced  on EPA's
digital plotting equipment  (CALCOKP) and disseminated to users.
             ITMCT
              cnnom     LNGL
             «C !« 91.2 0*9 Zl »«.C 1
              IMG IIWE 10 111 c
             WOSS    (iicnicm
             I (WE Sift* I Or      OTO7U
             HI OHTOMICBN HIVtK
             IMMFM
              eooo fen BWTM ttws oo
             tint wrrs
This  PLOT program output plots the values  of  parameter 010«5
 (Iron,  Total, ug/1)  stored at station 070009  for the years  1973
through the first half  of 1976.
                                B-16

-------
PGM=LOC,

The LOG program plots a map of a user defined area,  and  plots  a
symbol to denote the locations of all stations within  the area,
using CALCOMP plotting routines.  Printed output  from  the mapping
routine includes a listing of all stations and their associated
latitude and longitude coordinates.  Plots can optionally include
outlines of cities, lakes, reservoirs (where available),  and
county lines.  Plotted stations can be tagged with coded
identifiers which cross-reference to the printed  output.
                                                        . '"O'fCTiaw OCCMCT

                                                    STORET SYSTEM
                                                   tt*l.i '• IMMI It > H »LIIM«*
This LOG program output plots the  locations  of  stations along the
Flint River in Michigan,  along with the outline  of  the  counties
through which the river flows.  The stations  are tagged for easy
cross reference to a  list of station  locations.
                              B-17

-------
PGM=STAND,

The STAND program compares the observed values of selected water
quality parameters with a set of values (criteria)  specified
within the retrieval request.  These criteria could be the state
or Federal standards currently in force for a particular stream
segment of interest.  Stored parameter values which do not
satisfy the criteria comparison are flagged with an asterisk  (*)
in the program output.  The program provides for various output
formats, including violations lists and violations summaries.
VIOLATIONS KITH
AKB NT/LAKE

DATE TIHC
73/06/04 0*00
71/09/11 0915
14/07/19 0110
SUPPORTING PARAMETERS
00011 00300 00400 00(10
MATER DO Pd NB3-N
TEMP
rAHH
(1.
((.
78.
TOTAL
HC/L SI) HC/L
9. 7.
1. (.
e. 7.
070D09 LONCL
46 14 11.2 0(9 21 4B 0 1
LONG LAKE 10 41 E HATERSMEET
J4053 HICBIGAN
LAKE SUPERIOR 070793
MS OHTONACOH RIVER
I4ACNFS9
0000 PEET DEPTH CLASS 00
01045 31J01
I ROM TOT COLI
PE.TOT HP I NENDO
UG/L /lOOKL
1500 0>
•00.0*
(20.0* 10 0
AM ITT/LAKE
SUMMARY OP VIOLATIONS ON




•0 OF VALUES
MEAN
MEDIAN
NO or VIOLS
PERCENT VIOL
HINIHUR VIOL
MEAN VIOL
MAXIMUM VIOL

00011
MATER
TEMP
PAHN
32
eg.
(2.
0
0
0.
0.
0.

00300
00

MG/L
22
10.
9.
0
0.
0.
0.
0.

004CJ
PU

SU
32
(.
7.
0
0.
0.
0.
0.

00(10
MH3-N
TOTAL
MC/L
S
0.
0.
0
0.
0.
0.
0.

070009 LOkGL
4( 14 31.2 019 21 4B 0 1
LONG LAKE 10 MI E HATERSMEET
24053 MICHIGAN
LAKE SUPERIOR 070793
MB ONTONAGON RIVER
14ACNPS9
0000 PEET DEPTH CLASS 00
SAMPLES COLLLCTED PROM 73/01/0] TO 74/07/13
0104S
IRON
PE.TOT
UC/L
15
1S6.I
120.0
1
7.
(20.0
(20.0
(20.0
500.0
J1501
TOT COLI
HF1MENDO
/100HL
10
294. S
ts.o
2
20.
UO.O
1150.0
1SOO.O
(00.0
Two formats of the STAND program are shown above: Violations with
Supporting Parameters, and a Violations Summary.      ati™s wxtn
                             B-18

-------
PGM=MSP,

The MSP  (Multiple Station  Plot)  program performs a number of
statistical computations on the values of selected parameters,
and plots the resulting values as a function of the stations
selected.  The program allows the user rather extensive  control
over the format of  the resulting plots.  Parameters to be
plotted, scaling and  axes  control, statistical values to be
plotted, stations to  be grouped, and line printer or digital
plotter output are  all user-optional specifications.
     stontt
            o
          KILCJ
                         to
                                5THTICN PlOT IKSfl
                             run IUI2I 10 isioa?
                               SO    120
                                          ISO
                                                110
net no. i
>IO "IKS
                                                      
-------
  PGM=REG,

  The REG program allows the  computation of  the best-fit  straight
  line  relationship between a parameter and  time, between two
  different parameters  at the same  site, or  between  the same or
  different parameters  at two different sites.   The  program permits
  specification  of features of the  linear regression analysis to  be
  performed, including  the specification of  time periods,  and
  maximum abscissa and  ordinate values.
             SUMMARY t>ACC

CORRCLATIO* t »£G»r.SS!0« ANALrSIS HO DUALITY l-ARAHETERS SAKE SITE.
  STATION! 070009       LONG LAKE 1C MI C (UTERSHECT
  LATITUDE I LONGITUDE: 46 II 31.2 08i 21 46.0 1
  ABJCISSA PARAMETER: 00011
  OROINATt PARAKETEK: OC300
HATER
 DO
TAHN
MG/L
  •EOUtSTtO
  ANALYSIS rROK: 1»73/ 1
                TOi l»7i/ 7
  KCtlVU
  ANALYSIS tun, l»7}/ I/ J TOi 1«7»/ 7/1J
  REGRESSION LINE:
  ORIGIN IS 0.0
            Y.   I7.J43 .
 CORRELATION COEFriCIENT:    -0.»5
 CUErnciENT or DETEIMINATION: o.*i
       STANDARD ERROR OF ESTIMATE:     O.S4}>!
       STANDARD ERROR Or IHTCKCEPTl    O.S^OJO
       STANDARD LRROR OF bLOPc:      U.00906
       T VALUE TOR 1NTCRCCPT:       31.334)2
       T VALUE rOR SLOPE:         14.11313
                                                  4I.2&00     ii.bOOO     il.7100

                                                       »TER   TEMP  rum
                                                                           71.0000
  This run of the REG  program  displays the  best-fit  straight line
  relationship  between two parameters  (water  temperature  and
  dissolved oxygen)  at a single station location  (station 070009).
  A summary of  the statistical computations performed is  provided
  on a summary  page which accompanies  the REG print  plot.   The
  asterisks appearing  in the plot margin denote the  intercepts of
  the regression  line  with the plot axes.  Users may draw a line
  between  these points to show the regression line.
                                   B-20

-------
                                                     PGM=PLOT,
PROGRAM DESCRIPTION:
          The PLOT program retrieves data from the WQF and plots
          the values of each selected parameter for each selected
          station  (Y-axis) for the specified time period (X-
          axis).  Values plotted may take the form of raw
          concentrations  (mg/1), loadings (Ibs/dy), or
          logarithms.  The program allows for the control of the
          plot format including size control and plot symbols
          used.

GENERAL KEYWORD APPLICABILITY:

          All general keywords described in Sections H and 5 are
          valid with the PLCT program.  The parameter keyword P
          may be specified up to 10 times.

OTHER NOTES AND COMMENTS:

          The program generates the plot data onto a magnetic
          tape which is forwarded to EPA headquarters for
          plotting and dissemination to the user.

          Users should exercise constraint in producing a large
          number of plots in a relatively short period of time.
          Each specification of fPGM=PLOTr' generates a separate
          plot tape which is sent to EPA by the computer services
          vendor.  Requests for a large number of plots in a
          short period of time may result in a backlog of tapes
          to be plotted at EPA and a depletion of the number of
          tapes available for use as plot tapes.  If you have any
          questions relating to your particular plotting
          requirements, please call STORET User Assistance.

          The following keywords described in the MEAN program
          may be used with the PLOT program to plot calculated
          values:

               LOAD      allows computation of stream loadings
               LOG       calculates logarithmic values
               LL1...    specifies acceptable ranges for parameter
                         values
               LV        establishes lower limit  for a parameter
                         value
               HV        establishes upper limit  for a parameter
                         value
               calculation of dissolved oxygen saturation
               calculation of un-ionized ammonia


                             B-21

-------
PROGRAM OUTPUT:
            Plot values of iron (P=10<»5)
            The starting  date  for the
            x-axis (time)  is to be
            January  1,  1973.
                    PGM=PLOT, PURP=305B/STA,
                    A=1UAGNFS9,5=070009,
                    P=10U5,
                    BD=730101,
                    PRT=NO,
1
                 STONfl
                  070009     LOHCL
                 If IV 31.2 085 21 UB.O
                  LO«C LBKE 10 HI t HOT
                 26053    KlrnlCIlN
                 L>I«E SUPEMO"      070
                 KB ONTONRGOK hIVER
                 IMOUFSI
                  0000 fEt' OEP'H CLRSS 00
                   too  m

                TJME OPT?
                                 B-22

-------
CLASSIFICATION:  Program Associated Keywords
                                                       FACT
                                                        SYM
                                                     PGM=PLOT,
USE:
          These keywords may be used with the PLOT program to
          specify the size of each plot and the plotting symbol
          to be used to represent the plotted data points.
KEYWORD FORMATS AND VALUES:
FACT=n,    where n is any numerical value, including a decimal
           value, between 0.1 and 5 inclusive.  The specified
           value is the multiple by which the basic plot size of
           5 1/2" x € 1/2"  (FACT=1.0) is increased or decreased.

SYM=mss,   where ss is any two digit whole number that
           equates to one of the plotting symbols shown
           below, and where m is an optional minus
           sign which specifies that the plotted symbols
           are not to be connected by straight lines.
                               +   X  0
             00
01
02
03
04
                                           05
             Z   Y   X
             08
09
10
11
12
                                           1«
06

-5"
20
X
 07
           The value of 20 specifies that no symbol is to
           be plotted.

           Leading zeroes must be specified.
DEFAULT VALUES:  FACT=1.40,  (7.75" x 9.0")
                 SYM=02, (plot using the symbol
                         straight lines.)

NOTES ON USAGE:
                                                   ,connected by
           These keywords may be specified only once in a
           retrieval request.
                             B-23

-------
EXAMPLE (S) :


 Plot values of water temperature      PGM=PLOT,PURP=305B/STAr
 and DO as a function of time  .        A=1UAGNFS9,5=070009,
 using an "X" to mark data points.     P=11rP=300,
 The size of the plot produced         BD=730101,
 is to be approximately 11" by 13".    FACT=2,SYM=ur
                                       PRT=NO,
                             B-24

-------
                                                       sc

                                                     PGM=PLOT,
CLASSIFICATION:  Program Associated Keyword

USE:      This keyword may be used with the PLOT program to
          specify that the scales of the axes of the plots are to
          be uniform throughout the program.  The keyword SC
          specifies that the scales of all plots associated with
          stations and parameters are to be identical.

KEYWORD FORMATS AND VALUES:

          SC=A,      examines all of the data retrieved and
                     then sets scales according to the maximum
                     value for each parameter retrieved and the
                     maximum number of days.  This causes the
                     scales for the various plots to be the
                     same.

DEFAULT VALUES:  None.

NOTES ON USAGE:

          If *SC=A,f is not specified, the x- and y-axes will be
          set to the maximum and minimum sampling dates and
          maximum parameter values for each parameter at each
          station.
                              B-25

-------
EXAMPLE(S) :
  Plot values of DO for  the three
  stations specified  (i.e. three
  plots will be produced) .  The.
  sampling characteristics of the
  three stations are  as  follows:
                            PGM=PLOT,PURP=305B/STA,
                            A=1UAGNFS9,5=070002,
                            8=070006,5=070009,
                            P=300,
                            BD=730101,
                            SC=A,
                            PRT=NO,
 station
  070002
  070006
  070009
     date
730108/760713
730108/740516
730103/760713
max, value
   11.0
   16.U
   13.9
  Since  *SC=A,f is  specified the axes
  will be set as  follows  for all three
  plots.
           origin point
               730103
                  0
                end point
                  760713
                    16.U
                             B-26

-------
                                                        NOPLOT,
                                                       PGM=PLOT,
CLASSIFICATION:  Program Associated Keyword

USE:      This keyword may be used with the PLOT program to
          eliminate the plotting cf specified parameters.

KEYWORD FORMAT AND VALUE:

          NOPLOT,    specifies that the values of the immediately
                     preceding parameter are not to be plotted.
                     There is no value associated with this keyword.

DEFAULT VALUE:  Not applicable.

NOTES ON USAGE:

          If not specified, all requested parameters will be
          plotted.

          NOPLOT applies to each parameter keyword it follows,
          and can be specified as many times as is required
          within a retrieval request, up to a maximum of 10
          times.

SOME REPRESENTATIVE USES:

          For calculating loadings, a flow parameter must be
          retrieved, but need not be plotted.
EXAMPLE (S)
Plot loadings for parameter
650 at the specified station
but do not plot values for the
flow parameter (60) .
                                           PGM=PLOT,PURP=305B/STA,
                                           A=11 15D050,S=255U20,
                                           P=60, NOPLOT. P=650, LOAD,
                               3-2?

-------
                           PGM=LOC,
KEYWORDS SPECIAL TO THIS PROGRAM:
  The following keywords apply only to this program:

  SCALE      selects the desired scale for the area to
             be plotted
  NOCOUN     suppresses plotting of county boundary lines
  NOPOLPLT   suppresses plotting of the polygon
  TAGS       tags stations with a cross reference number
  STREAMS    plots streams
  CLR        plots outlines of cities, lakes, and reservoirs

  See Section 7, Advanced Retrieval Programs and their
  Special Keywords, for additional capabilities of this
  program.
                              B-28

-------
                                                      PGM=LOCC
PROGRAM DESCRIPTION:

          The LOG program plots a map of the area defined with
          any station selection method, and plots a symbol to
          denote the locations of all stations within that area,
          the state boundaries, and optionally, the county
          boundaries and polygon vertices.  The map will be a
          maximum of 2U" high  (north/south), and 49" wide
          (east/west) to include the area plotted.  Included with
          the map is a listing of all stations and their
          associated latitude and longitude coordinates.

GENERAL KEYWORD APPLICABILITY:

          All station selection keywords described in Section u
          are valid with this program.  Data selection keywords
          are not valid since the LOG program retrieves no data.
          The HEAD keyword may be specified and will appear in
          the title block of the map.   (When used with LOG, the
          text' of the HEAD keyword may not exceed 35 characters.)
          SHIFT, PRT and PRMI are not valid for LOG.

          All advanced general retrieval keywords described in
          Section 5 are valid except the PM keyword.

OTHER NOTES AND COMMENTS:

          This program utilizes a relatively large amount of
          computer resources and conseguently can be rather
          expensive to run.

          Although all station selection methods are valid, it is
          recommended that polygon selection (LT,L keywords)  be
          used to ensure the most accurate locations of stations
          plotted.
                             B-29

-------
PROGRAM OUTPUT:
 Plot a map of the area
 described by the specified
 polygon,  with symbols plotted
 to denote stations belonging
 to agency 21MICH.
PGM=LOC,PURP=305B/STAr
TAGS, STREAMS, CLR,
SCALE=250000,
L=U320,L=8UOU,L=U315f
L=833730,L=4330,L=8315,
L=U330,L=830730,L=U30730,
L=830U,L=U32230,L=825230,
L=U33730,L=825230,L=U33730,
L=82U7,L=U3a5,L=82U7,L=43U5,
L=8315,L=«»33730,L=8315,
L=a330,L=83U5,L=U330,L=8«405,
U= 21 MICH,
                            B-30

-------
CLASSIFICATION:

USE:
       Program Associated Keyword
                                                       SCALE
                                                      PGM=LOC,
The SCALE keyword may be used with the LOG program to
specify the desired scale for the map to be plotted.
KEYWORD FORMAT AND VALUES:
          SCALE=scale,
                      where scale is any numerical valqg
                      indicating the scale desired.  No
                      commas are to be embedded within
                      the value.  Units are in real-
                      distance/map-distance, so
                      SCALE=500000, will produce a
                      map at scale 1:500,000.
DEFAULT VALUE:  None.
NOTES ON USAGE:
          If this keyword is  not  specified,  the  system  will
          maximize the  scale  used,  based  upon the  size  of  the
          area to be plotted.

          If the scale  specified  would  result in a  map  over  2U"
          high  (north-south direction)  or 
-------
                                                       NOCOUN,
                                                      PGM=LOC,
CLASSIFICATION:

USE:
       Program Associated Keyword
The NOCOUN keyword may be used with the LOG program  to
suppress the mapping of county boundaries.
KEYWORD FORMAT AND VALUES:

          NOCOUN,    There is no value associated with  this
                     keyword.

DEFAULT VALUE:  Not applicable.

NOTES ON USAGE:

          On maps covering large geographical areas, the  presence
          of county boundaries on the map may hinder the  study
          and interpretation of the map.  Specifying NOCOUN  can
          help alleviate this condition, as well as reduce the
          cost of the plot.
EXAMPLE (S) :

    The map specified will be
    plotted with county lines
    suppresed.
                           PGM=LOC,PURP= 3 05 B/STA,
                           SCALE=250000, NOCOUN,
                           LT=I,
                           L=1320,L=8UOU,L="315,
                           L=833730,L=4330,L=8315,
                           L=U330,L=830730,L=U30730,
                           L=830U,L=U32230,L=825230,
                           L=U33730,L=825230,L=U33730f
                           L=82U7,L=a345,L=8247,L=<*3<»5,
                           L=8315,L=U33730,L=8315,
                           L=U330,L=8345,L=«330,L=8105,
                           U=21MICH,

-------
                                                     NOPOLPLT,
                                                      PGM=LOC,
CLASSIFICATION:  Program Associated keyword
USE:
The NOPOLPLT keyword may be used with the LOG program
(when the area whose stations are to be plotted is
defined by a polygon) to suppress mapping of the
polygon.
KEYWORD FORMAT AND VALUES;
          NOPOLPLT,
                There is no value associated with this
                keyword.
DEFAULT VALUE:  Not applicable.

EXAMPLE (S) :

   The map  specified  will be
   plotted  without  the
   polygon  outline.
                                  ,PURP=305B/STA,
                                  50000,  NOPOLPLT,
PGM=LOC,PURP=
SCALE=250000,
LT=I,
                              B-33

-------
                                                       TAGS,
                                                      STREAMS,
                                                        CLR,
                                                      PGM=LOC,

CLASSIFICATION:  Program Associated Keywords

USE:      These keywords may be used with the LOC program to tag
          stations with a cross reference number, and to plot,
          where available, streams, cities,  lakes, and
          reservoirs.

KEYWORD FOFMAT AND VALUES:


    TAGS,       specifies that each  station plotted on the map  is
               to be  tagged with a  coded  identifier which relates  to
               a listing of descriptive information for the
               stations.

    STREAMS,    specifies that stream traces are to be plotted,
               if the data are available.  (Areas  which have
               such data include the State of Michigan and
               the Southeast Region.)

    CLR,        specifies that the outlines of cities, lakes,
               and reservoirs are to be plotted,  if the data
               are available.   (Areas which have  such data
               include  the State of Michigan and  the
               Southeast Region.)

    There  are  no values associated  with any of these keywords.


 DEFAULT VALUES:   Not applicable.

 NOTES ON USAGE:

           If  TAGS is specified,  only 300 stations may be
           retrieved/plotted.   STORET will produce as many maps as
           necessary to avoid overprinting any of the tags.   Users
           should  not use this  keyword when  plotting  many stations
           located in a relatively small  area.

-------
EXAMPLE (S) :

  For the previous map  include      PGM=LOCf PURP=305B/STA,
  cross reference tags  for  the      TAGS, STREAMS ,CLR ,
  stations, stream traces,  and      SCALE=250000,
  outlines of cities, lakes        LT=I,
  and reservoirs.                   L=U320,L=8UOU,L=U315,
                                    L=833730,L=U330,L=8315,
                                    L=4330,L=830730,L=430730,
                                    L=830U,L=a32230,L=825230,
                                    L=U33730,L=825230.L=433730,
                                   L=8315,L=tt33730,L=8315,
                                   L=a330,L=83U5,L=4330fL=8U05,
                                   U=21MICH,
                             B-35

-------
APPENDIX C

-------
              APPENDIX C




         INTERAGENCY AGREEMENT




              BETWEEN THE




COUNCIL ON ENVIRONMENTAL QUALITY (CEQ)




                AND THE




 ENVIRONMENTAL PROTECTION AGENCY (EPA)

-------
                                                       EPA-IAG  D7-01226
                             INTERAGENCY AGREEMENT
                                    BETWEEN
                       COUNCIL ON ENVI30&&SNIAL QUALITY
                                    AND THE
                                      PROTECTION AGENCY
 I.  PURPOSE:  The Office of Air, Land, and Water Use (QALNU),  Office
     of Research and Development, wishes to enter into an agreement
     with the Council on Environmental» Quality (CEQ), Executive Office
     of the President, 722 Jackson Place, N.W., Washington,  D.C.  20006,
     to carry out a project, to evaluate and prepare the CEQ UPGRADE
     environmental data analysis system for use and possible co-sponsor-
     ship by EPA.  The interagency agreement will be" coordinated with
     the Office of Monitoring and Technical Support,  which will provide
     support and technical direction.  The duration of the agreement
     is one year, October 1,  1977 through September 30, 1978.

          The UPGRADE system has been developed over the period of the
     last three years by CEQ.  The system has the capability of inter-
     facing with the irajor EPA data systems.  The systerrs ease of  access,
     analytical capability, and sophisticated output formats would irake
     it a valuable tool for research, and environmental po.: -.cy and  manage-
     ment decisions at the national and regional level of £PA.   As a
     result of this agreement EPA will have directly available for its
     use this powerful analytical and decision making tool.

II.  SCOPS OF WOPK;  The work will be performed by a combination of two
     CEQ sole-source contracts plus CEQ staff efforr.  The work to be
     carried out is comprised of the following 5 tasks:

     1.   Compilation of a system survey for possible installation.
          of UPGRADE into the EPA system.

     2.   Completion of a user needs survey including demonstrations
          t!o potential users.

     3.   Completion of a system design analysis to identify possible
          need for redesign or reconfiguration.

     4.   Completion of a system management requirements analysis  to
          estimate resource requirements for installs, tico-and
          operation.

     5.   Completion, of a user's documentation package for the UPGRADE
          system and its data bases.  This manual will be widely
          circulated in draft form among EPA offices and regions,  other
          Federal agencies, State and  local users, and academic insti-
          tutions, and will be revised to reflect the comments received.


                                C-l

-------
           In addition, the documentation will include complete
           explanations of the criteria used to select data  for inclusion
           in each data base.  It will be written in clear,  easy-to-
           understand language.

           The results of these tasks will be presented  to EPA in the
      form of a final report due to EPA no later than September 30,  1978.
      CEQ will provide EPA with the original plus ten copies of the
      final report.

III.  PROVISIONS:   Changes in the work schedule or in the terras of the
      agreement rray be made by irutual consent of the Project Officers
      representing the respective Agencies, provided that no irajor change
      in the scope of the project or in the cost to the  funding agencies
      is involved.  Any irejor changes in scope or in cost shall require
      the approval of the Authorizing Officials.

 IV.  DURATION OF AGREEMENT:  This agreement is from October 1, 1977
      through September 30, 1978.  No extension of this  agreement is
      contemplated.

  V.  REPORTS:

      a.   Notice of Research Project;  Within 20 days from  the effective
           date of this agreernent the Council on Environmental Quality
           shall submit an executed copy of EPA Form 5760.1, Notice  of
           Research Project, to:  Technical Infonration  Division (RD-680),
           Office of Monitoring and Technical Support, Office  of Research
           and Development, U.S. Environmental Protection Agency,
           Washington, D.C.  20460.

      b.   Final Report:  The report resulting from this agreement for
           delivery" to EPA will be prepared in accordance with current
           ORD publication requirements.  Detailed instructions are
           provided in the attached "Handbook for Preparing  ORD Reports,
           ^5ay 1S76."  Because the effort described in this  interagency
           agreement is part of an overall Agency program, the final
           report, if published, will be assigned an EPA number and  have
           a standard EPA cover.  The title page and content of the
           report will clearly recognize the source of the described
           results to credir or identifyt the CEQ.

 VI.  PROJECT Or'FICCR:

      a.   For the CEQ

           Dr. James J. Reisa                      202/633-7107
           Council on Environmental Quality
           Executive Office of the President
           722 Jackson Place, N.W.
           Washington, D.C.  20006

                                   C-2

-------
       b.   For the EPA

            Dr. Lance A. Wallace                         202/426-4153
            Office of tonitorir.g & Technical Support
            Office of Research and Development
            U.S. Environmental Protection Agency
            Washington, D.C.  20460

 VTI.  FUNDS:

       a.   The total cost of the Interagency Agreement in FY'77 is
            estimated to be $100,000, all of which is to be paid for
            by EPA.  EPA1 s~'funds will be provided in approved FY'77
            funds from Program Element 1KC519.  It is anticipated that
            funds will be advanced in a single block.  Request for pay-
            ment should be made to the Accounting Operations Branch,
            Financial Management Division (PM-226), U.S. Environmental
            Protection Agency, Washington, D.C.  20460.

       b.   Appropriate accounting data follows:

            Appropriation No:                    687/80107
            Account No:                          761926WOA2
            Document Control No:                 W10011
            Object Class:                        25.70
            Amount:                              $100,000

VIII.  AUTHORITY:  This agreement is entered into pursuant to the Provisions
       of the Clean Air Act and the Federal Water Pollution Control Act, as
       amended, and the Safe Drinking Water Act.

  IX.  APPROVALS;


       Environmental Protection Agency         Council on Environmental Quality
       Thcraas A. i-Xirphy                        Edwin H.  Clark,  II
       Deputy Assistant Administrator          Acting Executive Director
       Office of Air, Land and Water Use
       Office of Research and Develocment
               B /      ^ ?-
                 DATE
                                    C-3

-------
APPENDIX D

-------
        APPENDIX D




UPGRADE EVALUATION REPORTS

-------
                                  APPENDIX D

                          UPGRADE EVALUATION REPORTS

1.  This appendix contains the EPA procedure to follow in evaluating UPGRADE
and the subsequent reports generated by EPA.  Each also contains quantification
data, if available.  They are in the following order.

      1.  Procedures to follow in Evaluating UPGRADE

      2.  Office of Toxic Substances

      3.  Office of Air and Waste Management, OAQPS/RTP.

      4.  Office of Research and Development, OMTS/HQ

      5.  Office of Research and Development, EMSL/Las Vegas

      6.  Office of Research and Development, EMSL/Las Vegas

      7.  Office of Research and Development, OHEE/HERL/CINN.

      8.  Office of Planning and Management

      9.  Office of Enforcement

     10.  Region III

     11.  Region X

     12.  Transaction Data from CEQ User Support Group Test.
Note:  Those EPA reports not included will be inserted when available.
                                    D-l

-------
 "-= \      LiNi^ED STATUS ENVIRONMENTAL. °RO7ZCTiON AGENCY
  '1 '.'-
-.. '„...•                       WASHINGTON. DC  2C--.SJ
   '"
                                                          REI-EARd- •^(•JC. L'^ViTI-O"'!': r-1'
                         to Follow ?.E. i:v*-lu?-:::-*s 'JI-G^VH
    irRCv:     Project Off deer frr "PGUAflS  Evaluation
              Monitoring Technology Division
              Office of Monitoring & Technical  Support (RD-680)

    TO:       UPGRADE Coordinators
              See Bel 01:

         All five UPGr.iDE Coordinators  from the Environmental Protection.
    Agency Program Offices have no\: been selected (see enclosure A).  As I
    hi. -/a aibcusscil pej.-nona.lly with yov, I  am  transriitting a fornaJ npr-j
    t.-s sf.~.ci.f>-v in pjL-f-.auer detail, the  o"r>jecci\es ai.d y ocedui.es tn be fcllowt-I
    in the Asai.cy-Wic!s LPGK/J)!; £\£.laatiou.

         We will need from the UPGPADE  Coordinator for each offica a list of
    people within his office vho ray reasonably be expected f:o have an interest
    in using or inspecting the UI'Gi-JiDE  S3'stein.   (Sone of you have already
    supplied this list.)  Tnece individuals \rill form a nucleus for whom we
    will describe the evcilnacicn project,  arnnge de^on&trationE, trair-^ng
    sessions, and whatever else is required ^o  allow an objective evaluation
    of the system.

         Any individuals who wish to learn more about UPGPJU5E or to evaluate
    the sysren will be given the opportunity  to do so over the next three
    months.  (That is, a terminal will  be  made  available and user's support
    will be forthcoioing.)  At the end of their  evaluation, we would appreciate
    receiving a report on their experiences in  a memorandum to their UPGRADE
    coordinator (with a copy to me)  containing at least the elements listed
    in enclosure B (i.e. , a description of their work with UPGPADE. the problems
    they found, and their reconmendaticns  regarding modifications in UPGRADE
    and their expected level of use) .   The evaluators may be contacted again
    to discuss the memorandum.  The  conclusions from every memorandum will be
    included in the final evaluative report which I will prepare.
                                   D-2

-------
     We need to have a Summary Memorandum from each coordinator, summa-
rizing the experiences of his office in the evaluation of UPGRADE and
including the documentation from individual evaluators.  We will need
these reports by June 1, .1978, so we can prepare a final summary report
describing" the Agency wide evaluation and making recommendations about
UPGRADE.
                                    Lance Wallace

Addressees:

     Ray Smith (AW-443)
     Bruce Rothrock (EN-320)
     Warren Muir (WH-557)
     Phil Taylor (WH553)
     Elijah Poole (PM-218)

Attachments

cc: J. Reisa (CEQ)
    K. Jones (CEQ)
    L. Milask (CEQ)
    M. Dorlester (VITRO)

bcc: ORD-CRU
     Dr. Gage (RD-672)
     Mr. Trakowski (RD-680)
     Mr. Brunot (RD-680)
     Dr. Wallace (RD-680)
     Mrs. Warner (RD-680)
     MTD (RD-680/chrono)

prepared by: RD-680/LWallace/ep/3809 WSM/ 426-2177/2-16-78
                           D-3

-------
                              Enclosure "A"
                          UPGRADE COORDINATORS
OFFICE    UPGRADE COORDINATOR                  TELEPHONE

ORD       Lance Wallace (Project Officer)       426-2175

OAWM      Ray Smith                            755-0470

OE        Bruce Rothrock                       755-0724

OTS       Warren Muir                          755-4871

OWHM      Phil Taylor                          755-1567

0PM       Elijah Poole                         755-0916
                             D-4

-------
                              Enclosure "6"
        ELEMENTS TO INCLUDE IN WRITTEN REPORT EVALUATING UPGRADE

I.   Introduction

     1.   Identification of the evaluator
               Name, telephone number, Office and Division, mail drop

     2.   Brief description of evaluator's function as it relates to UPGRADE
               Kind of monitoring data normally dealt with (source/ambient,
               air/water/food, etc.), uses of data (annual reports,  one-
               time research studies, support to regions/states)

II.  Description of Experience

     3.   Extent and nature of evaluator's experience using UPGRADE
               Number of people and total man-hours spent
                    a) Familiarizing self with system from documents &
                       demonstrations
                    b) Using the system

     4.   Description of the tasks or goals that the evaluator set for
          UPGRADE
               Major needs of the evaluator, (other data bases, rapid
               analyses, graphics capabilities, etc.)

     5.   Evaluation of UPGRADE performance in meeting those objectives
               UPGRADE features that satisfied requirements;
               UPGRADE features that did not satisfy requirements

III. Recommendations

     6.   Evaluator's recommendations
               What UPGRADE features need to be modified?  What new  data
               bases should be added?  Would his office find UPGRADE
               useful for any purpose?  If so, estimated amount of use
               UPGRADE would receive per month.  What level of user's
               support would be adequate for his office?
                          D-5

-------
                 ENVIRONMENTAL PROTECTION AGENCY




                    2nd ANNUAL ADP CONFERENCE




                       LANCE WALLACE,  ORD






Introduction




     The Office of Research and Development has recently  entered into




an Interagency Agreement with the President's. Council  on




Environmental Quality (CEQ) to carry out an evaluation of UPGRADE,




an automated, interactive graphic and statistical analysis system.  UPGRADE




(User Prompted GRAphic and Data Evaluation) has been developed by CEQ to




analyze information from a variety of environmental and related economic




and demographic data sources.  The system includes an integrated database




developed by the Council to study the relationships between environmental




pollutants and health.




     The objectives of the IAG are to determine how UPGRADE  might




best be implemented within the EPA.  It is important that all potential




users whether in the regions, laboratories, or headquarters be notified




of the existance of the IAG and be given the opportunity to evaluate




the usefulness of the system within individual program areas.




     The intent of the presentation and the following on-line




demonstration of UPGRADE (give locational information and time) is




to introduce you to the nature and scope of the UPGRADE system and its




related databases.




Background




     UPGRADE has been developed during the past 2% years at an




approximate cost of  $500,000  CEQ's initial design requirements




dictated a system that could provide advanced statistical analysis
                            D-6

-------
and graphic display of environmental trends and interdisciplinary




relationships, particulary between environmental pollutants and national




health.  The system also had to be completely accessible to the Council's



staff who did not possess specialized computer training.




     UPGRADE contains the following features:




           .Interdisciplinary analysis of w.-ter (STORE! and USGS/




           WATSTORE), air (SAROAD), health, demograaric, and




           related data.  Any type of digitized data can be used on



           the system.




           .English language prompted — knowledge of syntax




           structures or computer  systems and languages are not




           required to use the system making it accessible to




           scientists and managers without specialized training.




            Interactive and graphic orientated — UPGRADE gives the



           user  immediate feedback through on-line computing which




           allows the user to efficiently evaluate and manipulate




           analytical results.   This capability allows instant production



           of bar charts, scatter  plotting, regression analysis and




           plotting, and off-line  production of maps.




           .Access to advanced statistics — the statistical analysis




           system(SAS) is being  interfaced to the system.   SAS includes
                                /


           a wide variety of advanced statistical procedures which




           can be used  through UPGRADE




      UPGRADE has been used by  the  Council on Environmental Quality




 for a variety of environmental and health studies during the past




 year as well as  the 1977 annual  report  to Congress.
                               D-7

-------
EPA's interest in UPGRADE, stems from the AgencvJ.s perception of several



basic needs:



     • The Need for Linking Health and Environmental  Data



            Recent reports, including the 11-volume,  5-mi11ion-dollar



       NAS study of the EPA, have remarked on the inability of the Agency



       to relate ambient environmental quality to health effects.   UPGRADE



       cara provide a rapid determination of correlation between mortality



       rates and environmental variables.  These correlations  can  then be



       investigated further to see if cause-effect  relations  are involved.



       Thus in this relatively mechanical way, the  system  can  act  as a



       hypothesis-generator in much the same way as the Cancer Atlas does.



     • The Heed for Increased Analysis of Environmental Data



            The same reports have pointed out the imbalance between the



       Agency's collection of environmental data and its analysis  of that



       data.  By making such analysis available to a wide  audience of



       users, including those without computer training, it  is possible



       that much more useful analysis will result.   (There is, of course,



       the danger  that untrained or naive users will conclude more than



       the data can bear;  combating this danger will be a  challenge to



       the people  participating in the present evaluation.)



      • The Need for Rapid  Investigation of Available Monitoring Data on



       a Given Pollutant




            The "Pollutant of  the Month" syndrome is likely to continue




       in the foreseeable  future, with the attendant Congressional




       inquiries  and other requests for fast-turnaround analyses and




       naps,  charts, graphs, or other information aids. The rapid graphics
                                   D-8

-------
          capabilities of the system can help in this regard,




          although at present the environmental variables represented




          are drawn largely from the most familiar air and water




          pollutants.




     Since UPGRADE gives promise of satisfying several important needs for




the agency, Dr. Stephen Gage of the Office of Research and Development




has authorized the present evaluation.  Dr. Lance Wallace of EPA's




Office of Monitoring and Technical Support is the Project Officer for




the evaluation, and can be reached at 426-4657.  Dr. James J. Reisa is




the CEQ representative.  The subcontractors are SIGMA DATA CORP and




VITRO Automation Industries, represented by Larry Milask of SIGMA Data




Computing Corp. (202) 633-7074\ and Marc Dorlester of VITRO Laboratories



(301) 871-2512.




     The IAG prescribes five tasks:




          1.  System Installation Survey




               Presently UPGRADE is supported on the NIH System.




               Can it be supported either on COMNET or the UNIVAC




               installations at EPA?




          2.  User Needs Survey




               Prospective users must be found, given time to




               familiarize themselves with UPGRADE, and allowed




               to arrive at recommendations concerning its present




               utility in satisfying  their needs, its potential




               utility if the proper  modifications are made, etc.




               This will require an information campaign, including




               demonstrations at the  Regions and at the laboratories.
                                D-9

-------
     A questionnaire will be developed  and  circulated




     to all prospective users.




3.  Systems Design Analysis




     The systems design analysis will depend  on the




     findings of the installation survey  and  the user needs




     survey.  Detailed design specifications  will be developed




     tailored to the special installation requirements and




     the most relevant user needs.




A.  Management Requirements




     Requirements for optimum maintenance and operation of




     the system, including a data base  management system to




     be developed for UPGRADE, will be  determined based on




     the findings of the first three phases of the evaluation




     and the recommendations of the prospective users and the




     affected members of MIDSD.




5.  Documentation




     A user's manual will be developed
                     D-10

-------
                                                         22 August 1978
MEMORANDUM



To:    Lance Wallace

From:  Charlie Foole

Subj:  UPGRADE Evaluation


       Attached you will find my evaluation of the UPGRADE system.  I
have broken it down by illustrative projects and studies I have per-
formed using the system and its data bases.

      The estimates for resources expended are best guesses only.  I
have tried to keep the evaluation limited solely to projects and
studies which made use of data sets and analysis/display capabilities,
excluding the actual building of data bases and development of
capabilities.

      If the evaluation team needs more information, I will be glad to
provide it.
                                   D-ll

-------
PROJECT:
DATA BASE:
CAPABILITIES USED:
RESOURCES:
COMPARABLE RESOURCES:
COMMENTS:
NEEDS:
Product maps to study the geographic patterns of
County-level mortality in relation to water quality
across the U. S., 1968-1972.

IDE

A.  Mortality files
B.  NASQAN files

A.  SORT/RANK procedure to obtain percentiles for
    shading intervals
B.  Mapping capability (interactive) - Regular and
    population adjusted maps

A.  Approximately 100 mortality maps

    1.  Approximately $4,000 computer time, plotter
        time, etc.
    2.  Approximately 1-2 weeks analyst's time
        (obtaining percentiles, specifying maps
        interactively)

B.  Approximately 15 NASQAN maps

    1.  Approximately $600 computer time, plotter
        time, etc.
    2.  Approximately 3-5 days analyst's time

Producing maps of these data would have been impossi-
ble without this system. Manual production of the maps
would have been unthinkable.

These maps have been and continue to be used for a
number of studies and other projects. Several are
mentioned and under other Projects. One which is not
is the brief discussion of respiratory disease
mortality distributions found at the end of the 1977
CEQ Annual Report's Environmental Health Section.

Slides, poster-size enlargements and "quick and dirty"
Xerox copies of these maps are on hand and are used
for demonstration, discussions, talks, etc.

The interactive, "always ready" nature of the data
and the system enable the production of special maps
"to order."

A.  Batch-specification of maps
B.  State and regional maps (possibly on-screen)
C.  Capability for mapping time-trends, age-specific
    rates, and additional geographic areas (e.g.,
    State Economic Areas).
                                      D-12

-------
PROJECT:
DATA BASE:
CAPABILITIES USED:
RESOURCES:
COMPARABLE RESOURCES:
COMMENTS:
NEEDS:
Study the relationship between cardiovascular disease
mortality rates and constituent levels in drinking
water.

IDB

A.  Extraction software
B.  Data listing
C.  Ist-order regression/correlation

A.  Approximately 2-3 months analyst's time
B.  Approximately $1,000 - $2,000 computer time

For production of approximately 2,000 plots of
selected water quality variables and demographic
variables vs. selected cardiovascular disease
mortality rates on a race/sex-specific basis.

At least 5-10 times the above to perform these
analyses by hand.

This work appeared in the 1977 CEQ Annual Report
Environmental Health Section. Only a subset of the
correlatives were chosen to report. The speed and
ease of UPGRADE analysis enabled the production of
many plots from which we could choose in presenting
results.

Related studies (e.g., altitude vs. heart disease,
drinking water vs. cancer, etc.) are possible and
some have begun.

A.  Nationwide drinking water data (current IDB has
    data for 300-400 counties only), environmental
    data (e.g., altitude), and demographic data.
B.  More sophisticated and quicker extraction pro-
    cedures
C.  Batch processing
                                     D-13

-------
DATA BASE:
CAPABILITIES USED:
PROJECT;                Study time trends and mean violation rates of drinking
                       water constituents in selected surface supplies of
                       public drinking water systems.

                       STORET stations for drinking water supplies (specially
                       prepared tapes of STORET raw data retrievals)

                       SOTRET interface

                       A. • Station combining/automatic sequencing
                       B.  Regression/correlation plot production
                       C.  Batch processing

  	                 A.  Approximately 4 weeks intern's time
                       B.  Approximately 2 weeks analyst's time
                       C.  Approximately $2,000 computer time for production
                           of approximately 3-5,000 plots

COMPARABLE RESOURCES:  At least 5-10 times for manual production
RESOURCES:
COMMENTS:
NEEDS:
                       Preliminary results from the mean violation rate
                       analysis presented in Environmental Health Section
                       of 1977 CEQ Annual Report.  Both studies (violation
                       rates and time trends) are  currently waiting for
                       personnel to complete in-depth investigation (both
                       look promising).

                       UPGRADE "as-is" handled those analyses fine. More
                       sophisticated batch processing would have been
                       helpful. Biggest need now is someone to follow up
                       in the 2 studies in progress.
                                    D-14

-------
PROJECT:
DATA BASE:
CAPABILITIES USED:
RESOURCES:
COMPARABLE RESOURCES;
COMMENTS:
NEEDS:
Prepare ad hoc materials with quick turnaround time
for administrators, Congressional interest, etc.

A.  IDB almost always
B.  STORET drinking water tapes occasionally

A.  Extract software and data listing usually
B.  Occasionally special statistical analyses or
    special maps are prepared.

Usually 1-2 days (including midnight oil) analyst's
time. Usually $50-150 computer time.

These special analyses would not be possible in the
required time frame without the system in place and
operational.

Those needing the information would otherwise have
to rely on existing information without it (see
example under "COMMENTS" below).

These ad hoc reports are of UPGRADE'S most useful
features, yet they are the hardest to document.
Examples are the two packages of briefing materials
on cancer mortality I prepared for Steve Jellinek
(AA for Toxic Substances) while I was still at CEQ.

The first was a package on cancer mortality in
California. Mr. Jellinek was on his way to a meeting
with officials in that state when he learned that
one of them had recently been quoted describing
California as a cancer "hot spot". In less then 24
hours (literally overnight) I provided a briefing
document, including maps, which discussed in a fair
amount of detail California's absolute and compara-
tive cancer mortality status. The overall conclusion
was that the situation was not as bad there as the
official's statements had indicated. About a month
later, Environmental Health Letter reported statements
by independent scientists in California disagreeing
with the official and reaching essentially the same
conclusions we had provided in 24 hours.

Mr. Jellinek subsequently requested briefing materials
in cancer for the U. S. as a whole. Within 48 hours
(some overtime again) he had a package which included
several poster-size enlargements of maps to use in
presentations. These materials included a few points
on which our maps revealed different geographic pat-
terns than those displayed by the NCI Cancer Atlas.

A.  Quicker, more sophisticated mechanism for gener-
    ating mortality "status reports" as specified by
    the user.
B.  More publicity for this capability of the system.
C.  More data bases (especially mortality rates cover-
    ing different time periods and age-specific mor-
    tality rates).
                                            D-15

-------
                                               UPGRADE USER EVALUATION
USER
OTS

















USE
Produce maps
to study the
geographic
patterns of
County-level
mortality In
relation to
water qunlit;
across the
U. S.
1968-1972
Study the
relationship
between
cardiovascu-
lar disease
mortality
rates and
constituent
Levels In
drinking
water
DATA RASE CAPABII.ll IES
CATEGORY
IDE: Mortality
files
NASQAN
tiles





IDB










AVAILABLE
Yea






Drinking water
300-400
counties








NEED
.






Nation-
wide









RANK
.






N










UPCRADh CAPABILITIES
CATEGORY
SORT/RANK
mapping





Data Extraction










AVAILABLE
Ye-i
Interactive
neutral text
to CALCOMP





Yes










NLCD
_
State &
RcRJonal
maps on
screen





Quicker










RANK
_
D





D
















































a

-------
                                                 UPGRADE USER EVALUATION
USER
OTS





















USE
Time Trends
& Mean Vio-
lation races
of drinking
water con-
stituents In
selected
surface-
supplied
public drink
Ing water
system

Prepare
Ad hoc
materials
with quick
turnaround
time for
administra-
tors.
Congression-
al Interest.
etc.
DATA BAKE CAPABILITIES
CATEGORY
STORE!









IDB
STORET










AVAILABLE
Manual
Interface








Yes
Manual
Interface









NEED
_









lore data
_










RANK
_









N
_










UPGRADE CAPABILITIES
CATEGORY
Station combining/
Automatic sequencing

Regression/Correla-
tion plot production


Batch processing



Data Extraction
Data Listing

Statistical Analysis

Mapping






AVAILABLE
Yes


Yes


Yes



Yes
Yes

Yes

Yes






NEED






_



_
_

_

_






HANK






_



_
_

_

_




















































I
M
-J

-------
TRANSACTION DATA
USER
(TASK)
OTS (water/morta-
lity)





OTS (Cardiovascular
mortality vs drink-
ing water)
OTS (mean violation
rates of drinking
water)


CTS (Adhoc/average
per request! abort-
time f rame)
Extract from DB
STORET








X



X


SAROADS















NIU















IDE
X





X





X


Store
In
IDB















Analysis
& Terminal
PLOT






Regression/
correlation
2000 Plots
3-5000
Plots



Various
members

OFF-
LINE
PLOT
100
morta-
lity
naps

IS NA-
SQAN
maps

NO



NO


Number of
Terminal
Sessions &
Time Per
Session
1-2 weeks
10 maps/
1*5 hrs

3-5 days


2.5 months
10-20% on
machine
W Intern
80Z on
tern
2U analyst
5% on mach
1-2 days


COST
Per
SESSION
$4.000.
Total


$600.
Total

$1-2000.
Total

$2000.
Total



$50-150.


COST
Per non-IDB
Extraction
	





	

$50.



$50.


Could
Not
Do.
X


X


	

	



X


MANUAL
TIME
Indeterminati


Indeterminatf


19 months

3OW Intern
15W Analyst



Indetermlnat



-------
                  UNITED STATES ENVIRONMENTAL PROTECTION AGENCY


SUBJKCT:  UPGRADE Evaluation                                   DATE:   5/1/78

                                           /S      /$•*  ft L/ -^^f
FROM:     Jon B. Clark and Neil H. Frank    x,->«->   *->  LX"^—
          Monitoring and Reports Branch    (J

TO:       Lance Wallace
          Kay Jones
          Council on Environmental. Quality

              Scope of Evaluation

              The UPGRADE evaluation was done in conjunction with the analysis
          performed for the CEQ Annual Report.  The evaluation of UPGRADE was
          based on the system on the NIH compute*1  available to EPA at that time.
          Many of our recommended changes may have been made or are currently
          being made.  Due to the changing nature  of the system, no attempt
          v/as made to determine which, if any, of  the recommendations had been
          completed.

              The UPGRADE system was evaluated only in regard to its ability
          to handle air quality data.  While the use of each UPGRADE procedure
          was investigated, emphasis was placed on (1) the procedures used in
          the analysis performed for the CEQ Annual Report (ref. 4-3-78 memo
          from Neil Frank to Kay Jones), and (2) the ability to handle the
          Ozone Study (ref. 2-9-78 memo from Jon Clark to Lance Wallace).  As
          a result of these applications, we concluded that the system is easy
          to use because of its English language and conversational prompting
          mode, but we also observed that the system is inefficient at the
          present time and performed the analyses  in a laborious manner.

              Discussion

              The UPGRADE system is a general purpose interactive analysis
          package developed for unsophisticated computer users.  The package
          presently operates on an IBM computer.   There is no similar package
          available on the UMIVAC computer.  Th:.- most effective package for
          OAQPS usage would, of course, be a package for the UfllVAC computer
          which could easily access the AEROS data bases.  Because the UMIVAC
          has no such system, our analysis of air  quality data is accomplished
          through the use of a collection of computer programs.  Although soira
          of these programs can be used in an interactive mode,  like UPGRADE,
          they are usually intended for the more sophisticated computer user.

              The UPGRADE system contains the following procedures:  3 plot
          procedures, a  regression procedure, and  some data manipulation oro-
          cedures.  In addition to these general analysis techniques, UPGRADE
          has one procedure that was specifically  designed to meet the needs
                                      D-19

-------
of air data analysis—rollback of a  frequency  distribution of data.
Although this procedure could benefit from sorr.a  further refinement,
it is, nevertheless, useful  in its present form  and highlights one
of the unique features of the UPGRADE system.

    Cost of Using System

    All of the costs for the OAQPS1  use of the UPGRADE system were
charged to CEQ.   Requests have been  made for accounting information
regarding run charges, data  loading  charges, and interactive session
charges.  This information has not been made available to us.   How-
ever, partial accounting information which was available indicates
that the system can handle small  to  moderate data bases in a cost
effective manner.

    Findings

    The system has potential as an aid to personnel with no data pro-
cessing knowledge who wish to perform certain  types of air analyses.
Prior to spending money for additional  developmental work on UPGRADE,
all alternative general purpose analysis software should be studies.
Because OAQPS does not have  the personnel to perform such a comparative
evaluation, it should be performed by a central  group such as MIDSD.
They have more expertise in  the area.  In addition, any general soft-
ware package would undoubtedly be useful to all  EPA users.

    The remainder of the report deals with recommendations for chang-
ing UPGRADE as well as problems noted in the system.  The next section
discusses the three major areas of concern with  the UPGRADE system.
The last section discusses suggested enhancements, minor errors, and
minutia concerning the system.

    Three Major Problem Areas

    The first area of concern for any user of  air quality data is
how will that data be made available to the UPGRADE system.  Currently,
this can only be accomplished by a couple of conttactor personnel
working for CEQ.  The process of loading any air data set for use by
UPGRADE consists of 8 processes.   Any user is  now totally dependent
on the contractor performing these necessary steps before UPGRADE can
be used.  It is necessary that some method be  established whereby
any user ca.n easily build his own data sets quickly and efficiently.

    The second area of concern is the limitation on the amount of
data that can be handled by UPGRADE.  The original ozone problem
specified continuous data for 105 cities.  This  requirement was later
pared to 10 cities.  Upon arrival in Washington, it was discovered
that UPGRADE could not handle 4 years of continuous data for 1 site.
Only a fraction of the analysis was  finally performed using 1 site
                            D-20

-------
year of continuous data.  The data handling problems are caused by
limitations at MIH as well as problems in the UPGRADE file design.
Unfortunately, one feature of any interactive system is that it can-
not handle extremely large data sets.  Hov/ever, a few years of con-
tinuous data for a couple of pollutants at a few sites is a reasonable
requirement for any general purpose software system.

    The third major concern for any user of UPGRADE is the plodding
required by the system to accomplish any repetitive task or any fairly
large task.  The UPGRADE system is supposed to have 3 conversational
modes.  The "Terse" mode and the "Verbose" mode are currently avail-
able.  Both are very similar.  They are easy for beginners; however,
there is a very flat learning curve.  The user cannot make the system
perform significantly faster even after he has had many hours of
experience with the system.  This causes the system to be much too
slow and frustrating for most users.  The "Super Terse" mode which
is to be developed should provide for a much higher learning curve
and provide the capability to perform repetitive tasks much more
efficiently.

    Other Findings

    1.  A generalized transformation capability should be developed
to allow tho user to create new variables for analysis.

    2.  Better capabilities are needed to merge certain data, i.e.,
combine sunder data from many years.
                                                       •

    3.  The UPGRADE system should allow for the incorporation of
user developed routines.

    4.  The terminology used by the system is often confusing and
should be standardized.

    5.  The plot procedure currently displays at most 400. data points.
Since this can limit the analysis of air quality data, users should
be allowed to determine the amount of data displayed.

    6.  The prediction procedure should be modified so that it can
be better utilized for air quality analysis.  The modification should
include an improved choice of plotting positions so that the maximum
observed concentrations can be included in the analysis.

    7.  The following specific minor problems were noticed while using
the system:
                               D-21

-------
    (a)  At one point in the partitioning procedure,  the system gives
plot options for the Y axis.  The options are  1  to 8  with 8 represent-
ing summation.   Uhen 8 is requested, the system responds that this is
an invalid entry.

    (b)  In the partitioning procedure, there  is an error in the
routine that sets interval  widths.

    (c)  In the bar chart plotting procedure,  the titles for the vari-
ables should be changed to match those in the  summary table of the
partitioning procedure.

    (d)  In the graphics procedures, the annotation for end year on
the X axis is not correct.

cc:  W. Barber
     R. Campbell
     R. Neligan
     J. Padgett
     D. Goodwin
     R. Rhoads
     W. Cox
     J. Reisa
     W. Ott
                                D-22

-------
                     UNITED STATES ENVIRONMENTAL PROTECTION AGENCY

   DATE.  April  3, 1978

SUBJECT:  Accomplishments During Detail to the Council on Environmental Quality
    T0:
Monitoring and Reports Branch

Kay Jones
Council on Environmental Quality

     The purpose of this memo is to summarize my accomplishments during
the CEQ detail of January 23 to March 15, 1978.  The primary arras of
accomplishment are (1) familiarity with CEQ's UPGRADE system, and (2)
data reduction of CO and oxidant air quality for trends analysis.  These
items satisfy requirements outlined in Kay Jones' memorandum of February
23, 1978.

     Concerning the evaluation of UPGRADE, a formal discussion of UPGRADE'S
analysis capabilities and suggested areas for improvement will be contained
in an overall OAQPS evaluation.  Concerning CO and oxidant trends analysis,
the following discussion will summarize this activity.

               TRENDS ANALYSIS OF CO AND OXIDANT FOR CEQ

     Analysis Plan

     I selected a list of trend stations (Attachment 1) for the analysis
based on CEQ's requirement that data be available for each year during
1973-1976.  Twenty-five oxidant stations in 11 AQCR's and  39 CO stations
in 18 AQCR's were selected.  An annual data completeness criterion of 75;J
(6570 hours) for CO and 67%  (5840 hours) for oxidant was utilized.  The
trends analysis was to be based on  the daily maximum hour oxidant and daily
maximum 8-hour CO.  The analysis was to be performed on UPGRADE with data
summaries  prepared in  format compatible to the 1977 CEQ Report.  Data
retrievals from SAROAD were  to be obtained by CEQ/SIGMA personnel in order
to create  UPGRADE data sets.

     During this time,  I obtained available NEDS emission data  from  the
National Aerometric Data Branch for possible comparison to the air quality
data.

     Data  Availability
               The  air  quality  data  bases  covering  the time period  1974-1976 were
          accessible by UPGRADE on  February 29,  1978.   The 1973  data  has  apparently
          been  retrieved from SAROAD but are not currently available  on UPGRADE.
          Based on  available emissions  data, 1973-1975 estimates wore tabulated  for
          the 22 AQCR's covered by  the  air quality  data.   Preliminary 1976  data
          expected  August 1978  (reference  3/2/78 memo  from Chuck Mann).
                                                                       arc
 EPA r...... 1171 t if,.
                                         D-23

-------
     Data Summaries

     The air quality summaries for 1974-1976  daily  maximum 8-hour
CO were prepared using UPGRADE.   These are shown  in Attachmsnt 2.
Based on modifications to the UPGRADE data sets,  1976  su-merics for 9
sites could not be prepared.   The format Tor  data suii-nsrios were frequency
distributions and bar charts, in agreement with presentations in the
1977 CF.Q Report.  Because of the lerge amount of  man-hours required to
extract data from UPGRADE, a flRB data retrieval capability for extracting^
daily maximum hour data was employed to generate  the required datr.  su.-paries
for oxidant using the EPA UNIVAC.  These are  shown  in  Attachment 3.
Appropriate graphical presentations can be easily prepared from this
information.

     Emissions summaries for hydrocarbons and carbon monoxide covering the
time period 1973-1975 for 22 AQCR's are shown in  Attachment 4.

     Discussion

     The information presented in this report provides basic information
for an analysis of CO and oxidant trends.  The use  of UPGRADE to extract
the CO summaries was hampered by the small number of sites per data set;
four UPGRADE data sets were required.  The efficiency of this type of
analysis can be improved if each data set only contained the data for
a single year, thereby allowing more sites on a single data set.  Further
efficiency would be possible if the UPGRADE data  sets did not allocate
space for as many as 50 variables regardless  of the number of variables
needed.

     The presentation of AQCR trends in a format compatible to the 1977
CEQ Report can be accomplished by representing the AQCR with  the data
for a "typical" station or an area-wide composite.   A series of  four
frequency diagrams could be presented, one for each year.  In order to
improve upon the graphical presentation, a series of trend lines showing
cumulative frequency of occurence of selected concentration levels can
be employed.  Superimposed on this type of graph, emission levels  could
also be displayed.  These two methods of presentation are  shown for oxidant
trends at the Azusa station in Los Angeles (Attachment 5).

Attachments 5

cc:   Robert E.  Neligan,  HDAD
     Lance Wallace,  ORD

-------
! FACHMEfil  1.           AQCR'S WITH TRLNU STATIONS FOR CO AND OX1DAFITS


                                         NUMBER OF SITES BY POLLUTANT

        AQCR      AQCR IIAME            OXIDAMT/03       CARBON KO.'IOXIDE


        052        West Central  Florida
                   (Tarnpa-St.  Petersburg)    1


        050        Southeast Florida
                   (Miami)                   1                  1


        079        Metropolitan Cincinnati   1


        119        Metropolitan Boston       1                  1

        173        Dayton, Ohio              1

        178        Northwest Pennsylvania-
                      Youngs town             1


        024        Metropolitan Los Angeles  10                 11


        031        San Joaquin Valley
                   (Fresno)                  1                  2


        028        Saramento Valley          2                  1


        030        San Francisco Bay Area    4                  6


        220        Uasatch Front
                   (Salt Lake City)          2                  2


        009        Northern Alaska
                   (Fairbanks)                                  1
                                             D-25

-------
             AQCR'S WITH TREND STATIONS FOR CO AMD OXIDAIITS


                                NUMBER OF SITES BY POLLUTANT

AQCR      AQCR NAME            OXIDAHT/03      CARBON MONOXIDE


036       Metropolitan Denver                          1
094       Metropolitan Kansas
            City                                       2
099       South Central Kansas                         1


078       Louisville, Kentucky                         1


042       Hartford-New Haven-
            Springfield                                1


131       Minneapolis-St. Paul                         2
085       Metropolitan Omaha-Council
            Bluffs                                     1
148       Northwest Nevada (Reno)                      1


193       Portland, Oregon                             2


229       Pugct Sound (Seattle)                        2
                                   D-26

-------
              "?;~:H  '.3,197fl I  LIST Or STATIONS FOX TJE RETJJIEUftL CflLLEDl
         C£Q.£A.-<0.':3.£OcJ.CC.03.T»;i;.l.!;J.U02
CCDI

   1
   2
   3
   4
   5

   7
   S
   3

  11
STATION
 COLE
                 c- ic
 SA.WO
 AGFIiCV
  CODE

03*505113

03s|5j-l?/

•i-isC "2C3
                                   448C42C3

                                   74837523
                                                 S7A7ICH DESCRIPTION
                            1.1
                            i.ae3 esf.sziojai^i  AZUSA
                            1.ZC3 CS3G5CC31U1  Lft H.-i3Rfl
                            1.434 253SC.13.J1121  LEMMOX_

                            licSS $3-!!SC23lI31  LOS'"
                            1.337 851183833131  I.CS
                            1.2.18 es-izeaeoiiei  LOS ANGCLES co
                            1.303 »5Sl26e3lI61  HEUHALL
                            I.eiC 25£cb8CS4I81  SA/I DIEGO
                            l.eil C57223C84F&1  SA.-fTA t.VDARA
                            1.012 es87t"aooiiai  UHITTIER
                                                                         6j
                                                                                         CO
                                                                                   ' •  •*    I
             TKE SCxECJi TO  Pr.ESERVE THIS LISTING AN9  THEN DEPRESS THE RETUSH KEV TO CONTINUE
t>

-------
         -V.-CH £1.1073  :  LIST  Or STATIONS FOR THE RETRIEVAL CALLED:
    CEO. 5A.!?C-iO.E37S. CO. 03. TRENDS. VI 1
                  ftGEHCY
                   CODE
                                           STATION DESCRIPTJCM
      2
      3
   S
   7
   8
   5
  10
  11
  12
:C?V THE
  nn
STATION
 CODE:

CG5C-2C1COO!3~01  CSCCCiEO   1

CD I . *• ! i i y.'. x i CONTINUE 10 00


-------
O

ro
SO
     r'.VICH  22, ID'S : LIST OF  STATIONS FOB THE Kt
CEG . tV,RC,'-.- . £073 . CO . 03 . 1 KuKDS . VI 3
                                                              yftL CALLED:
      rooz
         •j
         3
         6
         7
         S
         "3
        13
        11
      DOPY
                     lJl
                            COD
S72505-O
E'i-iC 13-50
                          71530053
                          76CC&9S9
                          7G030D80
              STATION DESCRIPTIOM
             C01
             CJ2
             033
          !.C)3S
                                                  lS37eC001G01
                                                         SPRINGFIELD
                                                         Mri.'-.EAf-OLIS
                                                         H!ll/i£a?OLJS
                                                         Giu'nfi
337
038 381-!G337Er-31
009 ^£05^^>^Gi^a=
    -I918-!e251f01
    -4513-<&053Ffil
                                                        POSTLft.'ID
                                                        FCRTLfytD
01C
                              SEflTTLE
                              SEflTTLE
                                    03   Ox  O)
                                                                                           Xx
                                           II
                                           !<*
    THE SCREEN TO PRESERVE THIS  LISTING WiD THEN DEPRESS THE  RETURN KEY TO CONTINUE

-------
tl

o
              W.
         CEO. fj
        
-------
      4.
   OiISSU),'lS Of HYDROCARBONS AMD CARBON I-'ONOXIDr., 1973-197$ FOB 22
                             SCLLCTCD AHCP.'S
     The following Table contains Hydrocarbon and Carbon Monoxide pnir.sicns
for 22 selected AQCR's in the years 1973 - 1975.  The 1973 data uerc
derived from the 1973 National Emissions Report (EPA-450/2-76-007),
and the 1974 and 1975 data were obtained fron National Air Data Branch.
All Figures For each pollutant were found by taking the "Grand Total"
of pollutant emissions and subtracting the totals for the "Industrial
Process-Point" and the "Miscellaneous-Area" categories.  This method
was used in order to record the most stable and consistent categories
and exclude those whose methodologies have been changed during the
period From 1973 - 1975.
                                  D-31

-------
             Emissions  of  Hydrocarbons  and Ccirbon  !'/ji!Oxic!<>
AQCR £            Major City                     1973       1974       H'7B


 OG2              Tampa-St.  Petersburg,
                      Florida

                      Hydrocarbons                 172        *          124
                      Carbon Monoxide             913        *          803

 050              Miami,  Florida

                      Hydrocarbons                 151        *          179
                      Carbon Monoxide             813        *        1,173

 079              Cincinnati,  Ohio

                      Hydrocarbons                 113        106         99
                      Carbon Monoxide             600        618        582

 119              Boston,  Massachusetts

                      Hydrocarbons                 178        163        151
                      Carbon Monoxide           1,012      1,049        980

 173              Dayton,  Ohio

                      Hydrocarbons                  78         77         71
                      Carbon Monoxide             431         46        428

 178              Youngstown,  Ohio

                      Hydrocarbons                 112        104         98
                      Carbon Monoxide             583        595        555

 024              Los Angeles, California

                      Hydrocarbons                 732        761        715
                      Carbon Monoxide           4,001      4,840      4,603

 031              Fresno,  California

                      Hydrocarbons                 164        157        148
                      Carbon Monoxide             953      1,012        968
                                    D-32

-------
             Emissions of Hydrocarbons and Carbon Monoxide
AQCR f/            Major City                    1973       1974       1975
028              Sarnmento, California

                     Hydrocarbons                132        118        111
                     Carbon Monoxide             711        710        673

030              San Francisco, California

                     Hydrocarbons                351        357        337
                     Carbon Monoxide           1,936      2,212      2,102

220              Salt Lake City, Utah

                     Hydrocarbons                 52         65         51
                     Carbon Monoxide             259        411        234

009              Fairbanks, Alaska

                     Hydrocarbons                  877
                     Carbon Monoxide              45         28         31

036              Denver, Colorado

                     Hydrocarbons                104        108        104
                     Carbon Monoxide             583        728        703

094              Kansas City, Missouri

                     Hydrocarbons                101        102         96
                     Carbon Monoxide             545        671        637

099              Wichita, Kansas

                     Hydrocarbons                 51         55         50
                     Carbon Monoxide             242        301        277

078              Louisville, Kentucky

                     Hydrocarbons                 46         45         44
                     Carbon Monoxide             251        2G8        2!i7
                                      D-33

-------
             Emissions of  Hycirocdrbotis and Carbon F'.onoxick1
AQCR /.'


 042
 131
 085
 148
 193
 229
 Major Ci Ly
Hartford-Hew Haven, Conn.

Hydrocarbons
Carbon Monoxide

Minneapolis - St. Paul
Minnesota

    Hydrocarbons
    Carbon Monoxide

Omaha, Nebraska

    Hydrocarbons
    Carbon Monoxide

Reno, Nevada

    Hydrocarbons
    Carbon Monoxide

Portland, Oregon

    Hydrocarbons
    Carbon I'.onoxide

Seattle, Washington

    Hydrocarbons
    Carbon Monoxide
1973
197",
1975
                                                  806
                                                  120
                                                  664
                                                   43
                                                  232
                                                   14
                                                   61
                                                  149
                                                  772
                                                  115
                                                  582
            124
            812
             41
            260
             18
            103
            147
            894
            122
            751
                       139
                       843
            118
            776
             39
            243
             16
             87
            142
            861
            121
            751
 *   Data  not  currently available.

-------
Or
           O'*''2fa7 AT
— - ..;


I
i
i
i.

^tt'
te-
$L-
-S-
1 •!
•$'
 il
	 	 i 	 1 	 ! 	 i 	 1- l i
i
1






	




^J8-
Ho
ti

-H-
t»-
-«-


1-8
-1o-
.7/1
--s




—



._..
*!_
1


— .
	




T
|
1
r


a

i

i
i
.3
1
i

i
»j»j
"
i
i
i
i


— _.f ... .
i

I
t
-, —
1
T 	
1
5'-
i

!

"

,.4._
i
.. 	 j_ ....
i !
nk\
VT
i
j
1 ' !
1
" " T
I

•
-™. _«
1

—
•—•-I——
	 ]-...,
i

4



j
i
1
i

i — , — „ 	 j — . — . — i '
20 ^o {,0 3/j )fl« |7i) hto '.!
1 	 . 1 1 i .' : | '. *
i —
—

	
-
	
—
	
i
i'
if
[
J
3

0 i


D-35

-------
" 9
  /.T
      TT
   -J0o-
  •£uj

  .S-J-JfcO-

  ti-i'O
  JK2 I _.
   u)
   u
 I rvJ<
    .JzoJ
                        [4"pf^
           i	i . r_v	L	
     Et:&ri?riJ
—M	
-U-U
      iT~tI2-
      i  I  i ii i  •>'	i	i
   •~.—i  \~~        .iii
        |
-------
                                                UPGRADE  USER EVALUATION
USER
OAUM
(QAQl'S)
(RTP)





























USE
Data Reduc-
tion of Air
Quality
Data for
Trend
Analysis


























DATA BASE CAPABILITIES
CATEGORY""
SAROAD
Interface






























AVAILABLE
Manual
Interface






























NEED
Auto































RANK
N































UPGRADE CAPABILITIES
CATEGORY
Data Extraction





Data Manullpatlon

Meaningful size data
Bets




Variable size for
variables




Faster system
operation

Additional
Variables


Additional
Analysis
Routines


AVAILABLE
Yes, but too
much tine
required



Contractor
Interface
1 site year of
cont Inuoua
data



Fixed size
field for 50
variables



Verbose and
terse modes

Predefined
Variables


Predefined
Analysis
Routines


NEED
User
manageabl
SAROAD
extrac-
tion

User
Control
Multi-
year con-
tinuous
data for
multi-
sites
Variable
sized
field
depending
on no. ol
variable!
Super
Terse
Mode
User
creation
of New
Variables
Ability
to add
user
developed
routines
RANK

i E




E


E





E





N


N



D





































































o

-------
                                                UPGRADE USER EVALUATION
USER
OAUM
(QAQPS)
(RTP)





USE
Data Reduc-
tion of Air
Quality
Data for
Trend
Analysis


DATA BASE CAPABILITIES
CATEGORY |








AVAILABLE








NEED








RANK








UPGRADE CAPABILITIES
CATEGORY
Plotting of a large
number of data
points



Standardized
terminology
AVAILABLE
Plot max. 400
data points




Confusing
terminology
NEED
User
specify
(1 of
data
points

Standard
Ize
RANK

E




N



















o

u>
CD

-------
                                                   TRANSACTION DATA
o

CO
VO
USER
(TASK)
OAWM/OAQPS/RTP
(Trend Analysis for
Mt Quality)

Extract from UB
STORE!


SAROAOS
2 extracts

NI1I


IDS


Store
In
IDB
438K By tea
($50)

Analysis
& Terminal
PLOT
100 Plots

OFF-
LINE
PLOT
0

Number of
Terminal
Sessions &
Tine Per
Session
5-10/2 hi.

COST
Per
SESSION


COST
Per non-IDD
Extraction
$50/per
($100)

Could
Not
Do-


MANUAL
TIME
Existing
Systems are
comparable
for this
task

-------
 ORD-HQ Evaluation


 Main evaluator:  L.  Wallace

 Total time spent learning system:   30-40 hours.

Types of analysis or graphics routines used heavily:

     Data listing; basic statistics; polygon plots;  bar charts;  regression;
     partitioning; the corr routine in SAS; filtering;  plot mods,  including
     windowing and axis changes (linear to logarithmic  or  probability axes);
     Stepwise in SAS; GLM in SAS.

 Types of analysis or graphics routines not used heavily:
     The Autoregression routine in SAS; neutral text; mapping;
     shading and other plot mods.

 Total time spent using system:  150-250 hours.

 Data interfaces used heavily:

     General purpose; IDB; Air

 Data interfaces not used heavily:

     STORET; NASQAN

 Main effort:  Correlations between mortality rates  (1968-72) and
     drinking water quality variables.  About 800 correlations run;
     200 linear regression maps produced.  Each graph cost about lOc
     (estimated cost of paper—$50/roll @ 500 sheets per roll.)  in
     paper.  Computer costs for the 200 graphs appeared to be about $100.
     Another 600 correlations were run without making graphs (using the
     CORR routing); here the costs appeared to run about $40.

     Advantages:  Instant access to 191 health, demographic, and
     environmental variables, in compatible form for running
     correlations.


 Other Uses of UPGRADE

     INTERCORRELATIONS OF ENVIRONMENTAL VARIABLES.

     Colinearity of the independent variables causes problems in
     attributing the variance of the dependent variable to either
     one or the other.  The UPGRADE format allows easy  tests of
     relationships among the independent variables.   For example,
     a series of water quality variables were tested against "hardness",
     a somewhat ill-defined quantity, to see in fact which variables
     correlated most closely with "hardness".  About 100 correlations
     were run for the drinking water variables.

-------
 INTERCORRELATIONS OF MORTALITY VARIABLES.

 Many causes of death are more or less closely related.  Finding
 the  exact relationships allows the analyst to group those causes
that vary similarly to achieve greater numbers and more stable
rates.  For example, several types of digestive system cancer
(stomach, rectum, and intestine) correlate fairly well and can
be grouped; however, two main urinary system cancers do not
correlate well across the nation (cancer of the kidney and cancer
of the bladder) and should not, therefore, be grouped.  (In
fact, many previous studies do so wrongly according to the
1968-72 data).  About 300 such intercorrelations were run.

 CROSS-CHECKING EPA AND USGS WATER QUALITY MEASUREMENTS.

 Many variables, such as hardness, fecal coliform, and certain
heavy metals, are measured both by the stations operated by EPA
personnel and those forming part of the USGS national system
known as NASQAN.  Comparing these variables indicates how well
or poorly these independent measurements agree.   About 50 such
correlations have been run.

 CORRELATIONS WITH DEMOGRAPHIC VARIBLES.

 Often demographic variables such as urbanization or employment
in manufacturing are far stronger predictors of a changed mortality
rate than environmental variables.  Learning- which diseases are
 so related is a necessary fist step in determining environmental
influences.  About 300 such correlations have been run.  Graphs
 of 75-100 have been made, at about the same costs as indicated
 previously.

 MULTIPLE REGRESSION USING BOTH ENVIRONMENTAL AND DEMOGRAPHIC VARIABLES.

 Once the diseases affected by both environmental and demographic variables
have been determined, the relative power of each variable can be
estimated by running a multiple regression, either the stepwise
 variety or one of five other types contained in the SAS GLM
(General Line Model) Procedure.  Several multiple regressions
were run.
                                      D-41

-------
UPGRADE USER EVALUATION
USER
ORD/OMTS/
I1Q











USE
Correlate
mortality
rates and
drinking
water








DATA BASE CAPABILITIES
CATEGORY
IDB


User Created
Data








AVAILABLE
Yes


Yes, no
documentation








NEED
More
data



Ability
to ex-
tract
data
from
Tape
Storage

RANK
H


N

N







UPGRADE CAPABILITIES
CATEGORY
Data manipulation


Graphic Analysis





Batch Mode



AVAILABLE
numeric data


4 type plots





Neutral Text



NEED
alpha-
numeric

multi-
plot
A/N axis
descrip-
tion

Batch
Produc-
tion
Mode
RANK
D


D

D



E



TRANSACT ION
VOLUME













COMPARABLE
MANUAL
TIME
Could not do.













-------
                                                     TRANSACTION DATA
USER
(TASK)
ORD/OWTS/HQ
(Correlate mortality
races (1968-72) and
drinking water
quality variables



Extract from DB
STORET







SAROAOS







NIH







IDE
X



X


Store
In
IDB







Analysis
& Terminal
PLOT
800 correla-
tion runs £
200 linear
Regression
maps
600 correla-
tions (no
graphs)
OFF-
LINE
PLOT







Number of
Terminal
Sessions &
Tine Per
Session
2 days
6 hra/day



III hrs
total time

COST
Per
SESSION
$100 total
+ $20 for
paper


$40.


COST
Per non-IDD
Extraction








Could
Not
Da.
NOT



NOT


MANUAL
TIME








J>-
OJ

-------
I.   Introduction

     The purpose of this document is to describe areas where UPGRADE capa-
bilities can be applied to the analysis, interpretation and graphic display of
the EPA's National Phytoplankton Data Base.

     A major effort in the classification and enumeration of phytoplankton
algae was initiated at EMSL-Las Vegas in 1972, in support of the National
Eutrophication Survey Program (NES).  Since 1973 these activities have been
performed at the EMSL-Las Vegas facilities by the University of Nevada, as in-
house contractors, and monitored by resident experts in the Water and Land
Quality Branch, MOD.  During this period, a phytoplankton data base of over
44,000 entries has been developed which represents information from nearly 600
lakes in 37 states.  Work is being continued on sample material from New England,
New York, Michigan, Wisconsin, and Minnesota which will complete the only com-
prehensive nationwide (48 contiguous states) phytoplankton data base in exist-
ence.

     The sampling, preservation, classification, and enumeration techniques
used were consistent throughout, proving unique opportunities for comparisons
and contrasts, both within (geographic distributions, and community assoc-
iations) and between the phytoplankton data base and the other physical,
chemical, and biological information gathered concurrently (over 2 1/2 million
data points from the NES alone).  These combined data bases provide a unique
opportunity to identify specific water quality indicators, develop trophic
classification methods, relate factors in lake problem conditions, and to
predict and/or control algal forms associated with taste, odor, and/or toxicity
in potential drinking water sources.

II.  Data input requirements

     Several levels of data input are identified below which progressively
increase the utility, and flexibility of UPGRADE in meeting our needs, and the
actual or potential needs of other users.  Optimally the entire phytoplankton
data base would be transferred to UPGRADE from COMNET along with the corres-
ponding lake-date mean physical/chemical parameter values calculated for each
of the 800 lakes in the NES program.

     A.  Access, through UPGRADE, to our phytoplankton data files in COMNET
(without permanent transfer and storage), would enable us to examine geographic
distributions and community structure.  This approach would considerably reduce
the interactive advantages of UPGRADE by increasing job times during data
transfer.  Perhaps the most significant loss would be to other users who would
not otherwise have access to the National Phytoplankton Data Base.

     B.  Transfer of the entire phytoplankton data base to UPGRADE would be
optimal; of lesser value would be the transfer of phytoplankton data for select
species.  At this level, geographic distributions and community structure could
be analyzed on a national or regional basis with minimal time loss due to
initial data transfer from one computer system to another at the beginning of
each user session.  This level of data transfer would give all UPGRADE users
access to the National Phytoplankton Data Base.
                                      D-44

-------
     C.  The mass of physical and chemical data collected by the NES program is
being reduced to lake-date mean parameter values (1/2 complete at this time).
Since about 800 lakes were sampled at least 3 times each (spring, summer, and
fall) there will be nearly 2400 sets of mean values.  This entire data set
should be transferred from COMNET to UPGRADE for several reasons.

     1.  It, in conjunction with the phytoplankton data, would enable users to
examine, statistically and graphically, the relationships between phytoplankton
species (genera) and the environmental conditions associated with their occur-
rence.  Conditions associated with the growth of special interest and problem
algae could be analyzed in more depth and with greater flexibility than is
feasible with the present system.

     2.  Increased effectiveness of 208 planning could be achieved through
geographic selection of data for use in preconstruction prediction of phyto-
plankton response to proposed environmental modification.

     3.  The physical and chemical data is in itself unique, and when married
to UPGRADE would provide the means for users to characterize and analyze lake
subsets of special interest.

III.  Report needs

     Computer programs have been written and tested which perform the routine
compilation of environmental data associated with the occurrences of the var-
ious phytoplankton forms.  The execution of these programs has resulted in a
wealth of information leading to the initiation of several reports.   Each of
the reports, in order to satisfactorily explain the data, requires extensive
use of various plots and histograms.  UPGRADE would enable us to fulfill these
needs as well as initiate investigations into areas previously untouched due to
the size of the data base and the inadequacy of our present system to handle
geographic problems.  The following are specific areas where UPGRADE would be
useful to our programs, and suggested uses of our data by other organizations
with access to UPGRADE.

     A.  Geographic distributions and representations (phytoplankton and
environmental factors).

     B.  Statistical evaluations of the environmental requirements of well-
known problem and special-interest algae.

     C.  Development of phytoplankton indices of trophic state (already well
under way at the genus level).  UPGRADE would be useful for testing, index
modification, and development of species level indices to water quality.

     D.  UPGRADE, with its interactive capabilities, could be used for econo-
mical screening of data-analysis approach techniques on small sub-sets of data
before applying them to entire data sets.

     E.  Retrieval of baseline phytoplankton data within geographically re-
stricted areas.

     F.  Retrieval of baseline phytoplankton data within geographically pre-
dictions relative to 208 planning.


                                   D-45

-------
     G.   Selection of lake subsets of special interest,  Ce-g-»  high or low
productivity), for comparison with community structure.

     H.   Provision of ambient water quality and/or sensitive biological com-
ponents for inclusion in multiparameter models for land-use and watershed
management.

     I.   Opportunity to interface the water quality and  phytoplankton data with
other information of specific user interest.
                                     D-46

-------
                          U.S. ENVIRONMENTAL PROTECTION AGENCY
                              OFFICE OF RESEARCH AND DEVELOPMENT
              ENVIRONMENTAL MONITORING AND SUPPORT LABORATORY - LAS VEGAS
                  P.O. BOX 15027. LAS VEGAS. NEVADA 89114 • 702/736-2969 (FTS 595-2969)

   Date: April 6, 1978

Reply to
  Attn of: MSA

 Subject: Establishing a SAROD  Subset Data Base in the UPGRADE System
    TO: Dr^-Wayne Ott
       RD-680

       This is to confirm my telephone  request to establish two SAROD data sets:
       one for Phoenix, Arizona, and  the  other for Tampa/St. Petersburg,- Florida,
       covering the period  from 1974  to 1976.

       We would like the hourly average for NOx,  03,  and hydrocarbons included
       in the data set for  each station,  if available.

       Larry Mylask indicated  they  are  having  some problem storing large amounts
       of data but they would  have  a  solution  to  that problem within two months.
            ,   -,-,
       E. A". Schuck, Chief
       Monitoring Systems Design
       and Analysis Staff
       Monitoring Systems Research
       and Development Division

-------
"Robust" Correlations of Mortality Rates  (br*!j races & sexes) with  Drinking  Wntcr Constituent1-
1
Cancer, Total
" Esophagus
Stomach
11 Intestine
Liver
11 Kidney
Leukemia
I
00
Major
Cardiovascular
Disease
Hypertension
Chronic Ischemlc
Heart Disease
Cerebrovascular
Disease
Arteriosclerosis

	
— '


+
4-
—

—
—



	


4-




—






4-
-h
—


•

• ni


—

-1-

	


	


	













	
+




— .





	
4-















+



•
—








— ,
4-
|
•


4-










—











—
4-










	
4-











4-




















-------
                                               UPGRADE USER EVALUATION
USER
ORU

(KMSL -
Las Vegas


















USE
Correlate
Drinking
Water
Concentra-
tions and
mortality
















DATA BASE CAPABILITIES
tAiTCORT
IDB





















AVAILABLE
DB Is avail-
able but very
limited data



















NEED
Larger
Data Set




















RANK
E





















UPGRADE CAPABILITIES
CATEGORY AVAILABLE
Data manipulation
capabilities


Analysis and plot 1
variable to 1 or more
stations
Correlate two vari-
ables after both
variables arc plotted
to stations
Restart capabilities
after computer mal-
function
Additional
Analysis
Routines
Formal access to
Upgrade User Support
Group
Faster System
operation
Limited



Yes


Can plot two
variables
across groups
of stations
Start over


SAS


Per Contract


Verbose and
terse modes
NEED
Increase






Specific
Lnstruc-
tlon

)ato Save
Capabili-
ties
Added
lou tines

Jo cost
iccesa

Super
terse
RANK
N










D


N


N


D

TRANSACTION
VOLUME






















COMPARABLE
MANUAL
TIKE






















o




VO

-------
                                                   TRANSACTION DATA
USER
(TASK)
OUTS/Las Vegas Lab
(water/mortality)
Extract from DB
STORET

SAROADS

NIH

IDB
50K
data
point!
Store
In
IDB

Analysis
& Terminal
PLOT
Regression
IK plots G
20-200 pta
OFF-
LINE
PLOT
0
Number of
Terminal
Sessions &
Time Per
Session
15/2 hr.
COST
Per
SESSION


COST
Per non-IDB
Extraction


Could
Not
Do •


MANUAL
TIME/COS t
3lirs/plot
(S51/plot)
(Jl
o

-------
                                               UPGRADE USER EVALUATION
USER
OlEE/HERI
CINN

•
USE
Correlate
Cardiovas-
cular vs.
rid 10 1
hcirdncss
DATA BASE CAPABILITIES
CAtTCOKY
User data base



AVAILABLE
CEQ user
support group
interface


NEED




RANK




UPGRADE CAPABILITIES
CATEGORY
SAS
Plotting


AVAILABLE
Yes
Yes


NEED




RANK




TRANSACTION
VOLUME
5K Records



COMPARABLE
MANUAL
TIME
Should save
10 times over
use of COHNET-
lo Graphics
at COMNET.
o
Ul

-------
      UNITCU S1ATLS  ENVIRONMENTAL PROTECTION  AGENCY
                        WASHINGTON D I   20460
SEP 1 1  1978
                                                              on .n 0'
                                                        PLANNING AND
 SURJECT:  Evaluation of UPGRADE System

 FROM:     Ten Gardenier,  Statistician /— -^-__ ,, ^
          Statistical Evaluation Staff,^-PM-22'j-

 TO:       Lance Wallace,  Environmental Scientist
          Monitoring Technology Division,   RD-680
CC:
          Marcia Williams
 Thanh  you  for getting me started on the UPGRADE system.   I had a chance
 to have sane "hands on"  experience with both the extract and upgrade
 analysis options.   Since I promised I would provide some feedback to you
 on the use of the  system,  let me give you my preliminary impressions.  I
 already talked with Joe  Higgins and I understand Marcia  routed over to you
 comments from our  staff.  We had a chance to discuss it  briefly last week.

 1.   First,  let me say that I think the idea of the system is very good
 because it gives us the  flexibility to access actual data which,  as
 statisticians,  we  are able to analyze.  This is a very strong point,  since
 data already complied in tabular form in other reports may not correspond
 to sumraries that  we may want to generate.  For example,  maximum concen-
 trations at point,  sources may have been used as input; but we may want to
 have an index of variability (standard deviation)  around the point source.
 Unless access to time-series data is available this would not be possible.
 It would take personal contacts and administrative time  to make actual
 data available to  statisticians to analyze and perhaps reanalyze; thus the
 system orovides a  capability and diversification in our  job function
 strengtheningthe versatility of our results.

 2.   Second,  environmental indicates leading to a specific health hazards
 arc many and they  involve a multitude of pollution sources.  Consider, the
 example of blood lead concentration which could be transported through
 ambient air or drinking  water.  If one is interested in  the relationship
 between morbidity  and/or mortality of a specific health  condition and
 environmental variables, more than one possible source of contribution is
 far more superior  to use as input parameters than a single parameter.
                                   D-52

-------
Data collection efforts in the form of a  "matrix" are often tedious
and occasionally impossible if one is interested in a particular location
and a specific time interval.  I am impressed with capability to search
and rctrive for several health indicators  (such as cancer of various
types, kidney infections, various types of heart and respiratory diseases)
as well as air and water quality variables.  With this capability the user
is able to create a subfile for correlations, cross-tabulations, or other
analyses.

3.   Tliore are a fe,v points about the system which emerged while I was
working with it which probably need attention.  Again, there may be features
within the system which I am not familiar with to bypass these issues.  If so,
I will know in time.

     a.  When in the extract mode, a "matrix is created out of the data
containing a full set of observations on all variables selected.  That is,
one receives a matrix the size of which is dependent on the number of
variables sleeted and the number of observations in the variable with
minimun common data points.  While I was searching within water quality
data I also searched for the variable dealing with "population/sq. mile"—so
I received as few observations as a couple of hundred.  This was because
the population data is scanty and in the process  of update.  Perhaps it
might be useful to receive a "response" indicating how many observations
exist in each variable searched.

     b.  There is a state and county code identification in the main data
base, however unless the user asks for it as a "variable" the subject extract
datafile does not have any I. D. associated with each record.  The matrix
includes a  generated code of I to n as a sequential keycode for each record
and the specific observation on the variables.  This delimits the use of
specific lists.  I will now always include the inquiry for state and county
code. Yet, it might be worthwhile to automatically write into extract files
a specific reference I. D. for the record.

4.   An idea related to the previous point in 3b might  be to provide sunmary
data grouped by AQCR  (Air Quality Control Region) for air quality related
variables.  I do not know if a corresponding type of categorization exists
for water quality data, on whether the locations comprise the same territory.
Tins might be an idea for generating "national sanplc" data with a
feasible number of observations in the matrix.  Otherwise, the more than
3000 entries for all countries appear in  the extract database.
                                  D-53

-------
7-.iothcr idea for creating national subsets would be to "randomize" on.
selected demographic parameters within state and county records.  I can
visualize a 1/k stratified sample generating subroutine which could
create 3 subset national samples on one or more selected demographic
parameters.

5.   The new modifications to UPGRADE evidently needs more disk space; so
when logging in, the user needs to specify  Region (500).  Perhaps until
automatic logon procedures are implemented, there  could be a message to the
user with the request for logon,  requesting the additional statement.

6.   I also tried the system on an Andersen Jacobson printer rather than
the Tcktroniy wo were using.  I found it convenient for listing large
data sets, since it takes a while to obtain a copy of the screen on the
Tektronix.  At the time I did not want any graphics but tried the option
anyway.  I received no documentation showing how one queues the output.
If the "no" option is given everything seemed to be clear-cut.

7.   I also followed up on your comment on skewness and kurtosis.  As I
mentioned, there is an approximation to skewness using the median.
This is:    3(X - median)
          standard deviation
There is also a similar approximation to kurtosis using the semi- inter-
quartile range and 10th and 90th percentiles which is    ^(Q., - Q,)
                                                         P   - P
                                                          90    10
I hope these comnents were helpful.  I will keep you posted as I get
more experience with the system.  Do let me know if I can provide any
assistance.

-------
UPGRADE USER EVALUATION
USER
or&M






USE
Air quality
and health
data




DATA BASE CAPABILITIES
CAiTCoHir
County ID and
descriptive
data reference




AVAILABLE
Good aelectloi
of variables





NEED
Total
popula-
tion




RANK
X/N






UPGRADE CAPABILITIES
CATEGORY
Quick exit capability
Documentation of In-
terface between up-
grade and extract and
where the extract
file la GLIDE

AVAILABLE
None
Not vet avail-
able




NEED
Posaiblll-
bility to
trite dati
into cardi
ind/or
Jlsk tape.

RANK
X/N






TRANSACTION
VOLUME
Three a week
estimated use
for approx 12
variables and
5 sites.


COMPARABLE
MANUAL
TIME
light take
too long to
locate data.
so statistics
may not be
lone on actual
data.

-------
              EVALUATION OF UPGRADE
 WATER ENFORCEMENT DATA SYSTEM INTERFACE OPTIONS
                       By
            William C. Blackman, Jr.
                    June 1978
The President's Council on Environmental Quality
                Washington, D.C.
                   D-56

-------
                        EVALUATIOfl OF UPGRADE
            WATER ENFORCEMENT DATA SYSTEM INTERFACE OPTIONS
                                  by
                       William C. Blackman, Jr.*
INTRODUCTION

     The Council on Environmental Quality (CEQ) has, since 1975,
sponsored and principally utilized a highly versatile, computerized
data system known as UPGRADE (User-Prompted Graphic Data Evaluation).
The system is interactive, employing ordinary English language in-
structions, stap-by-step analyses, graphic display and hard copy, and
line printing.  This system is designed to enable managers, scientists
and angir=ers with no computer training to accass and analyze a wide
range of environmental natural resources, public health and related
data.  The software package presently provides for a variety of graphic,
statistical and procedural options.  These options are detailed in
Appendix A.

     As directed by President. Ca.rter in his May 23, 1977 Me??=r;<3 to
Congress, CEQ has convened an interagency task force "	to review
present environmental monitoring and data programs, and to recommend
improvements that would make these programs more effective."  Accord-
ingly, CEQ is examining possibilities for interfacing the UPGRADE
system, with systems having environmental data bases, in order to:

     1.   Assist the agencies by making the capabilities of the
          versatile UPGRADE system available where applicable, and
     2.   Achieve the access to existing data bases that is necessary
          in order to carry out the Presidential Directive cited.
   While on temporary assignment to CEQ.
                                 D-57

-------
OBJECTIVES


     The specific objectives of this analysis are:


     1.   To provide an assessment of various options for interfacing
          UPGRADE with existing or planned automated data systems
          operated by the Office of Water Enforcement (OWE) and Enforce-
          ment Divisions in the Environmental Protection Agency (EPA);

     2.   To identify the adaptations that are necessary to enable
          EPA Enforcement entities in Headquarters,  Regional Offices,
          and Field Offices to utilize the analytic, graphic and map-
          ping capabilities of the UPGRADE system;  and

     3.   To illustrate to CEQ the options believed available to re-
          trieve information from a broad environmental  data base
          (including parametric data not now accessible) for use by
          CEQ in achieving the general requirements of its mission,
          and the specific requirements of the President's May 23,
          1977 message.


     This evaluation is in no way intended to conflict with either

studies of UPGRADE capabilities by the EPA Office of Research and

Development presently in progress, or the work of the Interagency

Task Force on Environmental Data and Monitoring.   Rather, it is in-

tended to enhance and assist boui of these efforts  by indicating and

ranking several options that could initiate interfacing activities in

a specific area of mutual interest to all parties.
DATA SYSTEMS PRESENTLY SERVING EPA WATER ENFORCEMENT ACTIVITIES


     Automatic data processing (ADP) in EPA's enforcement operations

is fragmented, parochial, and largely housekeeping in nature.   It is

unnecessary here to reconstruct the history of a series of contractors

and their respective proprietary systems, except to say that the extant

situation grows from a chronic absence of an operational centralized
                                D-58

-------
system.   The most recent episode Involved the inadvertent destruction
of the General Point Source File (GPSF), which at one time was to
have been the centralized system for storage, retrieval and analysis
of National Pollutant Discharge Elimination System (NPDES) data.

     After years of frustration, the regional, field office and Head-
quarters program managers have opted for local systems that meet
their immediate needs, or in some cases, make maximum use of limited
resources.

     More recently, a trend toward centralized data processing in
regions! offices has limited the accessibility of data processing
facilities by technical program managers and their staff members.
Although the approach varies between Regional orfices, the present
trend is to locate terminals in the Management Division where the
technical program manager must send data processing requests.  There,
the ADP manager mai.-.^ain^ -he discretion as to when and if the request
is to be filled.  This leaves the requesting technical program with
little opportunity to "tinker" with the data or to experiment with
various analytical procedures.  This "management" phenomenon may be
appealing from a budgetary standpoint but it is not effective in
terms of utility of data.

     All Regional NPDES compliance operations and the Office of
Enforcement in Headquarters utilize a bookkeeping system known as
Permit Compliance System (PCS) for tracking compliance schedules and
permit renewal dates, forecasting events such as due dates for in-
spections, preprinting Discharge Monitoring Reports (DMRs) and similar
functions.  The system is up on COMNET, the present EPA contractor
facility, and can be accessed by a local terminal in each Region.
The system is not designed to store or process raw parametric data.
It is the single data system that is used at least to some extent by
Water Enforcement operations in all Regions and the Headquarters.
                          D-59

-------
     STORET, a well  known automatic data base used by EPA for storage
and retrieval of stream quality data,  has recently been adapted for
storage of effluent  data by some users.   Regional  enforcement data
processors were canvassed (see Appendix  B) to determine the extent of
this use.   It appears that at least three Regional  Surveillance and
Analysis Divisions are now routinely storing effluent data generated
by their inspections, and that one additional S  and A Division intends
to do so.   Although  there is some doubt  about such use by a few states,
Regional contacts indicate that nine state agencies are presently storing
effluent data in STORET and that three others are  preparing or planning
to do so.   The generalities but not the  specifics  of this information
are confirmed by analysis of a retrieval from STORET.

     To enable a more accurate understanding of  the present use of
STORET for storage and retrieval of effluent data,  a retrieval was
designed to indicate recent pertinent activity.  The retrieval included
only stations having 1976 and lazer data, and only stations at which
any of five effluent parameters (BOD,  TSS, pH, fecal coliforms, and
flow) had been stored.  This retrieval indicated that twenty-two states
have stored at least some measurements of the specified parameters.
These states are:
          NEW JERSEY*                   CONNECTICUT
          ALABAMA                       MASSACHUSETTS
          MARYLAND                      ARIZONA
          KENTUCKY                      PUERTO RICO
          INDIANA                       SOUTH CAROLINA*
          NEW YORK*                     NORTH CAROLINA*
          FLORIDA*                      WEST VIRGINIA
          TENNESSEE                     OHIO
          NEW MEXICO                    TENNESSEE
          VIRGIN ISLANDS                WYOMING
          MISSOURI                      UTAH
          PENNSYLVANIA
     Indicates significant amounts of data.   Among these states only
     Maryland, Florida,  South Carolina and North Carolina have stored
     appreciable amounts of effluent data since January 1, 1976.
                          D-60

-------
     These findings indicate that a very sma-11 portion of the ef-
fluent data being generated by the NPDES Permit Program se'lf-monitor-
ing requirements is currently accessible through STORET.

     Three of the regional Surveillance and Analysis Divisions (II,
VIII, X) indicate that parametric data from monitoring being con-
ducted in conjunction with compliance inspections by S & A Divisions
is being stored in STORET.  A representative of Region IV indicates
that preparations are being made to use STORET similarly.  Although
the present commitment of EPA inspection data to STORET is limited,
the data so generated should be of higher quality than the self-
monitoring data.  Numerous quality control evaluations by EPA's
National Enforcement Investigations Center (NEIC) have shown that the
reliability of the self-monitoring data varies widely.

     Several of the Regional operations have developed local systems
to meet water enfcrcGrce.-it needs, including the capability to store,
retrieve and manipulate effluent data.  Salient features of these
systems are as follows:

     REGION I - A local  system, Region One (I) Enforcement Data base
(ROEDS) consists cf three components that function as cheir-titles
indicate:  1)  Compliance Schedule File (COMP);  2) Self-Monitoring
Report File (SMON); and 3) Managment Information and Control System
File (MICS).  The system does not receive nor store parametric data.
COMP and SMON perform the PCS functions that are similar in most
regions.

     REGION II - A five-component system which is resident on COMNET
includes:  1) Status of Permit Development (SPD);  2) Status of Permit
Compliance (SPC); 3) Local Effluent Data System (LEDS);  4) Regional
Industrial  Contributors  System (RICS);  5) Non-Filer System (NOFS).
SPD and SPC are similar to PCS, and Region II interfaces the two systems
with PCS.  LEDS is the system of major interest to CEQ in that it uses
                                   D-61

-------
a data base that includes parametric effluent data.   RICS contains
information that will  be of central  importance in  achievement of
pre-treatment requirements.   NOFS is of diminishing importance as
non-filers are identified and brought into compliance.

     REGION V - Supplementing the normal  PCS, Region V employs a sub-
system dubbed ENF-V to track enforcement actions  such as response
dates, orders, etc.  (See later discussion of Region V proposed inter-
face with state systems.)

     REGION X - The Surveillance and Analysis Division operates a Point
Source File that stores permit and effluent data  on about 450 sources.
This number includes all major dischargers and what are termed "signifi-
cant" minors.

     Some state agencies have developed ADP systems for storage and
retrieval of effluent data.   Those known to have  ADP capability include:
Pennsylvania,  Ohio,  Indiana, Michigan and Kansas.   State officials in
Dreg.-.-,, Washington and Idaho are Et various stages of development of
ADP systems.   The nature, extent, and requirements for interfacing
these systems  is not known, however, the associated EPA Regional En-
forcement Divisions are, in some cases, planning  to interface them.
DATA SYSTEMS UNDER DEVELOPMENT

     Two systems are in development by EPA contractors at this time
that are of immediate interest to CEQ.   These are:   1) Compliance
Analysis System (CAS) which is to be developed for the Office of Water
Enforcement; and  2) an interfacing system as yet unnamed, that is
intended to enable the Region V enforcement staff to interface four
state-operated systems with CAS.
                          D-62

-------
     CAS is designed primarily  as  a national permit tracking system
that will  store permit conditions, pre-print DMR's, detect'viola-

tions, compare performance  of dischargers, etc.  It is supposedly

designed to accept raw parametric  data.  The concept has progressed

through an initial feasibility  study  by Arthur Young and Co., was

deemed unaccep tably costly by  EPA, and a simplified approach is in

the process of a second feasibility study by Young and Co.  OWE

expects to have that study  in hand within the month.  EPA staff
estimates  that if a decision to develop CAS is made, operational
status will trail that decision by at least 27 months.
OPTIONS AVAILABLE


     It appears that four options  several sub-options are available
for interfacing the UPGRADE system with water enforcement-related

data systems.   None of the options hold promise for immediate access
to a nationwide base of parametric data on water pollution sources.
OPTTON  I -  Interface  UPGRADE with the ex+e*?t  PCr
             sy&te..;  uu.ci  a>.*cu.;: cperazic^i
             the proposed CAS system.
                                                         cf
           PRO -  This approach would  give to CEQ early
                   access   to  a  nationwide  base  of  non-
                   parametric  data  that would permit studies
                   of  the  administrative  aspects  of  the
                   NPDES  program.   For example, numbers
                   of sources  due to achieve  compliance in a
                   particular month or quarter, numbers of
                   major  sources  not in  compliance,  geog-
                   raphical  locations of major sources, etc.

                   This approach would initiate the working
                   level contacts and cause  development of
                   the communications channels  that will  be
                   necessary to establish  any kind of  inter-
                   facing  activity.
                                   D-63

-------
                 EPA staff would gain the opportunity  to
                 work with UPGRADE,  experience  its im-
                 pressive capabilities, and  examine  its po-
                 tential  for  a wide  range  of applications
                 by  EPA   Headquarters  and  Regional
                 Offices.

          CON  -  As  indicated above,   a  system  that  is
                 limited  to analysis  of the  housekeeping
                 type data that  is  stored in PCS  is  of
                 questionable  utility.   Apparently,  some
                 two  to three years  of development  time
                 lie  ahead  for  CAS  if   development  is
                 authorized.   Thus,  the  opportunity for
                 timely   interface  with  a  parametric  data
                 base is  not inherent in  this   option.

 OPTION  II(a)  -  Interface UPGRADE with STORET.

          FRO  -  CEQ  would,  in  a  shirt  p-oficd,  be
                 enabled to access a  parametric data base
                 of limited scope.
          CON  -
OPTION II (b) -
The base  would  be  data-rich  for  only
five states.  Most of  the  accessible data
grows  from self-monitoring  (as  differen-
tiate  fie::: C2ir.pjiei:ce n?ofc.hc,. hȣ  ay the
regulatory agencies),  thereby adding an
element  of  questionable  quality  control.

There  is no uniformity of format between
contributors  of  data  to  STORET,  thus
causing  the  effluent  data stored  therein
to be  difficult  to  use for comparisons or
statistical analyses.

Interface UPGRADE with STORET  in con-
junction  with and in cooperation with, an
EPA program of  greatly expanded  stor-
age  of effluent  data.   This  option would
require  a  modest  commitment  by Water
                                    D-64

-------
               Enforcement personnel  in  the Headquar-
               ters and/or  Regions.*

        PRO  -  EPA  would  gain  the ability at an  early
               date  to perform  a  wide  range  of com-
               parisons,   statistical   evaluations,   re-
               gressions,   maps,    automated  calcula-
               tions, etc.

               CEQ  would  gain  access to a  meaningful
               data  base  that would  grow  steadily  in
               scope,  richness and utility.

       CON -  In  the  absence of  a switch  to another
               ADP  contractor or  the  improvement  in
               responsiveness  of  COMNET the chronic
               difficulties   of   working   with  STORST
               could   be  expected  to  plague  tne  op-
               eration .

               It  is  unrealistic  to expect that all EPA
               Regions  and   delegated   state  agencies
               would voluntarily participate  in the pro-
               gram.   If  not,  the  gaps  would deprive
               CEQ  of a  nationwide data base,  and  the
               extent  or  seriousness of that deprivation
               would  be  directly   proportional  to  the
               numbers of non-participants.

OPTION  III -  Interface UPGRADE with  individual sys-
               tems  in EPA Regional Offices.  This  ap-
               proach  does not hold  promise with  re-
               spect to the ROEDS system  in Region I
               since the accessible data would be  essen-
               tially   PCS  data  from  one Region.   It
               holds marginal promise  in terms  of  the
               LEDS component of  the  Region II system
 Details would require  negotiation with EPA but the proposition
 is that EPA Regional and delegated states be shown the capabili-
 ties and flexibilities that would be made available by adopting
 UPGRADE.  Participants would be asked to store effluent data gen-
 erated by compliance monitoring inspections and enforcement
 evaluations.  Some  provision for separate retrievals should be
 provided in the event  Regional and State program directors wish
 self-monitoring data to be included.
                                 D-65

-------
       in that LEDS contains parametric effluent
       data.   However,  the  data consists  of
       pre-calculated loads,  and  is  generated
       by    self-monitoring    operations     in
       (presently) only one  state.  It is thus a
       very limited data base, having the  qual-
       ity  control  questions  associated   with
       self-monitoring  data.   It  limits  demon-
       stration of UPGRADE capabilities  in that
       flow    measurement-raw    concentration
       manipulations, regressions,  calculations,
       and other  "data-tinkering"  type  opera-
       tions  are  not possible.   Stated another
       way, LEDS requires the user to manually
       (or  in some  other  external system) do
       the   calculations   that  the  interfaced
       UPGRADE   should   do,  if  its  full  capa-
       bilities are to be  used.   The  Region X
       system  appears  ideal  for  interfacing,
       number  of  sources (450)  represented by
       the  data.

       The most  favorable  of  the regional sys-
       tem/ UPGRADE interface  options  is with
       th-i  system  being developed  by  a con-
       tractor for Region V.   Regional staff has
       no  access  to  DMR  data.   All  states  of
       Region V have been  delegated the NPDES
       program and are maintaining DMR  data  in
       state  files  of which three (and  soon  a
       fourth)  are automated  storage and re-
       trieval systems.   The  system  presently
       under development is  intended to inter-
       face the state systems with the  Head-
       quarter's CAS system,  which again is  at
       least  27 months from operational  status.

PRO - Interface  with  UPGRADE  could   enable
       Region  V  to  achieve  access  to  state
       agency files at a  much earlier date than
       will  be possible  with CAS  (assuming that
       CAS is  brought   to operational  status).

       Interface with  UPGRADE  would provide
       CEQ with  an  early  opportunity  to work
       with  a  significant  data  base,  and  to
       demonstrate UPGRADE  potential.
                         0-66

-------
      CON - The  data  base  is  thought  to  be  con-
             stituted primarily of  self-monitoring data
             and thereby  embodies the quality control
             doubts.   There  are  two related  consid-
             erations:   1) though quality of  the data
             may be questionable, the interface  would
             provide the opportunity to test, improve
             as  appropriate,   and   demonstrate  the
             utility  of  UPGRADE;  2) Region  V rep-
             resentatives  might  find  it  possible  to
             negotiate  provision  for identification  of
             compliance monitoring data generated by
             the  regulatory  agencies.    The   latter
             would  enable Region V and  CEQ  to ac-
             complish meaningful data analyses.

OPTION  IV -Although  this analysis  was to have been
             confined to  considerations  of suitability
             for interface with Water Enforcement data
             systsT.G,  it is  necessary *o include con-
             siderations that lie outside  of that realm.
             Various  EPA  operations  are  presently
             contracting with  consultants  (have  re-
             cently  done so,  or planning to do  so) to
             develop  separate data  systems.    There
             are now or will  be,  data systems  devel-
             oped for  each  major EPA media program
             in addition to the operational and admin-
             istrative support systems.

             The   independent   development  efforts
             have both negative and positive aspects.
             Certainly the  needs  of  the media  pro-
             grams  differ with each other  and with
             those   of  the  housekeeping operations.
             Moreover,  individual  program  managers
             upon  finding that no  data  system that
             meets  the program needs is  extant  within
             the Agency,  cannot  be faulted for taking
             the steps necessary to  meet those needs.
             However,  economics-of-scale,  intermedia
             analyses, and  technology transfer  would
             be enhanced if a single system  could  be
             developed.

             Exploratory discussions  between  CEQ and
             EPA  could  ascertain  the  possibility  of
             structuring a  single system, incorpora-
             ting or interfaced with  UPGRADE so that
                         D-67

-------
                  the  Council  staff  would  be enabled to
                  access  the  full  range of  environmental
                  data.   The  trade-off  for  EPA  would be
                  the  acquisition  of  capability  by Head-
                  quarters  and  Regional staffs  to  exploit
                  the  capabilities  of  the UPGRADE  system
                  by returning environmental  data analysis
                  to the  technical  and scientific  personnel
                  who  should  conduct  such   analyses.
                  Hopefully, the present study by the  EPA
                  Office of  Research  and Development will
                  reach  that  conclusion and provide  the
                  details  of  the  process  which  would be
                  necessary   to  initiate  that  approach.
SUMMARY AND CONCLUSIONS


     UPGRADE is a versatile,  interactive, user-prompted data analysis

system that could be exceptionally  useful to Water Enforcement operations

in EPA if interfaced with appropriate data bases.  The system can be

adapted to perform a wide variety of calculations and statistical

analyses, and to produce  hard copy  print-outs, plots, maps, regres-

sion curves, etc.   It is  designed to enable the evaluation of en-

vironmentally related factors against others, for example, morbidity

statistics can be analyzed in terms of the concentrations of a par-

ticular air pollutant, or first-through sixth-order regressions can

be performed for a water  quality parameter on flow.


     It is deemed highly  desirable  by CEQ, the sponsor and presently

the principal user of UPGRADE,  that the system be interfaced with EPA

data storage and retrieval systems, such that CEQ could access Water

Enforcement data bases.   Such access would be very helpful to CEQ in

the accomplishment of its mission and objectives, but the present

options to gain the desired access  are limited.  There are no exist-

ing systems that contain  a nationwide base of parametric data on wa-

ter pollution sources. Such a system may shortly go into development
                             D-68

-------
with operational status approximately 27 months later.   One system
(PCS) contains "housekeeping" data such as compliance dated, Dis-
charge Monitoring Report pre-print data, etc.  Relatively small
amounts of source data have been stored in STORET, and if all or most
compliance monitoring data were placed in STORET it could be made to
provide a satisfactory data base with which to interface.  Various
EPA Regional Offices operate Water Enforcement-related data bases.
Of these, only the Region X system contains parametric data.  Region
V is developing a system that will enable accessing of pollution
source data in the State files.

     In view of the absence of a uniform nationwide storage system
for pollution source data, and the present trend toward development
of separate systems for each media program, CEQ might render an im-
portant service and at once achieve its accessing goals by proposing
discussions with EPA to consider joint development of a comprehensive
data system that:  a) would be designed to interface with UPGRADE;
b) meet the needs of all media programs in EPA;  and c) thereby
reverse the trend of concentrating technical data processing in the
administrative elements of EPA, and return that function to the
technical and scientific operations where it can best be employed.

     It is recommended that .CEQ consider the latter options to be
most favorable and the former to be least favorable, i.e., if it is
possible to join EPA in a comprehensive data management system with-
out undue delay of media program objectives, that approach embodies
the greatest potential effectiveness.  Failing that possibility, CEQ
should negotiate an interface agreement with Region V:   a) to adapt
UPGRADE to Water Enforcement needs and fully develop the system's
inherent capabilities and, b) demonstrate those capabilities to the
other EPA entities.
                          D-69

-------
     The STORE! interface option is. workable and would increase in
effectiveness as the number of State agencies and Surveillance and
Analysis Divisions that could be persuaded to participate.   An under*
standing of the utility of UPGRADE to Regional  and State technical
staff will be the key to gaining participation by those entities.

     The PCS interface option is recommended only as the last resort.
The non-parametric data therein is of limited value and the wait for
operational status of CAS is a major inhibitor.
                               D-70

-------
                           APPENDIX A
                    UPGRADE ANALYSIS OPTIONS
(Reprinted from the pamphlet "The  UPGRADE  System User's Overview"
  - President's Council  on Environmental Quality - August  1977)
                          D-71

-------
                   UPGRADE ANALYSIS OPTION'S

UPGRASS Graphics Options

1.   Scatter plotting -- '3. plot of data points arid their
     distribution
2.   Polygon (point-to-point) plot — data point plot with a
     straight line connecting successive points
3.   Bar chart — each data value is represented by a bar.
     (Numerous shading and density options are available)
4.   Polynomial fit — scatter plot with up to 6th order
     least-squares fitted line and table of m, r, t, and
     f values.
5.   Multi-plotting (FY 1978) -- to allow up to 5 y-axis
     variables on same graph.
6.   Multiple-site plotting  (FY 1978) — to allow the
     plotting of sites  (instead of a variable) on the
     x-axis.

Plot Modifications
A large variety of analytical and cosmetic options are
available,  allowing the user to "tailor" graphic output
froai the UPGRADE systea.  Future development will include
the addition of even more plot-aod options.

           Interchange axes -- x becones y and vice-versa.
           Reverse axis  scaling -- scale from max. to ciin.
           instead of vice-vers2.
                              D-72

-------
Change scale factors — allows user to
specify scale ranges.  This mod can be
used for "windowing" a plot so only
datapoints within a specified range are
plotted.
Change number of axes annotations — to
modify precision of scale divisions.
Change number of axes tick marks — to
modify precision of scale divisions.
Add or delete grid lines — to divide a
plot into quadrants.
Change letter size of axes annotation —
to nodify legability of scale annotations.
Change s7=1*001 and symbol size of datapoiats-
will also be used to differentiate between
different variables when multi-plotting
becomes available.
Change graph title.
Add 2nd line to graph title.
Eliminate or restore plotted datapoints —
if a regression line without the plotted
points is desired, for instance.
Change line type — will also be used to
distinguish between  lines for different
variables when multi-plotting becomes
available.
             D-?3

-------
Change axes length --  to modify dimensions
of entire graph.
Change axes scale type -- to allow use of
log and probability scales.
Change degree of polynomial  fit — to use
up to a 6th order fit  for regression
analysis.
Eliminate outlying datapoints from a fitted
plot — to "window" a  regression line to a.
selected range of data values.
Eliminate or restore current date printout
that appears on every  plot.
Change number of decimal places used for
axes annotation — to  modify precision of
annotations.
Change number of decimal places used for
bar annotation — to modify precision of
bar chart annotations.
Suppress statistics printout on fitted plot —
if Eit r, f, t, ndp values are not needed.
Change bar density and  shading specifications
for a barchart — see graphics 8 and- 9 for
examples.

-------
Statistical Options
1.   Sort and rank -- to obtain a table of median,  quartiles,
     tertiles and 15th and 85th percentiles for any one variabl
     or complete sort arid/or rank for each data point.
2.   Data filtering and listing — to eliminate, outliers ,  selec
     a range of data values, or obtain a listing of data point
     valves.
3.   Linear regression — to produce a table of coefficients o:
     variance for regression analysis.
4.   Data partitioning — to group data values for the x-axis
     into class intervals and plot against a partitioned y-axi:
     variable.  Statistics for class intervals and partitions
     can be produced.
5.   Basic statistical srr-naries of selected data,
     including minimum, maximum, mean, standard
     deviati'on, number of data points, and historical
     period of record.
6.   Data transformation FY 1973 — to allow user to perform
     arithmetic operations on variables to obtain, ratios, etc.
     User will also have capability of using a variable for
     exclusion purposes to obtain a selected range of data
     values for that variable.
                           D-75

-------
7.    SAS  FY  1978  -- an  integrated system for data management
     and  statistical  analysis.

     Highlighting SAS's statistical capabilities are  its
     versatile  least-squares procedures, which produce  a
     wide variety of  linear and non-linear regression
     analyses,  analyses of variance and covariance, and
     multivariate analyses of variance.  One can produce
     highly  specialized analyses vith  comprehensive matrix
     manipulation procedures.

     SAS  can also produce multiple and partial correlation
     coefficients, Spearman's and Kendall's correlation
     coefficients and contingency table chi squares.

     It has  several procedures  for analyzing time-
     series  data.  One  can calculate summary statistics
     and print them or  use them directly for further  analysis
     One can obtain frequency and cross-tabulation  tables
     and analyze them as well as. perform, discriminant
     analyses,  factor analyses,  and cluster analyses.  One
     can construct and  evaluate Guttman scales, and can
     perform t-tests or tests of goodness-of-fit or probit
     analyses.
                               D-?6

-------
                APPENDIX B
     CANVASS OF WATER ENFORCEMENT DATA
SYSTEMS IN EPA REGIONS AND DELEGATED STATED

-------
            fr   ~\ -z
    A i  A /'« •"«•"«/) ?*
        x         J                     WATER ENFORCEMENT DATA HANDLING QUESTIONAIRE
      1.  Region  3—  .  person Interviewed jiff' •?•*-&'  IWC' *******            . Phone  2 ^ ' *3 - ^ ^ 7 &

                                           fft.ii°"A ' <^f/«" ^i,/«v*/«~~'                              '

      2.  General Dcscripbion of Region's system RQ&P$  —  ?  )•*>"
                                                     ce>**6j~'- •£*"*-•**'(•**
     3.1s  industrial  effluent  data being stored    up     ?  Majors  **°   7  Minors	7  Self-



     MonihorJng	?   EPA monitoring	7  State/local agency monitoring	7



     1. Is municipal  effluent  data being stored	**        ?  Wat states
                                  7  Self-monitor ing	7  State/local monitoring
     EPA monitoring
     7.   Does system store and retrieve compliance schedule data_
     B.   Ooes system store and retrieve receiving water quality data  CjgS  *" *«»

                                                                     "7^



     9.   What hinds of analyses are performed
10. What hardware components are  in  use
                                             C.twJf.  ftaM^   -Qt>r*o> • «••• I f

-------
WATER ENFORCEMENT DATA HANDLING QUESTIONAIRE


            *    Xr.
 1.  tteglon_JLlL_. person  Interviewed  y*-^****    /r.\ _ . Phone

                                              I  /    '/     7 /     /  ?  /          A   -«:.
 2.  General Description of Region's system Uvl^^J^'-^i £ *******  f***-/^.^..*     /k -r  (3>^/OS   '•«••/ /'//£• -f /*/> <•.»/-»  &  A *)*- J

          - /J*^ - t". Ls Jf*..: /i —                                           f^ /..'/. /-A
 J.la  inclutiLrictl effluent  data being stored   ^j«*>     ?  Majors  LI**   ?  Minors  ^''^     ?   Self-
                                             •                    ^^^^^^^^^^^^^^^^^
                                              lL ••»"')

HoniL Airing  tj ^ •.• _ ?  EPA  monitoring^- ' ^//C' __?  State/local  agency monitoring _ ?
                  J  '
A.  Is  municipal effluent data  being stored   i*f*.	7  Self-monitoring  ^^v	?  EPA
                i  i                            t '                               r


monitoring _y>/ 'f>	?  State/local monitoring c4£-«v	?
                                                  «


b.  Is  dnta from NPDES States being stored Ufe-	?  What statos
                               ? Self-monitor ing ^^^	?  State/local monitoring
EPA monitoring
6. Is dutn  from non-NPDES States being stored  tV'A	?  Wat  states
                               ?   Self-monitor ing  4^>	?  State/local monitoring
EPA mouitoring_
7.  Doeb  system store and retrieve  compliance schedule data
0.  Ooes  system store and retrieve  receiving water quality data J&'rt f~ 3k rt  t+fp




9.  What  kinds of analyses are performed ^V-^  ]£.**?•* n*-*v  -•&/*  6«*  £'<>+..  4e"-*+j /H A>
                                          «I^_^MM^M«*iM«MBMM-M^WM«a^B*^BH^«M«^B^B«Ha^HM«BB^^MI^HM^^^MM«M«^^aB*^"^^H">IH
                                                            '-'£'  -•»***+*-
10. What hardware components are  in  use

-------
                                       WATER ENFORCEMENT DATA HANDLING QUESTIONAIRE


     1.  Region HE  .  Person Interviewed  J?f£  A^'S _ . Phone
     2.  General Description of Region's system    b?~"  '•*>*+,' t>   $e-lv& '«"_<>   _ ?  Majors _ 1  Minors _ ?  Self-


     Monitor iny _ ?  EPA monitoring _ ?  State/local agency monitoring _ ?


     4.  Is municipal effluent data being stored   (fiO _ ?  Self -monitor ing _ ?  EPA


     monitoring _ ?  State/local monitoring _ ?                      ^

                                                                                     "
     5.  Is dnta from NPDES States being stored    U£>    ?  What states
                                    Self-monitor ing	?  State/local monltoring_
£,   EPA monitoring

0
                                                                .
    6. Is data  from non-NPDES States being stored   (AC> _ ?  Wat states
                                 _?  Self-monitor ing	?  State/local monitor ing_
    EPA monitoring_
    7.  Does system store and retrieve compliance schedule data
    8.  Xoes system store and retrieve receiving water quality data   HO


    9*  What kinds  of analyses are performed.
              r—	3-
    10. What hardware components are in use
        ,.,-.   A -,  f,..,.,  -./w/f^l-'.  /.. ' .'j'f'&JCf-' 7'-

-------
                                   WATER ENFORCEMENT DATA HANDLING QUBSTIONAIRE




                  Person Interviewed  ^
    1. Reg iep_
    2. General Description  of Region's system
                                                                                         VfcJ f-
3 . la industrial effluent data being stored




Monitoring __ ?   EPA monitoring
                                                   we?
                                                      _?  Majors
            ?  Minors
4.  Is municipal effluent data being stored




monitoring _ ?  State/local monitoring
                                                     ?  State/local agency monitoring^



                                                             ?  Se If -roon itor ing
    5.  Is dnt.i from Nl>DES States being  stored   V\D




    	? Self-monitoring_
                                                    ?  What states
_?  Self-




	?
                           ?  EPA
?  State/local monitoring
l   CPA monitor ing
Ou
                                                                        l mo

                                                                        tl^
    6.  la data from non-NPDCS States being stored
                                                          ?  Wat states
                                 _?  Self-monitoring_
                                                           ?  State/local monitoring^
    EPA monitoi:ing_
    7.   Doeu system store and retrieve compliance schedule data_
    U.   ':-oes system store and retrieve receiving water quality data




    9.   What kinds of analyses are performed  6*4£: $70*& /
 I ' C*^> £*\5
                                                     -4.if I*
10. What, hardware  components are in use
                                                         f ctrllv 12 /

-------
WATER ENFORCEMENT DATA HANDLING QUESTIONAIRE

    A     '0   t  fr
   /Jif«S'(r    Let**-1'
      1. Region       . Person Interviewed    if«S'r     et**-1' _ .  Phone
      2. General Description of Region's system ft-*£ - 7£*< :*"% »>^  c*««f '•**,* c  •'• L,,l,.f*t  f?ff''ss.i f^^
                    <>/ <•  r..                            .v-
      3.1s industrial effluent data being stored   />ft _ ?  Majors _ 7  Minors _ 7  Self-
      Monitoring _ ?  EPA monitoring _ ?  State/local  agency monitoring _ ?


      4.  Is municipal effluent data being stored  ./"*£* _ ?  Self-monitoring _ ?  EPA


      monitoring _ ?  State/local monitoring _ ?
      5.  Is data from NPDES  States being stored   h&    ?  what sttes   itjts L*t.e  /Jfff&S /•fj*"'",, ^  /

              jf/U
                   jfr-J^ i,a.'t > jf < A**-*- ? Self-monitor ing	?  State/local  monitoring

F
J5     EPA monitoring	?


      6.  Is data from non-NPDES  States being stored            ?  Wat  states	
                                   _?  Self-monitor ing	?  State/local  monitoring_
     EPA monitoring_
      7.   Does system store and  retrieve compliance schedule data
     8.   .Ooes system store and  retrieve receiving water quality data _

                                                  ** /       *             f        f\ff/    \   fi r    *
     9.   What kinds of analyses are  performed  CASisc&*v&• */  ^IXK^  ^t^-^^    {stri**e+~>T  ff&r*-  •**.
     10. What hardware components  are in use^

-------
                                   WATER ENFORCEMENT DATA HANDLING QUEST ION A I RE
 1.  Region"VIZ  .  Person Interviewed  LJle**,  />i-*l~, _ . Phone "72. ^ ""
 2. General  Description of Region's system  fs*elc ' */»•*«/ /-tf«~«*-  £tt+&/*i m  t»^^*-^  *r**t>S*
                                    5
3.1s  induiiLrial effluent data being stored   V\t> _ ?  Majors _ ?  Minors _ ?  Self-
Monitor Jmj _ ?  EPA monitoring _ ?  State/local agency monitoring^ _ ?
4. Is municipal effluent data being stored   (/\&        7  Self-monitoring _ ?  EPA
monitoring _ ?  State/local monitoring _ ?
5. Is data from NPDES States being stored   Mf>     ?  What states
                              ? Self -monitor ing            ?  State/local monitoring
EPA monitoring_
6. Is data  from non-NPDES States being stored   HO	? ' Wat states^'   '   *^<
	?  Self-monitor ing	?  State/local monitoring   /X
EPA monitoring_
7.  Does system store  and retrieve compliance schedule data
Q.  Ooes system store  and retrieve receiving water quality data
0.  What kinds of analyses are  performed    <*£>*» ^	
10. What hardware  components are in use_

-------
                                  cln-h*.       1ft
 1.  negion      •  Person Interviewed 6*'        ?                     . phono
                                             ENFORCEMENT DATA HANDLING QUESTIONAinB


                                           '^ Ah I
     2.  General Description o£ Region's sysbem      I £S   ^ -//ticuT^y  ci»  }*ve(g> f-~>~y
                                                      4-
     3.1s  industrial effluent data being stored . ytb	?  Majors       7  Self-


     Monitoring	i/tt?     7  EPA monitoring ,» M    7  State/local agency monitoring	7


     4.  IS municipal effluent data being stored   U/-S ^"    7  Self-monitoring   ^^     7  EPA


     monitoring  	7  State/local monitoring	7
    5. Is data  frort ttPDEB States bein^ stored	1  What states
                                 _? Self-monitoring	7  State/local monitoring  £7. A*;*,
    EPA monltoting
2                                                                            .
    6. is data  from hon-tfPDBS States being stored _ 7  Wat states1^ "
                                 _7  Self-monitoring	7  State/local monitorlng_
    EPA monitoring_
     *  boed systeiii  storfl  and retrieve compliance schedule data
        toes system  store  and retrieve receiving water quality data   V\O
9.  What kinds bf analyses  are  performed "&$ -  \^.t,{4^^    5>TPKt«=T    /isK~H5 f  .



                                oT
                                                                                                    7
    10. What hardware components  are  in use     (tf***   will   (_QfiF*i?(

-------

                                    WATER ENFORCEMENT DATA HANDLING QUEST ION A I RE



                                       K.?
 1.  HegAonTTTTT' . Person  Interviewed  K.?y*f   'e.i+t' _ .  Phone  33 7 -
 2.  General Description of Region's system   '//«•**. »-w»~(    k-t>n ^   -»        A^ / —
 3.1s  industrial effluent data being stored   *"£ _ ?  Majors f»^-*   ?  Minors M      ?  Self-
MoniLr-rintj /j4-£ _ ?  EPA monitoring p*'*   __?  State/local agency monitoring  ti P
4.  la municipal effluent data being stored  U/*s _ ?  Bel f -mon itor ing pt'- > _ ?  EPA



luoiiiborinr; __H>£4 _ 7  State/local monitor ing  *O     1               ^.

             7                               ut$                    CD
ti.  Js .  la datn  from non-NPDES States being  stored **&	?  Wat states
                              _?  Self-monitoring i**>	?  State/local monitor ing    *»p
EPA
                                                                                         I      A

7.  lioea  uyateiu sLore and retrieve compliance schedule data y**» —  lJ--^t P*"-y /v»~*T**'••( fa



0.  «:-oea  ayaLem store and retrieve receiving water quality data .M^ — f^  yiPffrf  - 0" *
'J.  Wliak kinds  of analyses are performed^
                                                                   &-L.  M
 &--f  /^tf-*-."-*-^  £^-\*.Sl>  « ff-(

10. what hardware compononta are in use
                                                            »fi»  0

-------
                                       WATER ENFORCEMENT DATA HANDLING QUESTIONAIRE

    1. RegionJL£-_. Person Interviewed  M-/\f  /K^cV- _ . Phone *&C?
    2. General Description  of Region's system  /"uf-" — ~    f^* &.•'••<    / 1  *:> -^ */<"••  !•-
    3.1s industrial effluent  data being stored   V& _ ?  Majors   **'?  7  Minors _ ?  Self-

    Monitor ing____*J£ _ ?   EPA monitoring  t*-i"* _____ 7  State/local agency monitoring __ ?

    4. Is municipal effluent  data being stored /ll?_ _ ?  Self -monitor ing   H.& _ ?  EPA

    monitoring    /ig» _ ? State/local monitoring _ ?
    5.  Is dntn from NPDES States being stored  tft*-f _ ?  What states
r   _ ?  Self -monitor ing    b\c* _ ?  State/local monitorlng_
OD   — — — ^— — — — — — ^-^— — ^— —                 __^___— __                         _
O\
    EPA monitoring
    6.  Is data from non-NPDES States being stored      _ ? 'Wat states
                                  ?   Self-monitoring    ^^	?  State/local monitoring_
    EPA monitoring    AC7	?

    7.   Does system store and retrieve  compliance schedule data   IJS1> - WT nc> t'H*e'V*li &>•/  t^.*f-

    8.   Ooes system store and retrieve  receiving water quality data

    9.   What kinds of analyses  are  performed
10. What hardware  components are in use
                                             '?  ^'^•h£^^-l  ft.vir' **•'*•
                                             ft

-------
 1.  Region
                                   WATER  ENFORCEMENT DATA HANDLING QUESTIONAIRE> -




                . Person Interviewed  Jc*^£" f>i k OS  *L '.'           . Phone T?^ *9 - / 2 c.,

 2.  General Description of Region 'a^sy at em5

                              '
                                                        .//, s.'j «,«.,. '/xi'«

                                          '
3. la industrial effluent data being stored  n <* <*    ?  Majors

                                             y
                                              A*'"'
                                                                           Minor s_£f^
Monit-jring
                       ?  EPA monitoring
                                                  ?  State/local agency monitoring




4.  Is  municipal effluent data being stored  ty £* _ ?  So It-monitor ing  tj£ *
                                             y                                /

                  Of

monitor JIKJ _ ^ t? •*   ?  State/local monitoring           ?




5.  la  dntn from NPDES States being store




__ ? Self-monitoring
                                                                                             Self-




                                                                                                 7
                                                                                       ?  EPA
EPA monitoring_
6. Is data  from non-NPDES States being
                                                            ?  State/local monitoring
                                                            ?  State/local monitoring
7.  nooj, oy»Lom storo and retrieve compliance schedule data
0.  ':f-'/l

-------
                  UNITED STATES ENVIRONMENTAL PROTECTION AGENCY
                                 Region III - 6th & Walnut Sts.
                                    Philadelphia, Pa. 19106
 SUBJECT:  Region  III Evaluation of UPGRADE
 FROM.
 TO:
                                                              DATE:
Dee Ortner
Information Systems Branch, 3MA20

Dr. Lance Wallace
Office of Monitoring and Technical Support, RD-680
          The approach taken  in our evaluation  focused on  the  comparison  of
          UPGRADE with two graphics packages  presently supported  by  the Agency.
          Experience of our staff  in  the use  and  capabilities  of  graphics
          software and output devices was extremely  limited  prior to this
          analysis and thus results reflect an  unbiased assessment.

          Evaluation results  are structured in  accordance  with Enclosure  "B"
          of your memorandum  dated February 17, 1978.  Additionally, I have
          been  in contact with Mr. Joe Higgins  of VITRO Laboratories to provide
          him with as much data as available  from our analysis.

          If Region III may be of  further assistance do not  hesitate to contact
          me or either of the evaluators indicated on the  following  pages.

          Attachments
EPA-III.013.73.T
                                            D-88

-------
                REGION III EVALUATION OF UPGRADE
Identification of the Evaluation

   Principal Investigator for the evaluation was
       Maria Gilbert      597-9769 (FTS)
       Information Systems Branch, 3MA20
       Management Division
       EPA - Region III
   Ms. Gilbert was assisted by
       Steven Belczyk      597-9964 (FTS)
       Management Division
       EPA, Region III

Evaluator's Function as it Relates to UPGRADE
   Both Ms. Gilbert and Mr. Belczyk perform systems analyses and provide
programming assistance as support to Regional personnel.  Presently the
evaluators are investigating the use of UPGRADE as a means of supporting
the Region's efforts in developing Environmental Profiles and in dis-
playing data for public information/awareness.  Other probable applications
of UPGRADE would be in the areas of 208 planning and the drinking water
and new source programs where public health effects are an integral part
of impact assessment.

Evaluator's Experience Using UPGRADE
   Neither evaluator was familiar with UPGRADE prior to the evaluation.
Ms. Gilbert had overall responsibility for reviewing the documentation
and directing Mr. Belczyk in system use.  Time spent in documentation
review approached 50 hours and time on the system totaled 10 hours.  Both
evaluators found UPGRADE easy to use once data was stored.  However more
detailed documentation with examples for both the general user's guide
and the GLIDE manual would be helpful.

Goals Set for UPGRADE
   The approach taken in the evaluation focused on the comparison*of
UPGRADE with two graphics packages presently supported by the Agency:
   CALCOMP (California Computer Products)
   IGP - Interactive Graphing Package (TEKTRONIX, Inc).
Criteria by which all systems were evaluated are shown in Figure 1.  The
intent of the evaluation was to rate each system independently and not in
relation to either or both of the other systems.
                                 D-89

-------
                            Figure 1
                       Evaluation Criteria


1.  Input
    a.  Ability to read data from tape and disk
    b.  Ability to input interactively
    c.  Ability to easily modify data
    d.  Ability to store data for later use
    e.  Ability to use all types of data, STORE!,  AEROS, etc
    f.  Ease of Data preparation and entry

2.  Output
    a.  Types of graphs; barcharts, linegraphs, 3D plots, isovariate plots,  maps
    b.  Display features
    c.  Devices: calcomp, textronics, line printer
    d.  Ability to save output
    e.  Ease of output structuring
    f.  Overall aesthetic appeal

3.  Special Features
    a.  Statistical capability
    b.  Ability to interface with EPA Headquarters and Regional
        software
    c.  Transferability between NCC and WCC
    d.  Ease of operation for non-ADP user

4.  Quality of Installation
    a.  Completeness of documentation
    b.  Performance

5.  Costs of Test Graph
    a.  Sign-on
        -time
        -cost
    b.  Execution cost
    c.  Programming time
                                  D-90

-------
Performance
   Each of the systems was rated according to points allocated for
each of the following five categories of evaluation criteria:
       CATEGORY                                     POINT SCORE RANGE
       Input                                              0-5
       Output -                                           0-5
       Special Features                                   0-5
       Quality of Installation                            0-5
       Costs of Test Graph                                0-3

*range goes from low (0) to high (5) performance rating

Performance rating for each system is the summation of category scores.
Results of UPGRADE'S performance and that for CALCOMP, and IGP are shown
in Table 1.

Recommendations
   From a Regional perspective it would be difficult to estimate
the amount of use UPGRADE would receive or to anticipate the necessary
level of user's support without first making the following modifications
to the system:
   -include the capability for analyzing functions and the plotting of
more than one variable (Y-axis) per graph;
   -include the capability for generating 3D and contour graphs;
   -include the capability to interface with Headquarters/Regional
software (extend STORET, MSIS, PCS);
   -expand data modification capabilities;
   -enhance the "data save" capabilities;
   -prepare more detailed documentation on system use and the update
of user manuals.

   Presently the Region has access to a great number of graphics packages.
The extent to which we would use UPGRADE, consequently, would be dependent
on the number of modifications or enhancements made to the existing
system.  The Agency should consider this perspective before authorizing
the "sponsorship" or "co-sponsorship" of the UPGRADE system.
                                     D-91

-------
                                                                                                               page 1 of 4
    Evaluation
     Criteria
             UPGRADE
             CALCOMP
               IGP
INPUT:

- Ability to read
  data from tape
  and disk

- Ability to
  input inter-
  actively
  Ability to
  easily modify
  data
  Ability to
  store data for
  later use
  Ability to use
  all  types of
  data;  STORE!,
  AEROS, etc
  G
  \o
  ro
SCORE:  3

Data must be stored in one of the
UPGRADE interfaces for accessing.
Up to 50 variables of a user's
data can be input*interactively
for analysis.
Modification before graphing and
between successive graphs is
limited to selecting a range of
data values and/or eliminating
specified data points.
Data entered interactively can be
stored; data selected from UPGRADE
interfaces cannot be stored, the
selection process must be re-
peated for subsequent analysis.

Any data can be stored.
SCORE:  4

Data can be read from tape and
disk.
CALCOMP is not an interactive
system; however, conversational
jobs can be run for which data is
inputted interactively.

Input data can be modified, either
by editing the data or modifying
the program; modifying data
becomes difficult if a data set
cannot be accessed over the
terminal.

Any input data can be stored.
Any data can be stored.
SCORE:  2

This capability does not exist.



All data is input interactively.
Data can be inserted, deleted and
changed before and after each
graph is produced.
This capability does not exist;
data values are lost after
terminal session is ended.
No data can be used unless
entered interactively.

-------
                                                                                                               page 2 of 4
    Evaluation
     Criteria
             UPGRADE
             CALCOMP
                IGP
- Ease of data
  preparation and
  entry
OUTPUT:

- Types of graphs;
  barcharts, line
  graphs, 30
  plots, iso-
  variate plots,
  maps

- Display
  features
  Devices:  CAL-
  COMP, tektronix,
  line printer

  Ability to save
  output
Data inputted  interactively  is
easily entered.   Stored data is
prepared very  easily  for analysis.
SAROAD data  is entered by  CEQ
through user request; STORET data
must be transferred to tape  and
mailed to CEQ  with a  copy  of the
data layout.

SCORE:  3

3D and contour graphs cannot be
produced.  Only one y-axis
variable can be plotted per  graph.
Functions cannot  be plotted.
Only one graph can be displayed on
the screen.  Text cannot be shown
within the graph.

Plots can be produced on tektronix,
line printers, CALCOMP printers
and microfiche.

-Could not be determined-
Some data may be preprocessed
before use.  For STORET, a'
retrieval would first be run then
accessed by CALCOMP.  For some
Regional (other user) data, it Is
feasible to write extraction
programs to feed data to CALCOMP.
SCORE:  4

Any type of graph exclusive of maps
can be produced.
Data is easily entered.
All display features determined as
Regional needs are available.
Plots can be produced on tektronix
and CALCOMP printers, but not a
line printer.

Output can be stored on tape for
plotting on the CALCOMP printer.
Output cannot be stored for plot-
ting on the tektronix.
SCORE:  3

3D, coutour, maps or curve fitting
graphics cannot be produced.   Bar-
charts without variable shading
can be graphed.
All display features needed for
Regional use are available.
Tektronix printers only can be
used.
Up to ten (10) data sets con-
taining the instructions for
graphics display can be stored.

-------
                                                                                                               page 3 of 4
    Evaluation
     Criteria
             UPGRADE
             CALCOMP
                     IGP
  Ease of output
  structuring
- Overall
  aesthetic
  appeal

SPECIAL  FEATURES:

- Statistical
  capability
  Ability to
  interface with
  EPA Head-
  quarters and
  Regional soft-
  ware

  Ease of opera-
  tion for non-
  ADP user
,VO
Display features cannot be altered
without restarting the Data
Selection or Analysis section.
Graphics look acceptable for
public distribution (see Attach-
ment 1).

SCORE:  2

UPGRADE can perform basic sta-
tistics and can interface with the
Statistical Analysis System (SAS)
to perform more complicated
analyses.

No direct data access capability
is available, therefore data must
first be stored.  Updating of data
must be maintained.
Special screen displayed prompts
assist the user during terminal
sessions.  Aid of an ADP person
may be required for transferring
data to CEQ for storage in
UPGRADE.
Any structure is possible but some
display controls cannot be'changed
Graphics look professional  (see
Attachment 1).
SCORE:  3

No statistics are available.
Extraction programs or retrievals
of data may first be needed.
Knowledge of FORTRAN, EXEC 8,
and ALPHA is needed by user.
JCL
      An exhaustive  list  of display
      options  are  available.  Problems
      were encountered  using  some of
      these options.

      Quality  of graphics adequate for
      in-house use (see Attachment 1).
      SCORE:   2

      No statistics  are  available.
      Interface is  irrelevant as data
      only can  be inputted  inter-
      actively.
Easily operated by a  non-ADP
person, IGP has a "HELP"  command
to aid the user during  a  terminal
session.

-------
page 4 of 4
Evaluation
Criteria
QUALITY OF
INSTALLATION:
- Completeness of
documentation





- Performance


COSTS OF TEST
GRAPH (see
Attachment):
- Sign-on:
time
cost
- Execution
- Programming
time
TOTAL SCORES
CJ
i
NO
V.n
UPGRADE

SCORE: 3

The User's Overview and GLIDE
Manual did not provide sufficient
information to learn specifics
of UPGRADE. Screen instructions
(prompts) during terminal sessions
were good.


Problems encountered using some
display features, the neutral text
option and mapping.

SCORE: 3



35 minutes
$6.52
N/A
N/A

SCORE: 14




CALCOMP

SCORE: 4

Subroutines were well documented
and examples provided. Variables
used by each subroutine were
adequately explained. Instructions
for the 3D and contour plots were
difficult to understand. In-
structions for using CALCOMP at
NCC and WCC were good.
A problem with the THREED sub-
routine was encountered.


SCORE: 1



20 minutes
$9.00
$8.28
15 minutes

SCORE: 16




IGP

SCORE: 3

User's Guide was not available
at time of evaluation. More
detailed explanation and examples
are required.




HELP command did not function;
commands for storing/retrieving
data sets did not work. Some
problems were encountered with
axis range, bar size and tic marks.
SCORE: 2



27 minutes
$9.88
$1.59
N/A

SCORE: 12





-------
     UPGRADE
T
U
R
B
1
D
I
T
V
     15.90


     13.59


     12.00
      9.04


      7.5*
3.W


i.se
                                                                                                   0>
                                                                                                   o

                                                                                                   I"
                                                                                                   3
                                                                              i J. .v

-------
    CALCOMP
) 00   2.00
               YflKIMfl  RIVER BflS IN
                    RT CLELLUh  RlVf.n
                         1969
u.oq
6.00    8. UP
     MONTH
lu
1^.00
IM.UO    16.00

-------
                  IGP
                                                             •-
oo
         T
         U
         R
         B
         I
         D
         I
         T
         Y
              15-
             10
                                                              t
                                                              I
                                                              »»
                                                              •",

-------
UPGRADE USER EVALUATION
\SS£.fl
Region
III







USE
Correlate
water
turbidity
vs. time
(months)




DATA BASE CAPABILITIES
CATEGORY"
Custom data
set



Data save
capabilities
Data Extraction

AVAILABLE
Limited data
modification





Manual

NEED
Expanded






Automa-
tic
RANK
D




N

N

UPGRADE CAPABILITIES
CATEGORY
Plot






Display features

AVAILABLE









NEED
3D
Contour

fuiction
idd user
irmlysls
•ou tines.
nultlplal

Quick
exit
RANK
N
N

N

E

D

TRANSACTION
VOLUME









COMPARABLE
MANUAL
TIME










-------
                                                    TRANSACTION DATA
USER
(TASK)
Region III
(Correlate water
turbidity versus
time (months of
year) monthly
values)




Extract from DB
STORZT
X




SAROADS





NIH





IDE





Store
In
IDS
X




Analysis
& Terminal
PLOT
No analysis
1 plot




OFF-
LINE
PLOT
	




Number of
Terminal
Sessions &
Tine Per
Session
1 session
35 nln.




COST
Per
SESSION
6.52




COST
Per non-IDD
Extraction
	




Could
Not
Do .
	




MANUAL
TIME
Calcomp
tlpe-20mln
slgnon cost-
S9.00 term -
$8.28 exec
Prog=15 rain.
1CP=27 nln
slgnon $9.88
=1 . 59 exec
tek. plot
o

I-"
o
o

-------
        U.S.  ENVIRONMENTAL  PROTECTION  AGENCY

,«••*>,                        REGION  X
      *r                    120°  SIXTH AVENUE
       t              SEATTLE. WASHINGTON  98101
            3*5

    September  7,  1978

    Mr.  Lance  Wallace
    Environmental Protection Agency
    Office  of  Research & Development
    Washington,  D.C.   20460

    Dear Lance:

    As requested I have summarized Region 10 '3 evaluation of the UPGRADE
    system. Please bear in mind that our review may not be as comprehensive
    as desired since we were unable to devote more time with this task due to
    other priority commitments.   The following paragraphs contain our
    summarization .

    X.  Introduction

        1.   Identification of evaluator

            •Bruce Cleland & Shirley Towns
            EPA, Surveillance & Analysis Division
            M/S 345        (FTStf 399-1193 or -1106)

        2.    Brief description of ''valuator's function

              •Both ambient and source data for air and water.
              •Data uses varied from annual reports, one-time quick
              responses, and support to states.

     '•'•.  Description of Experience

         3-    Extent and nature of evaluator 's experience using UPGRADE

              •Two people and approximately 25 man-hours were spent.

                  a)   10 hours gaining familiarization with system from
                       documents and demonstrations

                  b)   15 hours using the system
                                      D-101

-------
    U.    Description  of tasks  or goals  that  the evaluator set for UPGRADE

         •Major needs of the' evaluator  centered on graphical and
         statistical  analyses  of only one data base (e.g. SAROAD or
         STORET,  etc.)  to generate  environmental profiles, to determine
         trends,  and  to define source/receptor relationships.

    5.    Evaluation of UPGRADE performance in meeting those objectives

         •Portions of UPGRADE  did provide good graphics display and
         statistical  analyses  with  respect to some of the more general
         purpose items (scattergrams, means  standard deviations, etc.).

         •Since the data's initial  input will be from the National
         Aerometric Data Bank  the most  current data may not be available.

         •Overall, UPGRADE, when needed, was too general to be specific
         for most of  the regions analytical  needs.  Simple "in-house"
         programs exist which  accomplish the same things UPGRADE provides
         at a considerably lower cost  to the user.  In addition, it is
         much easier  to "customize" these programs to satisfy specific
         needs at the regional level.

Ill.  Recommendations

    6.    Eva'luator's  recommendations

         •About 75$ of the time spent  using  the system was devoted to
         getting through the sometimes  confusing prompts confronting the
         user before  the desired analysis could be performed.  As a
         whole, it was felt that UPGRADE was too general purpose to be
         useful at this time.   With present  budget cuts and  increasing
         work loads,  the time needed to sit  down and run UPGRADE simply
         does not exist, particularly when "in house" programs give most
         of the same  outputs.

I hope this information wilL.be of some assistance to you in preparing
the final summary report.  If you have any other questions regarding this
summarization, please contact Shirley Towns.

Sincerely,
William B. Schmidt, Chief
Air Surveillance & Investigation Section
                                D-102

-------
  •DATE-


"IBJECT-
          UNITED STATES ENVIRONMENTAL PROTECTION AGENCY

April 11, 1978

UPGRADE System
  FROM.
    TO:
Ben Eusebiol'Chief
Surveillance Branch

Lance Wallace
EPA RD-680
401 M Street S.W.
Washington, D.C.  20460

On March 29 & 30 we discussed the possibilities of utilizing the UPGRADE
system for Region X's air data analysis.  In order for the region to
evaluate this system, we request that the following air monitoring
stations be included:
         Pollutant/Method

         TSP  11101 91
         0,  44201 11
         S02   42401 14
         CO   42101 11
                                  Station ID
All Region X Sites
38
38
38
38
49
49
49
49
49
49
13
13
13
13
13
13
13
02
02
02
02
02
49
49
49
49
49
49
1200
0560
1460
1580
0960
0980
1560
2140
2220
1840
0840
0840
0840
1420
1420
1420
0220
0040
0160
0160
0160
0160
1840
1840
1840
1840
2040
2040
001
008
002
Oil
001
010
002
001
007
058
002
Oil
012
021
026
001
007
013
002
012
013
014
051
059
062
063
012
013
F01
F01
F01
F01
101
F05
F01
F01
F01
101
F02
J02
JO 2
F02
JO 2
F02
F01
F01
G01
G01
F01
G01
F01
F01
F02
F02
F01
F01
Time Period

  69-77
  74-77
  74-77
  70-77
  73-77
  74-77
     77
  74-77
  74-77
  74-77
  76-77

  76-77
  75-77
  75-77
  75-77
  75-77
     77

  75-77
  74-77
  72-77
  73-77
  73-77
     77
  72-77*
  72-76*
  74-77
  74-76
  71-76
  76-77
         * A portion of the time frame for these sites already exists in UPGRADE.
 EPA FORM 1320-6 (REV. 3-76)
                                            D-103

-------
If you are unable Co retreive 1977 SAROAD data from NADB, please contact
Ray Nye from our Data Systems Branch (399-1580).  He should be able  to
assist you in retrieving this information.

As you are aware, we have used the 1-year, 2-year, or 3-year (depending
on the amount of available data) moving means to depict trends.  We  would
like to see UPGRADE set up a program that would select the appropriate
trend and automatically produce the chart(s) and/or graphs(s) with the
1-year, 2-year, 3-year or x-year moving means.  The enclosed data chart
shows a calculated 3-year moving means from one of our particulate
sites.  Please note that the quarterly geometric means are used in
determining the moving means as opposed to the monthly arithmetic means
used for other pollutants (continuous data).

Also, when analyzing data for profiles, we projected the estimated number
of days in violation for sites with less than 365 monitoring days by
using the following formula:
                                       (365)
                where:  r = number of actual violation days
                        n = number of actual sampled days

                (Reference Guideline:  A Mathematical Model for Relating
                 Air Quality Measurements to Air Quality Standards AP-89-
                 Ralph I. Larsen, Ph.D.)

We would like this calculation to be provided as an option.

Some of the statistics most often used in Region X are not currently
summarized in the UPGRADE system.  Would it be possible to program
additional summary reports?  Basically, Region X's summary reports need
to have the following conditions:

    1.  Flexibility to retrieve the data by various parameters such as:
        geographical boundaries (Region, State, AQCR, county, city),
        number of violations, and second max, etc.

    2.  Conformity of units to standards:
             Micrograms per cubic meter
             Milligrams per cubic meter (carbon monoxide)

    3.  One averaging period per report:
            1-hr
            3-hr
            8-hr
           24-hr

-------
    4.  Averages for the first hours of a month should include values
        for the last hours of the previous month.

    5.  Excursions are counted on the day of the ending hour period in
        which the violation occurs.

    6.  Compute, but flag data which does not meet NADB reporting
        criteria.

    7.  The Monthly Summary Report should be updated each quarter as a
        continuous report for one calendar year.  On completion of up-
        dating the fourth quarter data, the data should be computed to
        give annual totals on the Monthly Summary Report.  These totals
        will also be used to comprise the Annual Summary Report.

    8.  An option to retrieve data for one or several calendar years
        should be available from the Annual Summary Report.

The outlines for the summary reports and suggested printout format are
attached.

If you have any questions or need further clarification, please contact
Shirley Towns (399-1106) of my staff.

Enclosures

cc:  Ray Nye
                              D-105

-------
TRENDS El TSP  OF  AQCR 062              3/77
CLARKSTO:.' — CITY HALL
ASOTIN COMITY  —  4903R0001 F01
DATA INPUT  FOR 3  YEAR Rir^JIIIG AVERAGE
MEASURED HI IIICROGRAMS PER CIIBIG METER

YEAR  QTR   -JOBS  GEO MEAN  RUN MEAN  1 YEAR  2:iD HIGH

 1971
                                                  210
1
2
3
4


3
19


117.64
103.26


0.00
0.00


0.00
0.00
 1972
1
2
3
4
20
21
13
20
115.13
90.01
108.26
68.54
0.00
0.00
0.00
0.00
0.00
104.22
104.41
92.32
                                                  177
 1973
 1974
 1975
 1976
1
2
3
4
2
12
1
3
174.35
83.91
147.00
32.14
0.00
0.00
0.00
0.00
87. 28
85.22
79.20
80.01
1
2
3
4
13
10
8
8
92.04
89.65
119. fil
116.54
0.00
92.89
93.76
93.05
80.74
82.51
88.56
101.29
1
2
3
4
14
13
15
13
85.02
126.84
130.63
78.62
89.03
92.42
94. SQ
98.39
98.28
108.45
112.87
103.14
1
2
3
4
17
15
14
13
101.97
104.18
127.97
115.60
97. 9R
100.10
102.26
105.70
107.70
103. 00
102.18
111.24
                                                  204
                                                  276
                                                  224
                                                  219
                                             D-106

-------
                                              T3P 7 KEND3
  U
  G
  X
o n
        160 -T-
        IB0 --
100 -L-
                                     CLARK3TQN  --  CITV  HAUL

                               A3OTIN  COUNTY	4-30380001 F01
                -I—I—I—f—f
1371
1372
                                     1373
                                                          1374-
137B

-------
                MONTHLY SUMMARY REPORT FOR CONTINUOUS DATA

                                                     (i.e. CO, 0  )
SITE INFORMATION

1.  Pollutant: Name & Code (include Method Code)
2.  Calendar Year
3.  Site Identification
         AQCR Number
         County Name & Code
         Site Code Number
         Site Name & Address (include city)

OBSERVATIONS

1.  Number of samples observed for each month
2.  Number of days sampled for each month
       Minimum of 18 hours of a 24 hour period  (within  one
       calendar day) constitutes a valid day (includes  zeros)
3.  Maximum Value for each month
       Nonoverlapping
       Includes midnight
       Reported to nearst tenth
4.  Second Maximum Value for each month
       Nonoverlapping
       Includes midnight
       Reported to nearest tenth
EXCURSION OF STANDARDS

1.  Number of actual excursions (non-overlapping, including midnight)  for
           each month
2.  Number of days exceeding standard (non-overlapping,  including
           midnight for each month.
3.  Number of days exceed secondary but less than primary  standard
4.  Number of days exceed primary but less than alert  level for
           each month
5.  Number of days exceed alert level for each month

MEANS AND STANDARD DEVIATIONS

1.  Arithmethic Mean for each month
            Report to nearest tenth
2.  Arithmetic Standard Deviation for each month
            Report to nearest hundredth
3.  Geometric Mean to each month
            Report to nearest tenth
4.  Geometric Standard Deviation for each month
            Report to nearest hundredth
                                  D-108

-------
              MONTHLY SUMMARY REPORT FOR NON-CONTINUOUS DATA

                                                     (i.e. TSP)

SITE INFORMATION

Same as in Continuous Data Report

OBSERVATIONS

1.  Number of days sampled for each month
2.  Maximum Value for each month
         Report to nearest whole integer
3.  Second Maximum Value for each month
         Report to nearest whole integer

EXCURSION OF STANDARDS

1.  Number of days exceeding standards for each month
2.  Number greater than secondary standard but less that primary standard
3.  Number greater than primary standard but less than alert level
4.  Number greater than alert level

MEANS & STANDARD DEVIATIONS

1.  Quarterly Arithmetic Mean — report to nearest tenth
2.  Quarterly Arithmetic Standard Deviation — report to nearest hundredth
3.  Quarterly Geometric Mean — report to nearest tenth
4.  Quarterly Geometric Standard Deviation — report to nearest hundredth
                                 D-109

-------
                                                MONTHLY SUMMARY REPORT FOR CARBON MONOXTDB--8IIOUR       Calendar Yrar 11 Id       !>tate:  ureRon
  AQCR.  193  County!  rluUnomh Site fade:  38 1460 075 101   Sice Address:  Hollywood Arcade, Sandy  Blvd.. Portland   Method; NDIR   Site Typei  NA»)TS
                                 1UMBI R OF
              f
 AQCR: ( 193   Countyi  Hultnoniih
.Site Code:   J8K6flOJ5Kol      JAH '
[Site Aildrein:  llnllywood ArcndH'R',
<;     ',        Snnily Hid      HARi'
'	"	I'ortlnnd	—'I
:Melhod-  MDIR               Al'R 1|
;Slt« Type:   NAQTS            M\Y
 I                           JUN
                             JUL
                             AUC
                            " SI I'
•i
:r
 i


6S7.

7*7

737
               Of  ] PrHCrS'T OS-}  MAXIMUM
      11-VALID DAYS n MTA .MI'.CUNO --VAJ.UE'
      i|   SAftmD  || ,-CUTERLA Ii
                             OCT
                             SOV
                             nrr
                              If
                                                      II
                                               J/
                                               30
                                                                          I
                                                                              SECnin
                                                                                         MJMRl'R
                                                                                                   HUMnrR OF
                                                                                             ,_-._-d.. DA»S ---.
                                                                               VALUE
                                                                  j   7f
                                                                  L-/P.7-
                                                      i;  -1--
                                                   ._.!'?/*_
                                                                   M.L..    \JJ£     j—^Z-
                                                                 ..//?.._LL/i«_J_  jy__  _
                                                                      i                i            i
                                                             .    ^
                                                                 !',..,
                                                                                        KXCURSIOHS
UO
7.1
.	  j_	;/
1          P
15
             y
             /
             0	
                                                                                       II
                                                                                       !i
          d
          'd

          o
          ia
         /i*
       _^_
                             jl  SEVERITY  i  ARITIWITIC  CI-OHI'TRIC
                             IOF! I.XCURSIONG ,   -, =    i. ,   ,
                             'Pel.    Alert1'Hrin  Scd Dov Mran.Scd D<»
                                                                                                  «.'
                                  ; f+s-t-
                                  ! i1/xx
                                   *XAX
                                   IOJ3
                                   itJrt
                                  j M/IXf
                                  I //x/
                                                                      !Foc Cc^ot,'r\B^OXid«.-^.^  HOL*R. .
                                            i   '      !  j    ' J   ;'"i    1     -1 '    ;      ' ''    '     !l '    '
                                            ,frt.- j-i  *rt.	U.»,x	

                                            ! >"x    |  -y/.j
                                            I mt    ,:: t^
                                             33^    *! w    ij
                                /

-------
                                                    RtPofvr  OF   PrxiTic.yi.flTe  OAT/S  Fop,   C.V
    «••=-- rs. -  .-i— i
                              Number o(  i

                               «y» , ••
    i1

193  1.IIIN
• ]« 00:0 002 rni
 i

      Councv Cburtboit^r

      Hmrborn Si.
 I!    Albuny


 'j        Ill-Vol
            -VO4          .•



                SLAMS
                                  £
                               	j	:_
                                                HaKinum

                                                Ji-hr V»luc
                                                       "5


                                                       5/.
                                               1
                                               -ir
Second"""* I] Nuitber ol~ l"! Murobor of'F.licuriionii DyTsavcrlty    f  Arithmetic '  r.m<^

Highlit.—.--|l E«cur«lon_ - -.  n  II   111 |i - |  — .-u- -  t   —-. tftfn  Std Diy Mrc»  y^ F*v
                                                     -210-
                                                       1      il
                                  •      i

                                  I      !
                                                                            o
                                                                            '
          :; i   f>
    ).°°    \   o
                                                               ^&—
                                                                         I
                                                                   Jj.
                                                                    !• i
                                                                    i- i
                                                                    i *
                        ,  o
                       -!>-
o
!
-------
UPGRADE USER EVALUATION
USER
Region
V

























USL
Air Quality
data for
Trend
Analysis and
Summary
Reporting





















DATA BASE CAPAHIMllES
CATEGORY
SAROAD
Interface

























AVAII.AH1.L
Manual
Interface

























NI'.l.D
Auto
Current
Data
























HANK
N
E

























UPGRADE CAPAUII.111LS
CATEGORY
Data Extraction by
User specified
Parameters


Standardized
Units
Statistic aummary
Routines
Various Mean and
Standard Deviation
Analysis Routines
Extraction from more
than one data set and
combining data sets
Ability to summarize
data

One averaging period
per report


Meaningful size data
sets

Compute and Flag
Data not meeting
report criteria
Varying parameters
for plotting
AVAI1.A1ILL
Extraction
levels: 1. SA-
ROAD programs
2. Data filter
Ing
Whatever units
are used In
data base
Partitioning
statistics, SAS
Partitioning
statistics.
SAS
CLIDE. IDB
techniques

Partitioning
statistics

Dependent on
data available


3.9H bytes per
data set

Filtering


Any pair of
parameters nay
be plotted
Nlll)
Addition-
al such
aa 2nd
max

User con-
trolled
variable
transfor-
mations
	
loving
neans

More di-
rect user
control
I/O for
atorlng&
reusing
Addition-
al avera-
ging tran
s forms
nultlyear
ilr data
Idd
flagging

	
RANK
E




N


E





E


E



E

N



TRANSACTION
VOI.IIMT



























COWFAHAuLli
MANUAL
TIM




























-------
TRANSACTION DATA
USER
(TASK)
CEQ user
Support Group
(Teat-Average
Session)
Extract from DB
STORET

SAROADS

NIK

IDS
X
Score
In
IDB
400 data
points
Analysis
& Terminal
PLOT
65AS analy-
sis 12 Std
Tables 20
Plots
OFF-
LINE
PLOT

Number of
Terminal
Sessions &
Time Per
Session
1 Session
35 oln.
COST
Per
SESSION
S25.52
+ 2.00
paper
COST
Per non-IDD
Extraction

Could
Mot
Do


MANUAL
TIME



-------
APPENDIX E

-------
            APPENDIX E




CONVERTING UPGRADE TO OTHER SYSTEMS

-------
                                   APPENDIX E

                       CONVERTING UPGRADE TO OTHER SYSTEMS

 I.   Introduction

     Portability, transferability and convertability of computer software from
like-to-like and like-to-unlike computers remains one of the thornier problems
of system development and use.  When this is combined, as is the case with UP-
GRADE, with a system developed on a specific computer configuration for a
single user whose objective was the rapid demonstration of the basic capabili-
ties of the system, problems for both conversion and software configuration
control are to be expected.  UPGRADE has been developed piecemeal to meet the
original development goals; it has been further developed to an initial level
of a controlled, production-oriented software package, as evidenced by its
successful portability to like-configured computers.

     This Appendix discusses the technical factors involved in the conversion
to unlike computers and the transfer to similiar computers with differing
utility software (especially operating systems).  The nature of UPGRADE, its
software environment, and the technical characteristics of the target computers
are discussed.  These considerations form the basis for the estimates for
conversion in the feasibility study document.

II.   Discussion

     UPGRADE has been co-evolving with the NIH-DCRT computer installation since
early 1975.  As DCRT made larger TSO regions available and, in a one-year con-
version process, shifted from MVT to MVS operation, UPGRADE has been changing
and growing to use these and other additional resources.

     This discussion will focus on those aspects of the current version of
UPGRADE which would present problems in the process of transferring UPGRADE to
another computer center.  Of course, no one will have a complete list of con-
version complications and problems until after any given conversion is com-
pleted.  The discussion is broken into four parts to correspond to four levels
of potential differences between NIH-DCRT and the target computer center.
These four levels are computer type, operating system, interactive system, and
specific installation.  For example, UPGRADE currently operates on IBM-370/MVS/
TSO/ NIH-DCRT.  The comparable description of COMNET's EPA system would be
IBM-370/MVT/ALPHA/COMNET-EPA.

     The use of IBM computers from the very start of UPGRADE development to-
gether with the demands of users for a more powerful and flexible system has
resulted in the use of certain programs, programming languages, and techniques
which are unique to IBM computers.  These would present problems in trans-
porting UPGRADE to any non-IBM system, e.g., the Univac at Research Triangle
Park  (RTF), North Carolina.

     The most basic aspect of this area is the computer architecture as it has
affected UPGRADE architecture.  IBM computers use a four byte, thirty-two bit
word as the basic unit of  information for data program instruction storage.
Other computers use a different scheme for encoding data and program instruc-
tions, employ differing methods of input and output of data between storage
                                    E-l

-------
media and the computer core storage,  and differ from IBM computers in other
underlying ways.  To the extent that  any of these differences have become in-
corporated in the structure of UPGRADE and the way that it handles data, changes
will have to be made to adapt UPGRADE to a different make of computer.

     Some more specific areas are currently identifiable where a move to a
non-IBM machine would require changes.  These include the SAS subsystem, Sort/
Merge Program, Assembly language, and FORTRAN extensions.

     The Statistical Analysis System  (SAS) is sold and maintained by the SAS
Institute.  They currently maintain and support SAS only for use on IBM com-
puters and have no plans to expand this coverage to other types of computers.
Some version of SAS has been made available on the RTP Univac; this would make
conversion problems to that computer  slightly less than for any other computer.
The source coding for SAS is approximately 35% IBM Assembly language, 60% PL/I,
and 5% FORTRAN.

     The IBM Sort/Merge program is used in several parts of UPGRADE for sorting
of data.  Since this is a proprietary IBM product, a move to any other computer
would require modification of UPGRADE to use the available sort program.  Most
computer manufacturers supply such software; however, if none were available,
one would have to be written to meet  the sorting requirements in UPGRADE.

     IBM Assembly language is currently used for about 5% to 10% of UPGRADE
coding (about 1000 lines).  The functions performed by this code include ter-
minal input/output, SAS subtasking, linkage to IBM Sort/Merge, manipulation of
the "help" libraries (which are stored in a partitioned data set) , dynamic
allocation, RHB and IRB routines.  In a conversion,  all of these subroutines
would have to be rewritten in the appropriate language of the target computer.
This could be a major problem area, since, at this level of programming, source
code statements correspond to specific machine dependent instruction set opera-
tions.

     The  remaining area of computer  type dependence again refers to differences
in the software available on the target machine as compared to the IBM software.
The IBM FORTRAN Gl compiler has been  used in UPGRADE development from the very
start.  As a result, there has been a tendency to use all features of the IBM
compiler rather than restricting the  programmers to the contents of ANS FORTRAN.
Some of the IBM features will also be available in other compilers; other fea-
tures will not be available.  A specific determination could be made for any
given target compiler.

     Some features would be relatively easy to change to ANS FORTRAN coding;
for example, replacement of literals  delimited by apostrophes with the ANS
character count format would be a simple task.   Conversion of certain other
features (such as direct access files, END= in READ statements, etc.) would
require varying amounts of reprogramming or even restructuring of the UPGRADE
system.

     The second general area of potential conversion problems concerns the
operating system of a potential target IBM 360/370 computer (any other model of
large IBM computer is now very rare and could be considered to present similar
problems as going to a different manufacturer's computer).   Other operating
                                     E-2

-------
systems in use on IBM 360/370 computers include MVT and MFT.  Any of the areas
discussed in this section would also be of concern in any move to a non-IBM
computer.

     Perhaps the most important problem would be the overall size of UPGRADE.
Currently the UPGRADE system as run on the NIH-DCRT MVS system requires approxi-
mately one-half million bytes of core storage.  On the MVS operating system,
some of this is actual core, some is "virtual," i.e., residing on a direct
access device (typically a magnetic drum storage unit).  On an MVT or MFT
system, all 500,000 bytes would have to be actual core.  At most such installa-
tions, this size region of core is either never available or available only
late at night or weekends.  Also, costs of using such a large region on MVT or
MFT would usually be high.  UPGRADE might have to be restructured to fit into a
smaller core size.  The costs and difficulty of this reprogramming would be
dependent on the reduction in core size required.

     Subtasking is currently used in UPGRADE for the operation of the SAS rou-
tines.  Since subtasking in the same manner is not available in MVT or MFT,
changes in UPGRADE would be required.  Two other areas that would require
changes are the DAIR (Dynamic Allocation Interface Routine) used to allow
changes in the dataset - I/O channel assignments during a single UPGRADE session,
and certain TSO MACRO'S which are somewhat different in MVT or MFT.

     The third general area to be considered is the interactive or timesharing
system to be used.  UPGRADE now uses TSO.  A change to a different processor
(e.g., ALPHA on EPA's COMNET IBM system) would require some reprogramming;
again, a change to a non-IBM computer would probably require even more changes.

     The attached pages list in some detail those parts of the current version
of UPGRADE which may cause problems in moving UPGRADE to some installation
other than NIH/DCRT.  The list is based on my analysis of UPGRADE program
listing, plus several discussions with Sigma Data.  Of course, no one will have
a complete list of conversion complications until after any given conversion is
completed.

     The list is broken into four sections to separate, as far as possible,
those parts specific to:

        •  MVS

        •  TSO

        •  IBM

        •  NIH/DCRT

     This makes  it easier to see the potential complications of any given con-
version.
                                    E-3

-------
                        Current parts of UPGRADE
                         which are specific to MVS*

 •  OVERALL SIZE - approx.  500K - this much core is not usually available
                     except on a virtual memory system.

 •  ISO MACROS - certain ones (e.g.,  TGET & TPUT) are said to be different
                     under MVS.

 •  DAIR - Dynamic Allocation Interface Routine

 •  SUBTASKING  - used with ATTACH macro for operation of SAS.
*These present difficulties in moving UPGRADE  to  a  non-MVS IBM computer
 (e.g., Vitro of COMNET),  and even more problems  in going to non-IBM.
                                 E-4

-------
                            Current parts of UPGRADE
                             specific to IBM*

     •   FORTRAN EXTENSIONS - IBM extensions to ANSI FORTRAN are in use,  e.g.,
                             list directed I/O, subroutine entries,  END  = in
                             read statements, direct access files, object time
                             dimensioning of arrays, >3 dimensions in array,
                             etc.  Some of these may be found in other FORTRAN
                             compilers.

     •   ASSEMBLY LANGUAGE  - about 1,000 lines (currently) - about 5-10% of the
                             UPGRADE code.  Performs terminal I/O, SAS sub-
                             tasking, linkage to IBM sort, manipulation  of
                             "HELP" libraries (PDS), dynamic allocation, RHB &
                             IRB routines.

     •   SORT/MERGE Program

     •   SAS (Statistical Analysis System) - SAS Institute support SAS only for
                             IBM machines.
*These present problems in going to a non-IBM machine (e.g., RTP Univac).
                                    E-5

-------
                            Current parts of UPGRADE
                             which are specific to IBM TSO*

     •  CLISTS - Various CLISTS are used to allocate data sets, set terminal
                 characteristics etc.

     •  DAIR - Dynamic Allocation Interface Routine

     •  TSO MACROS - used in some ALC subroutines.
*These present difficulties in moving U/G to a non-TSO environment (e.g.,
 COMNET), and even more in going to non-IBM.
                                     E-6

-------
                            Current parts of UPGRADE
                             specific to NIH/DCRT

•  RHB and IRB routines              - ALC code written at NIH.  A few are
                                       available at COMNET

•  IFF (Integrated Plotting Package) - used to produce graphs

•  IPP-Tektronix Resident Processor  - converts IPP neutral text to commands
                                       for Tektronix graphics software .

•  NIH/WYLBUR                        - used for setting up jobs auxiliary to
                                       UPGRADE and maintaining file records,
                                       etc.  Other versions of WYLBUR exist at
                                       some other computers.
                                  E-7

-------
APPENDIX F

-------
           APPENDIX F




POSSIBLE CORE SAVINGS IN UPGRADE

-------
                                  APPENDIX F

                       POSSIBLE CORE SAVINGS IN UPGRADE
     A number of the UPGRADE evaluators found themselves limited by present
internal storage constraints in the system in terms of the kinds and amounts
of data they need to handle.

     A number of these constraints exist because of the manner in which UPGRADE
was developed.  The system has only in the past year reached a level of software
stability permitting production packaging.  It was developed in the context of
rapid proving of basic capabilities, followed by the rapid sequential add-on
features found needed by its initial CEQ user community.  Rapid response to
frequent increases in requirements provides the developer with a severe problem
in software configuration control and system optimization.  Software configura-
tion control has been achieved, with separated test and production versions of
UPGRADE, with a controlled and orderly procedure for building new features into
the system.  With CEQ the only user, these controls had not previously been
required.  Thus, at its present stage of development, UPGRADE can be improved
considerably in performance by retuning the internal design of the system.

     As part of the analysis of UPGRADE capabilities, the overlay, file alloca-
tion, and coding conventions were reviewed in the context of the now larger set
of system requirements identified by EPA's potential user community.  This
Appendix lists the results of the analysis, referencing the UPGRADE Version II
program listing.  It is recommended that EPA support the review of the internal
software structure of UPGRADE, and its rebuilding along the lines suggested
by this Appendix, for subsequent production versions.  The result will be an
increase in efficiency (and consequent reduction in cost) of UPGRADE processing,
and more importantly, increase in the size of the data base processable by
UPGRADE.  Refer to Figures F-l and F-5 for a pictorial presentation of the
program structure.
                                     F-l

-------
TOTAL
UPGRADE;
LENGTH <

247.816  I
                                                               LENGTH 136-080
                                                    OVERLAY A
                                                    OVERLAY C
                                                    OVERLAY D
                                                                                    LENGTH 8.224
S LENGTH 16.872
           LENGTH 86.640
                                            Figure F-l.  Overlay Structure

-------
                                   START
        Move C.O.  Save '1FC'
•n
1*1
Move to A.C.D. - Save '6FC'
        Move C.O. -Save-11 A'
                                    HPDCB
X~N
x^\
/^\
s~\
HPINIT
HPENO
HPLIST
TOTAL LENGTH HPACCS
136,080 (Decimal) PPPTEK
/-v PPPTK
Type (A) TK|N|T
* TKSIZE
TKCHR
NTKSYM
TGETER

GTALPH
(A) MAIN


HDCOPY
TABDSP
DIREC
ANALYS
SASDCB

(A) SASSTT
^ AiunnnnF

AOUTST
A10UT
DRWABS
HOME
IHOECOMH
V
;
IHOCOMH2

MOVABS
NEWPAG
IHOSATN2

IHOSSCN

CSIZE
IN ITT

RHB230
IHOSSQRT

IOW AIT
RHB240
PPPBUF
PPPOPN
PPPBFL
PPPSIZ
PPPDRW
SSPECS
PPPSPC
TOUTST
IRB229
IHOLDFIO
I

I
IHOLI02
RHB201
RHB213
RHB218
IHOFOXPI
IHOSLOG
APROB
IHOSEXP

IHOFRXPR

IHOFRXPI

ALFMOD
ANSTR
CWSEND
KAM2AS
KA12AS
TSEND
IHOFCVTH

IHOEFNTH

IHOEFIOS



V i
;
IHOFIOS2
IHOUOPT

VECMOD
XYCNVT
BUFFPK
IHOERRM
IHOUATBL
NEWLIN
RESET
RHB241
RHB242
SASA
TOUTPT
AOEOUT
CARTN
IHOFCONI

IHOFCONO

IHOETRCH

LINEF
PLTCHR
IHOFTEN



\ i
;
RASA
CONMON
ANSERS
TEXT
ANAL
STATS
NAMES
ACCESS
n W I* k «JW
STASEQ
PCOM
CNCHS
NCHSDA
SASCOM
IPSKAT
HAROCY
CACCS4
BARS
LIMITS
SVSTAT
CFILT
CCLIST
XTB
STASL
PARTTT
PLOT
VARSEL
CTRANS
ANOTA
TKTRNX
PPPZET
                                     Figure F-2.  Root Segment

-------
                 Length 8224 (Decimal)
r
1 1 1
ACCES4 ACNCHS IPSCAT








1 1
XPART SAINIT
SAENO







1 1
SAATCH ACCMAN
SAWAIT
SASDTH
SASOUT
CORESZ
SASINT
SASGET
KEYCHK
RHB206
1 1
FILTER DATLST
RRANGE







                                                           Figure F-3.  Overlay A

-------
length 16.872 IDccinull
1
IVSAVE
IV01GP
IV03GP
IV02GP
IVINIT
IVCALC
1 1
XTABLM XTABGR
INTLIM




1
STNSL





1 1
STEPWS AUTO





1
GLM





1
FRED





1
OBCT





1
IVEW
EOF02




1
WKSORT
RHB233




1
VAHOSP
DISP1
BASIC



1
MAPN
STITQ
STITH



1 1
NCHSTA NCHSCO





1 1
IDBRO PLOTIT





1
BARPLT
BLOK
ABIOK
SHADEX
GLINE
SHAID
1
AXISPT
ANOTAT




1
AUTOST





1
PLTMOD
IPSU61




                      Figure F-4.  Overlay C

-------
length 86640 (Decimal)
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
REGR STPMOD SORT SASMON XTABCT PLM001 PLM002 PLM003 PLMOD4 PLMOOS BARSHA OAT APT OATASH HCXCO HR6HT
POLY GLMMOO FVAR WINDO HXOUIV MCCSVH
POLSUM VARDEP ASMSRT FRANCE STMR01 IMXC1 SYHOFF
MINVRS VARIND BPAS1 IMXC2 UOVECH
MATMOL FRQMOD DATARD INXC3 OSPEGS
TRANS1 VAHTBL DYNAM HXTFRU
TRANS2 PPPEMO
BPROB SIGHOH
ZZOMES
FABLGX
FABLIX

„ AXPRPR
1 P5LGLI
"^ PSLILG
PSllll
PSLIPR
GOLGLG
GDLGLI
GDLILG
GOLILI
GDLIPR
PSLGLG
PSLGPR
PSPRLG
PSPRLI
PSPRPR
OLLILG
DLLILI
DLLIPR
GDLGPR
GDPRLG
GDPRLI
I
r *
GOPRPR
SLLIL6
SLLILI
SLLIPR
DLLGLG
DLLGLI
DLLGPR
DLPRIG
OLPRLI
SLLGLG
SLLGLI
PRSPEC
FABLGY
FABIIY
FABPRX
FABFHY
UOVPEN
SAXLGB
SAXLGL
SAXLIB
SAXLIL
SAXPRL
AXLILG
AXLILI
AXLIPR
NODLIB
MODLIL
NOLGB
NDLGL
MOPRB
NOPRL
SAXPRB
AXLGLG
(.
r ;
AXLGLI
AXLGPR
AXPRLG
AXPRLI
SLLGPR
SLPRLG
SLPRLI
DECVAL
DIVLG
DIVLI
DIVPR
OLPRPR
GTITLE
SLPRPR
TITLES
TITLET
BLGSCL
BLISCL
BPRSCL
DECBCD
DIGITS
TITLE L
AXSMRK
ORWDL
DRWSL
GRDURK
LGLG
LGLI
LGPR
LILG
LIU
LINLAB
LIPR
I
\
LOGLAB
MATCH
NCHARS
NLOGB
PRBLAB
PRLG
PRLI
PRPR
PSNLIN
SIMPNO
SLNLIN
TITLE





















                                             Figure F-5.  Overlay D

-------
                   Possibilities for saving core in UPGRADE

I.    Certain arrays could share core storage by use of equivalence.  If data in
      array "A" needed to be kept while array "B" used the same core space, array
      "A" could be written onto disk to be save'd.

      Representative Examples:

          ACCMAN              p. 40 may be STATS and NAMES commons

          STNR01              p. 57 PEARZ (500) seems of little use - reordering
                              of the "STATS""arrays and taking first 50
                              entries - maybe shows "STATSVL .arrays are too big.

          ACCSB2              p. 62 & 87-89 does sort using tore storage, could
                              use disk

          FILTER              p. 122 1 24 common SV STAT

          HISTO (and others)  p. 131 1 26 common XTB

          IDBRD (and others)  p. 139 1 22 common CNCHS

                     •

                         others exist
II.   Some arrays are simply bigger than they need to be (space allowed for future
      expansion, or for ease of programming).  Reduce these to minimum.

      Representative Examples;

         Several arrays associated with data variables are dimensioned 100 (e.g.,
         all those in NAMES and STATS commons - see ACCMAN p.40 for example).
         I don't think all of these are now used - maybe 65 are used.  At any
         rate, number of variables could be < 100 and thus save core.

     Representative Examples:  (continued)

         Several (most) of the large arrays in XTB common are 20% to 40% larger
         than need be (unused space) - (//intervals = 60, 50 used) (# STATS = 10,
         8 used)

         - Others -

III.  Some subroutines could be broken up and called selectively or sequentially
      to reduce region sizes.

     Representative Examples:

          ACCMAN - is already broken up but is all in one region of the overlay

          ACCES4

          ACNCHS

                                     F-7

-------
          IP SCAT

          SASINT

          ACCMAN

          PLTMOD

          1PSUB1

          XTABLM

          INTLIM

          XTABGR

          STNSL
            and most all others.

         The first ones to be hacked up should be those which are making parti-
cular regions long.

IV.   Some groups of subroutines (in one segment),  and entry point areas, could
      be moved to lower region, thus decreasing size of upper segment.

          Of course this would increase the size of the lower region, unless
      put in as separate entries in the overlay structure.

          For examples, see list under Item III, as many of the small sections
      that could be taken out of these segments could be moved (or would have
      to be moved lower to retain logical calling sequence and proper overlay
      structure.

V.    Where possible,  use I *2 in place of I *4 integer variables.

          ACCMAN p.  40 1 25-26

          IPSCAT p. 144 1 36

          SAS1   p. 339 1 25-26
      and most other areas where integer variables are actually used for
      arithmetic, not text, arrays.

VI.   Rewrite "ANSER" handling to use less space - perhaps with set of logical
      variables held in common to TGETER

      Or use computed GOTO

                                     F-8

-------
     Examples;

         ACCMAN p.40 1 51-54

         ACCMAN p.41 1 109-114

         ACCMAN p.42 1 166-171

                p.46 1 378-386; 1 394-401



                others
            •
            •

           AUTOST p.98 1 129-137, 147-154

           	 and nearly every place that TGETER is used to bring in an
           "ANSER" - most of these could be handled by returning an INTEGER
           value corresponding to "YES," "NO," "HELP," etc. and using GOTO
           (100, 200, 300, xxxx), INTEGER

VII.  Put test of questions into partitioned data set library and call similarly
      to "HELP" library.

      Examples;

         ACCMAN p.40 1 41-49

         ACCMAN p.42 1 159-164

            •
               examples in nearly all subroutines
         Also, nearly all places that a question prompt is given to user, core
      would be saved by having the text of the question in a. -BDS and" using CALL
      QUES (nn,m) to put the text on the screen.

VIII.  Several routines have unchanging text stored in arrays - could be put
       on disk data set, with keyed or direct access -to--bring inappropriate text
       (or PO library)

          IVINIT (ACCMAN) p. 50 1 617-632

          ACCS4           p. 68 1 36-52

          MAPN            p. 170 1 39-56
                                      F-9

-------
            Several other subroutines have lesser amounts or text arrays could
     be initialized by BLOCK DATA - would save core in some cases.

  IX.  Use LOGICAL *1 to replace L *4

       Variables (a 75% core savings!)

          ACCMAN  p. 40 1 27

          ACCS4   p. 68 1 54

          1PSCAT  p.145 1 61-63

          XTABGR  p.302 1 55-56

   X.  Reduce capabilities of UPGRADE (e.g., drop SAS, reduce // of variables
       allowed, etc.)

       SASINT and other SAS subroutines could be dropped, etc.

  XI.  Move certain subroutines out of root segment by making multiple copies of
       appropriate lower region segments

          DATCON

          HDCOPY

          TASDSP

          NTKSYM

          RHB routines

            and many others could be determined by more analysis.

 XII.  Determine if any of the FORTRAN, IPP, or TEKTRONIX subroutines added to
       root segment are never used (e.g., may be trigonometric functions).

       Perhaps some type of program monitor could be used to determine which
       subroutines are not used in a full exercise of UPGRADE.

       The trigonometric functions for tangent, sin, cosine are in the root
       segments, and I would bet are never used.

XIII.  Replace HPLIST (7200 bytes of "HELP" xxxx number lists) with code to
       generate needed number.

 XIV.  Some areas of the detailed code could be made smaller by writing more
       complex (therefore harder to maintain or change) code.
                                    ¥-10

-------
     Examples:




        AXISPT    p.107-109, 1 349-480




        DATCON    p.166, 1 299-353




        NCXCO     p.178-181




        PLOTIT    p.204-207




        SORT      p.271 1 80-95




        SAS1      p.354 - duplicate code - could be subroutine.




        also SASWT p.371-2






        	and probably many more areas.






XV.  Conversion table for NCHS to VITRO mapping codes is-.now in. a common -




     could be put on disk.




     This conversion table is now about 25K bytes (INXC1 through INXC3).  With




     the current dynamic allocation, it could now easily be put on disk and




     thus release this core.
                                    F-ll

-------
APPENDIX G

-------
  APPENDIX G




UPGRADE REPORTS

-------
                             APPENDIX G




                           UPGRADE REPORTS




The following is a list of EPA reports contained in this appendix.




     1.  Office of Research and Development, OPR




     2.  Office of Research and Development, OEMI/IERL/CINN




     3.  Office of Planning and Management




     4.  Statistics and Data Management Office




     5.  Epidemiology Branch, FSD, HERL
                                  G-l

-------
                          U.S. ENVIRONMENTAL PflQTECTIOI AGENCY
                             OFFICE OF RESHflCH AND DWsLOP.Ic.17
                             MOHiTOaW& AND SUPPORT LABORATORY - LAS VESAS
                 P.O. BOX 15027. LAS VESAS. NeVAOA 59114 * 702/7:5^69 iFTS 555-£cS)

       JUN 22 1978
   Date


"?piy to
  Attn ofc   MSD


 Subject   Evaluation -of OTGSSDE-

    To:   Dennis A.  Tirpak
         Office of  Planning and Review, BD-675


         Per your telephone request, we have used the  UPGRADE system to examine
         certain data bases in order to become familiar  with  the usefulness of
         the system.

         Summarizing Our experience suggests the following:

              1. UPGRADE is a good tool for displaying  graphical  functions and
         for interacting, directly with data.

              2. The system requires considerable additional software development
         before it  becomes a useful tool.

              3. UPGRADE could ba developed into a useful management tool, partic-
         ularly at  the national level.   It might also  have usefulness as a research
         screening  technique.

              4. Beyond the developmental phase, it will be  necessary to maintain
         a central  computer staff for interaction between data bases and users.

              5. The amount and kinds of data available on the  system at the present
         time are very limited.

         Our experience shows the accessible data are  basically  trend data, i.e.,
         yearly or  monthly averages on the geographical  scale of counties.  Of
         course, special data bases can be constructed and interfaced with the
         system. We have, in fact, had the contractor generate  such a data base
         from air quality data.  This was accocplished in a reasonable tine franse
         as have our requests for interface of new data  manipulation procedures.
         However, note that we are not privy to information regarding the contractor
         costs incurred as a result of these requests  and, therefore, have no basis
         for judging the cost effectiveness.
                                           G-2

-------
Vhether UPGRADE should ba developed  %r.t,o  a ti.auagamer.t  taal depends
        to such questions a^»:
     1.  What organizations wouJJEi mak's. iise* of  Chi^y'sCSrtl.t'in  i/'r>at  manner,
and how much use?

     2.  What is the investment required  in order  to stake ithe system usefsl?

     3.  What is the maintenance cost  of  the systen?

Obviously, these are interrelated questions, i.e.,  the  required- investment
will be a function of who, for what, and  how much  usa?   This, of course,
requires a detailed survey of the potential users  and their needs.  For
example, would individual Regions make use of raationvPida-  take 'Into account potentia.i rtsl-:-= frc:?.
a variety of disease states when deciding' how' ouch "hardness,  etc.* is healthful.
Also note the suggestion' that " the higher' tKs.BOD the be'ttcr  tVie waiter may b«
                              G-3

-------
for human consumption.  Reflection would suggest that BOD determinations
are carried out on water supplies beiora treatnent.  High 300 requires
better or different treatment before being deemed satisfactory for
human consumption.  Thus,, this particular finding could be indicating
not that high BOD is good, but that the better or different treatment
given high BOD waters does result in a more healthful water supply.  In
any event, the foregoing is speculation*and as stated, merely points
out areas for further investigation.

                           TABLE 1

                           Number of Correlations with Various Disease State
                           Positive         Negative         Total
Water Quality Variables    Correlations     Correlations     Correlations

Dissolved Oxygen                459
Hardness                        279
Sulfate                         246
BOD-5 Day                       066
Chloride                        325
Calcium                         145

la point of fact, we cannot claim by such a preliminary examination that
the 921 correlations showing less than a 952 confidence level are unim-
portant.  We have at this time no knowledge of the quality assurance of
the data used nor- knowledge of the concentration distribution of the water
quality data.  These factors can have a decided effect on whether or not
potential correlations are indicated.

This effort required 120 man-hours of work of which one-third was required
to become familiar with the system.  However, we understand that the con-
tractor is modifying the program so that this type of survey involving
approximately 1000 correlations could be accomplished in about 30 hours.

What this exercise did do is give us enough hands-on experience to be able
to forward our evaluations of UPGRADE as given in the opening paragraph.
This evaluation is admittedly from a restricted viewpoint, yet our experi-
ence indicates UPGRADE, in its present state, is not ready for general use.
Decisions regarding its further development should be based as indicated
on a survey of users and the development and maintenance costs.
Edward A. S chuck.
Acting Deputy Director
Monitoring Systems Research
and Development Division

3 Enclosures

cc: "w/o -enclosures
A. 'C. Trakowski, .RD-680
•H<. M. Bills, RD-680
                                 G-4

-------
                     INDUSTRIAL  ENVIRONMENTAL  RESEARCH  LABORATORY
                  UNITED STATES ENVIRONMENTAL PROTECTION AGENCY
                                  CINCINNATI, OHIO 45268
  DATE-  June 28,  1978

SUBJECT-  UPGRADE  Evaluation
  FROM: David  R.  Watkins
        OCPB
    TO: Lance Mai lace
        USEPA, ORD/OMTS
        401  M Street, SW
        Washington,  DC 20460
             I  would like to thank  you  for  the  computer demonstration  that
        you conducted on  5/25/78 in Cincinnati.   We  have utilized  some of the
        health  data that  was generated  during the demonstration.

             The UPGRADE  data was found to  be of greater value  than .that  data
        generated by a contractor of the'IERL-Ci. The  reason for  this.was  the
        more current mortality entries  in UPGRADE.   These results  were compared
        with those of a system which had been in operation for  a longer period
        and found to be of greater  value fop our oarticu-lar case,

             I  found the  UPGRADf.system very impressive and of  obvious benefits.
        However, we do not have the personnel nor Allotted time to conduct  this
        type of in-house  searches.   Contractors for  lERL-Ci  normally conduct
        these types of investigations on their  own systems or subcontracted
        systems.

        cc:  E. E. Berkau
                                         G-5

-------
                                       23
 8BBJEC?;   UPGJVUDE Styt*

 FROK:      Mareia Villian, Chief
           Statistical Evaluation Staff,  IW-223

 TO?        Jjanee  WaUae*,. Eovirpnnental Scientist
           Monitoring Technilpgy filvision, RD-660
Hir staff and * have- r^vdav^ the UPGRADE system "Users Overview" docucent and
I v»ld  like to pass along  tvo coananta.  First, ve think it vculd be useful
to expand and  clarify the section of the manual dealing vith the description
of included data.   In particular, it vould be useful to itemize each variable
vhlch  la contained  In. each  data  set-vlthin UPGRADE.

Second,- VB haye a nunfcer of auggestions about djata vhich ve v>uld lite to have
added  to the UPGRADE ayeten*   Sone of these data 'should be readily available
vhlle  other data vlll need  to  be searched out:

     1*  Itanber of  eaployeea by  SIC codes by county (Ed Brooks did this for
'1959,  1967, 19T3);

     2.  Itan&er of  etaployeee by  SQC codes by county (this should be available
Iron the 1980  census) J-

     3*  Demographic- and cllaate' ajeaeurea by county (Ed Brooks did 30 such
measures);

     4»  Additional age  adjusted abrtality 'ratea (Ed Brooks did 56 cause of
dealth categories and these could, be used to supplement existing categories);

     5«  Transportation  related  data by county such as nunber of vehicles,
nuaibcr of VMTS, number of ailes  of roads (DOT hex this);

     6*  Stationary and  aobile source emission data, by county, (tons per year)
from HED6,;

     T.  Certain census  type information, by county, such as type of vater system,
type of ,  hoae beating eyaten, existence of hose A/C system, etc*;

Ac other thoughts, occur  to  me, I will pass them am.  Call me if you vent tc
discuss  further.


         by:   FM-223/MWalliaaa/a«/$/21/T3/1J72-5CAO
                                      G-6

-------
                     UNITED STATES ENVIRONMENTAL PROTECTION AGENCY

   DATE.  October  12,  1977

SUBJECT:  The  UPGRADE  System.
  FROM:   William C.  Nelson, Ph.D.
          Chief,  Statistics and Data Management Office

     T0:   Lance Wallace (RD-680)
          Office  of Monitoring and Technical Support

               Thanks for sending us the draft User 's-OvsfvleW'' and otfier mateVla-ts
          for the graphics and data base system UPGRADE and for the opportunity
          to give you our comments.

               The UPGRADE system obviously- has an  elaborate display and analysis
          capability.  However, we do/hay?. sev*£fal,jreservatlons 'cfbottf Hs usefulness
          to EPA  in general and to HERk.r:RT{MrL* parti culajr*.

          1 .  Concern over conversioni.probTe
-------