PB-241 150


PROCEEDINGS NUMBER 1  OF THE OR AND D ADP WORKSHOP HELD

IN BETHANY COLLEGE, WEST VIRGINIA ON OCTOBER 2-4, 1974
            •»

Dem'se Swink


Environmental  Protection Agency
Washington, D.  C.


January 1975
                       DISTRIBUTED BY:
                       KTLT
                       National Technical Information Service
                       U. S. DEPARTMENT OF COMMERCE

-------
                          KEEP  UP  TO  DATE
   Between the time you ordered this report—
 which is only one of the hundreds of thou-
 sands in the, NTIS information collection avail-
 able  to you—and the time you are  reading
 this message, several new reports relevant to
 your interests probably have entered the col-
 lection.

   Subscribe to  the  Weekly  Government
 Abstract* series  that will  bring you sum-
 maries  of new  reports as  soon aa they are
 received by NTIS from the originators of the
 research. The WGA's are  an NTIS weekly
 newsletter service covering the most recent
 research findings in 25  areas of industrial,
 technological,   and  sociological interest-
 invaluable  information for executives and
 professionals who must keep up to date.

   The executive and  professional  informa-
 tion service provided by NTIS in the Weekly
 Government Abstract* newsletters will give
 you thorough and comprehensive coverage
 of  government-conducted  or  sponsored  re-
search activities. And you'll get this Impor-
tant information within two weeks Of the time
it's released by originating agenctos.

  WGA newsletters are computer  produced
and electronically  photocomposed  to slash
the time gap between the release of a report
and Its availability.  You can learn about
technical innovations Jmmedlaiely—and  use
them In the most' meaningful and productive
ways possible for your organization. Please
request NTIS-PR-205/PCW for more 'Infor-
mation.

  The
current But learn what you have afttttcl In
the past by ordering a< computer' NTMsauh
of all the research report* in your area of
interest, dating as  far bade as 1864, rf  you
wish. Please request NTfS-PFHS8/PGN for
more information.

         WRITE:  Managing Editor
                 5285 Port Royal Road
                 Springfield, VA 22161
                   Keep Up To  Date With SRIM
SRIM  (Selected  Research  in  Microfiche)
provides you with regular, automatic distri-
bution of the complete texts of NTIS research
report* only in the subject areas you select.
SRIM  cavers  almost all  Government re-
search report*  by subject area and/or the
originating  Federal  or  local  government
agency. You may subscribe by any-category
or siibcategory of our WGA OVMkJy Govern-
ment  Abstracts)  or Government  Report*
Announcements and  Index categories,  or to
the reports  issued by a particular- agency
such as the Department of Defensav Federal
Energy  Administration,  or  Environmental
Protection Agency. Other  option* that will
give you greater selectivity are available on
request.

  The cost  of SRIM service  te  only 451
domestic (60?  foreign) for each  complete
microflched report. Your SRIM sarrtea.:begin8
as soon as your order is received and proc-
essed and you- will receive biweekly ship-
ments thereafter.  If you* wish,  your itivtee
will be backdated to furrrfsh you mlcrbUche
of reports Issued, earlier.

  Because of contractual arrangements with
several Special Technology Groups; not all
NTIS  reports are distributed In the  SRIM
program. You will receive, a notice'In  your
microfiche shipment* identifying'4t»a txcap-
tionally priced reports  not avallabl* through
SRIM.

  A deposit  account with NTIS is required
before this service can be Initiated, if'you
have specific questions concemmg.thls serv-
ice, please call (703) 451-1568, or write NTIS,
attention  SRIM Product Manager.
This information product distributed by

               US. DEPARTMENT OF COMMERCE
               National Technical Information Service
               5285 Port Royal Road
               Springfield, Viralnia 22161

-------
                                   TECHNICAL
                            (I'lcasc read lHWin-tii»is i>n
                                            REPORT DATA
                                            the reverse before completing)
I  HI I'ORT NO
   f;on/9-7r>..no;>
•1. I I I I I AMI) SOU I I I LI
          ADI' Workshop
                                   No.  1
                                                             PB  241  150
6. REPORT DATE
 January  1975
                                                           6. PERFORMING ORGANIZATION CODE
                                                            ORD
 7. AUTHOR(S)
    Denlse Swink
                                                           8. PERFORMING ORGANIZATION REPORT NO.
                                                             600/9-75-002
 9. PERFORMING ORGANIZATION NAME AND ADDRESS
    Office of Program Management
    Office of Research and Development 
-------
                NOTICE








THIS DOCUMENT HAS BEEN  REPRODUCED FROM THE



BEST  COPY FURNISHED  US BY THE SPONSORING



AGENCY.   ALTHOUGH IT IS  RECOGNIZED THAT CER-



TAIN  PORTIONS ARE ILLEGIBLE, IT IS BEING RE-



LEASED IN  THE INTEREST OF MAKING  AVAILABLE



AS MUCH  INFORMATION AS  POSSIBLE.

-------
         OR&D ADP WORKSHOP
            PROCEEDINGS
                 NO. 1
              Sponsored by

     Denise Swink, OR&D ADP Coordinator
    Office of Program Management (RD-674)
      Office of Research and Development
     U. S. Environmental Protection Agency
          Washington, D.C. 20460
    OFFICE OF PROGRAM MANAGEMENT
 OFFICE OF RESEARCH AND DEVELOPMENT
U. S. ENVIRONMENTAL PROTECTION AGENCY
        WASHINGTON, D.C. 20460

-------
                                                FOREWORD
     The OR&D ADP Workshop was held October 2-4, 1974, at Bethany College, Bethany, West Virginia. In response to
the need for better communication among the scientific community utilizing data processing, the OR&D Workshop was
designed to promote the exposing of state-of-the-art data processing techniques and available resources, and the sharing of
experience and knowledge related to Agency operating programs involved in scientific data processing applications. The
workshop was sponsored by the Office of Research and Development and was open for participation by all organizations
within the Agency.


                                                    Denise Swink
                                                    OR&D ADP Coordinator
                                                                                                           111

-------
                                       TABLE  OF  CONTENTS


                                                                                             Ni


  FOREWORD                                                                                   m


  AGENDA


  STORE! - The Status of the System
       Charles S. Conger                                                                          Q

  A Method for Improving User Access to STORET
       Michael I. Friedland                                                                      37

  Two Data Storage Programs for STORET
       Kenneth V. Byram                                                                       45

  The EPA Scientific Applications Software Study
       Elijah L. Poole                                                                           49

  BIOMED, SAS, and Other Tools of the Statistician
       Robert Kinnison, Ph.D.                                                                    SO

 MSP (Model Specification Program)-An Easy-to-Use Linear Statistical Analysis System
      Larry M. Male                                                                            56

 Concepts Verification Testing of the ENV1R System
      John W. Scotton                                                                          63

 SEAS - The Strategic Environmental Assessment System
      Edward R. Williams                                                                        69

 A Computer Controlled Data Acquisition and Retrieval System for Use in Air Monitoring Programs
     Marvin Hertz, Kirby Kyle, Anne Duke, and George Ward                                         88

 An Aquatic Ecosystem Simulator and VS/8
     David M. Cline                                                                            96

 An On-line PDF-12 Spirometry Package That Supports Both Onsite and Remote Stations
     Sam D. Bryan  '                                                                          99

An On-Une Real-Time Multi-User Laboratory Automation System
     William L. Budde, Edward J. Nfane, and Jack Teuschler                                          104

Computerized Chromatography - Mass Spectrometry
     William L. Budde and D. Craig Shew                                                        .,.
                                                                                            114

-------
                                                                                               P»ge
                                                                                             Number
CALCOMP Software
     Theodore R. Harris                                                                         118

Harvard Graphics Packages
     Curtis S. Lackey                                                                            132

EPA's Regional Air Pollution Study
     Robert B. Jurgens                                                                          148

Capabilities of Tektronix Software
     David M.Cline                                                                             156

Automated Laboratory Management System in Region V
     David Rockwell                                                                            159

Automatic Data Processing and Regional Laboratory Quality Assurance
     Billy Fairless, Ph.D.                                                                         167

Sample Tracking Data Management System
     G .C. Allison, M J. Madsen, and R.N. Snelling                                                   182

SIDES
     Davis Barrow                                                                              194

Selection of Probability Models for Determining Quality Control Data Screening Range Limits
     Wayne R. Ott, Ph.D.                                                                        208

A systematic Approach to Water Quality Trend Analysis
     Merlin H. Dipert and Jon A. Abraytis                                                         217

Sample File Control for the Automated Laboratory
     Roman I. Bystroff                                                                          219

A Data Reduction System for an Automatic Colorimeter
     K.V. Byram, F.A. Roberts, and L.A. Wilson                                                    226

A Chemical Information System
     Stephen R.  Heller                                                                          234

Mass Spectral Search Systems
     John M. McGuire                                                                           236

Bowne Time Sharing (Word/One)
     Catherine Tittle                                                                            254

Introduction to the UNI VAC 1110 at Research Triangle Park - Capabilities
     T.L.  Rogers                                                                               257

-------
                                                                                           Number
 RTCC Software and Acccf;ibility                                                                 265
     Maureen Johnson

 Technical and Environmental Information System                                                   271
     Donald L. Worley

 The Conversion of CHESS and Other Systems                                                      282
     Andrea Kelsey, Gene R. Lowrimore, and Jane Smith

 Present State of EDP Policy and Development in EPA                                                289
     Theodore R. Harris

 Status of the Washington Computer Center Procurement                                             296
     Denise Swink


 APPENDIX A - List of Attendees                                                                 A-l


 APPENDIX B - Areas of Expertise                                                                B"1


APPENDIX C - Areas of Interest                                                                  C'1

-------
                                            WORKSHOP AGENDA
Opening Ki-nuirks
     Spo;iker: Dcnisc Swink
                                                   Session I
                                               October 2,1974
 Session Title: Mathematical, Scientific and Statistical Applications Software
Cession Coordinator: Kenneth V. Byram

STORE! - The Status of the System
     Speaker: Charles S. Conger                                                                        Page   9

     Abstract: A history of the STORE! system, from its beginnings more than ten years ago, to its status at the present
     time  is presented.  Functions of user assistance group and the  programmer  staffing demands of the system are
     discussed. Some idea of the diverse nature of data in the system, and the volume of data it contains, along with plans
     for the future are presented.

A Method for Improving User Access to STORE!
     Speaker: Michael Friedland                                                                        Page  37

     Abstract: The need to manipulate data stored in the STORE! data base imposes a relatively complex programming
     problem on the user. This author has developed a FORTRAN subroutine which performs the housekeeping involved
     in making STORE! data available in card-image form to the relatively unsophisticated user; that is, the user with
     limited WYLBUR and STORET vocabularies, and essentially no JCL or programming knowledge.

Two Data Storage Programs for STORET
     Speaker: Kenneth V. Byram                                                                       Page  46

     Abstract: One of the difficulties in preparing a large batch of STORE! data for input is the difficulty of proofreading
     it before it goes into the system.  The DIP input format, for example, is easy to prepare but difficult to check. This
     program tabulates an input data set in any format before data storage and also adds the ability to check values against
     upper and lower limits.

The EPA Scientific Applications Software Study
     Speaker: Elijah Poole                                                                              Page  49

     Abstract: EPA uses several computer facilities encompassing many intermediate and large-scale digital computers. The
     smaller systems are generally special purpose computers that service a single laboratory, while the large systems have
     remote terminals serving a large user community. Recent EPA computing system requirements studies have excluded
     "purely scientific"  systems, and  the scientific applications study  now  getting underway is designed to help in
     determining and fulfilling the Agency's needs in this area.

Statistical Analysis System (SAS)
     Speaker: Richard Johnson                                                                         No Paper

     Abstract: SAS is developed and updated by  North Carolina State University, Department of Statistics. Released in
     1969 for public consumption, there are now some 30,000 users of the system in approximately 170 installations and
     five foreign countries. Some of the reasons for its success and growth are discussed.

-------
 BIOMEO, SAS, and Other Took of the Statistician
      Speaker: Robert Kinnison                                                                         Page SO

      Abstract: A compai.sion of the capabilities of the standard BIOMED package and of SAS is made from a user's point
      of view. The new P-Series BIOMED package, which is as yet unreleased, is introduced. Some available analytical
      routines, such as Goodman's nonparametric anova, chemical kinetic, and simulation packages are described.

 MSP (Model Specification Program)-An Easy-to-Use Linear Statistical Analysis System
      Speaker: Larry Male                                                                              Page  56

      Abstract: Most linear statistical programs severly limit a user to a fixed set of statistical models. Analysis of variance
      programs, for example, assumes a fixed model for estimating parameters and tests a fixed set of hypotheses. The user
      is unable to specify his own hypotheses, which  may be of most interest. Although he can use "general linear model"
      programs, they require input of a cumbersome coefficient matrix with an entry for each experimental observation.
      The statistical analysis program described here allows a user to specify easily any number of models that he may want
      to analyze.

 Concepts Verification Testing of the ENVIR System
      Speaker: John Scotton                                                                            Page  63

      Abstract: The Data  and Information Research Division, Office of Monitoring Systems, ORD contracted for six
      months with  the Gulf Universities  Research Consortium (GURC). The objective was to demonstrate and evaluate the
      application  of the Environmental Information Retrieval (ENVIR) System  to the collection, sorting, correlative
     analysis and interpretation of selected categories of EPA data.

     This presentation discusses an  evaluation relative to a concept verification testing of the ENVIR System on water, air
     and  pesticides data bases. It  also briefly  discusses the comprehensive technical management information system
     known as Environment Dependant Management Process Automation and Simulation (EDMPAS) of which ENVIR is a
     primary model.

 Interactive Computing • A Case History
     Speaker. Richard Feldmann                                                                       No Paper

     Abstract: Time Sharing computer systems, as distinct from CRJE systems such as WYLBUR, have been available for
     some time, and in many applications have been enthusiastically utilized. This discussion compares the three modes of
     computing (time  sharing, batch, remote job entry) pointing °ut the advantages and disadvantages of each mode for
     various applications (program  development,  graphics,  number crunching, and information  retrieval). The  material
     presented reflects the authors experience with the N1H PDF-10.

SEAS - The Strategic Environmental Assessment System
     Speaker. Edward  R. Williams                                                                      Page  69

     Abstract: The Strategic Environmental Assessment System (SEAS) is a computer-based forecast and impact analysis
     technique that provides a national  economic forecast  which associates industrially related environmental residuals
     plus environmental  treatment effects and costs. Additionally, it allows these factors to be distributed at a Federal
     region and state basis  as well as forecasting environmental residuals associated with transportation and space condi-
     tioning. The model, operating in prototype form in the OSI computer, is being applied in analysis for Cost of Clean
     Air, Economics of Clean Water, Energy Conservation  Analysis, Support of the Industrial Sludge Task Force, the
    Council for Environmental Quality, the National Commission for Water Quality and National Institues of Health.

-------
                                                  Session II
                                               October 3, 1974

Session Title: Applications of Minicomputers
Session Coordinator: Bill Budde

A Computer Controlled Data Acquisition and Retrieval System for Use in Air Monitoring Programs
     Speaker: Marvin Hertz                                                                             Page  88

     Abstract: The heart of the CHAMP network is an automated data acquisition system employing intelligent controllers
     at both a central site and a network of remote monitoring stations. A  minicomputer at each of the local stations
     controls and  acquires  data  from aerometric instrumentation and transmits the data by phone lines to a central
     computer facility  where the values  are recorded and  validated for  use  in health effects studies. Retrieval and
     processing of a large data base such as that acquired from the CHAMP network require a specialized approach. Total
     system capabilities  are realized using a dual processor controller in the central facility operating in a foreground/
     background mode; semi-real time acquisition  via telecommunications  system and data recording are  handled in
     foreground, and processing, validity  checking,  tabulation  and reduction are accomplished as background tasks. A
     functional description of the system is given, including a data flow diagram and a discussion of the machine validation
     criteria. Emphasis is placed on the software considerations which make the CHAMP system unique.

An Aquatic Ecosystem Simulator and VS/8
     Speaker: David M. Cline                                                                           Page  96

     Abstract: Aquatic Ecosystem Simulator (AEcoS), a one million dollars computer-controlled research facility used for
     physical simulation of freshwater ecosystems, is described with major  emphasis  on its computer system.  The
     computer system consists of a PDP-8/E minicomputer and a real-time virtual operating system (VS/8) which is being
     developed at the Southeast Environmental Research Laboratory.

An On-Line PDP-12 Spriometry Package That Supports Both Onsite and Remote  Stations
     Speaker: Sam  Bryan                                                                              Page  99

     Abstract: A spirometer (literally  a "breath meter") is a device that can  measure rate of air flow. When a  person
     breathes through a  tube connected to a spirometer, it can be used to  measure how fast his lungs move air and how
     much air is moved. An electronic spirometer can be used in conjunction with a computer to provide a large array of
     measurements that relate to lung  function and  can provide them instantaneously. This paper describes such a system
     that is based, around a PDP-12. This particular system can support an electronic spirometer that is connected directly
     to the computer or one that is located at a remote site that is connected  through an analog telecommunications link.
     A graphics terminal at the remote site gives the attending technician feedback on the computer processing. A 1200
     baud line is used for this digital telecommunications link.

An On-line Real-Time Multi-User Laboratory Automation System
     Speaker: Bill Budde                                                                              Page 104

     Abstract:  Small to  medium scale  time sharing chemical laboratory automation systems were first developed during
     the late 1960's. The Methods Development and Quality Assurance Research Laboratory (MDQARL) and NERC-
     Cincinnati Computer Systems and Services Division are working with the Lawrence livermore Laboratory (LLL) of
     the Atomic Energy Commission (AEC) to develop a time-sharing laboratory automation system with several novel
     features. The principal goal of the project  is to automate presently owned chemical analysis instrumentation that is
     currently widely used at MDQARL and a number of other EPA laboratories for water quality parameter measure-
     ments.  The significant feature is  to do the automation in a fashion that will permit transfer of this technology to
     other EPA laboratories at a very significant cost and time savings. Another feature is the open-ended design that will
     permit  the attachment of nonwater parameter  instrumentation to the same system. These features will  be substan-
     tially supported by the choice as the principal  programming language of an extended  version of BASIC. One other

-------
       NERC-Cincinnati research laboratory, an Office of Enforcement and General Counsel Field Center, and the Region V
       Surveillance and Analysis Laboratory  have  chosen  to  participate in  this development. The salient features of this
       system are described with the exception of the sample file control program which is presented later In the workshop.

  Computerized Gas Chrornatography-Mass Spectrometry
       Speaker: Craig Shew                                                                             Page 114

       Abstract: One of the most widely used automated experiments in EPA is the computerized gas chromatography-mass
       spectrometry system. About 25 complete systems valued at nearly three million dollars are available in a variety of
       laboratories for the identification and measurement of organic pollutants in environmental sample*. Most of these
       systems emply 8K PDP-8/£ or M minicomputers with a real-time disk operating system. Several systems use Varian
       620 and Hewlett-Packard 2100 minicomputer systems. Cathode ray tube (CRT) displays are used on many of the
       systems for alphanumeric and graphic display at  1200 characters/second  and user interaction.  Programs  have been
       developed for CPU-CPU communications with remote time-sharing systems. ThePDP-8 based system Which controls
       the spectrometer scan, acquires data, does data reduction, and outputs data are described.


                                                   Session in
                                                October 3,1974

 Session Tide: Applications of Interactive Graphics in Analysis of Environmental Monitoring Data
 Session Coordinator: David M. Cline

 Cakomp Soft ware
      Speaker: Theodore R. Harris                                                                       Page 118

      Abstract: Presentation is a discussion of EPA Computer  Graphical capabilites. Basic information is given on Calcomp
      Plotting software and Tektronix CRT software.

 Harvard Graphics Packages
      Speaker: Curtis S. Lackey                                                                         Page 132

      Abstract: Presentation is a discussion  of the Harvard  Graphics Packages.  The status of the system as well as aoplica-
      tions are presented.

 EPA's Regional Air Pollution Study (RAPS)
     Speaker: Robert Jurgens                                                                          Page 148

     Abstract. Presentation is a paper entitled "Regional Air Pollution Study Graphics System." The development and
     current status of  EPA's Regional Air Pollution Study  (RAPS) and the  methodology used from collecting and
     accessing  the RAPS data bank are briefly reviewed.  Computer graphics  provide one of the primary media for the
     presentation  of RAPS data. Justification of graphics  for RAPS and hardware/software  support  at EPA,  RTP are
     discussed.  The graphics  system  which is  being developed under contract is explained including its  philosophy,
     liierarchal structure and classes displays.

Capabilities of Tektronix Software
     Speaker: David M.  Cline                                                                          Page 156

     Abstract:, Presentation is a brief tutorial on  the use  of  the Tektronix software as installed on the OSI computer
     facility as well as a PDP-8/E minicomputer. The Terminal  Control System,  the Advanced Graphics II System, and the
     Colcomp Preview Routines are discussed,

-------
                                                 Session IV
                                               October 3,1974

Session Title: Laboratory Data Management
Session Coordinator: Wayne Ott

Automated Laboratory Management System in Region V
    Speaker: David Rockwell                                                                         Page 1S9

    Abstract: The automated handling of point source compliance monitoring data is presented for Region V Surveillance
    and  Analysis Division. The  dominant characteristic of this system  is the use of automatic processing from the
    conception of a field survey  to the storage of data in STORET. The programs involved in the automatic processing
    are LABORATORY-LABEL-INTERACTIVE-SIDES and QUALITY CONTROL.

Automatic Data Processing and Regional Laboratory Quality Assurance
    Speaker: Billy Fairless                                                                            Page 167

    Abstract: The Region V Central Regional Laboratory (CRL) will perform analyses during the current fiscal year.
    Sample types range from the very pure surface water of Lake Superior to sewage sludge. Concentrations  vary from
    subnanograms per  liter to essentially pure compounds and analyses are routinely requested for over 200 different
    parameters. This variety of problems presents a formidable challenge to the chemist to maintain control of each of his
    experiments and the quality of the resulting data. Steps the CRL have taken to define and measure the variables
    needed to evaluate the quality of data reported for each of several parameters are discussed. Methods are suggested in
    which automatic data processing might be used to assist laboratory personnel.

.-.ample Tracking Data Management System
    Speaker: Robert N. Shelling                                                                       Page 182

    Abstract: A Sample  Tracking Data Management System (STDMS) is under development at NERC-LV to provide
    management information relating to sample analysis status as well as result reporting. Approximately 3000 samples
    per month are received by the  laboratory for stable and radiochemical analysis. With an average of three analyses per
    sample, this results in approximately 9000 sample-analysis combinations per month to track. The system utilized two
    files: a Master file containing sample identification, collection, and parametric data; and a  Status file maintaining
    statistics related to analysis status. Completed analysis results added to the Master file serve to update the Status file.
    An exception reporting technique is utilized to access information on number of samples received, analysis requested,
    analysis  completed, and analysis overdue.  Status data may be retrieved for a given sample, analysis  category, or
    program.

SIDES
    Speaker: David R. Barrow                                                                         Page 194

    Abstract: SIDES (STORET Input Data Editing System) was developed to facilitate the preparation of data for entry
    into STORET. SIDES offers the following features: an easy-to-use card format for entering data particularly from the
    standpoint of keypunching; a straightforward method of entering sample identification information; a comprehensive
    pre-STORET  editing system to  allow a user to catch keypunch and transcription errors prior to entering the data into
    STORET; a simplified method  of correcting data already entered into STORET through SIDES. Several data reports
    may be run from SIDES intermediate disk files to aid in data checkout and in obtaining data printouts with statistics
    prior to entry into STORET.

-------
  Selection of Probability Models for Determining Quality Control Data Screening Range Limits
       Speaker: Wayne Ott                                                                                 Page 208

       Abstract: tnviron mental  monitoring laboratories carrying out routine chemical analyses increasingly are using
       computers to process and store the data.  If these laboratories process  large quantities of data, errors may be
       introduced in data handling phase, and there is some risk that these errors may go undetected. Such values, which
       frequently result simply from keypunch mistakes, may be  stored as "valid" data  and may create serious problems
       when sample  means  or other statistics are calculated. The  computer  can be used,  however, to screen the  data
       automatically  as they are entered to determine  if any value of a given parameter is "unusual", or lies outside an
       "acceptable range." Such unusual values can be "flagged" and brought to the attention of laboratory personnel where
       errors can be corrected and new analyses performed if necessary.

       The  Environmental Protection  Agency (EPA) is now  undertaking a data  analysis effort in which probability models
      arc fit to historical water  quality data and range limits calculated. These limits then are being built into a quality
      control data screening computer program which serves as an intergral part of the computerized Laboratory Data
      Management System (LDMS). The basis for selecting and applying these probability models is discussed^*! detail.

 A Systematic Approach to Water Quality Trend Analysis
      Speaker: Merlin Dipert                                                                             Page 21 7

      Abstract: In investigating water  quality trends, compounding of errors can be avoided by separating the errors of
      measurement  from the  errors  of fitting. The fundamental  data for consideration are the data points (individual
      points, replicated means, periodic means or some similar value and their calculated errors).

      For the fitting procedure consider the general n dimensional mathematical model:
      where Lft) is the linear trend line used for the model, Fft) is the nonlinear component of time, CM is a function of
      the nonlinear components and S is a vector of seasonal components in the Box-Jenkins sense. Least squares estimates
      of the parameters are calculated and confidence limits in the quality control sense are drawn. The general model is
      relatively easy to fit since there is only one nonlinear function.

      A computer program has been developed to retrieve the data from STORET and calculate the points and their . rrors,
      estimate the parameters of the curve in the least-squares  sense, and plot the curves and data. The program is
      constructed so that the user can set his model or change it in one program statement. The user can fix any  parameter
     desired  by a control  card.  To date water temperature,  nitrates, and phosphates have been  fitted to the general
     harmonic model:

         y=A +Bx + CSin (D + Ex)/EXPfFx}

     The nonsignificant parameters were set to zero  the period  fixed at one year.  This  article is a presentation of a
     cost-effective procedure for water quality analysis that is available as an analysis tool for qualified men.

Sample File Control for the Automated Laboratory
     Speaker: Roman Bystroff                                                                           Page 219

     Abstract: Sample  file control is to be an  integral  part of the  automated laboratory  minicomputer system  which
     currently is being implemented  for several  EPA Labs as  part of the ALMS program. Presentation discusses some
     aspects of the implementation and the philosophy of using a conservative systems approach.

-------
A l>.ila Reduction .System lor an Automatic Colorimeter
     Speaker: Kenneth V. Byram                                                                        Page 226

     Abstract: The System was designed to process results from u technicon autoanalyzer on a medium-size computer. The
     system creates sample and standard lists. The analyst acquires data rerun at higher dilutions. The system incorporates
     quality control and allows considerable flexibility in handling irregularities.


                                                   Session V
                                                October 4,1974

Session Title: General
Session Coordinator: Denise Swink

A Chemical Information System
     Speaker: Stephen R. Heller                                                                         Page 234

     Abstract: This presentation deals with those parts of a chemical information system which could be used by Agency
     Uihs in their routine analysis work. The areas covered include literature, structure and hard data.

Mass Spectral Search Systems
     Speaker: John McGuire                                                                             Page 236

     Abstract: Setting standards and monitoring effluent streams requires a sensitive analytical tool for precise qualitative
     analysis. Computerized gas chromatography/mass spectrometry provides  this tool for organic  pollutants, but
     generates large amounts of data. Various approaches exist to simplify interpretation  of these data. The EPA is
     developing three complementary systems to provide rapid, computer-controlled identifications from the data. One of
     these systems is now available to the public; the others will be in the near future. Presentation discusses the systems
     and their use in suppor of the EPA programs.

|]owne Time Sharing (WORD/ONE)
     Speaker: Catherine Tittle                                                                           Page 254

     Abstract: Ms. Catherine Tittle is Bowne Time Sharing's Washington Manager for Customer Service. She discusses the
     use  of BTS's  computerized text-processing service, Word/One, within the  Environmental Protection Agency.
     Word/One lias demonstrated effectiveness within the  research community, and EPA specifically, where it has been
     used in the preparation  of documents such as EXPRO, OR&D Program Planning & Reporting Manual, and Methods
     for Chemical Analysis of Water and Wastes (1974). Ms. Tittle discusses how Word/One may reduce the time and cost
     required for document preparation while simultaneously increasing production.

introduction to the UN1VAC 1110 at  Research Triangle Park-Capabilities
     Speaker: T. L. Rogers                                                                              Page 257

     Abstract: This is a presentation of the hardware configuration and capabilities of the UNIVAC 1110 System installed
     at the  Research  Triangle Computing Center. Included is  a  discussion of the processors, storage, and  peripheral
     subsystems as installed. A description of the communications support including  line types, number of lines, and
     terminals is also presented.

RTCC Software and Accessibility
     Speaker: Maureen Johnson                                                                         Page 265

     Abstract: This presentation is a brief description of the various software packages available on the UNIVAC 1110 at
     RTCC. This does not include detailed  utilization techniques, but is an overview of available software resources.
     Information on how to become a registered RTCC user and obtain pertinent documentation is also included.

-------
          und I'.iwirimmcnlal Inlunmtion System
      Speaker: Donald L. Worley                                                                       Page 271

      Abstract: TEN1S is an Information Storage and Retrieval System for the EPA UNIVAC 1110 which is being used to
      support the Air Pollution Technical Information Center. TENIS was custom-designed to provide not only low-cost
      support to APT1C but also to provide the complete range of services. The system is  written in COBOL and was
      developed by the DSD staff at Research Triangle Park, N.C. Presentation discusses the system and Us use to support
      APTIC.

 The Conversion of CHESS and Other Systems
     Speaker: Gene R. Lowrimore                                                                    Page 282

     Abstract: This presentation is a description of experiences of one large user with the conversion effort from the IBM
     360/50 to the UNIVAC  1110 system at RTP. The changes precipitated  by differences in hardware and software
     architecture are emphasized. The things which worked well in the conversion and those that did not are presented
     from one user's viewpoint. Recommendations are made for how to approach a conversion effort of this type. Plans to
     change the systems to take advantage of new capabilities which the UNIVAC 1110 offers are discussed.

Present State of EDP Policy and Development in EPA
     Speaker. Theodore R. Harris                                                                     Page 289

     Abstract: Presentation is an informal discussion of current items which Management Information and Data Systems
     Division (MIDSD) are currently  undertaking. These items include the EPA Feasibility Study Order (2800), the EPA
     Manual, the Standard Data Communications Terminal Procurement, the Interim EDP Resources Procurement and the
     Feasibility Study for the EPA Data Communications Network. The Whitten Committee Report and its effect on the
     Agency are presented. The EDP Plan is discussed under this topic.

Status of the Washington Computer Center Procurement
     Speaker: Denise Swink                                                                           Page 296

     Abstract: The current status of the development, approval, and implementation of the consolidated EPA computer
     resource (WCC) is discussed.

-------
                                    STORE! - THE STATUS OF THE SYSTEM

                                              By Charles S. Conger
     The  STORE! concept, using magnetic tape as the
storage medium, was conceived in 1962 to support the
 120-station water quality network. STORE!, an acro-
 nym for STOrage and RETrieval, is a computerized data
 base utilized by the Environmental Protection Agency
 (EPA), state and  local  pollution  control  agencies to
 define, record,  and monitor the cause/effect relationship
 of water  pollution. The  system consists of several files,
 the principal files being  the Water Quality  File (WQF)
 and the General Point Source File (GPSF). Five basic
 areas are  important in consideration of these two files
 (Figure 1).

     !he  first  area on the effects side, is the water use
 allegories,  which  the states  have defined  by stream
 segment.  The second area consists of sets of criteria or
 standards  In describe these  uses; the standards  were
 established with federal  approval by the states in order
 Ui numerically  and quantitatively describe the uses. The
 third area and  largest file in the STORE! system is the
 WQF which describes the state of art of the river today,
 yesterday, a week  ago, a year ago, and so forth. If the
 quality is  known, it can be compared with the criteria
 and standards  to determine whether or not the desired
 use is being produced. !he above three areas describe the
 effects of water pollution.

     The next consideration is  the cause of these effects.
In the  fourth area, we attempt to define these causes by
compiling  a discharge inventory  of point  sources  of
pollution  in order to determine  what discharges are
affecting water  quality. The next question concerns the
fifth area:  What can be done to either  upgrade  or elim-
inate the discharges that are causing problems? Histori-
cally an action  file of implementation plans has been
maintained; this file describes when and how to upgrade
or eliminate the problem discharges. These cause and
action  files have now been combined into the GPSF, !he
GPSF  will  consist  minimally  of 12 data  collection
efforts.

     In order to integrate all of these files and to make
the  S!ORET  system meaningful, five identifiers are
associated with each piece of data. These five identifiers
are action,  use, parameter, location and time.  A com-
plete list of the pollution abatement actions available to
EPA is contained in Figure 2. !ypically  these actions
may be the granting of money for construction or court
action  resulting from  Enforcement  of  the  Permit
System. !he second identifier, use, is described in terms
of Standard Industrial Classification (SIC) codes which
are also listed in Figure 2. The third identifier, param-
eter, is grouped by  the  following categories: physical,
chemical,  biological,  microbiological, waste treatment
and economic.  For  any of this data to be of value, the
user must  know the fourth identifier, location, which is
categorized as follows: political, hydrological, and geo-
graphical.  Finally, the user must have an indication of
the time frame of collection or occurrence.
     Assuming  that most of the readers are familiar with
the mechanics of the system, let us examine the growth
and content  of the WQF. One of the more interesting
growth indicators is a plot of the number of stations in
the system versus time (Figure 3), which has some inter-
esting  inflections. Point A indicates where the system
started utilizing low-speed remote terminals and where
EPA initiated efforts to bring states on line. Point B
occurs at  the end of the USTS to BCS conversion, while
Points C and D cover  the first six months of operation at
OSI. Point A also has some additional significance  since
it was  at this point that  the support staff of the system
grew beyond two people. At point B the user assistance
group  was created. Presently there are four people in the
programming support group which began utilizing soft-
ware contracts to develop new software to support the
GPSF  in  1970. It  is believed that  the support group
should be twice its size and should contain at least two
junior  people to accomplish such things as library main-
tenance, backup functions, and so forth. The user  assis-
tance group now has four people in it and needs twice
that many to  support its functions, including mainte-
nance  of  all user routines. Lately, the user group has
been  performing extra  duties in terms of developing
input  routines  and  testing  contractor-supplied GPSF
software.  Never  in its history has tht  user assistance
group  been in  the position to initiate contact with the
user community as  opposed  to reacting to user prob-
lems. EPA has recently awarded a contract to maintain
and disseminate  the  STORET Manual.  In the  last few
months, data element definition  and contact with new
users  at the headquarters level have been handled by
another group within the Monitoring and Data  Support
Division, which at least  has enough manpower to  com-
plete the job adequately. Since the move to OSI,  there
has been an attempt to shunt as many of the communi-
cations and central systems related problems as possible

-------
   (o (lie OS I project officer and the OSI customer support
   group.  However, users  do not make a distinction be-
   tween the hardware and software. If a phone line is busy
   or the central system is down, STORE! is not working
   as far as the user is concerned. Ideally, given an adequate
   computer  utility,  the  staffing pattern to  support
   STORET should consist of a user  assistance group of
   eight to  ten people, a  systems/programming group of
   eight to ten people, and an input/data element definition
   group of eight  to ten people plus the continuation of
   contract support for  the STORET Manual. A real asset
   would be a group of two people to keep track of utiliza-
   tion, budget  problems, justification  and defense of the
   system, and so forth. During the last six years the system
   has been  in a continuous state of audit; and in fact, at
   the present time, there are at  least two studies occurring
   simultaneously.  To respond to questions and  requests
   from audit teams, resources from the programming staff
   and from user assistance must be drawn upon; a manage-
   ment staff function would eliminate the necessity for
   this.

       Growth  and utilization statistics follow. Figure 4
  indicates the  level of activity  against  the WQF during a
  typical week. Figure 5 summarizes the data growth of
  the  WQF. Figures 6 and 7 show a historical  level of
  activity  against  the WQF. The GPSF has not  been in
  existence long enough to show historical trends, but the
  present  content  and level of  activity  is shown in
  Figure 8.


       New  developments   include  the publication of a
  STORET System Directory. This directory is designed to
  aid the new user in finding stations with enough and/or
  the particular data of interest. Figure 9 is an example
  indicating  the number of parameters at particular sta-
  tions. Figure 10 may be utilized to pinpoint the location
  of a particular station  on USGS maps.

      Quite a bit  of effort has been placed on developing
 graphical display of data from  the WQF. The Multiple
 Station Plot (MSP) package has been  available for over
 two years, but its new features are probably not ade-
 quately  documented  for  the  unsophisticated  user.
 Figures II  through 16 contain examples of  features
 taken  from MSP. Another plot capability  that is cer-
 tainly not  utilized a great deal is PGM=LOC. A typical
 example can be examined in Figure 17. Figure  18 shows
 a plot produced  from  a CALCOMP subroutine. In terms
 of plans for the future, a  WQF preedit program is being
 designed to provide at least a minimal amount of quality
 control on input data.
      Assuming that EPA will have access again someday
  to a system that can support interactive graphics, a pilot
  project will  be  implemented  under an R & D grant
  completed two years ago with the state of Michigan (see
  Figures 19 thru 28). Figures 29 and 30 are examples of
  graphical  output  from the AUTOMAP subsystem and
  the Municipal Waste Inventory, respectively.

      The  following story  illustrates the importance  of
  considering both how you use the computer as well as
  what you  use it for. Once upon a time in a house in the
  woods lived three bears: Daddy Bear, Mummy Bear and
  Baby Bear. Now  Daddy  Bear was called ACCOUN-
  TANCY, and he made sure that the family got what it
  needed, not what it wanted. He was very fond of making
  tables, which he  did  by simply nailing five pieces  of
 wood firmly  together. However, he had a dever knack
 for making all four legs exactly the same length so that
 the Bears'  tables hardly ever wobbled.

      Mummy Bear was called STATISTICS and she took
 care  of the cooking. Although the  Bean tte only por-
 ridge, she knew their likes and dislikes and could usually
 give them  the sort they liked. Of course it meant a lot of
 work keeping her recipe book up to date.

      Baby Bear was called MATHEMATICS and he had
 a tin whistle. On it he could compose catchy little tunes
 that  for some reason he called theorems. Although he
 really was  not very reliable, one of his link masterpieces
 could sometimes help his parents over a bad patch.

      One day when ACCOUNTANCY, STATISTICS and
 MATHEMATICS were away, presumably at a confc r-
 ence, who should come to  the house but little GOLDI-
 LOCKS  COMPUTER.  Well of  course she tried Daddy
 Bear's tables, and tasted Mummy Bear'i porridge, and
 blew Baby Bear's whistle. She was surprised to find that
 although the tables were very plain, they did not wob-
 ble.   The porridge in each  plate  tasted  different  even
 though, as  far as  she  could see, the only food in the
 house was  plain  porridge; and all she could get out of
 Baby Bear's whistle was a faint squeak.

     Anyway, the  Bears returned home  and found her
 there. Because she  really was a charming young lady,
 they  asked her to stay; and in the fullness of time, she
 felt she ought to do them a good turn.

     She told  Daddy Bear that she would help him de-
sign some really elegant tables. She asked Mummy Bear
if it really would not be better to make a big pot of basic
porridge and then let the family get what they wanted
H)

-------
 11DID  it.  Mummy Bear was a bit  doubtful  about the
 ingredients  of basic porridge,  but  Goldilocks said she
 would help sort  it out. She also made it quite plain to
 Baby Bear that he ought to have a  stock of tunes ready
 for bad times.

     As it happened the Bears thought that these  ideas
 were worth trying. Daddy Bear had  always felt his tables
 could  be  improved, Mummy Bear  was fed up with al-
 ways changing her recipe  books, and Baby Bear, to tell
 the truth, was a little guilty about being selfish. So they
 agreed to Goldilock's plans.

     At first everything went  well. Plans  for splendid
 tables  were drawn up. Several dozen recipes for  basic
 porridge were  prepared,  and  Baby Bear  began  daily
 music  practice.  However,  the  bears  could not  help
 noticing that they were all so busy that they had no time
 left  for making tables, cooking porridge or composing
 tunes.  But they still felt everything would be all right.

     Then came the great day when they were ready to
 start their  new  life. Unfortunately Daddy  Bear  soon
 found that his new tables, which had at least a hundred
 parts,  wobbled when he could manage to get them to-
 gether. Mummy Bear realized that the only way to settle
 the recipe for basic porridge was to carry out a survey of
 Bears' attitudes towards porridge. She knew, however,
 that after some weeks of very thin porridge, their atti-
 tude to  it was distinctly  hostile. It was clear that she
 could  not complete her survey  until she had made  some
 good basic porridge, and she could  not find out how to
 make good  basic porridge without  a survey.  Baby Bear
 said he could not get any  inspiration under pressure and
 locked himself in his bedroom.

     The next day the Bears all asked Goldilocks Com-
puter  to  help  them  out of  their trouble.  However,
Goldilocks Computer was very  busy  looking into the
possibilities   for  making automatic  reclining chairs,
cooking coq-au-vin and playing alto saxophone. So she
said, "I really  know  nothing about the actual job of
making tables, cooking porridge and playing tin whistles.
But I have given you all the information you need, so
why don't you get on with  it?" Now Bears,  although
nice enough  characters,  are  only Bears and they have
very little sense of  humor, especially  where tables,
porridge and music are concerned.

     So Daddy Bear told Goldilocks Computer to stop
interfering with his tables,  and he showed  her his long
black claws just to make sure she understood. Mummy
Bear got out  her old recipe book, threw away  the mess
of basic porridge, and let Goldilocks see just the tips of
her sharp claws. When little  Mathematics Bear stopped
his practice  and showed Goldilocks his daws, she was
surprised to see how long and sharp they were and went
to the corner and cried. Soon afterwards her  beautiful
golden hair  turned grey. Because  she  could not make
tables, cook  porridge or play  tin whistles, there was
nothing left  for her to do  but the washing up. And if
you have ever washed  up after a party of porridge eaters,
you know how she felt.

     Why  use a moral  story at a serious  conference?
Well, the  first reason  is to draw the moral.  To stress it,
perhaps even  to labor  the point, the moral is: Ask to see
bears' claws before telling them what their problems are.
That moral  is  commended  to  all the computer  men,
analysts and  so on present. Engineers and managers are
also urged to keep it in mind.

     The lesson is that it matters at least as much how
you use the computer as what you use it for.  The con-
ference was  not intended  to promote computer  use.
What it has done is to bring together many diverse inter-
ests so that each of them can hear what the others have
to say about how they do things.
                                                                                                               11

-------
                              STORET
WATER QUALITY EFFECT
GENERAL POINT SOURCE
CAUSE         ACTION
USE

CRITERIA
OR
STANDARDS

WATER
QUALITY
DISCHARGE
INVENTORY

IMPLEMEN-
TATION
PLANS
                                         MUNICIPAL WASTE INVENTORY
                                         INDUSTRIAL WASTE INVENTORY
                                             PERMIT FILES
                                                MICS
                                                SHORT FORMS
                                                LONG FORMS
                                                SELF  MONITORING
                                                RAPP
                                             VOLUNTARY
                                         CONTRACT AWARDS
                                         CITY MASTER FILE
                                         POLLUTION CAUSED FISH KILLS
                                         FEED LOT/FISH FARM (AGRICULTURE)
                                         MINE DRAINAGE
                                         DEEP WELL INJECTION SURVEY
                                         MUNICIPAL DRINKING WATER SUPPLY
                                         NEEDS SURVEY
                                         OCEAN DUMPING
                               Figure 1

-------
|STORI:T|
Action
(1 cdrral, State, Local)
Mfliient standard criteria
vstahlislitd
Kf fluent quality measurement
Voluntary industrial waste
inventory
RAI'P
FPC survey
Influent quality measurement
Voluntary industrial waste
inventory
RAPP
1 PC survey

Knforcement
IPO day notice
Conference
Court order
Discharge permit
Waste treatment facility
construction
Preliminary plan submission
Preliminary plan approval
1 inal plan submission
1 inal plan approval
Financing complete
Site acquired
r.r.int award
Contract award
Begin construction
End construction and start
discharge
Operational level attained
Waste treatment facility
operation
Training
T chnologlcal advancement
WC measurement
WC tandatds established
Use stopped
C«e

SIC
Agriculture, forestry
fisheries
Mining
Contract construction
Manufacturing
Transportation.
communication,
electric gas. and
sanitary services
Wholesale and retail
trade
Finance, Insurance,
and real estate
Services
Government
Organization
Plant
"Discharge
Life support


















Parameter

Physical
Chemical

Biological
Microbiological
Waste treatment facilities

(e.g.) Primary
Intermediate
BOD Removal
Secondary
BOD Removal
Tertiary
Phosphate Removal
Disinfection
Sludge processing
Sludge disposal
Outfall
Deep ocean outfall
Sanitary intercepter sewer
Combined intercepter sewer
Pumping station
Force main
Combined sewer overflow

Economic
Men
Dollars

Numeric
— _








Location

State
County

Cong. Dist
City
RM1

Lat. /Long.

























Time

Planned
( yew.
month, day
hour.
minute.
•econd)
Executed
fT«T
month, diy
hour,
minute,
Meond)

Map
observation
time




















Figure 2
                                                           13

-------
  160
   120
en
u.

S
Ui
00

D
    40-
       1964   1966
1966
1967
1968     1969
1970
1971     1972     1973     1974
1975
w


§

-------
 1VJO
 1VJI
 IVJ3
 IVJ*
103*
IV-.7
JVfc"
Iv73
1«7»
»|97*

TOTAL
                                      ACIIVc
                                     STATIONS
*6
                                        In 10
                                        31SO
                                        3«0l»
                                        Jl/o
                                       20? 7*
                                       aoj'sv
                                       3*0*7
                              TO lot.
                                                                            ||0 M U
                                                                  1 0  I  A L
                                                                   1J333
                                                                    77ol
                                                                    Itbl
                                                                    VJil
                                                                                    *«i
                                                                                    UN
                                                                                    Ifct
.»»% !«.!.»•
                                                                           l«71Hv   »k.i«0 t>»,»;<
                                                                                    
                                                                   81 Jbl
                                                                   So«-JJ
                                                                   fftlOS
                                                                                                             !//••<
                                                                                                             »?3/'«0
                                                                                                             bHt>7>*?
                                                                                                             ><«.f»>«!0
                                                                                                           I«'*''J3^
                                                                                                           i3e<".aa
                                                                                                           IShl'jJ^
                                                                                                           1*MIOS
                                                                                                                               1*1. <1i>
                                                                                                          • Ij.n.-

-------

i n I y*'^*. t
11 V.W.*
] 1'JSAFA
IIVKI-A^
niONFf.
11 II^PI'I
1111 ^ •
illlSnno
1113AP.,
!;!ir
I'/nns-
lllIJA^
lin.MDNA
111-»SOOO
111.1-Jpsn
111 )ri^MO
111S&050
1116APCC
S T A T I 0 'i. * I
' K * CH»M>F >.LF.Tr: I
11 '" 0,
1 (! 1
1 o r. •
1 '"i 0

7? ?n 4Hn
? T 1
mi ?n <7»
4 ft
« 7 ?77
«•' ?<5 1SS
?1 7 ?«7
2? H 64
S 2 ?
2<» 4 10. 5*
?. 4 4
2" 1 1
* M ' . I?
4 21 782
71 47 ?«2|
0
0
0
0
0
0
0
T
n
0
0
0
0
0
0
0
0
0
A
0
0
1
0
IJ ATA
TATA Pi F 1 P T
DtiSVS STATIONS
n
t<
0
n
„
"
n
n
0
o
"
n
0
0
n
n
0
0
0
A
0
3
0
i)
n
(V
1
0
1
1HJ
0
0
n
0
n
0
0
p
0
0
0
0
0
"ft.
0
0
^AT^: in / n > • / •• .
" 17
E n o A T A
0
n
0
0
n
n
n
0
n
0
0
n
n
0
0
0
0
0
0
0
0
0
n
n
0
n
0
0
P
n
0
n
0
0
0
6
0
0
0
0
0
Ov
0
0
0
0
<> A T F
FNOIN<", ^rasE"
0 qqqqgQ
0 9
7^0f^ 7Pn^
740?>«? 7lo2?a
74J)7U 7-»n7".-?
740??^- 7?n?) i
74070J ^•>(tc,'\ i
74070? 7fcf»417
0 99994Q
740701 '••'OO-M
ft7llOl ^»710oc
^•ftOllQ 6*^011^
740711 740*0=.
720711* 7?n71«
711104 Tuoia
0 99Q99*
7?0*»1 710714
*81I>17 M1017
A 999999
7104]? AAIUQ
740531 73031?
740514 43052?

-------
I
v,r.<*

2 e   —
i' '

i, i  	

*i «
0
IV J


  ?_-
   /&.WJ

   C


   5	


   4	



   3	
  P 10J..-.1
  "v M

  Ji "
    7
    4—
    3_
    2	
    U.DD
     'J  <
   I. 4
   b ,
    \f10
                           W-
q=rz
   •  *,
               *.. <



 4..
           P ......
                          .._J?O.
                                                                           Figure 6
                  	4--

                                                      • I

                                                   C
                                 i

                                 i
                                « (
                                                     	I	
                                            IT
                                                                            _	_,
                                                             IE3E
                                                                                 r

-------
                   Au.
v.
•/.:

-------
                                                                                       i»f»»••••*»••••••••••••••*

                                           OPSF FILE CONTENTS                        /Uf&**k-'
             »••••••••••••••••••••••••••••••••••••••••••••••»••••••••<»•••••••••••••••{•«*«••>••••••••»»••••••••••
                  FACILITIES IN 
/**
W ...

//
S*o
If
/*>
._._#:...-
PERCENT;* • PERCENT °?
INCREASE. PERCENT IN . INCREASE
8/6/7* TO. 9/U/7* VS. . 8/6/7* TO
9/l?/7* . EXPtCTEd** . 9/12/7*
J4 • fl • /r
0 • // • /
* • ;/ //
0 • ff v -/

* . X . .. /

J0 • // • -/
0 • tf/ • -•/
^ • *¥ • /-t
...A 	 : 	 ^ 	 ; 	 x 	
OCO^C KIT *>^
• r t" wt™ * / • •
PERCENT IN . INCREASE . PERCENT IN .
V/12/7* TO . 8/6/7* TO. 9/12/7* VS. .
EXPECTED*** . 9/12/7* . EXPECTED**** .
J • 3 • 1O
S? • UdT-<*CTING PERCENT IN VtKSUS EXPECTED 0« 8/6/7*
 F*OM PERCENT IN VEHSUS  EXPECTED »/I2//*.
                                                                                                               oo

-------
1ST flM''
SEPT 1974
                                 FO»
                                               STATIONS *•»!» OAT*
                                       AS ft APP.ll  1*7*
• $TC  STATIC**  I 0  C A T  I C *   ACe*CV
                                                        STATION
H
N
f
f
»•
       OlMM"-««iiC P|yep.  IT  OMN*CAUG. C" 112V«P   C1124OOU
       NHlfN«KK « *T  MtrTWlUEt CT.  U?i.sn   0119MOO
       fct'COH  lA>c 8« Hftc  KfNT5Vlllr.  112WT   01198B51
       MA-ICt1"  IAKS W *tAe Sl'ITH CANAAN 112k-D   911f*480
       MrtS»"rMf « »T  PALIS VIILAGF. T- ll?WPf   Cll«?9O01
                   * »T  TWIerv|iLct I;T, H2WSC   01199995

                     (V=* *T ^--I'^IO "i« HlOVcT  980082
                      AT tup t»?fJvlll= C 112xep   41184OPO
       M»l(.  "H-(14 4K W» S»*C-VIlLF. CT II?•.»•»   J12055H

                      «-60C FT eon fC«!F.
                                 P»'0<"   IHlOrGl

                                 '0 MC."
        CT.r-HT.KiVJ"
                        -••I')
                                  Ser,  IW)  llll^rCl H« 06
                                                 PI H'07
                                           UIK'Gl
                        -r»5T -«t 'C P'G
                                                    H«12
                                                    m>13
                                                    M'14
                      »T
                                        Cn
        »J»T«I  UK?
                                           IIFPAU^S  390201
                                                     09020?
                                                  :r  090101
                         N?M»
        nnn?iT-«t ?C LA«? lIll?-Jr.«'M
                      »? l!UI1f«'*H
                                                     IIU5.5S
                                                     1105S
 •  t

 »  9

 •  t
 9  fl
 *  9

 f  «
 *  t
        »"I.SJT r.!«. 1C LAIIF

        WWH9'•*'•!. C^VfCTKUT

        MSNCVf  POND
     *T  Mim •. HITI".
         H.IKPTOM. CT.
/.T V'.CTK,  CT.
C STJIL  'IV*
  ITUt  »1V»
  STin  «rv»
  ^TIIL  "IV":
                                           llll'VCl  LI C7S

                                           llll'-'Gl  Ca-J?.o
                                                     0»-)3.J
                                            ii2i-*n    11193040
         V/':Ttf
                                           112V*"    1U27504)f,
TYP?
02111704
77777777
T7777777
77777777
TT7TT7T7
02111204
02111234
0?1 11 204
06U12J4
02111204
02111ZJ4
021112)4
02111204
02111204
021112^4
02111204
02111214
02111204
02111294
021112"*
021112>4
0211120'.
02111214
02111204
021112J2
06111234
021112J2
021112 J2
0211120?
021U2J4
02111212
0211 12"?
0211120?
0211120?
021112)4
02111204
02111204
Afflll?32
J211I234
07111204
02111204
061112.14
02111204
02111204
021112J4
02111204
02111204
07111204
421117-14
0211KM4
PH SO »;1 PH ni ex »» «; HA »I GS C.e P£
rs I l T* n$ SS vr. »i TI ri 01 it- SF ST
ic D* ar, PM OL tN oi.ts E° "G *A f>A 1C
Al 9N OR VF t) OG in K t I to
us P '* ir 10 AI OP IN es
OX IN 11 Gt GA OR
VG 0 CA NI GA
FN L e ii
EAjlA WO * C
60162 2 39 17 97 17 77
72'T» 2 40 44 14 1A C4 39 84 J2
72*73
72 73
T7.T3
7l'T3
6S 13
02 69
66 .13
66.07
6ft 7,3
73 73 '
73 n
73 73
73 T3
73 73
73 |3
73 fa
73 713
73 73
60 69
73 13
73 J3
73 f3
72 72
6J 61
72 72
72 ??
7? 72
63 73
73 73
7' 73
71 7S
73 73
73 73
73 7-*
75 73
72 72
73 71
71 73
73 71
67 73
60 73
«.« 73
73 73
73 73
T3 T1
TJ 73
73 73
71 7>.
2 4C
2 4C
2 4"
3 5G
3 4C
F 2Z


44
55

2? 33
H 7t 4D 70
2 3*
6 TC
1 17
1 17
1 1?
1 17
1 15
1 Ik
I 16
1 15
1 15
3 V4
I 15
I 15
1
1 26
2 28
I 2*
1 26
I 3a
7 5f
1 24


















24
24
2t
77
U
IS
26
1C
10
1R
5C
5C
6F
27
77
n
27
77
1 If
M 19
?6 16
14 14
14 14
15 li
14 14
14 14
1* 14
14 14
14 14
14 14
\r.
14 14
14 14
14 14
26
is
2t
24
25





14 14
14 14
14 14
2f
14 14
14 14
1'. 14
2C



14 14
14 14
14
14 14
15 1'i
19
19
24 19
2B 1A
17 17
48 1Z
20 IN

29 1C
14 17
14 18
14 IS
14 18
14 18
14 16
14 17
14 18
14 18

14 18
14 IB
14 16
26 14

2A 18
26 14
75 16
If. 1C
IT
IT
IT
15
16
14 18
14 18
26 14
14 14
14 18
i'. in
If. 1C
1C 1C
29 1C
17
IB
IB
in
18
14 19





1Z CC
1A 6C


14
14
15
14
14

14
14
14

14
14
14










14










14



14
C3 39
63 39
A4 39
E6 38
17
K4 1Y
P3 3H
7A
9C 3C
24
24
25
24
24
24
24
24
24
7*
24
24
24

88



39 38
22
22
22
22
22
24
• 74

24
24
24
»r. 39
=29 39
74 3C
22
22
22
22
22
22





• 8 55
ID
19

14
14
IS
14
14
14
14
14
14

14
14
14






17
11
17
16
17 14
18 14
18 14

18 14
18 14
18 14



1A
' IB 14
IB 14
IB 14
J2
02
84 J2
se 12
TT
5Z
8f 92
TA
BC









7H



16
78
1A
16
IS
TC







16



BC
TC
BC




IB 14
10 15

•=L T?
rm HP
ER
AT
IM
E
1A 1A
18 '.0
16 1C
16 1C
IB 10
IK IK
17 17
1Z 1Z
IN 10
1*
IF IF
IT
18
19
IB
18
18
18
IB
IT
IX
18
IB
16
14
IB 18
IB
14
16
IE IE
18
18
18
IT
18
18
18
14
18
18
18
10
1C 1C
IE IE
18
18
IT
IT
18
19
IN
1
7
3
4
5
6
7
a
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
V>
47
44
49
50

-------
P.T--
" 7  r r 1 SY* T"" r ; • - •• r
 '-t'C7rr> 'TiTi •! t  «
   >:  'F APPH  1<57-,
•r r, i f • 1
A" 9
IN
1
j
3
4
5
6
7
8
•)
10
11
17
]*
14
V
16
17
IP
19
20
jl
22
23
24
25
X
27
28
29
30
31
32
33
34
35
36
37
38
»9
40
41
4?
43
44
45
46
47
46
49
V)
ST-'.IJ
.jOflOO
09VJ5
of- > •>•;
O9'"1"5
09'105
n9OQ3
f qf)[\ ^
09101
19J03
O1* 105
tlj*}
0° J03
0?003
? 9 1 0 3
09-103
09J.-I?
0°'?Li3
onv»»
.l-y'i03
I) 'Ml 3
C5003
">° 3 1 2
O1/."1!?
JS'! 13
09''.Tl'
CO 100
09 ;oo
09.1 VI
09XH1
0000ft
09^01
09-MM
09 j.l 1
CM"!
09'! .11
O *»C 3
09AO'»
?0 (" ,l
OC^Ti
C9f'05
! 1 1
05JQI
00. .11
OO-i-ji
)*'M
vV 11
T53O1
UATlTUf-c LOMGITt
42
41
41
41
41
41
41
41
41
41
41
41
41
41
41
41
41
41
41
41
41
41
41
41
41
41
41
41
41
41
41
41
41
41
41
41
41
41
41
41
41
e \
41
41
41
41
41
41
41
41
1
56
56
57
57
54
54
59
59
51
47
45
45
46
46
46
46
47
49
49
45
49
50
51
46
?9
41
41
37
4?
33
32
33
»5
31
34
3?
31
31
71
31
37
33
33
20
77
74
24
24
2*
20
55
36
42
26
30
41
2r-
14
60
in
20
41
37
3?
55
47
59
33
9
36
57
fcO
56
t>0
11
60
44
37
1
1
3*
55
35
13
11
44
16
45
27
5
?0
11
31
55
22
4»
24
1
1?
7)
7?
73
73
73
72
77
7?
77
73
72
72
72
72
72
72
72
77
72
72
72
72
77
77
7?
73
73
73
77
72
73
73
73
7?
73
72
77
72
72
73
72
77
72
72
73
73
73
73
73
73
57
1°
17
18
22
45
41
36
36
10
57
3S
38
37
36
35
34
*1
31
30
?5
?°
?9
29
16
2"
13
13
58
10
74
24
74
?(•
75
53
52
49
'.I
5C
49
33
ft.
7
24
73
24
75
75
25
.0= *«S SI' -,wrC« H7.50. (
22 02
18 32
8 02
52 17
11 02
40 32
16 07
21
21 02
4
55 02
13
45
14
70
32
2
42
6
a
16
f 0
70
13
60
?5
?0
45
41
57 32
21
23
40
59
1 J
'3
47
39
46
20
35
13 32
57 37
19 12
3?
44
SI
32
44
26
t1?c7 rrsT.-n
7?i 'r
HAPTFCP1
HA?TFCRP
HAPTFrPT
HARTFCRP
HAPTF1FC
HAPTFTRH
H*RT=rr.c
G 61526 h»««TFCPC
HARTFror
HARTFPRO
H40TFPPO
HAPTFOKC
HAPTF1RP
HA^TFr-Pfl
HAPTFTPP
HSPTFf:»n
HABTFORC
MAPTFOPn
hARTPT^P
f 63511 H?.tTPCPP
* 63532 HAHF"sr
G t3527 H,'.e-=r'.D
HAfiTfrec
H/**TF"°P
HAkTf PRf
HA3TF^KP
HAP T FOP i1
HAFTFCPP
                                                            Oil C,
                       USG« KAP
                      IFS   f  C

                            H  1
                            ."  6
                            A  6
                            A  6
                            A  6

                            All
                            M2

                            e  7
                            P  9
                            Fll
                            Fll
                            B12
                            312
                            ^12
                            Rl?
                            B12
                            B12
                            Cl?
                            612
                            P12

                            P13
                            R14
                            C  5
                            C  7
                            C  7
                            C  9
                            CIS
                            0  5
                            0  5
                            0  5
                            C  5
                            r  5
                            P  9
                            0  9
                            no
                            no
                            PIO
                            no
                            012
                            P13
                            P16
                            f  5
                                                                       7.5'
                                                                      SOUTH  c .•••?,?
                                                                      S«UTH  C.'NS-'
                                                                      SOUTH  (t.*>t»
                                                                      S'JUTH  C*K*£
                                                                      TAMFFVIU-
                                                                      hFST

                                                                      HAPTFC7D N'i-TH
                                                                      HAfTFORP V^TH

                                                                      MANtHESTF^
                                                                      M4MCHF5TFP
                                                                      fANCHFSTFe
                                                                      >'*NCHESTE»
                                                                      oCrKV'LLf
                                                                      FOCKVILLF
                                                                      SOUTH  CrvFf.""cY
                                                                      KENT
                                                                      lTTCHFI=in
                                                                      LITCHFIE1.1
                                                                      KEW
                                                                      l'f RIPEN
                                                                      >'FP1
                                                                      VEF. i
                                                                      »'-P]
                                                                      HIPPIE
                                                                      OAN4UFV
                                                                      r-A^a'jf-v
MES   EPf. ^AJDR  ANO 1INJP HAS!'«S

 7.5 N=05
 7.5 NEO?
 7.5 NED?
 7.5 NF02
 7.5 »JEO?
 7.5 NF04
 7.5 Nt-J4
 7.5 »1E94
 7.5 NE04
 7.5 NE02
 7.5
 7.5
 7.5
 7.5
 7.5
 7.5
 *•?
 7.5
 7.5
 7.5
 7.5
 7.5
 7.5
 7.5
 7.5
 7.5
 7.5
 7.5
 7.5
 7.5
 7.5
 7.5
 7.5
 7.5
 7.5
 7.5
 7.5
 7.5
 7.5
 7.5
 7.5
 7.5
NE04
f.'E04
NE04
NE04
NF04

NF04
N = 04
N=04
NF04
H=05
IJF02
^=02
N-P02
*E04 N^Ol  NF02
»IE05
NE02
                                                                                           N?0?
                                                                                           NF02
S£'I4
NE04
NE04
HF04
 7.5 N502.
 7.5 Hr02
 7.5 NFO?
 7.5
 7.5
 7.5 NF02

-------
       the plot on this pnr.e  was  prepared l>y A rfl^ltal plotter.
       Only 00 data collccti><4 In  the  surmor for the Utirfsnn P,  Is
       plotted.  The two extreme  llnr*  v/llh syrhols show 15th
       and 8 5 tit percent I Irs.   The/ roMrtln I lni» shows the nodi an
       or 50th percent I IP value.  For ortrtltlonnl axartples of this
       program, see Section  IV.
        0
        0
        0

        0
        0
        S_  I
                  3   
-------
   Figure  2.  Sample Outputs  From Multiple Station  Plot  Program.
              Th«» first plot, scaled  In miles on  the  horizontal oxl»,
              shows no values In thn  Saglnaw R. from  Its mouth  to
              confluence with the Cnss River and  then upstream  on
             • the Cnss R. for snvnrttl miles.  All symbols  shown on
              the plot ar« medians (50th purcunt I let);  lines wire
              added manually for a qulrkftr visual understanding of
              the data.  Plots on the n*xt paces  show seasonal
              grouping of DO data for »i>e Hudson  R.
SYSTf-'   Ga/07/7*
   MILE.'   »
1. 07f*0l
                                                                                              PLOT NO.  1
                                                                                                    MILES
                                              Fifurel2
                                                                                                       23

-------
                Plot on this page  shows  seasonal grouping of DO data tokan
                since IOCS.   Stations  nunbered sequentially on the horizontal
                axis reference  anency  and  station numbers shown on another
                pace of output.  Stations  are In the correct river mile
                order,  and,  In  this  example, downstream  (towards New York City)
                l.s to the right.
   STOHEl
                  ,lrt/C7/7<»
                                           *ULTiJLE STATION PLOT
                                              F*JM 6»0»02 TO 73111*
               ]   f  3  .,   S   6   7  8  V 10 U 12 »3 1* \"3 16 17 Id IV 20 21 22 23 ?<• 25
    >.60i«0
:          •            '     \       r'      V  \
/ m	^  ^,    	<   t     _   X^      /          .. . _.	L.. ,	,.   _^
L        *•,"*"    •   '  '     -*-~\^   j      TUHMTft       T
   S.40E«0	'                      I  x-v
   J.MI- .0
   S T«| | i I    |  ^  •<
                                     •» 10 II  \'- U
17 14 H 20  
-------
::/"ll'LC  PAC! S ('I- OUTPUT:










'I
0
1
I
n




*
.1
1)





^

0
A
Y





»(
:,
/
L












STOW-T
STMI'J
1
?
't

h
7
H
1*
1C
11
I?
i>
1*
lc»
1*>
)7
l-i
1 •«

••1-irion-
i .(>:«• «o 1









n.-j9? «0









o.HHE»0







.

b.l7E»0'









3.06E<0.









It7i£»0
STATION,
^AMCiE
•>YSIP'<
i USS
?|Ml J
1 1 *•«*-
•t\*\.
II 19
1110
1110-
IJIO-
lll«
1110
Ml A.
IW
II l'»
n n
iii'i
mo
/H! •
1 1 1 <>
1 1 10 •
lll'J
It'*
VI U
1 / i 0 «•»•«!;. 'la
.'^0 <"i'/.i l'•'* •-"»
,'lU ^f. <..<•'!
SO :»,•«. ;x*
Sll flli-.S*
1 i' ill-'UI--
• ••ll >"l'».'l^rf
•ifl ,•(.•< IJ
1 .-'.(I V*
/ '|'i !<••« '•' •
N •' .'1- 'll-
                  HULTI-'Lf  STATION  PL*OT
                      F*JM bHOlO) TO 7«0£0b
               10 II  K  1) 14 IS Ib IT I* 1*
                                          POM'MSP
  /
567
                                                                                                            PLOT NO.   1
                                                                                                                  STATIONS
                                          for itatlona locate* on  tha Mini ill ppl  «. around
                                          MlnmapolK. plot HOOj at a function «f  station for
                                          data takan ilnea HOI.
          8  9 10  11  \e 13 11  IS  16 17 IB I"*  20
                  (Mlnnr-aonl Is-St.  Paul STP)
                                                                                                                   STATIONS
       NP1
         (.1
          f
         **
         Jb
         3H
        IIS
         13
         It
                                                  WLTl-'LC STATION PLOT 
                                                                  TO
 FrtJ.<
AKO^oa
7*1 04
hMOt
-------
          sr->i  -i

             M 11
          i. i /  •

          r».7.  •


          «..11r .
                 l.rtlll'l f  1'Afif  O!-  III''I CUT :
                                                                                       Far ttatlnnt Inectoil «• tti« Uratot *., slot
                                                                                       dlitolvx* ttoll>« vi. « Mrlouk« * ri!«l»l  plot of th« Hto  I/'I*
   Mil-.      (j'.  ~-->T'>llLH -,t.
   •a      11'"- •    1-n ifu.
            \' I
            !•••
           »)<• /
S'.H
«.^7
>•>. I
*t i*t
/  .
Ml*
26
                                          147
                                          i n
                                          10?
  j)
   I
  (/
M\
          r.-> "i
         rj/'ll/J
         6" I   / i
         6.,:,. II
                                                      JLTfLE  STATION H.CM  (Hbr-»
                                                         F • IH f>HUIl>7  T"  7403UH
                                                      , 1         TO         NP2     FI.OM
                                                             740IM
                                                  r.H.; I I 1
                                                  70l:-' I''
                                                  r.Hf-1  »l
                                                  7«l  Ir
                                                  '>1l'p I  11
                                                            74K-IF
7'. >/
74 J^-^-l
/•«.,«'/-i
//'.. / •.
'«..'»' .'i
Mil j
                                                              Figure IS
                                                                                                TO
          PLOT NU.  1
           STATION LOCATION
BHAZOS PIVCA *r U4A/ORIA RtS
UrtA2US K NM HUSHA4UN*  T£X.
HMAZOS K -AT RICHMOND.  TkX.
flrfAZOS H NM COLLEliE  STATION.
HRA/OS K NR HIOHMANKt  TtX.
           AT tfACOt TtX.
           AT *MllN£Y  OAH NO  M
                C.AK utNNib. re
BS A /(.!<; K AT HO'jSU* MNbUOM  II
           AT »"•*  !?»/  NM 6*K»MAM, Tf
        f|VL«* Af StTMOUM, TtH
SALT  »f«  rt«A^Oi  H MM
                                                                                                         BKA/OS M
                                                                                                         HRA7OK X
                                                                                                         rt«A/0<, k
                                                                                                         hXA/OS X

-------
     SAMPLE  PAGE OF OUTPUTt
                                      For ulaetad itatlent In ft. leudaun Mwrvolr.
                                      Turn., and unilrrw and itown«tr«an •alntt, clot
                                      1)0 line* ItTO.  firouii no "•••uf«n*ntf tak*n abev*
                                      I O.I ft*t and n»»uren*nt» below 19.1 ft*I.  U»t
                                      llmar ptllaaiia ftcallnit.
ltO«ET SYSTEM   OB/07/7*


          M!LE>    66?
       I.O»E*0>
                        604
   MULTIPLE SUTION PLOT  CMSPI
       r*OM 700107  TO 7*0717
616       633          630
                                                                         637
                                                                                                6S1
NO.   1
 HIi.CS
 1.02O01
                                                                                                           0.68C*«0  0
                                                                                                           (Hoi(ten «.)

                                                                                                           7.9X»00
                                                                                                                     M
                                                                                                                     0
                                                                                                           7.16E*00
MNOC
                                                    Flfurel6
                                                                                                                     27

-------
oc
             pr.H-i.or
                     < ytt  wi l*»»s ro
                 toor.tu'**  ir t-" iNPl.'T to the  conput*r
                 proRra- P'J'P JT t*»n  l.sn stations i f> t
             A rf i gi t Al  pic:  sw^«s n->'/r.on boundaries
             sea cion )oc^c t on .
                       S » G I N • H
                                          HFTPIEVAL
                                                                                                                                                        CASS RIVER BASIN
i

-------
,/PLUrf  '  JO.«
Li l.L "31.?.L
MM1  13.
 t -LF.   <30wtf
 »t««l)«M«CKtMTHin.«(Soi.t»tl
L.L OJ.iO.L  *3Z»,l. OJJO.L <>
                            .L e3SO«
Ll)
<0
Hi
JO
 0
JO
•19
30
13
U)

KV
LAT.
         OUTPUT:


               £0033?
       2I3MH
       21MKH
          il.V3  c3oM3
21HIln  /JJlOo
       ilHlCrl
       Iia»rlj   ,4l*ISOO
       1114.105  *3b44B
               .-.151000
       II15JU:>
2IHICM
1115'JOS
4J 14 00<
43 l« OJ
03 lt> OJ-
»3 IV |»,
43 19 ?/
43 19 3'
4j 19 <>'.
43 19 i«,
4J IV )l«
*3 IV 40i
«3 W 41.
43 IV 44,
43 19 43,
43 19 4B.
4J 19 4J.
43 <;i 0"i
4J t\ S3,
43 22 10,
43 22 10.
43 22 U.
43 23 V<
43 23 3».
43 24 00<
                 0 OnJ 33
                 0 0«J 34
0 0-)-> 44

0 OaJ 3x
U Ol-i 34
0 OBJ J4
0 Ofl-> 34
0 OBJ JO
0 OHJ JO
0 0«J 4i
                                           .,'.0
                                           >^ .u
                                           - O.u
                                           <.0.l)
                                           HS.il
                                           r'S.O
32.0
=0.0
33.0
>v.o
                                                                                         Att»0
                                                                                  »630

                                                                                 *6*>*l>-L.H.f.O.
                                       0  C»>i UKSXi—fLKKT CM
                                       II  C*Sb xl». OXSib-L.n.r.Q
                                       0  C»SS Kl^t-* »I M-H1.
                                       u  FMIkWLnMJTH SfP     L1P
                                       0  C»iS K|y. U«S»l»-L.H.r'.o.
                                       0  CAiS KlVt» AT OtrtHtL MOAJ.
                                       0  CASb MlvtU AT S. Htttw ku.
                                       0  CASb nly. U«S»li-L.H.r-.o.  X615
                                       0  CA&S rf AT i>IXIt H«r OH I Out
                                       0  CA&S Nlvtx «l F^ANKEi'^UTHi HlCnl
                                       ll  CASS U»I>OUUINO> tx.LtVO X6O9
                                       6  »A»i«H SFH          Lnpo «6«F
                                       0  CASS HIV. U»S»o-L.rt.t-.0.  »*SO
                                       U  CASS K.  AT VASSA'K. Mii-1.
                                       o  CASS KivtK AT M.IS.  '
                                       U  CASS XIV. OXSAb-L.n.i'.O.  »655
                                       0  CASS «1V£M AT KATCKM4N KUAO.
                                       0  PHUDOCINO T!i«
                                                                                    SCflLf-l  TO 2SOOOO.

                                                                                                    7.63
                                                             MILES
                                                    0.00        3.95
          MINI
                                                   Figure 17
                                                  Continued
                                                                                                                       29

-------
               \
               \
CALL FLOWER  (X.Y.R.A)                       Load module on
                                         BCS011.TOBIN .LOAD.SUBLIB

X is x - coordinate, in inches,of center of large circle
Y is y - coordinate, "    "    , "   "     "   "     "       all  floating
R is radius of large circle, in inches  (1" used above)        point
A is angular resolution, in radius  (.05 used above)       single precision
                               Figure 18

-------
             Figure 19
Base Map of Continental United States
               Figure 20
         Outline Map of Florida
                                                                          31

-------
                                                      Figure 21
                                             County Overlay for Florida
                                                     Figure 22
                                          Hydrological Features for Florida
.12

-------
*NO. STATIONS • 789
                                       Figure 23
                        Zoom-Derived Map of Southern Florida with
                                     Geographic Grid

                                                •va III r—
                                       Figure 24
                            Water Quality Stations in Michigan
                                                                                                 33

-------
                        B.G99
                                                     Figure 25
                                     Graphical Retrieval of Dissolved Oxygen Data
                       U.7-»3
                      G 378
                                                                                           03X1 a^
                                                    Figure 26
                                      Data-Zoom Graph for Dissolved Oxygen
34

-------
     ox
CYANIDE
TOLUENE
CHLORIDE
BOD-SOflY
THORIUM
CERIUM
COLIFORM
FEC-COLI
ZINC
•,REASE
ACIDS
COLIFORM
TELLUR
TEST2
ZINC
    70113
    •57543
    37654
     8001
EDIT'  OH
                                  Figure 27
                       Example of Standards Zones Display
60.690
 00
                         Hi
                                    ALTITUDE.=  4?K  MI
                                 Figure 28
                      Example of Perspective Data Display
                                                                                         35

-------
                                              Figure 29
                               Graphical Output from AUTOMAT Subsystem
                                      -,r./v.::;^4;ft.'"
                                               Figure 30
                              Graphcial Output from Municipal Waste Inventory
36

-------
                           A METHOD FOR IMPROVING USER ACCESS TO STORET

                                           By Michael J. Friedland
 INTRODUCTION

      Several projects at the  National Environmental
 Research Center in  Us  Vegas, Nevada (NERC-LV),
 require  that collected data be stored in the STORET
 information system. There is also  a  need to perform
 statistical analysis of the  data using  packages such as
 5SAS, BIOMED, etc.  User personnel  have limited data
 processing experience (a maximum of one semester of
 FORTRAN), and no experience (or interest) in IBM job
 Control  language (JCL). However,  they are willing to
      a minimum of STORET and WYLBUR commands.
      In  order to  make  the  STORET-resident  data
Available  for  manipulation, a FORTRAN program has
»ecn written  to read a MORE file and write the results
f»ack out in card-image format; a subsequent step puts
f l)c file into  EDIT format so  the WYLBUR user can
 tcess and save it easily. The user then builds the JCL
    uired to run SAS around the data set and submits it
    I he RUN command. The details of this procedure are
 1 ustrated in the figures.

 igure 1
      The data set RUN.LAKE.READ contains the JCL
  l tl Jata needed to perform a STORET retrieval, run the
   ta  set name) and before RUNning it, SAVE it  under
  **>  name  of  the  output   data set  (in  this  case,
             to validate the DISP=OLD clause on  the
       EDSCARDS statement. The only other item the
      must  attend  to  is  the control input to  the
  formatting program; that is, the first two columns of
  ^   one input  card must   contain the number  of
  *ameters retrieved.
Figure 2

     The reformatting program produces  the data set
shown in  this figure. The data set has the following
attributes:

         Card image

         Six values per card

         Null values are indicated by -123.000

         Identification  information (station-date-time-
         depth) appears in columns 61-79. A one digit
         sequence number is in column 80.

     The SAS user typically performs a

     CHANGE'-123.000'TO1       'INALLNOLIST

command to convert the null value indicator to a blank
field, as is most commonly  required by SAS.

Figure 3

     The mainline  program used  in this  procedure is
shown here. It initializes subroutine GETDAT, the core
of this procedure, and performs the output operation.

Figure 4

     A listing  of subroutine GETDAT is presented here.
This subroutine has proved to be generalized enough to
be  used by several different  applications of reading
STORET MORE files. The  commentary at the top of the
routine is intended  to communicate all necessary infor-
mation to the user writing  a mainline to call this subpro-
gram.
                                                                                                          37

-------
  1.      //rtNSWY  JOH  (ABO'S, HIN, 01 .30. 30) , • 5 TW- TO-W YL • »MSGLE VELs ( 0 . 0 >
  ?.      //  E-XtC  WOOIST
  3.      //OIST.CAKJFO UD *
 S.     P  610  OR  3»P  3??17,P 10*^ 77»
 6.     P  f>65»')R  4»P  67NP 630, UP 3.P 300, H 72025*
 7.     P
 8.     S
 9.     S=550000»S=5b9999«
10.     BU=720601.EO = 720731 .
11.     PPT=NO,
12.     HEAD=WET111.
13.     // t'XEC SnSIN.NAMf = «CNAHOb,^NS.LAKE.«EAO.bOUKCE'«OISK*TS0004,
13. S    // INPUT='&INP1«
14.     // EXFC
U.S    // INPUT=»MNP2«
IS.     // EXEC FORTGCLG.PARM=tNOSOUWCE,NOMflpi
15. S    //FORT.SYSIN  DO
16.     //  DO DSN = UNP
18.     //GO.FT15FOOI DO USN=\FCF ,DI SP= (OLD. DELETE )
1«.     //GO.FT12F001 DD DbN=
-------
                                                                       STA
                                                                            DATE    DEPTH
                                                                                TIME    §10
5:
6
A.
G:
S


 }
 I
 li
 I

0,050
0.040
U.050
0.040
0.050
0.040
0.080
0.000
0.070
0.060
0.090
O.OSO
0. ISO
0.080
0.360
U.070
U.U60
o.ouo
0.080
0.040
0.750
O.uuo
0.100
0. USo
0.410
0.070
0.09Q
0,U«(J
0.040
0.040
0.030
y.040
0.020
0.030
0.070
O.OSo
0.140
0.060
0.020
0.040
0.020
o.u3o
0.010
0.030
O.OSO
0.110
0,170
0.720
o.aso
0.250
0,050
0,060
0.220
0.900
0.660
0.710
0.120
0,090
0.130
0,130
•123,000
6,000
•123,000
9,200
-133,000
8,400
9,000
9,aoo
•123.000
7,600
•123,000
1.400
-123,000
U.200
-123,000
1 ,UOO
6,20V
lo.ooo
•123. UOO
0.200
-123.000
5 . 8 0 0
•123.000
0.600
-123. OHO
0.600
-123,000
a, 200
2.600
9,400
•123,000
9,200
-123,000
6,400
-123,000
4.000
-123.000
1.400
2.600
9,200
•123.000
9.200
•123.000
5.200
17,100
16.800
-123.000
7,800
•123.000
0.400
32,800
16,600
•123,000
6.200
•123«000
0.300
45,100
6.200
25.600
12.000
23.900
•123.000
20.900
•123.000
19.400
•123.000
21.600
•123.000
20.400
•123. UOO
17.100
-123. UOO
14.000
•123. UOO
11.400
•123, UOO
22.000
-123.000
22.600
-123.000
12.300
•123.000
15.300
-123. UOO
10.500
• 123. 000
21.900
•123. UOO
21.000
-123.000
20,900
-123.000
15.500
•123.000
13.400
-123,000
11.900
•123. UOO
20.600
-123.000
20.400
•123. UOO
15.600
•123.000
24.000
•123.000
17.000
•123.000
10.000
-123, UOO
25.000
•123,000
17.000
-123.000
11.200
-123.000
22.000
-123.000
22.000
-123.000
•123.000
1203.000
•123.000
1202.000
•123. UOO
1201.000
38.000
1069.000
•123.000
109Q.OOO
•123.000
1091.000
-123,000
1092,000
-123.000
1093. UOO
36,000
1094.000
-123.000
1096. UOO
-123. UOO
109S.OGO
-123.000
1097.000
•123,000
1090.000
3ti.OOU
1099.000
160,000
1030,000
-123,000
1039,000
-123. UOO
1040.000
•123.000
1041,000
•123.000
1042.000
160,000
764.000
•123,000
775.000
•123.000
776,000
30,000
816.000
•123.000
815.000
•123.000
614.000
26.000
619.000
-123.000
616.000
-123,000
617,000
14,000
1023,000
-123.000
1006.000
0.010
4,000
0.011
15.000
O.U10
21.000
0.024
U.O
0.032
10.000
0.033
IS. 000
0,026
22. UOO
0.032
30,000
0.043
0,0
0.02S
15.000
0.032
22.000
0.025
26.000
0,059
26,000
0.029
0,0
0.010
0.0
0.010
15.000
0.01 1
36. UOO
0.016
40. UOO
0.017
45.000
0.012
0.0
o.gio
15,000
0.010
33,000
0,036
0.0
0,053
15,000
0.432
30,000
0.042
0.0
0.065
15.000
0.241
27,000
0,166
0,0
0.054
0,0
-123.00027150372
-123,00027150372
-123.00U27150372
-123,00027150372
-123.00027150372
-123.00027150372
-123,00027160172
•123.U0027160172
-123.00U27160172
•123.00027160172
-123.0002/160172
-123.00027160172
-123.00027160172
-123.0002/160172
•123.00027160172
•123,0002/160172
•123.00027160272
•123.00027160272
•123.00027160272
-123.U0027160272
-123.00027160272
•123.00027160272
•123.U0027100272
-123.00027160272
•123.00027160272
-123.00U27160272
-123.00027160372
-123.00027160372
-123.U002717Q172
-123,UOu2717017i?
•123.C0027170172
•123,0002/170172
•123.0002717017(*
-123,00027170172
-123.00027170172
-123,00027170172
•123.00027170172
-123.00027170172
-123.00027170272
•123.00027170272
•123,00027170272
-123,00027170272
-123.00027170272
-123,00027170272
•123.00027190172
-123.00027190172
•123.00027190172
-123,00027190172
-123.00027190172
-123,00027190172
-123.00027190272
-123,00027190272
•123.00027190272
-123.00027190272
-123,00027190272
•123.00027190272
•123.00027200172
•123,00027200172
•123.00027210172
•123.00027210172
7111930  41
7111930  42
7111930 151
7111930 152
7111930 211
7111930 212
7 316 5
       01
       02
      101
      102
      151
      152
316 5 221
316 5 222
316 5
31o 5
317 5
317 5
317 5
317 5
317 5
317
317 5
317 b
317 5
317 5
7 316 5
7 316 S
7 316 5
7 316 5
7 316 5
7
7
7
7
7
7
7
7
7
7
7
7
7
7
                                                                                     301
                                                                                     302
                                                                                      01
5 151
5 152
5 221
5 222
  261
  
-------
                1.      C    MAINLINE PROGRAM TO READ STljHET MURE f ILE AND OUTPUT  IT  IN
                <>.      C    CARD IMAGE  FORK.
                5.            DIMENSION KAHUC21)
                a.            COHMUN/GVALUe/VALUEUO)
                5.            CUtlMUN/UPAftM /NOHL, NUHNET, MOKE, INF IL,I0NUL,V»LNUL
                b.            COMMON /GCUUE/NSTA(lb)»NDATEU>
                7.           X    ,NTIM£(2)
                6.            COMMON /GDEEP/OtPTH
                9.            LAKFILM2
               10.      C   MUHE«4  TO GET  UEPTM
               11.            MORE«4
               12.      C
               II.      C CALL  StTNOM Til  FIND  Nu«BfK OF PARAMETERS RETRIEVED, AND If
               14.      C   MOKE  •  3  RETRIEVAL  "A3 RUN.
               15.      C
               lb,            CALL SETNUW(NUMHET,MUHtJ)
               17.      C   CALCULATE NUMBER  UF  LINES PFH SA^PLt
               18.            LINES»!»NU*«ET/b
               19.            IF{MUO(NUM«fcT,b),fcO,0)LlNE3«LlNES-l
               2U.            KASES'O
              ZZ,            IF(MOK£3.EO.I
              21.        SO  CONTINUE
              2V.            CALL GETDAT
              25.            IFtNDF IL.NE.OJGti  TO  200
              2b.            KA8fcS«KAS£S»l
              27.     C
              26.     C  MHITE b VALUtS/LlNE,  FlO.l   »ITH  ST ATIUN-OATE-T IHE-OEPIM-SEU IN bl-8U
              29.     C
              10.            IDfEP-OEPTM
              31.            Jl'l
              )2.            DO 40  1«1, LINES
              31.            J2«Jlt5
              3«.            «HnE(L»KFR,65) (VALUE 
-------
c
c*
c
c
c
c*
c
c
c
c
c
c
c
c
c
c
c
c
c
c
c
c
c
c
c
c
c
c
c
c
c
c
cc
c
c
c
c
c
c
c
c
c
c
c
c
c
c
c
c*
c
c
c*
c
c
c*
c
c
c
c*
c
 SUBROUTINE 13 CALLED TO EXTRACT OAT* FROM STORET 'MURE'  FILE
   AMD RETURN DATA VALUES VIA LABELLED COMMON. VARIOUS STATUS

   PARAMETERS ARE AVAILABLE! FUNCTIONS ARE EXPLAINED BELO".


 THE COMMON BLOCK NAMES. SIZES AND USES ARE I


 l.GVALUE •• SO *UROS,FLT,PT, I DATA VALUES


 2.GNULL  •- 50 MOHDS*INTEGER* NuLL VALUE INDICATORS


 3.GCUOE  -- 21 "ORO8,INTEGER| STATION ID(16),DATE(3),TIMEC2)


 tt.GKNTH  -- 52 wUKDS,INTEGER; COUNTS OF  ST AT IUNS» RECORDS, AND NULLS


 S.GPARM  —  b wORDSlS 1NT,1 FL.PT.I EOF FLAG, OTHER PAKAM3


 6.GFUHM  -« 11 wORDS,INTEGER; PROVIDES FOR ALTERNATE STATION 10 FMT


 T.GECHO  —  i *OHO,FLT.PT,,PRESET TO ZERU, i* SET TO i,CAUSES
     UfcTOAT TO ECHU EACH LINE READ

 B.GDEEP  -•  i MONO FLT.PT,, DEPTH. CHARS 3oo-io«  OF MOHE«« UATA
      RECORD.




 THE COMMON BLOCK VARIABLES'"FUNCTIONS AND VALUES FOLLOW
vALUt(I) CONTAINS THE QUANTITY FUR THE I-TH PARAMETER RETRIEVED,

   IF THE VALUE IS NULL* VALUE(I) HAS l.E-15, THE USER HAY

   CHOOSE A DIFFERENT DEFAULT BV SETTIING VALNUL, THE 6TH IFL.PT,}
   MORD OF HLUCK GPARM, TO THE DESIRED QUALITY,  THE ARRAY VALUE

   IS SET BY EACH CALL TO GETDATf IT IS NUT PRESET.


NULL(I) IS SET TO 10 IF VALUE(I) IS NULL, THE USER MAY CHOOSE A
   DIFFERENT INDICATOR VALUE BV SETTING  IDNUL,  THE 5TH nORD(lNTGR)

   OF BLOCK GPARM, TO THE DESIRED VALUE. THE ARRAY NULL IS SET

   BY EACH CALL TO GETOATl IT IS NOT PRESET.



THE STATION ID Is RETURNED IN THE FIRST 3 wORDS  OF THE ARRAY NSCODE

   IN THE FORMAT A4,A3,A2« THE USER MAY SUPPLY AN ALTERNATE FORMAT
   AS DESCRIBED UNDER BLOCK GCFORM.

THE DATE IS READ AS 312 (YEAR,MONTH,DAY).THE TIME IS 212 (HOUR,MIN)




THE VARIABLE KNTSTA IS INCREMENTED WHEN A NEw STATION IS ENCOUNTERED

   IT IS PRESET TO ZERO.


THE VARIABLE KNTREC 18 INCREMENTED WHEN A LINE OF DATA,NOT END-OF-
   •FILE, IS READ, IT 18 PRESET TO ZERO.


THE VARIABLE KNTNUL(I) IS INCREMENTED WHEN VALUE(I) IS NULL.
   THE ARRAY IS PRESET TO ZERO.
  •VARIABLE —

    NOMORE
—•USAGE —

 1  IF EOF
•DEFAULT-
 ZERO
....SET  BY*
    6ETDAT
                                                     GTDA  10

                                                     GTDA  IS
                                                     GTDA  20

                                                     GTDA  25
                                                     GTDA  30

                                                     6TDA  J5

                                                     GTDA  40

                                                     GTDA  45

                                                     GTOA  SO
                                                     GTDA  55

                                                     GTOA  60
                                                     GTOA  65

                                                     GTDA  70
                                                     GTDA  75

                                                     GTDA  80

                                                     GTOA  65
                                                     GTOA  90
                                                     GTOA  9
-------
61.
62.
63.
64.
65.
66.
67.
60.
69.
70.
71.
72.
73.
74.
75.
76.
77.
78.
79.
ao.
01.
62.
81.
84.
65.
86.
67.
68.
89.
90.
91.
92.
93.
94.
95.
96.
97.
96.
99.
100.
101.
102.
103.
104.
105.
106.
107.
toe.
109.
no.
lit.
112.
MS.
M«*
US.
116.
117.
lie.
u».
120.
m.
c
c
c
c
c
c
c
c
c
c
c«
c
c
c
c
c
c
c
c
c
c«
c
c
c
c
c
c
c
c
c
c
c
c
c
c
c
c
c
c
c
c
c
c
c
c
c
c
c
c
c
c
c
NUMRET NO. OF PARAMS IN RETRIEVAL 10 U8tR
MORE TYPE Of NOME FILE (1 OR 4) 3 USER
INF1L MORE FILE FORTRAN UNIT NO. 15 USER
MUST COHRE8P MlTH JCL FUR •FCF—SCE JCL HELD*
IDNUl VALUE FOR NULL INDICATOR 10 USER
VA' NUL VALUE FOR NULL UUANTITr l.E-15 USER



»
SOME VARIABLES ALLOW THE USER TO GET THE STATION IP IN A FORMAT
OTHER THAN IN THE FOLLOWING DEFAULT FORMATt
NSCODE(l) AND (SCODEC2) HAVE THE PRIMARY STATION NUMBER
IN 2A4 FORMAT, 91X IN FROM THE BEGINNING OF THE FIRST STATION
RECORD. THE SECONDARY NUMBER IS READ IN IN A4,A3*A* fORMATi
C AFTER SKIPPING 34 SPACES.
KODSIZ IS THE NUMBER OF NURDS DESIRED, AND MAY BE AS MUCH AS 16.

TMfc KFMT ARRAY MUST CONTAIN THE FORMAT SPECIFICATIONS ttESIREO,

** JCL *** AND MINIMUM COMPLEXITY USAGE OF GETDAT

//MJFKT JOB (LV03, J5,1.1),'USE-GETDAT",MSGLEVEL»CO,0)
//RETREV EXEC -UDIST
//OIST.CARDFD DD *
PGM«HET,A«1119UHPF,MURE«3,RMTs35,
B,S«002000130,8«PMN005,
P»8S520,P«65514,BD«720101,ED"721231,
PRT«NO,
/*
//USE EXEC FORTGCLti
//FORT, SYS1N OD *
C THIS OSEH PROGRAM USES GETDAT
C
COMMON /GVALUE/VALUEUO)
COMMON /GPARM /NOFIL, NUMRET
C FIRST CALL PRECEDED 6V SETTING NUMBER OF PARAMS RETRIEVED
NUMRET«2
5 CALL GETDAT
IF(NDFIL.NE.O)Gf) TO 90
PRINT 7,
-------
 23.

 25*.
 26.
127.
130.
131.
132.
133.

135!
1 36.
137.
1 40.
1 41.
I 42.
Iu3.
1 44.
X 45.
X 46.
Iu7.
I 46.
1 49.
152.
153.
l5«»
157.
l5«.

1*0!

1*2.
1*3.

***!
1*6.
1*7.
Ufl.
 72.


 75!

 77!

 79!
      CUMMON
      COMMON /GFOHM/KOD31Z,KFMT(10)
      COMMON /GECHO/ ECHO
      CUMMQN /GOEEP/DEPTH
      DATA KALL01/0/
              DATA VLNUL/Z2MBBEDO/
  TEST-FLAG FUR DIAGNOSTIC OUTPUT
      KTESTaO
      KTEST«1
   IF FIRST CALL DONE, GO GET DATA

      IF(KALL01.NE.O)GO TO 400
      KALLOl'l
           CHECK PARAMETERS THAT CUULO ALLO* USER 3tLF-OAMAGE
                                                                                 CTDA 600
                                                                                 GTDA 60S
                                                                                 CTDA 610

                                                                                 GTDA 615
               IF(NUMRET.GT.10.ANO,MORE.NE.
-------
163.
164.
IBS.
186.
167.
166.
169.
190.
191.
192.
193.
194.
195.
196.
197.
196.
199.
200.
201.
202.
203.
204.
205.
206.
207.
206.
209.
210.
211.
212.
213.
214.
215.
216.
217.
218.
219.
220.
221.
222.
223.
224.
22S.
226.
227.
226.
229.
210.
231.
232.
233.
234.
235.
236.
217.
236.
219.
240.
241,
242.
24).
302 CONTINUE
READ(INFIL,KFMT,END«699,£RR«799>(N3CODE{IJ),IJ»1,KOD81Z)
IF ( ECHO. E8.1) PRINT 305, (N3CODE 1 1 J ) , IJ«1 ,KODS1Z)
DO 303 I«2,9
READ(lNFIL,304,END»599,ERR»799)
X (STNID(I,J),J«1,10)
ir(ECHO.EG,l)PRINT 305, (STNlDC I , J ) , J« 1 , 10 )
30) CONTINUE
KNTSTA«KNTSTA»1
30U FORMAT(90X,10A4)
JOS FORMAT (' ECHO 3l',lOA4>
ttOO CONTINUE
C
C IF MURE »4 RfcAU ALL 50 VALUE ITEMS AND DEPTH. OTHERWISE,
C JUST READ THE NUMBER UF PARAMS RETRIEVED, (NUMRET)
C
IF(MORE.Nt.4)GO TU 405
RE AO < 1NFIL, 4 10,£NO«699,ERR«799>NDATE,NTlHt, VALUE, DEPTH
«10 FORMAT(25X,5I2,50A4,64X,F5.0)
GO TO 415
405 CONTINUE
RE ADC INF IL, 410,ENDm699,EKR«799>NDATfc,NTlME,(VALUt(I),I«l,NUM«eT)
415 CONTINUE
NUMREC»NUMRECtl
IF(ECHO.EU,1)PH1NT 410,NDATE,NTIMt,(VALUECI),I«l,NUHRET)
403 FOHMTCOECHO 4*)
IF(ECHO.EU.1.ANO.MOHE.E0.4)PR1NT 411 , DEPTH
411 FOHMATUX, 'DEPTH|«,F10t3)
IF(NDATE(1).EO. 99)60 TO 302
C
C ADJUST VALUES
C
DO 433 I«l,NUhRET
NULLU)»0
IF(vALuE(I),Nt.VLNUL)UO TO 433
VALUE(I)»VALNUL
NULL(I) «10NUL
KNTNUL(I)«KNTNUL(I)*1
433 CONTINUE
C
C UP RECORD COUNT AND RETURN
C
KNTREC"KNTREC+1
RETURN
C
C EOF DURING PARAMETER 10 RECORDS
C
599 NDFIL'2
PRINT 600,KNTSTA,KNTREC
8TOP 599
600 FORMATClEOF DURING PARAM OR STATION ID READING, STA/REC" ' ,217 j
C
C EOF DURING DATA RECORD
C
699 NO*ORE«l
RETURN
C
C ERROR DURING READING
C
7*9 NOMORE»2
PRINT 600,KNT8TA,KNTREC
CTDA 395
GTDA 900
GTDA 90^
GTDA 910
GTDA 9is
GTDA 920
GTDA 92b
GTDA 930
GTDA 935
GTDA 9«Q
GTOA 94S
GTDA 950









GTDA 955

GTDA 96S
GTOA 970
GTDA 975


GTDA 960
GTDA 965
GTOA 990
GTOA 995
GTDA1000
GTDA1005
GTDAlOto
GTOA1015
GTDA1020
GTOA102b
ttTDAioso
GTOA103S
GTDA1040
G DA104S
GTOA10SO
GTDA105b
GTDA1060
CTOA1065
GTOA1070
GTDAJ075
GTDA1060
GTDA 1065
CTDA1090
GTDA1095
GTOAUOO
GTOAU05
6TDAIHO
GTOA1U5
GTDAU20
6TDAU25
GTDAluo
CTOAJ135
GTOA1UO
                                              Figure 4 (Continued)
44

-------
                STOP  799                                                           GTDA11«5
           600  FORMAT CURHOH  DURING  READING')                                    GTOA1150
                END                                                                GTDAU55
                BLOCK  DATA                                                         6TDA1160
    .            COMMON  /GHUtL/NULL(50)                                             GTDMU*
$49.            COMMON  /GKNTH/KNT3TA,KNTREC,KNTNUL(SO)                             6TOA1170
                COMMON  /GPARM/  NOMOBE, NUMRET , MORE, INF IL» IDNUL, VALNUL              6TDAU75
                COMMON  /GFORM/K005lZ,KFMTC10)                                      6TDM1SO
    .            COMMON  /GECMO/  ECHO                                                6TDAM85
$5J.            DATA  KNTSTA,KNTREC»KNTNUL/ 52*0/                                  GTDAU90
$5«.            DATA  NOMORC,NuMRiT»HORE,lNFU,II)NUt»VAtNUL/ 0, » 0 , i, »S, i 0 » » ,t-l5/  STOAJ195
$55.            DATA  K00312,KFMT/5»«H(9lX,aH,2Aa,
-------
                               TWO DATA STORAGE PROGRAMS FOR STORET

                                             By Kenneth V. Byram
  SCAN-A PREEDITING PROGRAM  FOR  STORET
         DATA

  Purpose

      This program and  its  twq  cataloged procedures
  were written  to  enable  a STORET "storer" to make
  arithmetic checks on his  data before it is actually stored
  in the STORET data base. The program also produces a
  fairly tabular listing of all of the data. It is hoped that by
  using  this  procedure  before  entry of  data into  the
  system, a storer can eliminate some of the more obvious
  kinds of errors, especially those that are caused by cleri-
  cal  mistakes stemming from the  nontabular  forms of
  most STORET water quality input. A pH value of 75,
  for  example,  is particularly embarrassing  and obvious
  when it appears neatly tabulated on a STORET retrieval;
  such an occurrence casts  doubt on all of the user's data.
  It is difficult  to eliminate those kinds of errors, partic-
  ularly  if the data  has originated on a long string of DIP
  input, where the 75 is adjacent to parameters where 75
  is a  legitimate value, or from a STORET standard format
  card with the value coded as "00400750012."

  What It Does

      This program,  for  one thing,  produces tabular
  listings of data, so that a  pH of 75 should be  somewhat
  more obvious. Under user control, it also makes "stan-
  dards" type checks on the data, checking against a user-
  input  set  of maximum  and  minimum values  by
  parameter. For example,  a minimum of 2  and a maxi-
  mum of  12 might  be  coded for  pH. Also under user
  control,  it makes  interparameter  comparisons. A mea-
  surement  of total  rivets, for example,  must logically
  always be greater than, or equal  to, a measurement of
  dissolved  rivets on  the same sample. This comparison
  would  Hag  an occurrence of the values 10  for total
  rivets,  ;md 15 for dissolved  rivets. The  program addi-
  tionally makes checks for parameters which  are dupli-
  cated within the input  for that run. It will not detect
  duplications of data previously stored  in the STORET
  system.
 What You Provide as Input to the Scan

     To Get the Tabulation • No input need be provided.
 The program will always tabulate whatever parameters
 are there.

     To Make a Within Limits Comparison • The input
 provided is a parameter code, the minimum value for
 that parameter, and the maximum value. The parameter
 code goes in columns 1-5, right justified. The minimum
 goes in columns 11-20, with decimal point explicit, with
 the  maximum  similarly  coded  in  columns 21-30.
 Columns 6-10 should be blank.  (A maximum of 100
 such checks may be input.)

     To Make an Interparameter Comparison • The input
 provided is two parameter codes, and the resultant test is
 that the value for the first is greater than, or equal to,
 the value for the  second. An error message is printed if
 the first value is less  than the second. The first parameter
 code goes right justified in columns 1-5, and the second
 in columns 6-10. Other columns are not read by the pro-
 gram.  (A maximum of 100 such checks may be input.
 These are in addition to the 100 above.)

     To Sort or Not to Sort - There are two cataloged
 procedures, one that leaves the input data as is, and the
 other that sorts it by agency, station, date and time. In a
 sense,  the sort is  a disadvantage because it reorders the
 input,  listing errors detected by the program in an order
 different from  that  in the input deck. The sort is an
 advantage in viewing the tabulation, particularly >f the
 input is in somewhat random order. It is also  advan-
 tageous if there  are  possibilities for  interparameter
 comparisons, or if  there  are duplicate  parameters at
 different places in the input deck. Without the sort, the
 comparison would not be made.

 Using the Program

     The program executes a simulated STORET storage
 run, bypassing the data sets that are accumulated for
weekly  mass  storage  runs.  Thus,  normal errors of
46

-------
 \t;iiion n»i presently in system," "invalid parameter,"
 '< c., will he printed out. The program uses I90K of core,
 -*i •   The deck setup is:

      // (job card)
      // EXEC SCANWQJ (or SCANWQJS, for the sort)
  -   // DIST. CARDF DD  *
         followed by a standard STORET storage deck:
            7START
            701,02, 03 or 04... etc.
      // SCAN.SYSIN DD  *

 *<~'an  program  cards are  described above. Cards are
 *« i necessary if  no limits or interparametcr comparison
 *' locks arc to he made. Limit and intcrparameter com-
        cards may he intermixed.
     Notice that running either procedure will not store
 fie data  in STORET. One  would usually run a scan,
 Hirrect the errors, run another scan to check the correc-
     s,  and then run a STWQJ to actually store the data.
 Codifying the Program

      For  those interested in modifying the FORTRAN
         to add checks of their own, the  source deck is
  Bailable  on  &CNA907.KVB.SCAN  on TS0004. Once
     modification is ready, JCL for making the modified
"f YORET input run, compiling the program, and running
| is available on &CNA907.KVB.SCAN.JCL. The sorted
jhrsion  is &CNA907.KVB.SCAN.JCLS. The existing
TORTRAN  program  uses an assembler  language sub-
 *»ulinc which will  be assembled in the  run. The JCL
 ^ust  he  modified  to  reference  the  program,  the
 * YORET inpul, and the program input data.
t    For  more information, call Ken Byram, at FTS
*t) 3-7 5 2-4 38 5, or commercial at 503-752-421 1 ext 385.
     writeup will be maintained and modified if program
        are made.  It is available on &CNA907.KVB.-
 ^AN.DOCUMENT.
^VJBROUTINE STORE
     STORE is a FORTRAN subroutine which prepares
 t Complete card-image deck for data storage in STORET,
         header cards, agency card and data cards, when
      from a calculation program.
    Most STORliT users input their water quality data
with either the DIP format  or  the "standard" (five/
parameters per card) format. Neither of these formats is
particularly easy to produce from a running FORTRAN
program, and  the user who wishes to use  a program to
compute water quality data from intermediate factors
has the  option of receding the data after  it is listed by
the program or manipulating his FORTRAN to produce
the required output. This subroutine allows the pr^gram-
mer to put date into STORET card format  directly from
his  program by making calls to  the subroutine, which
then does all necessary left zero filling and data format-
ting. The deck produced by the subroutine contains the
necessary header cards, so with addition of the proper
JCL, the deck may be entered in the same job. As an
example, a user new to STORET may want to calculate
and enter chlorophyll data. The data  manipulation
requires that  6absorbance  series  of  formulae which
check for unrealistic volumes, calculate concentrations,
round them off, and check for precision. Without having
to learn all the restrictions of the STORET system, the
user can write his program, create entry records, add the
JCL, and enter data into STORET.

HOW TO USE STORE

    There are five calls made from the main program to
store data in the STORET storage deck.

1.  Call AgencyO)

    This call  must be made at the beginning of the
program. It writes the header cards and copies the infor-
mation for the agency card from an array  to the storage
deck.  This  call is made only once and must  be made
before any other calls to the subroutine.

    (I) is the integer array name into which the agency
card has been copied. The expected format IB (20A4).

2.  Call STOOBS (N.SJD.IDATEJTIME)

    This call  gives the subroutine the station, date, and
time for the entry of any  values  following the  call. It is
called once for each time  the  station,  date,  or  time
changes.

    (N) is the number of characters in the station code,
up to the maximum allowed, 15 characters.

    (S) is the array in which the station code is stored.
It may be from 1  to 4 words, depending on the number
of characters specified.
                                                                                                           47

-------
            is 11 ic ilcplli.  II  is un integer value used as ,i
         lociiliiii; piiianieiur. II'il is negative, depth will be
   considered ;is iu>l prcsunl us a sample locating parameter.

       IDATL is the  6 digit integer date in  the  form
   YYMMDD where YY is  the year, MM is the month, and
   DD is the day of the month.

       ITIME is  the 4 digit integer time, using a 24 hour
   clock.  If ITIME is negative, it will be considered as not
   recorded.

   3.   Call STOCDE (IP,V,R)

       This call transfers to the subroutine the data which
   is to be stored. The previous STOOBS call provides the
   station, date, and time for storage. When five parameters
   have been stored, it  dumps the card into the STORET
   storage deck.

       (IP) is the STORET parameter code. It is an integer
   value, which the subroutine provides with any necessary
   leading /eros.

       (V) is the  floating point value to be  stored for the
   specified  parameter code.  No more than 4 significant
   digits are accepted by the  subroutine; so  if there are
   more than 4, the subroutine rounds the value off. Values
   must be in the  range  E-10 to E+9. Zero is not allowed.

      (R) is the one character mark. A list of remark
   codes can be found on page 7-7 of the STORET manual.
   Usually this would be  a blank (1H  ); but, for example,
   if the user wanted to indicate that  an answer was an
   estimate, he would use 1HO.

   4.  Call STODMP

      Up to five  parameters may be stored on each stor-
  age card. STOCDE calls transfer the information on each
  parameter to the subroutine. When all parametric values
  have been stored for a given  station, STODMP is called.
  II must  he called once for each STOOBS call. STODMP
  dumps  any incomplete (less than 5 parameters) cards
  ami reinitiali/es the subroutine for the next STOOBS
  call.
  5.   Call STODEL (IP)

      STODEL creates  a record  to delete one parameter
  for u yiven station-date-time.  As with the STOCDE call.
 the station, date, and time are provided by the previous
 STOOBS call.


     (IP) is the STORET  parameter code in integer
 form.

 RUNNING A PROGRAM USING STORE

     A simple way of using STORE is to copy it onto
 the end  of  your  program and  use  the  following
 WYLBUR commands:

      ?USE (your file) ON (your volume)
      7COPY ALL TO END FROM &CNA907.KVB.
        STORE ON TSO004
      ?SAVE (your name) ON (your volume).

     When you execute the program, you could use the
 following JCL  to create a  SYSOUT DATA  set with
 STORET card images for checking:

     //job card
     // EXEC EDSlN,NAME=(your fllcname),DlSK=
        (your volume)
     // EXEC FORTGCLD
     // FORT.SYSIN DD DSN=&INPUT,D1SP=(OLD,
        DELETE)
     // GO.SYSIN
        your cards
     // GO.FT08F001  DD SYSOUT=A.

     If you want to make a STORET storage rur direct-
ly  with your program, replace the //GO.FTOi... card
with the following:

     // GO.FT08F001  DD DSN=&STORET, DISP=
        (NEWj?ASS),UNIT=SYSDA,
     // SPACE=(TRK,(10,10), RLSE),DCB=(BLKSIZE=
        3520,LRECL=80,RECFM=FB)
     // EXEC STWQJ
     // DIST.CARDF DD DSN=&STORET,DISP=(OLD,
        DELETE).

     For  more information,  call Ken Byram,  at  FTS
503-752-4385, or commercial at 503-752-4211 ext 385.
This writeup will be maintained and  modified if program
changes  are made.  It is available on &CNA907.KVB.-
STORE DOCUMENT ON TS0004.
48

-------
                           THE EPA SCIENTIFIC APPLICATIONS SOFTWARE STUDY

                                            By Elijah L. Poole
BACKGROUND

     The Environmental Protection Agency (EPA) uses
several  computer facilities in  fulfilling  its functions.
These  facilities encompass minicomputers as well as
intermediate and large-scale digital computers. Many of
the small  computer systems are  special  purpose com-
puters that service a single laboratory. However, most of
the large computer  systems have on-line remote term-
inals and serve a large community of users. The scientific
applications software  systems  implemented  on these
computers are botli numerous and varied. It is the large-
scale computers that this study is designed to cover.


     There have been  several  recent studies of  EPA
computing systems requirements  performed under con-
tract to EPA. One  of these, performed by Computer
Sciences Corporation (CSC), resulted in an inventory of
EPA information and  Administrative  Systems. It ex-
cluded, however, systems which were considered "purely
scientific."


     Another study, carried out by the General Electric
Corporation (GE), surveyed all ADP requirements in the
Agency  with a  view toward determining the need for
large-scale  computing centers.

     (t  was decided that a scientific applications soft-
ware study should be performed to help in fulfilling the
Agency's needs in  this area. So a task was negotiated
with Systems Architects, Incorporated to provide level
of effort support for ten months beginning September 9,
1974.

SCOPE

     This task requires that a survey be conducted of the
scientific software packages implemented on the major
computer  systems  utili/ed  by  EPA. The  computer
systems to be included in the survey are specified below:
         Organization

Washington Metropolitan Area
Optimum Systems Incorporated
National Institutes of Health

Research Triangle Park, N.C.
NERC-RTP


METHODOLOGY
Computer Systems
IBM 370/155,158
IBM 370/165
PDP-10
UNIVAC 1110
IBM 360/50
     The approach and basic functions of the study are
as follows:

     1.  Review  EPA requirements as documented in
the CSC and GE studies and as augmented by interviews
and other available documentation. Make recommenda-
tions  to  acquire  additional  scientific  applications
software, or  to  take any other measures that  would
improve the Agency's use of this type of software.

     2.  Conduct a survey of what is presently available
on  the computer systems described above in terms of
scientific  applications  software packages such  as
mathematical  packages,  statistical packages, modeling
systems (GPSS, GASP, CSMP, etc.),  biomedical com-
puter programs,  and plotting routines. Poll users to
determine if  there are additional software  needs and
implement such software where feasible.

     3.  Develop a user's manual that would describe
the capabilities,  advantages  and disadvantages  of the
available  software, give instructions on how to  use it,
and provide criteria to determine which software system
should be used for a given application.

     4.  Identify and document user training needs for
scientific applications software. Provide a recommended
schedule for training  and provide training courses to the
extent possible.

-------
                           BIOMED, SAS, AND OTHER TOOLS OF THE STATISTICIAN

                                            By Robert Kinnison, Ph.D.
 STATISTICAL PACKAGES

     Presently  there are  two  widely  used  statistical
 packaged systems: the Statistical Analysis System (SAS)
 and the Biomedical Analysis Programs (BIOMED). These
 analytical tools, differing essentially only in their details
 and emphasis, may be considered to  be in competition
 for  the  user community.  A historical  perspective is a
 convenient  means  of comparison. The early series of
 BlOMtD was the first widely disseminated and accepted
 program package available and was first released in 1965.
 An  immediately  apparent handicap for the experienced
 user (but definitely  not  for the  student or  neophyte
 user) was the rigidly structured fixed format specifica-
 tion of input. The first widely known attempt to relax
 this structure was the Statistical Package for the  Social
 Sciences (SPSS)  which  had an English  language type
 input specification  language. Designed for data descrip-
 tion only, SPSS suffered from a lack of completely free
 field input and a very limited statistical capability.

     SAS was the  next  major improvement in widely
 accepted  packages.  It included a basically  free  field
 input, time sharing orientation, excellent  data editing
 capability and an expanded statistical capability. Not to
 be outdone, the BIOMED programmers also developed a
 soon to be  released update  of their package,  called
 BMDP,  that has many features not found in  previously
 released  systems. (This package has actually been selec-
 tively released lor field evaluation.) This BMDP series of
 programs has  free  field  input, a data editing capacity
 that is equivalent lo SAS (although SAS is easier to use
 in this respect)  and  a much more extensive statistical
 capability than any previously released packages. It also
 has a wide range of output options for data description.
 Its basic capabilities that extend beyond those of other
 packages arc: time series analysis (Fourier analysis, auto-
 convLition etc.),  cluster and classification procedures,
 mulliv.iriiitc  analysis  procedures (beyond  multivariate
analysis ol  variance), contingency  table analysis,  many
noiipaiametric  tests, nonlinear  regression,  automatic
 hypothesis testing in  the general linear hypothesis, life
 lahles. survival rales, and biological assay.

     There are several other packages that are available
Inn  not so widely accepted. Most universities  have  their
own  iinii|tie  s\\ieins.  available only on their own com-
putei uuiMimiii! most of the more commonly used tea-
imes  ol  SA.S  .mil   HMD.  The National  Bureau of
Standards  has  released  a  program  series called
OMNITAB, which is oriented to data screening and  is
weak on statistical analysis in comparison to  SAS  and
BIOMED. Additionally there is  a  wide variety of sub-
routines available  with the  capabilities of the program
packages, but they require computer programming abil-
ity since these subroutines must  be used in conjunction
with  user-provided  input  and output programs. Such
subroutines are found in IBM's Scientific  Subroutine
Programs (SSP), a  commercial  update of SSP called
International  Mathematical  and Statistical  Library
(IMSL), CDC's Statistical Subroutine Library, and Union
Carbide's Subroutine Library. Because of the  expertise
necessary to use these subroutines, they can be recom-
mended only for special purpose use.

     Because of  the  close  similarity  of  SAS   and
BIOMED and their  extensive capabilities, they can be
recommended over all others. The  choice between them
can be  based on personal preference. For  the nonstatis-
tically  trained person,  SAS is certainly adequate  and
easier to use. For the professional  statistician,  the extra
statistical capability of BIOMED favors its use. However,
an  overriding consideration would be the existence of
either   package  on  a readily available or  convenient
computer system.

UTILITY ROUTINES AND OTHER TOOLS

     My statistician, of any level, will quickly  finu him-
self in situations where some computer programming is
highly desirable for the execution  of his duties. He will
invariably start with a programmable calculator and soon
graduate to a generalized scientific computer language
such as FORTRAN. At this point a personal library of
subroutines,  functions, and algorithms  becomes an al-
most essential tool. In addition to the rather comprehen-
sive libraries,  such as SSP (mentioned previously), the
following is  this author's  selection of useful but  not
well-known  programs. Further details, original sources,
or program  copies and general comments will be gladly
provided to anyone contacting or writing this author.

    It  is most useful to have the capability of finding
exact  probability levels without consulting large statisti-
cal  tables of  the  normal, t-test,  chi-square. or F-test
statistics. Special and efficient subroutines are available
for each of these  common statistics, and also for less
common distributions such as the Gamma function. For

-------
uilu-i  infrequent use, :i convenient relationship  exists
ili.il  permils Ilic compulation of;ill common probuhility
levels from only :i suhroulinc for the probability of the
I  lest. The following equalities use commonly accepted
notation for degrees of freedom and significance level:
                                                                   
-------
   World Health Urgani/ation reports, and Cornfield in the
   American medical journals.

        A somewhat specialized but highly efficient analy-
   sis of the variance toe' is available from Elliot Cramer,
   Acting Director of the Psychometric Laboratory at the
   University of North Carolina, Chapel Hill. This program
   package, called  MANOVA,  is  a highly sophisticated
   multivariatc analysis  of the variance system. It was not
   discussed with SAS and BIOMED because of its limited
   scope und acceptability. It should be noted that in every-
   day data analysis, multivariate analysis  of variance prob-
   lems is almost unknown. The strength  of this package is
   its ability (o handle  unequal sample si/es and  to auto-
   matically formulate the  more common hypotheses for
   analysis.  It  should also be noted that this program can
   handle any  analysis of variance, analysis of covariancc or
   regression  problem  (except  stepwise  regression) as  a
   subset of its overall capability.

        Another  common  statistical  problem  is  that  of
   selecting  those regression variables which truly  statis-
   tically explain the observed effects and of rejecting those
   variables which add only  noise to the system. The well-
   known procedures for this selection are classified  as the
   "Stepwise Regression Algorithms." Early users of these
   algorithms noted that their regression results depended
   upon peculiarities of the data and upon the details of the
   algorithm  implementation.  The  most widely  tested
   algorithm for avoiding  these problems  is  to run a sepa-
   rate regression on each and every combination of re-
   gicssion variables and choose the one with the smallest
   icsulual  mean  square error; this, of course,  takes ex-
   cessive computer time for most regression problems. An
   efficient  compromise was reported by L. R.  LaMotte
   and Mocking.* The implementation of this algorithm by
   LaMolle is called program SELECT.

       A problem in numerical  analysis  that  is encoun-
   tered  by statisticians whenever a regression or analysis of
  variance  procedure is  implemented, is  that  of matrix
  inversion.  The  commonly-used   techniques,   such as
  (iauss-Jordan or pivotal condensation,  frequently give
  the wrong answer whenever  the design  matrix  is ill-
  conditioned  (nearly singular or lias a determinant close
  to  /.ero), or whenever the specification of restrictions on
  the parameters is incorrectly done or poorly conceived.
  A  very general mathematical  tool can  be used in this
 statistical problem because  of its inherent numerical
 stability. Known as the "generalized matrix universe," it
 is a concept that specifies an inversion of any matrix or
 array regardless of its  properties.  This  concept is not
 restricted to square matrices or matrices of full rank
 which  are  typical  of statistical computation; however,
 when applied to these statistical computations, it offers
 a stable  and efficient algorithm that is not affected by
 the conditioning qualities of the matrix. Furthermore, in
 cases where the design matrix is truly singular (the deter-
 minant is identically zero), it is known that there is an
 infinite number  of answers, but the generalized matrix
 inverse gives that one answer that is of minimum norm;
 that  is, the one in which the sum of the squares of the
 parameter values is minimum. An implementation of this
 algorithm, as a subroutine, is available from the author
 as program MATGFV.

     Nonlinear  regression  algorithms  have  received
 extensive  use in  some laboratories for  calculating  such
 quantities as  biological  or  radiological half-lives. Such
 problems  are inherently  based on systems of differential
 equations which  are handled in the  regression programs
 as sums of exponential  terms. From the standpoint of
 numerical analyses, sums of exponentials are notoriously
 prone  to  blow  up  computationally in  the commonly
 used algorithms such as the Gauss (used in BIOMED and
 SAS),  Marquardt's  gradient method, and the less well-
 known  available  ones  such  as SPIRAL. It should  be
 noted  that  for nonlinear regressions, not of  the expo-
 nential  form, these algorithms are very satisfactory. A
 recently  discovered   algorithm   helps  substantially
 overcome the problems  associated with sums of expo-
 nentials; this class of algorithms and theory is kno  'n as
 Variable Projection methods. The name and methods are
 so new that  this  nomenclature is not firmly established
 and some  authors may refer to it by other names.t This
 technique  divides  the regression parameters  into two
 classes,  linear parameters (in which  the first derivatives
 of the regression function are  independent of any  of
 these parameters)  and  nonlinear parameters. Through
 some  special  transformations  of  the  mathematical
 sample space, the algorithm is able to find unbiased esti-
 mates of the nonlinear parameters. After these estimates
are known, the remaining parameters are easily found by
simple linear regression. A distinct computational advan-
 tage  is gained by working with small subsets of param-
eters. Even though this algorithm will converge to a valid
  *LK. IJ Mo Ik- anil I lucking, " Coin|nitjtiun:il KlTicioney in trie Selection of Linear Regression Variables, "Tfchnometrics. Vol. 12, No. 1
  (l-i'hrnary I'I7(I|.

  *l or rcleri'iiir. MV (iiilliiKin, Pm-y.i. :uul Siolnik, "LiMst Squares I slimalion for a Class ol Nonlinear Models." TechnOmetries. Vol. 12,
  No. 2
-------
 answer  in  situations  for  which  the  more  common
 nonlinear programs breakdown, it should be emphasized
 tliut  these  situations  are  complex  mathematical and
 statistical phenomena. Therefore,  these powerful new
 algorithms are advised for  use only  by persons with a
 strong statistical background who  are familiar with the
 implications  of  the  algorithms.  A  well-programmed
 implementation is available at Stanford University Com-
 puter Center; it is called VARPRO and was developed by
 Professor Gene Golub of the Computer Sciences Depart-
 ment. This author has a slightly  modified version of
 VARPRO  that is easier to use but  not  so computa-
 tionally efficient.

      Perhaps the  most common simulation or modeling
 algorithm  used for environmental transport studies is
 systems  of linear differential equations. This type of
 model is also extensively  used in chemical kinetic studies
 and in biological metabolism investigations. The pre-
 viously mentioned nonlinear least squares algorithms are
 not capable of estimating the coefficients of these equa-
 tions from sample data unless the systems of differential
 equations are reformulated as systems of integral equa-
 tions; this reformulation is an essentially impossible task.
 However, because of the importance of this problem in
 drug  metabolism, the  Upjohn Company has developed a
 program package  that can estimate  the parameters of
 these differential  equations from  data. The technique
 used  can best be described as  brute force numerical
 integration and differentiation, yet it does work well and
 has been extensively tested. Their currently available
 package  of programs  also has general  nonlinear regres-
 sion capacity  and aids in determining confidence limits
 on the nonlinear  parameters and algorithms for  deter-
 mining optimum  sampling designs. The contact for this
 package is Dr. C.  M. Metzler, Upjohn Company, Kalama-
    o, Michigan.
      A commonly desired  algorithm is one to find the
 maximum (or minimum) of some algebraic function; the
 most obvious example of its use is the need to find the
 optimum  conditions  for  environmental  sampling.
 Classically this type of problem has been  divided into
 problems of optimizing economic considerations, for
 \vhich the techniques of operations research, linear pro-
, Cramming, and nonlinear programming were developed,
 ^nd of mathematical optimization with a large number
 
-------
   kmelii's. li is lorlliilblis lh;il it Is also a natural language
   lor ciivimnmcni.il pollutant transport models.Currently
   (licit is .1 remarkable siaiulardi/alion ol' these program
   paeka^cs. ;ii least  us  Cur as the user  is concerned  in
   setting  up Ills problci \  (internally  there  are  marked
   differences).  All  program  packages are based  on  a
   standard established and  maintained by a  professional
   group  called Simulation  Councils  and are called Con-
   tinuous Systems Simulation Languages (CSSL). The IBM
   implementation  is called Continuous Systems Modeling
   Program (CSMP)  and is available on all of their scientific
   machines. Two good implementations are available for
   Control  Data  Corporation (CDC)  hardware; Lehigh
   Analog Simulator (LEANS)  from Lehigh  University and
   DARE-1IIB  from the  University of Arizona. All  these
   programs are written in FORTRAN and should be  easily
   adapted  to other hardware systems.  A somewhat  more
   general  simulation package,  called SIMSCRIPT,  is  avail-
   able as a compiler on CDC hardware. This simulation
   language allows categorical as well as continuous simula-
   tion variables and has been generally  accepted for  logis-
   tic problems only. It is a generalized simulation language
   that does include CSSL; its main drawback is that IBM
   chooses to ignore it.

       A  rather more specific problem of the chemistry
   profession  has elicited a group of programs  that  is
   closely  related  to  the CSSL  series. These  programs
   perform  a  CSSL type simulation  but  accept  problem
   definition as a series of chemical equilibrium equations
   rather  than a system  of  differential equations. These
   programs are often called translators since their  usual
   first  step is to translate  the system of chemical equations
   to the  equivalent set  of differential equations.  After
   that,  their  numerical  techniques  closely parallel the
  CSSL type  program; however, their output is chemistry
   oriented with automatic plots of chemical concentration
  versus  time. These  programs are discussed  in detail  in
  many of I he Computer Programming for Chemists text-
  hooks. Two implementations arc known to be available
  by  I his author,  one  from  IBM, and the  other  from
  Michigan State University. Most university chemistry
  departments have1 their own versions of one of  these
  programs.

       Another type of chemical  simulation algorithm is
  also available from the  RAND Corporation and is based
  on chemical equilibrium  expressed by  the  Gibbs  free
  energy  function.  This  algorithm is limited to  finding
  equilibrium  or steady state conditions; however, it is the
  only  technique capable  of solving  large systems of
  chemical equations.  It has  been successful in simulating
  the steady slate conditions in chemical systems defined
 by several  hundred chemical equations. The algorithm
 can also be implemented for use by SUMT.
      A type of  problem that is becoming frequent  in
 environmental research is that  called biological assay.
 Tlie  techniques of probit analysis, normit analysis, and
 logistic analysis were developed almost forty years ago
 by biologists and have become so standard that, except
 for  implementation on computers, they are essentially
 unchanged  today.  The computational problem can be
 summarized by the following equation:
E(P)  =
                         f(z) dz
 where:   E(P) is the expected probability of a response
          f(z) is the  Gaussian  or  normal  probability
          function

   and    y is a  mathematical function of the dose.

 Classically y=a + b * log (dose). From this equation, one
 can see  the origin of the common terms straight  line
 bioassay, and log-normal bioassay.

     The problem of biological assay is twofold: there is
 no analytical solution to the integration in the  above
 equation, and the straight line regression is a  weighted
 regression  in which  the weights depend upon the un-
 known values of parameters a and b. These problems are
 solved computationally by an iterative technique.  The
 above  formulation, while  useful in  laboratory  research,
 cannot generally be  applied to environmental problems
 because  it permits the response to be dependent  mly
 upon  one drug or  environmental  variable. In  all in-
 teresting environmental problems the animals or  plants
 are exposed simultaneously to a variety of chemicals  and
 physical  conditions. There is no mathematical or statisti-
 cal reason why  the  quantity y  in  the above equation
 should be  a simple linear function  of the logarithm of
 the dose, but classically that  is  the  only situation of
 interest to biologists; the only published methodology is
 likewise  limited. An  additional feature that can be in-
 cluded in the mathematical formulation of the  problem
 is the inclusion of positive and/or negative control prob-
 abilities (the probability that an animal or plant will  not
 respond regardless  of the strength of the dose,  and that
 without  any dose there will be a response). All these
 concepts important to environmentalists are incorporat-
ed  into  a  computer  program written by this author
several years ago for the National Cancer Institute. This
program permits  any  linear or nonlinear function of the
doses of any number of chemicals to  determine the value
54

-------
ill y  A  typical example is an mialysis ol variance strue-
lure where  ihe categorical variables are the physical con-
ditions stressing the animal (plant) at  the time of obser-
vation,  and the regression variables representing the
concentrations of several chemical pollutants.  A very
important feature of this example as implemented by
this author's program is that it permits the estimation of
interactions  among the several  stressing agents  and
conditions.

     Finally it would be beneficial to review two com-
puter programs recommended for documentation of new
programs that substantially ease the task of changing an
"old"  FORTRAN  computer  program  or  of  under-
standing such a  program  written  by someone else.
Several  programs are available  that automatically draw
How charts of a FORTRAN program (similar programs
are available  for COBOL, ALGOL, PLI, and many other
languages).  Some rather elegant ones are  commercially
available for  IBM hardware,  but  they are relatively
expensive if  not  used extensively. CALCOMP has a
package that  produces graphic quality flow  charts. For
CDC equipment a FORTRAN program that flow charts
a FORTRAN program was produced several years ago by
the CDC users' group.  This program is no longer  sup-
ported, but it is available from this author. The program
does work  well, but its major disadvantage  is that the
How chart it produces has a unique format. It should be
readily adaptable  to any FORTRAN system that can
accommodate  six alphanumeric  characters  per word,
several moderate sized arrays,  and one scratch tape. (On
16 bit  2 byte  per word or  32 bit  4 byte  per  word
machines,  this  can  be done with  double precision
declarations.) A major  problem with all flow charting
systems is that  they chart  program statements and not
procedural  steps as does a programmer when he is de-
signing a program. Nevertheless this program is valuable
lor comparing the program logic  to  the  programmer's
intended flow  chart,  for  finding  unsuspected  logical
loops that cause bugs, and for programmers who are not
willing to effectively document their own programs. A
generally available but much less powerful capability of
charting all logical program structure exists within many
FORTRAN compilers, in the  tables that appear at the
end of the compilation  of each routine. The tables pro-
duced by the better compilers contain a cross-reference
listing  of all variable  names and  statement  numbers
which can be used to determine  each use of a variable
and all references to a statement.

     Whenever a  program of any complexity  is pro-
duced, a major portion of the programming effort is
spent  changing tlic initial concept of the program. This
results in a "patched-up" appearance of the program, a
confusing assortment of statement  numbers, and many
unused  statement numbers or even FORTRAN state-
ments. In large programs these characteristics may place
such a burden on the programmers that progress is ex-
tremely slow. There is also a FORTRAN program avail-
able to eliminate these problems. It is called TIDY and
was produced in 1966 at the Air  Force Weapons Lab-
oratory, Kirtland Air  Force  Base, New Mexico. It  is
available through the CDC  users' group and has a wide
variety of options that  may be chosen by the user. The
eight default options are:

         All statement  numbers  occur in increasing
         order

         Only  statement  numbers  referenced are
         retained

         Statement number references are updated to
         match consecutive ordering

         FORMAT statements are collected at the end
         of routines

         All FORMAT and CONTINUE statements not
         referenced are deleted

         Blanks are inserted and deleted  to  improve
         readability

         Comments are aligned, centered and bracketed
         by blank cards

         All cards are sequence-labeled with  unique
         letter-number combinations.

The output of TIDY is a new card deck, a listing of the
old and new versions of the program, and a tabulation of
the changes made.

     The variety of programs discussed in this paper can
only  be described  as a selection  from the  accumulated
program library  of one person. The programs have no
common theme  except that their availability, or their
existence,  should  be  better  known  among computer
users. This presentation has deliberately not been com-
prehensive  or even moderately detailed. Hopefully it is
enough to  make EPA  computer users aware of the tre-
mendous libraries  of  computer  routines  that  can  be
utilized for little more than the asking.
                                                                                                            55

-------
                   MSP (Model Specification Program)-An Easy-to-Use Linear Statistical Analysis System

                                                 By Larry M. Male
 INTRODUCTION

      Most  linear statistical  programs limit  a user to
 analyses of a fixed set of statistical models. Programs for
 one-way,  two-way,  k-way,  and  factorial analysis of
 variance, covariance analysis,  simple  and multiple re-
 gression, etc., each assume a fixed model for  estimating
 parameters and a fixed set of hypotheses to test. A  user
 has no opportunity to specify hypotheses which may be
 of most interest to him. There do exist  "general linear
 model" programs which allow  the specification of arbi-
 trary models, but they require  that for each model, the
 user input  a matrix of coefficients the rows of which
 separately specify a model  for each observation in the
 experiment. For large data  sets or complicated experi-
 ments, this approach severely limits a program's  appli-
 cability. The statistical analysis program  described here
 allows  a  user  to easily specify  any number  of models
 that he may want to analyze.

     Model Specification Program (MSP) is designed to
 aJlow  simple specification  of  the following class of
 models. If an observation from an experiment depends
 upon the levels of c categories, ij ..., i , and the values of
v independent variables,
                               x, then  the  mean  or
 expectation of any observation is given by:
                .  ,  i  ,  x ,  ...  ,  x  }
                    c   1          v/

                   •••
      ('(V
 This means that the mean of an observation is a linear
 liiiuiioii of a number of parameters where the param-
 i* I ITS may be indexed by any combination of the c cate-
 gories. In addition,  the model must be identifiable, i.e.,
 for fixed values of the expectations, Ey, there must exist
 uniijiic  values for  the  parameters. In most cases  (ex-
 plained  later),  a  reasonable  reparameterization  of an
 unidentifiable  model   will lead  to  one which  is
 ulenli liable.

     Tin- necessary und sufficient information needed to
completely specify a model in this form is  the associated
categories and  independent  variable  number  for  each
additive  term, i.e.:
                                                                Independent
                                                                  Variable

                                                                       0
                                                                       0
                                 Categories

                                         •••
                                                                                           v v
                                                                                           V V
MSP contains an algorithm which translates this informa-
tion into an appropriate design matrix. This algorithm is
constructed so that missing observations or missing cells
in an experimental design present no problem so long as
the resulting model remains full rank.

     Since  most of the concepts involved in using and
programming  MSP can  best  be  understood  by  the
generalization of an example rather than by resorting to
a  complicated  notation,  the  following example  is
supplied. It has a factual basis in the tir pollution work
of NERL.  Many plants  produce  a Compound i died
ethelene when subjected to stress. The amount produced
is a function of temperature, and the probability distri-
bution of ethelene concentration can be shown to be
approximately lognormal.  A reasonable model  for an
observation z of ethelene concentration at temperature t
is:
                                                                z ~ Lognormal (flt exp
                                                                                          lt+273,
                                                           The natural log of z thus has the normal distribution:

                                                                y  =  In z ~ N (u + y X, cr)
                                                          where
                                                                  =  In Ot
                                                                X  --  l/(t+273).

-------
     lii ;m experiment three varieties of radish were used
and  lor each variety, Iwo levels of stress were applied
(control  versus innoculation with gal) disease). At each
of these factors x level combinations, measurements of
cthclcnc concentration were made for a gradient of six
temperatures.  The same range of temperature was used
for each factor x level combination, but  the actual tem-
peratures were not  identical. Six observations (plants)
were   measured at   each  temperature  x factor x level
combination. The fundamental  model for the mean log
response may be written as

     Model I
                                        where
                                                    1,  2, 3      radish variety

                                                    1,  2         stress treatment
                                                         •
                                                    1,  ... ,  6  replicate.
                                     The r (ij, i2) parameters are related to the activation
                                     energy of a specific chemical reaction while the \t. (i j, (2)
                                     parameters are related to the asymptotic production rate
                                     of ethelene at high temperature.
                                                                Examples  of alternative  models  that an  experi-
                                                           menter might want to examine are given in Table I.
                                                    Table I
                                         Examples of Alternative Models
                Analytic Description
                                                     Interpretation
         Model  2
         Model  3     Ey«ji(i,42)
         Model  4
         Model  5
         Model  6     Ey=M(i],i2)


         Model  7     Ey <=/J(i, ,i2) +

                          + P02)X
                                     Response does not depend upon X or any
                                       factor x level combination.

                                     Activation energy does not depend upon stress
                                       treatment.

                                     Activation energy does not depend upon radian
                                       variety.

                                     Activation energy does not depend upon variety
                                       or stress treatment.

                                     Activation energy is zero for all factor x level
                                       combinations.

                                     Difference in activation energy between varieties
                                       is invarient to stress treatment.
Note that  if the restriction p (1 ) = 0 had not been made,
then MODEL 7 in Table I would have been unidentifi-
able. This can be seen as follows. Let
"(V  0  '
-"(V1)
                                          then
                                             2
Thus all values of p (1) and p (2) for which (p (2) • p
(1)) is a constant will yield the same model. If we take p
(1) = 0 then p (2) may be interpreted as the common
quantity  you must add to the control for each radish
variety to get the  activation energy for the stress treat-
ment. In general if parameters of the form a (ij, i2,  i^,
14,...) appear in the model then MSP allows a  user to
specify restrictions of the form a (i j, 1, i^, i^,..,) = 0.
                               1)  f  (p (2) -  p (l)).
                                                                                                             57

-------
                                                      Table 1
                                                     Continued
                 Analytic Description
                                           Interpretation
')'(i1,i2)X
          Model  8     Ey


          Model  9     Ey

          Model 10     Ey


          Model 11     Ey




          Model 12     t!y


          Model 13     Ey = M(i2) + 7(i2)X


          Model 14     Ey=n + yX


          Model 15     Ey=ju(i) + p(i3
                                 Asymptotic response does not depend upon stress
                                    treatment.

                                 Asymptotic response does not depend upon variety.

                                 Asymptotic response does not depend upon variety
                                    or stress treatment.

                                 Differences in asymptotic response between varieties
                                    is invarient to stress treatment.
                          Activation energy and asymptotic response do not
                            depend upon stress treatment.

                          Activation energy and asymptotic response do not
                            depend upon variety.

                          Activation energy and asymptotic response do not
                            depend upon any factor x level combination

                          Assuming activation energy is constant for all factor
                            x level combinations are the differences in
                            asymptotic response for variety invarient to
                            stress treatment
       In  addition to these explicitly slated models, many
  hypotheses may he tested by the comparison of various
  models. The statistical comparison of the two models
                         y
(V  '2)
  and
                           BASIC  THEORY  OF  LINEAR STATISTICAL
                           INFERENCE

                               Let  y  represent a  vector of  the  observations  y
                          (ij,.... ic, X| .... xy) arranged in an  arbitrary order and
                          let § represent a vector of the parameters for a model.
                          Since the expectation of each observation is modeled as
                          a linear function of the parameters we may write:
                                                                     Ey  =
  provides a test of the hypothesis, "the average of asymp-
  totic ifsponse over varieties docs not vary with stress
  Ircatiiit'iit."

      The next section provides some of the basic theory
  of statistical estimation and hypothesis  testing which
  allow  the  construction of  a  general  linear  statistical
  system.
                                                                           th
                          where the ntn row of Z specifies the model for the n
                          observation.  The matrix Z is often called  the design
                          matrix. If in  our example, Ey = M (ij, i2) + tt(ij, *2) X,
                          the vector of parameters are ordered as:
                                                                4

                                                                . 2)
                                                                         6

                                                                         , 2)
s.s

-------
     7        «       !l        10       11        12

  yd, I)  yd.2)  y(2. 1)  y(2,2) yCi.l)  y(3,2)

then  the row of Z corresponding to  y (3, 1, 4, Xn) will
have  a  I in column 5 and Xn in column 11; the rest of
ihc entries in the row would be zero.

     The elements  of statistical estimation and hypoth-
esis testing are as follows. Let

^^           _ |
R     (/.'/.)   /'v    esl.im:il.cs of p;irani«>Ulrs
•N^                ~
n(0)   number' of parameters

ss   (Z 0)  (Z /3 ) -  sum of squares for model

ss   -  y'  v   total sum  of  squares.
   o   ~   ~

Depending upon the probabilistic assumptions which can
be made about the random vector y, the properties of
these quantities will differ. Three  cases will be con-
sidered in Table II.

                     Table II
            Properties of Estimates Under
               Various Assumptions
'L.
         v,n ly)  o"  I
          (I.IKCIV.IIIOIIV ;IIL'
          IIIKOTICljU'll Wllll
         y - N Up. !>' I)

          |y lus il spllCI kill
          illillllv.llLlli' linilll.il
          (hsllll'llllcMII
                                  I'ropirlirs
                              I hi' fli-nifllK  -"(«'"))
                                                              is an estimate of a  . Under case 3 (n(y)-n
                                                              a chi-squarc distribution with n = (n(y) -n(jjP 0 degrees
                                                              of freedom.
                                                              An estimate of the covariance between two estimates of
                                                              parameters is given by
                                                                   C O V
                                                              Under cases 2 and 3, a ' 0 is the best estimate of a ' 0 (a
                                                              liitcur combination of the parameters) and its variance
                                                              can be estimated by

                                                                             S\.      y\. *?           f 1
                                                                    xr or* /o»/)\  - */T   a '  {7*7}    A
                                                                    V a I \a  D )  ~  U   d   \£-i £-i I    tt»
                                                                          t+*  K*          *«*            r*»


                                                              Under case 3, a (1 -a) X100% confidence interval for a ' £
                                                              is given by
                                                                   ~ ~ - m.Of
                                                                                      I  /^      S\
                                                                                    y v a r  (a' p )  ,
                                                                                      ith
                                where  tma is the  (1-a)1"  percentage  point of  the
                                students t'distribution  with m degrees of freedom. The
                                program MSP calculates the quantities J3, n(/3), SS, and
                                (Z'Z)  for each model specified.
                                    Hypothesis  testing  is  essentially  the  statistical
                                comparison of two models. If
                                                              and
                                     Model 2
                                      Model 3
                                                             are  two  models such  that  model 3 is  a  submodel  of
                                                             model 2 (i.e., the subspace spanned by the columns of
                                                             z'3' is a subset of the subspace spanned by the columns
                                                             of Z®\ then under case 3 the quantity (SS2 - SS3)/o2
                                                             has  a  noncentral  chi-square  distribution with
                                                             n(g(2)-mj(p))  degrees of freedom.  If model 2  is  a
                                                             submodel  of model 1  (the  fundamental model), as  it
                                                             should be, then the statistic
                                                       x\2
                                                       o
                                                                                                                  59

-------
 litis II OcuifUl i- distribution when model 3 is true and a
 lionbCrtirul i;  distribution  when  model 3  is false but
 Model 2  is I hie. The  F statistic  thus provides a test of
 liic hypothesis

       h0   :   model 3 is true  .

       H    :   model 2 is true
the Ibst procedure Is given by:

     decide iri ftvot of model 2 if

                      '"
        ">*•.(» (£'")-  «(£
       Fa  fti, v) is the l-a)th percentage point of the
i-Shlra!  F diStHbtition with u degrees of freedom In the
HUtiK'rator tmd v degrees of freedom in the denominator.
    toJ»l' alldws (lie comparison of models and will per-
hlrnijlif cllleuliillon for an F test. When a user indicates
II) MSl* Hid! iifro models should  be compared then the
(allowing qlinrilillcs are computed.

    Oegreds t)f freedom for hypotheses

       =  »(jl2>)-n(^')

    lUms df squares for hypotheses


       ? &i ' SS3
      «
    ihean jqtlare for hypotheses
'I'll cUiOitrud lilt F lest  lor an hypothesis, the user spec-
Illi-s tiw ( wtt itiinlela to he a»np»ied an\\ the two models
lit In' Used Id compute the  error mean square (usually
ifii-SRjVtnlV) -nil*'.*), but for subsampling expcri-
      i Ins iHl^ll not be the case).
     MSIJ alsli  eiuploys  a  routine  for calculating the
ceiilriil F distribution  function. Thus  for  each test of
H|(iiilllCscs (lie  (irobabllily  of a Type I error (rejecting
HlS hypotheses when it is true) is calculated.
                                                           ALGORITHM FOR  CONSTRUCTING THE DESIGN
                                                           MATRIX

                                                                The main algorithm in MSP which rriMktt thb pro.
                                                           gram easy to Use is the one v^hich COMtHldl {tit dtllgrl
                                                           matrix from the minimal  Input  riecded tb ipefclfty |
                                                           model. The technique used h most easily undehtobd bjr
                                                           the generalisation of an example. Considef rtwdel i bf
                                                           the example,
                                                           The  design matrix will havt n(y) (number t>f obiervs-
                                                           tions)  tows and n(g) *  12  (Humb«r ftf ^ttiritittiHi)
                                                           columns. In ariy  row'the number bf honterb ehtrle* Will
                                                           be equal to the number of additive terms in the rhodel;
                                                           two in this example. The rtth observation will be artofcl*
                                                           ated  with  one parameter fot each  additive term in tile
                                                           model. If  we have constructed a serial ordering for thi
                                                           parameters then the elements of 1 are gfv*n by
                                                                   M
                                                                   nj
                                                                   nm
                                                                               .
                                                                              nk     n
                                                                                      j,  k
                                                           where j and k are the serial numbers of thJ p«rarn6t6fs
                                                           associated with the  first arid second addltit* terrrti for
                                                           the nth observation.

                                                               MSP first construct} a serial ordering for the pBfaii*>
                                                           eters and then consecutively construct* etch raw tfi ^he
                                                           design  matrix by examining the associated Jcvels oi the
                                                           categories for each observation. Ah index tector keeps
                                                           track of which parameter!) are hot used. If any pararn.
                                                           eters irre not used, then the corresponding cblumhi art
                                                           deleted  from the design matrix. Thus if thert ttr« milling
                                                           observations or missing cells in an experimental design
                                                           only those parameters which can be estimated from1 the
                                                           available data wilt be included. U parutWerf UtB a (i j;
                                                           i,  ij« !4) occur in the model with a restriction liich as dt
                                                           Of, 1,' lj, I4) = 0,  (hen this Informatlort lit sbrtd irl th*
                                                           index .vector and the appropriate columns ari dropped
                                                           from the design matrix.

                                                               In  actual  practice MSP does  not  ltor« thi ddligrt
                                                           matrix  but instead  accumulates the appropriate cross

-------
products  from the  matrix Z  Z and the vector Z'y as
each row  of the  design matrix is constructed. Since
many of (lie  elements of a row in the design matrix may
he /ero. only those  elements of Z. Z and Z*y which will
change  arc accumulated. Referring to the example, only
the following replacement statements are executed:
      (Z1  Z).    - (Z' Z).   t  Z  .  •  Z
            im          im    ni     nm

        forim  T JJ- Jk-  kk

      (Z'  y)     = (Z' y). t Z  .  •  Y
          ~ i         •*•  i    ni     n

        for i   - j,  k
Since Z Z is a symmetric matrix only the upper diagonal
is stored.

     If an experiment contains several dependent vari-
ables for which the same analysis is desired, then  it is a
waste of time  to construct the same design matrix for
each separately. For  this reason  MSP calculates  the
parameter  estimates  and sums of  squares for  each
dependent variable using only one construction of the
design  matrix.

SUMMARY OF FEATURES OF MSP
         Model comparison cards. The models are
         serially numbered so that a user merely
         specifies a pair of integers lor each model
         comparison card.

         F-test cards. The user specifies two pairs
         of model numbers:  one for  the num-
         erator and one for the  denominator of
         the F statistic.

3.  Output from the progj-am is

         For each  model specified an analytic
         expression of the model  is written on the
         output. For example:
 EY  = A(ll,  13)
                             B(I2)  • XI
                             •* can •  x
         For  each  model  the  estimates  of the
         parameters  for each dependent variable
         are printed in the form given below:
        Symbol
        A(l.l)
                             Values
          XXX
          XXX
XXX
XXX
XXX
XXX
XXX
XXX
     1.   MSP  handles  experiments defined by any
 number of categories, any number of independent vari-
 ables,  and any  number  of dependent variables. The
 maximum size or complexity of an experiment is limited
 only by the maximum amount of core storage available.
                  XXX
                  XXX
                  XXX
                  XXX
        XXX
        XXX
        XXX
        XXX
     2.  Required input to the program is:

              Raw  data and  user  supplied  format
              specification.
              Number  of observations,  categories,
              independent  variables  and  dependent
              variables.
              Model  specification  cards. Each  model
              requires one card for each additive term.
              On each of these cards the user specifies
              the independent  variable  number, the
              category  numbers  which  index  the
              parameters,  and  information  on
              restrictions of the form a (i,, '^ i?, u,
                                                                      B(I2)
CQ1)
                  XXX
                  XXX
                          XXX  -XXX
                                                                                        XXX
                                  XXX
                                                                                                       XXX
                                                                                                       XXX
          For each model  the number of param-
          eters and  the sums of squares for  the
          model are printed.

          If the bser asks for the (Z'Z)"1 matrix, it
          is printed in the form:
                         •
          COV (1,1)-XXX
          COV (1,2) -XXX
                                                                       COV  (n(0), n(0))» XXX
                                                                                                           61

-------
                Multiplication  of these values by  the
                users selected estimate  of o   yields the
                                              of  the
variances  and  covariances
parameters.

Model comparison table  which presents
the degrees  of freedom, sum of squares,
and mean square for each hypothesis.

Tables which present the results of the
F-tests.  Included  are numerator  and
denominator  degrees   of  freedom,
F statistic, and  probability of a Type I
error.
                                                   4.  Techniques such as normalizing the     matrix
                                              before inverting are used to minimize rounding error.
                                                                  5.  The programming language  is FORTRAN IV
                                                             for the Oregon State University CDC 3300.
                                                                  Additional  features  such as automatic confidence
                                                             intervals for parameters, three  decision-decision pro-
                                                             cedures, and even more  simplified input  are  contem-
                                                             plated  for the  near  future.  The  program should  be
                                                             completed  along with a  user's manual by the end of
                                                             1975.
62

-------
                         CONCEPTS VERIFICATION TESTING OF THE ENVIR SYSTEM

                                              By John W. Scotton
 PLANNING AND COORDINATION

     The contract with Gulf Universities Research Con-
 sortium  (GURC)  involved  joint  Environmental
, Protection Agency /Office of Research and Development
 (EPA/ORD) and GURC  planning and coordination. It
 also involved the selection  of three categories of EPA-
 generated data considered representative of, and suitable
 for, demonstration/evaluation. The following categories
were selected  in agreement with  the  program par-
ticipants, and a firm plan for delivery of data tapes to
GURC was developed. At the same time, priorities were
established  and continuing coordination arranged. By
personal visit to the GURC facility and  through active
participation in the creation of a data base out of Data
Category 1  and thereafter  personal query/response  of
the  first "Dynamic File" on UNIVAC 1108 at the Uni-
versity  of Houston,  a  firm  project  plan was
implemented.

     Presentations, discussions, work sessions and dem-
onstrations  were organized  and coordinated  to allow
maximum visibility and exposure to EPA divisions; em-
phasis  was  placed  upon  interagency  liaison.  Dem-
onstrations  were scheduled at  Waterside Mall  and  at
Research Triangle Park. The need for evaluation of those
demonstrated  applications  to  the  Agency  required
followup  discussion,  debriefing  and continuing con-
sultation with the GURC scientists.
                 V
     These  demonstrations/evaluations   were,  in this
pilot  program,  limited  to  application of  the En-
vironmental  information  Retrieval (ENVIR)  system.
This  ENVIR  system is  the primary module  in  the
Environment Dependent  Management,  Process  Auto-
mation  and Simulation (EDMPAS) system. As a con-
sequence of extensive briefings, discussions and training
sessions,  EPA/ORD  staff have  acquired  sufficient
in-depth knowledge of the total systems concept to pro-
vide evaluation  comment.
DATA CATEGORY I

National Soils Monitoring Program Data

     Mr. Elgin Fry, Office of Pesticides Monitoring, pro-
vided a magnetic tape to GURC containing the National
Soils Monitoring Program Data for the year 1971. This
data tape consisted of data related to place, time, crops,
pesticides, method of application, amounts, and so forth
to a total of 117 fields. The existing transaction-oriented
type printout was also provided.

     This data tape was reformatted by GURC in accor-
dance with the methodology reported and, in addition,
the fields were entirely  restructured so that pesticides
became the operative element in each record rather than
the location at which the pesticide was monitored. This
restructuring of the file  rendered the data more useful
insofar as the new structure coincides with the primary
monitoring requirement. This results in ENVIR records
containing 25 fields. While providing all of the  original
information, such records could have been punched on
almost half the original  number of cuds. Data entry,
involving staff and materials, is considered important in
terms of system cost effectiveness.

     The restructuring of this file, effectively  accom-
plished through simple  interface software,  permits an
unlimited number of pesticides and crops to  be in-
corporated in  the data base. The existing restriction of
15 pesticides and 3 crops, limitations imposed  by  the
current system, need no longer be tolerated.

     The  GURC team  produced  computer  printout
exactly duplicating the original information contained in
the conventional  output. Additional queries, pertinent
to pesticides monitoring, were demonstrated. All queries
requested by EPA staff through batch-mode  terminal
were scrutinized and found totally responsive.

     In brief, this demonstration indicated that:

         Data entry is more efficient

         Query capability is unlimited

         Hierarchical printout  makes the information
         more meaningful

         Information produced through certain queries
         had not been available before.

     A typical comment made by one staff member was
that:  "The information  produced  from  his  query,
                                                                                                             63

-------
  costing 23 cents, would have taken two days of staff
  time for response."

       The complete file, dictionary structure, dimension
  statements and typical queries are illustrated in a bound
  volume  of computer output  with a  report on the
  methodologies applied and the system User Manual. The
  computer  output  was duplicated  on  UNI VAC 1100
  series and IBM 370 series computers.

  DATA CATEGORY 2

  Federal Water Pollution Control Surveillance Data

       Mr. Lee  J. Manning, Office of Water Quality, pro-
  vided a  magnetic  tape and printout  of Water Quality
  Data (STORET) containing Potomac  River Basin  data.
  For purposes of this pilot program demonstration, it was
  considered more cost  effective to repunch  rather than
  reformat the tape. From the cards, punched in free field
  ENVIR format, a new tape was built. The Dynamically
  Structured Data Information File relevant thereto was
  created on UNIVAC 1108 at the University  of Houston
  for  purposes of checkout  and thereafter recreated on
  IBM 370/165  at  Bethesda,  Maryland  through
  batch-mode terminal.

      This methodology demonstrated the ENVIR skip
  and dupe capability to reduce data entry. The ENVIR
  facility for repetitive data entry (correction statements)
  was  also applied.  The  complete  Boolean  algebraic
  capability permits the partitioning of data, thus allowing
  the application of a correction statement specific to a set
  or subset of data,

      GURC staff considers that a record should consist
  of tests of water characteristics at a particular TIME and
  PLACE. This permits partitioning of the Data Base with
  a single Boolean Expression so that all localities where a
  pollutant level is in violation, even for only  part of the
  year or month, may be extracted.

      ID addition to the demonstration, a work session
 discussion was held with Mr. Sam Conger. It is evident
  Ih.iI,  if  interlaced  with STORET,  ENVIR  offers
 capabilities to expand the query and output facilities.

      The  data base, dictionary structure, dimensions,
 typical query responses and appropriate reports for this
 Data Category 2 reside with the writer.
 DATA CATEGORY 3

 Emission Inventory Point Source Data

     Mr. Gerald Nehls, Office of Air Quality Planning
 and Standards at Research Triangle Park (RTF), North
 Carolina, provided  information related to existing data
 banks at RTP (e.g., SAROAD, Emission Point Source
 and Area Source) and allowed GURC to select a data
 tape. GURC responded with a request for Point Source
 data, data pertinent to the EPA mission, representative
 of a broad  geographical area, and a reasonable expanse
 of time. RTP  therefore dispatched a  tape  covering
 various time frames and  containing representative data
 from Hawaii to Maine and Florida to Alaska.

     The computer programs presently used at  RTP for
 Point  Source  data  retrieval appeared to consist of six
 separate records for each Point Source. GURC therefore
 requested that the tape contain the data such that each
 set of six records could be considered as one. RTP com-
 plied.  As a  consequence, this reformatted Point Source
 data provided a logical structure permitting the full capa-
 bilities of ENVIR   to  be  demonstrated.  Prepared
 front-end software  was employed to convert the tape,
 IBM/EBCDIC  code to UNIVAC FORTRAN readable. At
 the same time,  an  additional routine to convert fixed
 field to ENVIR readable was applied.

     In thus same  UNIVAC  Run  Stream, an  ENVIR
 Dynamic   File  was  built  containing  85 fields  (de-
 scriptors). Fifty  field were automatically excluded from
 this Dynamic File. Forty-eight of these fifty fields were
 excluded  because  they were essentially -redund ,nt  or
 private to  the  existing  system (e.g.,  card numbers,
 actions, etc.). ENVIR has no need for such fields. They
 could, nevertheless, be accommodated in ENVIR but
 they add nothing to the query capability. The remaining
 two fields were extracted and  a separate file  created
 simply to demonstrate  that, when necessary, the com-
 puter can be used to provide a "book."

     This main ENVIR Dynamic File of scientific data
 contains  all  of the  original spatial, temporal and pol-
 lution  data in  the original six file cards.  The Dynamic
 File created on UNIVAC 1106 at Mississippi  State
 University was read directly into the UNIVAC  1110  at
 RTP for purposes of demonstration. By sample query
and response, it  was demonstrated  that the system can
be used to predict ambient air quality design sampling
64

-------
standards, control  strategies, pollution violations, stack
cliamctcrislics of regions, and so forth. Subsequently,
tliLs same Dynamic t:ilc was queried through batch-mode
terminal at  the  Waterside Mall line connected  to  the
UN I VAC 1110 at RTF.

     The data base, the dictionary structure, discussions
and  typical query  responses,  and appropriate reports
were delivered to EPA/ORD for this Data Category 3.

DEMONSTRATION

     On  Monday, June 3,1974, the GURC Information
Staff provided a three-hour presentation of the EDMPAS
system concepts to representatives of  EPA at RTF. By
prearrangements  with GURC, complete sets of zerox
copies of viewgraphs and slides were bound together
with  general narrative to provide  a meaningful docu-
ment.  A presentation agenda was agreed  upon and
distributed prior  to this visit.

     Following the presentation, RTP arranged for a full
demonstration of the Dynamic File created out of Point
Source Emissions data. Using the standard terminal with
the CRT photographed and  line-connected  to  several
standard T/V outlets  through the audience in the main
conference room, it was possible for any number of the
audience to  write or call for queries, input through tele-
typewriter  by an  EPA staff member, and  to  receive
immediate response to query by CRT. As a consequence,
it  was possible to  conduct a  practical systems demon-
stration  to  a  large  audience  with  full  audience
participation.

     This same Dynamic File, left resident on disc for a
few days, was again  queried  through the batch-mode
terminal  in  Waterside Mall at the Office of Pesticides
Monitoring.  In  parallel  Dynamic Files  mounted on
IBM 370/165  at Bethesda,  Maryland were similarly
queried.  In this  manner, it was possible for a represen-
tative group from several divisions within the agency to
participate directly. In addition, guests from  other U.S.
Federal Agencies were invited.

THE ENVIR SYSTEM  CHARACTERISTICS  AND
CAPABILITIES  DEMONSTRATED  AND
EVALUATED

     The following summary of characteristics and capa-
 ilities included  within the GURC intermediate reports
was carefully scrutinized and used for evaluation. For a
potential user to check any or all of these requires some
brief study of the User Manual. A perusal of this manual
coupled  with  a document  entitled  "ENVIR.  An
Illustration Check Deck,"  will demonstrate  the  sim-
plicity  of  application  and  operation;  it  will  also
demonstrate the flexibility of this ENVIR module in the
creation  of dynamically structured data/information
files of multidisciplinary data.

    To fully understand and appreciate the new  con-
cept in information management offered in the design of
the EDMPAS system, of which ENVIR  is the primary
module,  requires study of "The  Systems Brochure"
which describes the system in  its entirety from a user
standpoint.

    The three data categories selected and converted to
Dynamic Files (for which complete computer runs are
available  for inspection) fully demonstrate most of the
characteristics of  the system projected by GURC. The
EPA/ORD  evaluation was  conducted in  conjunction
with the  divisions represented in an entirely satisfactory
manner.  Furthermore, staff participation in this entire
process permits the expression of the view that all of the
following were fully demonstrated with the system:

         Accommodates and  readily adapts  to  the
         storage  and retrieval of data records consisting
         of  essentially  unlimited  numbers  of de-
         scriptors and  states  (ranges,  resolution) of
         variable bit length

         Operates as a completely free  field  system
         wherein no card column requirements are
         imposed;  i.e., data  can  be  received  and
         automatically integrated into the files without
         any predetermined fixed card column format
         requirements
               \
         Accepts data  which is alpha, alphanumeric,
         numeric or coded without restrictions

         Accepts data  regardless of  its managerial or
         scientific disciplinary content

         Dictionaries for  English/binary/Engliih trans-
         lation are built automatically from input data;
         the user is not  required to  furnish a pre-
         determined  and  inflexible thesaurus in
         advance

         Permits any data  entry in the dynamic file to
         be corrected,  any record to be deleted, and
         more data  to be added anytime during the life
         of that dynamic  file; new descriptors may be
                                                                                                             65

-------
 added  tb  the  dynamic  file  anytime  and
 combining  disparate  batches of data punched
 in different format

 Full Boolean  capability is available, and all
 three Boolean operators AND, OR NOT may
 be used while retrieving data

 All the descriptors in the dynamic file are re-
 trievable

 Permits the selective retrieval of any or all rec-
 ords in  the dynamic file without instructional
 software expansion, without search and com-
 pare procedures, and without  the inversion of
 any  descriptors or descriptor-slutcs in the file
 aiul, therefore,  without  associated  file
 expansion

 Operational  simplicity: permits direct access
 to die file by noncomputer professionals  and
 their utilization of the data for  analytical, in-
 terpretative, documentation, and management
 functions

 Several options to format the query output in
 the  desired  hierarchically  indented   and
 classified manner

 Fast binary  sort of the selectively retrieved
 data

 The user has complete control of designing the
 structure of the dynamic file  to best  suit his
 needs; he decides what descriptors are to be
 incorporated into the dynamic file, and how
 each descriptor is to be handled

 Simple   procedures to allocate  core  to  the
 various arrays used by ENV1R thereby adjust
 the  program dimensions to fit the particular
 data bank and optimize program performance

 In the ENVIR compressed file, compression of
 the data is achieved by storing data in binary
 form and structuring the files such that every
 bit in core assigned to data  storage is available
 for that purpose

 Operation speed:  the fast query response was
demonstrated during  the demonstrations  on
 I-I>A hardware.
      It should be emphasized that ENVIR does not use
 the conventional binary codes in the Dynamic  File.
 Rather, each unique field state is assigned an ordinal
 binary number representation. Thus, a data  field  with
 two distinct states uses only  one bit in the form of 0 or
 I. Consequently, a geometrical progression is created so
 that 4 unique states uses 2 bits, 8 states, 3 bits, 16 states,
 4 bits,  	  a million + states, 20 bits. Each distinctive
 state is assigned a code in this fashion whether it is ] or
 126 characters long. Each distinctive alpha character
 state is stored once in a conventional  binary coded file.
 Numerical states  are  assigned  only  ENVIR
 (BIT-VECTOR) codes.

     The system is written in FORTRAN V and is there-
 fore  essentially  machine-independent  Note   the
 availability  of both IBM and UNIVAC versions of this
 ENVIR  system. Files created on  UNIVAC 1108  were
 directly  transferred to UNIVAC 1110. The dictionaries
 are separated on disc from the compresaed data files and
 thus offer computer system independence. It is possible,
 by  front-end software, to translate the dictionary while
 transferring the compressed file directly from machine
 to  machine.  While  these system  facilities  were  not
 checked as part of this evaluation, it can be said that this
 ENVIR  module  of the EDMPAS system appears  to
 offer:

          Machine independence

          Computer system independence

          Intcrlinguistic possibilities through use of cor-
          rection statements.

 THE EDMPAS SYSTEM

     The total  EDMPAS  system  is described  in a
 brochure available for issue. To conduct an evaluation of
 ENVIR,  the primary module in this EDMPAS system, it
 was considered necessary to acquire a full understanding
 of the total  systems concept. It will be noted that  as a
 stand alone  system, ENVIR has the capability to create
 dynamically  structured data  information flies  to the
 user's specifications and for file content addressability
 through unrestricted Boolean query. This permits system
 application in a cybernetic mode.

    To  take full advantage of this research mode, the
 ENVIR system at Mississippi State  University is directly
interfaced to display modules  and processing modules of
the  EDMPAS system. The User Manual demonstrates the

-------
commands permitting direct input to modules. For pur-
poses  til  continuity  in  this  report,  a summary  of
LDMI'AS and a system diagram are included (Figures 1
and 2).

     Those presentations provided  by  GURC and sub-
sequent work sessions and consultation coupled with
participation  in  file  creation  at  the  University  of
Houston have provided EPA/ORD with all necessary
background  information  appropriate  to  a  general
evaluation.

     A study of the documentation of the system and an
examination of the reports based  upon system appli-
cations already undertaken and completed clearly de-
monstrate the applicability of this system to agency
needs. The EDMPAS system offers a capability to the
agency  whereby multiinstitutional,  multiagency, multi-
disciplinary data can be combined to produce data bases,
which can be correctively analyzed  by scientists and the
resulting information  synthesis conducted by  EPA
scientists in  conjunction with agency and  institutional
scientists  external  to  EPA.  In  this  way, scientific
credibility and  public  credibility  of management de-
cisions can be projected and demonstrated.
THE  EDMPAS  SYSTEM:  HOW TO ACHIEVE
IN-DEPTH SYSTEMS ANALYSIS
AN
     The complete  system as previously described  is
mounted and operational  at  the UNIVAC Computing
Center  at  Mississippi  State  University  at  Starkville,
Mississippi. It should  also be noted  that the primary
module  ENVIR is mounted at the University of Houston
UNIVAC 1108 and the Rice University IBM 370 system.
During  these demonstration evaluations, dynamic  files
were created on UNIVAC and directly read into the RTF
UNI VAC 1110.
                To conduct an in-depth analysis of the  EDMPAS
           system requires only that:

                    Dynamic Files of environmental sciences data
                    be created out of systems data tapes.

                    Such dynamic files contain multidisciplinary
                    data with a  large  number  of fields  and
                    thousands   of  records  with spatial  and
                    temporal coordinates included.

                    The data should include descriptors and states
                    appropriate to (1) objective analysis, (2) graph
                    clustering  analysis,  and  (3) Charanal
                    processing, etc.

                    These files on tape can then be transported to
                    Mississippi State University where they can be
                    read into the EDMPAS  system for in-depth
                    analysis by  the GURC  staff  in conjunction
                    with the Agency scientists.
CONCLUSION

     In line with the GURC objectives, this evaluation of
ENVIR  conducted  by  EPA/ORD  was  designed  to
establish the applicability of the ENVIR system to the
collation, sorting, correlative analysis and interpretation
of selected categories of EPA data. As a consequence of
successful  demonstration/evaluation,  it  is  considered
appropriate to recommend in-depth systems analysis and
application of the EDMPAS system.

     GURC has developed this information management
system to  that degree where, in conjunction  with a user
agency, a  systems design  can be jointly determined for
immediate application to Agency management needs.
                                                                                                            67

-------
r~
•!i H fill
(3 NMIATIH
X-Y PllJl
                         Figure 1
      A Summary of Environment-Dependent Management
        Process Automation and Simulation (EDMPAS)
              Selective Retrieval
            A Classifying Functions
                        Figure 2
           EDMPAS System Development Status

-------
                                   SEAS - THE STRATEGIC ENVIRONMENTAL
                                             ASSESSMENT SYSTEM

                                              By Edward R. Williams
 INTRODUCTION

     The system to be discussed is the Strategic Environ-
 mental Assessment System (SEAS). SEAS is a compre-
 hensive  analysis  and  impact  forecast  system (see
 Figure 1). As such, it can serve as a coarse-grain, early
 warning system that will allow the tracing of impacts of
 policy actions in  economic, energy and environmental
 areas. Design goals of the system include:

          Projection of demographic and economic vari-
          ations to derive impacts on pollutant residual
          levels

          Estimation  of treatment costs for major end-
          cost sectors

          Prediction of secondary effects and reactions
          due to various environmental quality levels.

     For  the set of goals outlined in Figure 2, the ex-
 pected characteristics  of the system are detailed as:

          National forecasts  for  a  period  of  10  to 20
          years

          Forecasts disaggregated to regions, such  as
          states

          Forecasts that include  economic levels with
          different  abatement  levels  and  timing
          assumptions

          Comprehensive forecasts to include industrial,
          consumption,  transportation and disposal
          sources in order to be  a comprehensive state
          of emission  report

          Computer-based, allowing rapid parametric or
          alternative assumption assessments.

     Figure 3 provides the original concept of the SEAS
system as designed in the  fall of 1972 and serves as a
basic model for discussing present as well as future devel-
opment plans.  The highest line on  the general form of
the model shown in Figure 3 reveals a diverse and large
data base and a significant interpretive capability (shown
as expert opinion). Most of the data changes to develop
an alternate set of scenarios require analyses  for engi-
neering expectations plus  feasibility and timing checks.

     The major line  of interest is in  the center of
Figure 3. From left to right the concept calls for:

         The inputs,  or change agents, for a particular
         scenario. These are factors that would impact
         the envirnoment and its management. They in-
         clude  such data as trends in population, indus-
         trial  growth  and shifts, energy  conservation
         measures, and technologic changes in goods
         production or abatement procedures.

         The processes shown in Figure 3 are the ac-
         tivities of the socioeconomic system that re-
         spond to  the change agents. These processes
         include extraction,  production, distribution,
         consumption, and  disposal.  Thus the full set
         of man-implaced processes is represented for
         several years' forecast levels, providing both a
         measure of the economic  supply and demand
         plus a basis for projecting gross environmental
         pollutants.

         SEAS produces annual emission levels for the
         the set of environmental residuals shown in
         Figure 3. These  residuals  are provided by in-
         dustry and consumer activity at several levels
         of  interest  including  gross  or  untreated
         residuals, net  residuals (left after treatment),
         secondarily  produced residuals, and recycled
         materials. About 125 types of pollutants are
         represented  in the general areas of air, water,
         solid waste, pesticides, and radiation residuals.

         From the residuals output there are several ef-
         fects:  costs  associated  with abatement pro-
         cedures,  ambient  levels,  and  other  benefits.
         This concept area  and the following one are
         concept areas of SEAS that remain only par-
         tially  developed at  the present time.

         In addition,  there  are reactions  due to expo-
         sure to residuals or other  anticipated impacts.
         This area has had little development to date
                                                                                                               69

-------
           although some analysis of regional economic
           distribution change is now under study.

           Finally, of course, SEAS would generate many
           reports  with each element to include adjust-
           ment  feedbacks to other elements. This com-
           pletes the conceptual procedure.

  DEVELOPMENT HISTORY

       From the fall of 1972 until the spring of 1973, the
  study group  assembled a  test procedure, using sub-
  modules  where  available, discovering where design ob-
  jectives were  unfeasible  or  the proposed  product still
  under development, and  generally  rating the problem
  areas. A working computer system for testing ideas and a
  plan for a reduced-scope operational model were devel-
  oped; a set of surveys of available theory and supporting
  data was produced. From these results,  a set of plans
  detailing procedures for the  early development of a pro-
  duct to be used in EPA analyses was completed.

       The  results of the first phase were  constructive
  enough to allow an operational model, called the Proto-
  type, to be tested in January 1974 and with refinement,
  this  Prototype  SEAS is  now being  used  for  several
  studies in EPA and other agencies. Much of this article is
  based on  the Prototype SEAS; however, note that devel-
  opment  is  continuing  to expand  the  scope  of  the
  forecast tool  as well as refine  its present components.
  The Prototype of March 1974 has been updated to in-
  clude three new modules; and by next summer, the third
  phase of development  will be completed to include all
  improvements planned at this time. It should be em-
  phasized  that Phase III is not only a time for devel-
  opment but also is the first major phase of user appli-
  cations. The results of these analyses will probably direct
  the  areas  of  further  SEAS  development  after next
  summer.

  THE PROTOTYPE SYSTEM

      The following discussion briefly describes the Pro-
  totype and its parts. Figure 4  shows that the system is
 complex and is designed  to  have a great deal of inter-
 action and feedbacks among its program modules. Its
 major areas consist of the following:

          Economic input/output models
         • Energy budgets
          Envirnomental residuals
          Abatement costs
          Consumption residuals
          Spatial assignment and so forth.

 By judicious treatment of scenario input data, the pre-
 vious areas can be assessed. However, any model of this
 size also requires a good deal of iteration and adjustment
 to fully balance effects; thus, the operating analyst must
 expect  to spend a fair amount of time in analysis  of
 inputs to property structure those change agents that
 were called for in the concept.

     Figure 5 is  a pictorial representation of the SEAS
 Prototype system. The first major  module takes macro-
 demographic and macroeconomic  inputs and,  using an
 input/output procedure, develops  a national economic
 forecast  for  each year  to 1985. The economic model
 called INFORUM was developed by Copper Almon  at
 the University of Maryland and  represents the national
 economy as 185 product sectors.  Even this level of detail
 is not sufficient to fully represent those industrial pro-
 cesses  that are  needed  to  show environmental  and
 energy-related effects; therefore, further detail is added
 to give an additional 100 production subcategories. This
 part of SEAS provides dollar volume outputs that can be
 associated with  physical noneconomic outputs for the
 industrial segment of national activities.

     The next block in Figure 5, RESGEN, provides the
 set of environmental residuals associated with industry.
 The first purpose of this module is to determine the
 gross residuals that are produced by the economic pro-
 cess. Next, it associates those environmental abatement
 processes with the gross residuals  to modify the levels
 and forms of pollutants that are released to the a Tier
 media. An obvious need is  to  represent information,
 such as time phasing of  emplacing processes, to meet
 environmental regulations  such as BPT and BAT. Notice
 that transformations can  be  intermedia; that  is, treat-
 ment of an air residual could give both water and solid
 waste residuals.

     Two bookkeeping  routines  are  part of SEAS.
 Energy demands  are associated with the economic  out-
 put in order to develop an energy forecast by nine major
 fuel  types for  each  year. The  second bookkeeping
 module  uses the economic level and the degree of envi-
 ronmental treatment in  each  sector to calculate capital
 costs  and  annual  O&M  costs  due to the   level of
abatement.

    All  of the  data generated  thus tar are at the na-
 tional level; as an option, the SEAS Prototype can re-
gionalize  all  industry data to  federal regions and to
70

-------
states. The major  mechanisms used are base year data
from county-city  data  bases  and the Department of
Commerce OBERS regionalization  shift-share  forecast
system. While these are the state-of-the-art forecast pro-
cedures, overrides  can be  introduced where better data
are known.

     Turning from the industrial drivers of SEAS,  three
other residual producing procedures are derived at state
levels and these are summed to provide national totals.
These are transportation, space and water conditioning,
and  consumer solid waste  disposal. In each case a set of
residuals based  on alternative abatement strategies  is
forecast  and costs of abatement, or  of  disposal, are
developed.

     This description completes the system overview ex-
cept for two modules. The first is a macroforecast of
land use  acreage  based  on the industrial production
levels.  The  second is  the report generator that  sum-
marizes all of the data of all assessment modules. These
data are in terms of national and regional totals.

     The environmental residuals generator for industrial
processes is presented in Figure 6. RESGEN is driven by
two  files, both representing output levels on an annual
basis from the economic  input/output models.  For a
given base year, the system is loaded with a set of multi-
pliers that projects the gross level of residuals that would
be produced  at  the given  production level according to
the environmental  state of the  art then in place. A set of
other coefficients  is provided to reflect  the  environ-
mental  abatement processes  for  that  base year.  For
example, for  a water effluent the industry treatment can
be divided into the fractions that represent fully internal
(onsite)  treatment, pretreatment  onsite  followed  by
municipal plant treatment, and  no treatment process.
Associated with each of the treatment options for 1971
is a  coefficient that represents the amount that is cap-
tured and transformed, leaving the amount of residual
that  escapes in its  original form to media. Additionally,
the materials that  are captured are subdivided into two
groups:  material that is recycled to the economic process
and  material  that  is transformed into different, or sec-
ondary, environmental residuals.

     An  examination  of  the right-hand  column of
Figure 6 reveals additional data elements considered  in
future years. First, there  may be economic technology
changes. Such changes would modify the gross emission
coefficient; an example would be more efficient use of
mercury  in a  manufacturing  process resulting in  a  re-
duction of the level that is  vaporized at part of the
economic production system. Additionally, any of the
abatement factors may be modified over time through
such  means  as treatment,  improved efficiencies  of
amount of residual captured,  increased recycling, and so
forth.  This is  obviously  where the data  are placed to
reflect meeting goals of BIT, BAT  and so  on. The
system is flexible enough to allow a full representation
of time series changes and to  allow easy modification of
the years when requirements are to be met. With the
introduction  of  these  changes  in  the  residual  co-
efficients, the  calculation procedures are much the same
as the base years.

     Figure  7  is a list  of the  primary  residuals;  the
number of secondaries is slightly less than this list. Of
course, many  of the industrial sectors produce similar
residuals so that the number of residual coefficients is on
the order of thousands from industry sources alone:

SEAS COMPUTER CHARACTERISTICS

     Figure  8 is  helpful in discussing the SEAS total
system from  a computer operations sense. In  Figure 1,
SEAS was represented as a set of independent modules
connected by shared  data bases. Figure 8 contains  the
list of programs in  order of  operation. Of the 17 pro-
grams, the main constraints are contained in UMINFOR,
the input/output program, and in DISAG, the assign-
ment  of economic residuals to  states. UMINFOR re-
quires the largest core, which is  solved in Phase III by
overlaying the program down to about 17SK bytes.
DISAG has the longest  running time and is the only
program requiring data from tapes. The 15-minute run-
ning time required  now is for regionalized data at state
level. We are planning to go to AQCR's and SMSA's by
next June; this detail could triple the running time. For-
tunately, the  need  to regionalize occurs in only  about
one-tenth of the analyses.

     Each program can be operated on call, but other
options have been  set up for users including  one  called
SUPERSTREAM,  It  does everything except DISAG,
using 850 lines of JCL.

     SEAS is in  FORTRAN IV and resides on private
disc  packs  on  the Optimum Systems, Inc. 370/158
                                                                                                               71

-------
  computer. It has sCVfifkl users, including two government
  agencies outside of EPA.

  SEAS PLAN HI

       Figure 9 shows SEAS Phase III scheduled to be op-
  erating in June  1975. It follows the same patterns as the
  present system but also  includes:

           An improved  macroeconomic driver

           Expanded regionalization  to  SMSA, AQCR,
           and river basins

           Expanded cost modules

           Expanded energy and scarce materials budgets

           A first set of damage functions

           An improved management system to better
           automate  feedbacks  and  internal  data
           assistance.

  EXAMPLE OUTPUTS

       SEAS is set up with a default data base representing
  the historical  projection  data  base overlapped  with
  known legislative  effects.  Some of  the  major assump-
  tions are presented in Figure 10. Using this set of change
  agents, a set of primary pollutant emission levels pre-
  dicted by  SEAS is depicted in Figure 11. The first nu-
  merical column gives the absolute value for the base year
  1971. The next three columns give annual growth  rates
  for  the  specified  three-year periods. There are quite
  different patterns  for the time periods representing the
  timing and input of the various legislation actions. Most
  of the air emissions  and all of the  water pollution levels
  drop in  effectiveness from  1977 to 1980, reflecting the
  input assumptions that  the first level treatment facilities
  arc in place by  1974 and that the  second level facilities
  are negligible prior to 1980.

       A more graphic presentation of these effects is pre-
  sented in Figure 12  through  a  set of graphs. The  solid
  line is (he  data for base  cost extended to 1985. The
  broken  line indicates  the  result  if industrial process
  abatement effectiveness and transportation cpntrols are
  held  at the effectiveness levels and the installation inci-
  dence given for the  base year 1971. Notice in the first
  three graphs concerning air emissions that there is a defi-
nite  inflection  in  the solid  line where  the industrial
abatement  processes reach full effect In 1977. The first
two emissions are primarily from industrial sources, and
from  1977 to 1985, they are directly in  proportion to
the projected industrial growth. The line representing
carbon monoxide is much smoother since the full effect
is  not realized until older automobiles are phased out.
Note  that for the  first few years the daih«d line drops
due to abatement techniques begun in 1968.

     Figure 13 provides a set  of 13 energy conservation
measures and the  expected static savings of energy in
Btu's. Additionally a change  in electrical power gener-
ation  toward  greater use of coal is assumed. Totaling the
expected savings provides a  fair representation of the
energy  conservation  half-and-half  plan.  This  set  of
measures is considered speculative  and  on the outer
bonds of economically feasible levels.

     Figure 14 details some  of the impacts expected
from  energy  conservation  measures. The value for each
scenario and  each statistic in 1971 is set at  100.  The
light bars give the  statistical data for 1985 for the base
case  scenario. The  dashed bars are the  results of the,
energy conservation measures. The actions to  reduce
Btu's  are reflected  here; the amount of the dashed level
above the 100 line is about half that of the light bar. The
impact on  the  economy is minor if one accepts  two
percent differences in GNP and employment as minor.
The  impact on  environmental emissions is  more dra-,
matic, at least  for air emissions. For water effluents,
little  effect is  noted,  less than the economic  change
values.

SEAS USERS AND USES


     Figure 15 presents a list  of SEAS users and uses as1
well  as  available   documentation  of the  Prototype
system. A number of applications have been completed.
Within EPA,  support is underway to OPE, OSWMP and
the Cost of Pollution Control Reports. Support of RTF
and Water  Programs with  analyses is expected over the
next  few months. Outside  EPA,  major  support  and
development  collaboration are  in  progress  with the.
groups listed in Figure 15.

    The general SEAS documentation plan and pro-
ducts  are  presented in  Figure 16.  Each volume  is
complete  by   itself and   fully  documents  SEAS
information for a particular viewpoint.
72

-------
SEAS IS A COLLECTION OF INDEPENDENTLY OPERATING MODELS

AND ACCOUNTING  PROGRAMS WHICH FORECAST THE  FUTURE OF

THE ECONOMY AND ITS ENVIRONMENTAL AND RESOURCE IMPACTS,
                          Figure 1
               The Strategic Environmental Assessment System
   PROVIDE  A COMPREHENSIVE MEANS  FOR EVALUATING THE
   LONG-RANGE IMPACT OF TRENDS, PROCESSES, ACTIVITIES, AND
   POLICIES ON THE ENVIRONMENT.

   •   NATIONAL/REGIONAL LEVEL

   •   MULTIMEDIA POLLUTION GENERATORS

   •   TEN- TO TWENTY-YEAR TIME HORIZON

   •   COMBINED MAN/MACHINE CAPABILITY

   •   ALTERNATIVE SCENARIOS

   •   STATE-OF-THE-ENVIRONMENT REPORT
                           Figure!
                        SEAS Objectives
                                                                 73

-------
Change
Agents
                                               Data Base
                                              Inf. System
Processes
                   Stocks
             Feedback ~'
Residuals
Effects
                                                             \
Reactions
                                                       I
                                                          Reports
                                                          I7T"~
                                                                    n

                                                                    o'
                                                                                        <*>

-------
 DATA
iuURCES
PROCESSES
 DETAILED
OUTPUT FILE
SUMMARY
 OUTPUT

INFOKUM DATA BASE
FINAL OIMAftjQ «V fctCIO*
INWf STMCNI
itOf OAI*.



STANDARD INPUTS.
FORECASTS'
IAIOM roncf
looms, IMPOSTS
•(gut


REGIONAL 9ATA.
ICONOMIC ft (fJPLQVM-INI
COUNTY •.Vt'Jittt »MH **•


TRANS/OHJATIOti DATA BASS
001 VW! fSI'V4Tf|
(•>> MHJt (TUB*
«FA STUPY tg| 001
TBAHI VOOCl 0*T*
FAUCI1T ilUOf
CM | OTttft CQhT ITVDill

W-AltHLAI VAIAttASE
«».H
ICO«tOHiC WffACI fTUOift
'
LANp U« HtHHVU QATA Mil
URBAN IA4Q Vfl
••ovmiittiirc
CCOMOMtC MMASCN
us offf. o* AcmcunuM
MlllfAAL T|«H|QOK


^—













1



1















^









— ^



^^



r



1— >




1_^,
	 ^




IN FOH UN (NATIONAL*
MACHO (COMOMETHlC MODEL
III IICTOH INIIHlNOUItaT
INVlKONMINTALt* lM»O*T
HDI tOUATlONt


RESGCMINATIONALI
«ICfO« ft «U«fICtO*
CAPTUIIIO * aicrCLio
AIM. IQIIOI HADlAtlON.
'


OIIAO ibcaioiaki
TO SlATff
•»e*f
IMOUCFN* ewrruT >v MCTOR

MAT ION AL INDUSTRIAL
H ItlOUAL LfVELS
Alt MIDI*
•r iMOusiaiAL ucioa
HICVCt.lMC

RCClOMAL INDUSTRY At
MIT • CBOH •tllOuALf
• * HIIIOUAl
PtMCIMI AlArlMIMT


INDUSTRY AftATEHCMT COSTS
ANMUAl COCTB O»f •ATtOft
• MAIOTtMA.«CI COST*
CAPITAL COSTS
COST Or HIM f ACILITIM
AIM
tVATIW

RESIDUALS FROM TRANSPORTATION ft
COST FOR ABATING AUTOMOBIlt
POLLUTION
VMT M«T TOM Mill
tIAATI
I AIM •tUOUA.lS
com or iwititOH eivicis
irnciiwcv

COSTS f(W CHAHGUIG FUlLS
7 Alll ftttlDUALS
*UtL CO*WI««IO* COSTf
COAL
OIL
SOLID WASTf MATERIALS*


ffcCiMCIAtlOh.
LAMOFlLL, OIHIA

LAND USC Hf SCBVES lUHMAAV
fllA«S»D«TATIOIi
CO«M«*CUt.
ItC.
AOIIICWLTIIM
MMMfUWO
laTi-syissr-"
» OBCtTLAttP WtMl


•V
f



sk
w



4
T


k
1


t
1


4
f
I
sk
m
T


sk
t


NATIONAL SUMMARY PRCKHAU
VARIABLES ft. ASSUMPTIONS
UfMARiO OISCKirtiON
tCINAMlO VIAHS
LIVIL Of ACCHSCATION
NATIONAL SUMMARY ECONOMIC t'HUjtLiiufis
f«PMLATIOH
tie
DIKfOSAlL* ikCOMC/CAriTA
TOTAL OUTPUT
tic
NATIONAL SUMMARY ANNUALUID
• COMOUlC
COMSUHfH MASII
AIR
WATfR
NATIONAL SUMMARY PEST 1C 101 MtSlpUALS

INOUSIHIAL
ANiH^ALUIB - C^ftN^i
NATIONAL SUMMARY TOTAL K(S*DUAI s
tCOMOMIC AMVUAlUf 0

CONtUHCH WA1TI ISfCHtMEMTk
•TtlOUAi.fi KHfihkf
IND.
IfNI
• ATI H
ihOUtriHAL
SOL'O M)A«I|
INOUSTMIAL
•AOIATION
SUMMARY COST OF ABATEMENT
BlZf Of PiAWl
MuMBia Of riANit
IOI*L ANNUAL C0*t
•ULL TDIATUINI CO^U
ffAf TNIATMINT C&KIt
SOLID WASTI
• V TVP| OlfQfAt
TMAteVONTAT'ON
COIT >OH AUIOWUIIItf t.HATlM.t*tl
C«»
-------
      RESIDUALS
  1  TREATMENT    I	k     •
-^ PROCESSES    T^AL
  v               DISCHARGE
                                                                                         • Air
                                                                                       • • • • Und
                                                                            FT   %
    InwntoriR
Pictorial RtpmmtMJon of SEAS Prototyp* Syittm
                                                                    )  Ali I WATER  COSTS
                                                                       ASSOCIATED
                                                                   ABATEMENT COSTS
                                                           DISAGGREGATION
                                                                                                                      TRANSPORTATION (RigteniH
                                                                                                                            Mit»«g« foncittt
                                                                                                                      Frtlph           PMungw

                                                                                                                       Truck          Auto
                                                                                                                       R«l           Air
                                                                                                                       Wittr .         Rill
                                                                                                                       Air            Ripld Trirall
                                                                                                                       Pip*
     SPACEHEAT (RtaiontH
      Futl &  RnldiMb lor:
       Multl-F«mlly
       Stngto-Fimlly
     C Comnwrdil
     % ttxttmtnt for  mrlouf
      control  UchnlqiKt
CONSUMER SOLIt? WATE (F^gkmil)
Solid Wait* Fortcnti of Mtttritl
Avtrig* OilpMil Coitl ptr ton for
  Incirwmtor. Ljnd Fills, Otlwr   .  .
 LAND USE RESERVES (R«giornl|
Projtction «t Acm el Und  in wch
          Ut* C»tt»ory—
        CropUml lnd»
        ForMtbnd lid»

-------
 E
  INFORUM
ector Output!
  INFORUM
Sub-sector
  Output
             Base   i
                                                  Future Year n
                     Year
             Cross  Residual
             per unit  out-
             put in base yr
                            Technology
                             Changes
                          Gross  Residual
                          per  unit  output
                            in  year  n
            Total  Gross
            Residuals output
            by Sector in
            base year
                                                 Total uross
                                                 Residual out-
                                                 put by sector
                                                 in year n	
            Extent and
            Efficiency of
            Abatement in
            base year
                             New Ahatt'tnont
                             improveu
                            Extent & Eff-
                            iciency of
                            Abatement in
                            year n
Net Emitted
Residuals in
Base Year
                        Captured
                          Residuals
              IUnrecycled
              (Transformed)
               [Recycled
              (Transformed)!
                                              etc,
                                               etc.
                              Figure 6
                          Residuals Framework
                                                                             77

-------

AIR
WATER
SOLID WASTE
PESTICIDES
RADIATION
Industrial/
Electrical
TSP, SOX, NOX, CO, HC
10 participates plus
Mercury
BOD, COD, SS, DS, two
Nutrients, Phenols,
9 dissolved solids
6 others
6 types of combustible
material
6 types of noncombustible
material
2 types of mining waste
Herbicides, Insecticides,
Fungicides, Misc.
4 Radionuclides to Air
8 Radionuclides to Water
Radionuclides to Land
Transportation
TSP, SOX, N0y, CO, HC
ii A
Lead




Residential
TSP,SOX,NOX<
CO, HC X

8 types of
combustible
5 types of
noncombus-
tible material


                                                     Figure?
                                            Pollution by Major Sources
78

-------
Program
Name
tlMIKPOR
ENERGY
INSIDE
RESUPDT
RESC0PRT
RESGEN
RESPRT
WCSTABT
ACSTABT
DISAG
DISAGPRT
PTRANS
FTRAK3
SPACHEAT
CONRES
LANDUSE
JOSTCOMP
SEAS
Dascriptive Title
Inter-Industry Forecasting
Model (INFOaUM, Had if led)
Kational Energy Forecasts
Sector Disaggregation By Side
Equations
RESGEN Coefficient Update
RESGEN Coefficient Report
Generation of National In-
dustrial Residuals
RESGEN Report
Water Pollution Abatement
Cost
Air Pollution Abatement
Cost.
National Industrial Pollution
Dis aggregation
National Industrial Pollution
Dis aggregation Report
Passenger Transportation
Residuals
Freight Transportation
Residuals
Non-Industrial Space Heating
Residuals
Consumer and Comercial Waste
Residuals
Land-Use Reserves
SEAS Post Processor

Core
Requirement
(bytes)
315K
130K
130K
130K
80K
100K
105K
100K
72K
216K
100K
126X
84K
76K
75K
100K
190K

Approximate
CPU Time
(KINS)
3.0
O.S
0.5
0.3
.25
1.5
1.5
0.5
0.5
13.0
1.0
0.6
0.4
0.3
0.2
0.2
2.5
4
          Figure 8
SEAS Program Characteristics
                                                                     79

-------
o
                                                                                FEEDBACK
                                                     ("UMINFOR"&
                                                          INSIDE'1
                                                                                                        ("WCSTABT" & "ACSTABT")
                          USER
                          INPUTS
     SEAS
PRE-PROCESSOR

 NEW    500
  NATIONAL
  ECONOMY
    MODEL
CHASE-INFORUM
 MOD     200
                                                                 4
                                                                  i
                                                                  i
                                                                       fEEDBACK	|

-------
ASSUMPTIONS:
         RONMENTAL STANDARDS:
          WATER PER CURRENT SCHEDULE
          INDUSTRIAL AIR EMISSIONS FULLY  ENFORCED IN 1977
          AUTO EMISSIONS;  INTERIM STANDARDS  USED FOR 75-76
                           MODELS; FINAL  STANDARDS USED FOR
                           77 MODELS
     POPULATION;
          • TOTAL. LABOR FORCE AND HOUSEHOLDS * SERIES E
            PROJECTION  (DEPARTMENT OF  COMMERCE)
          • STATE SHARES SET BY OBERS
     ECONOMIC;
          , BASED ON PRE-ENERGY CRISIS DATA FOR:  DISPOSABLE INCOME
                                                  UNEMPLOYMENT LEVELS
          • STATE SHARES SET BY ObERj)
     ENERGY USE/TECHNOLOGICAL CHANGE:
          • PROJECTED FROM HISTORICAL  TRENDS
                                Figure 10
                            The Basic Data Scenario
                                                                      81

-------
                                                       GROWTH  RATE - ANNUAL
PARAMETER
AIR
PARTICIPATES
SOX
NOX
HC
cu
WATER
BOD
COD
SUSPENDED
SOLIDS
DISSOLVED
SOLIDS
NUTRIENTS
ACIDS
BASES
PROCESS WATER
SOLID WASTE
COMBUSTIBLE
NON-COMBUSTIBLE
MINING
EE_SILC_LD_£S
RADIATION
(POWER PLANTS)
AIR
WATER
LAND
VALUE 1971"

20.50
30,66
19.91
29.78
125.70

8.13
10.01
9.16
25.30
.06
.58
.11
22,79
725,80
63,03
3316,19
.136

4578
7
1,150,UOO
71-74

-10.12
-1.3%
+4,1%
-3.27,
-4.6%

-11.7%
-11,11
-11. 9*
-1,5%
-7.0%
-11.9%
-17.1%
0.6%
1,8%
1,lX
3.4%
1.8%


41,8%

71-77

-18. Ox
-7.2%
+2,1%
-4.8%
-7.5%

-23.0%
-21.7%
-22.8%
-3.7%
-38.9%
-15,5%
-66.4%
-3,4%
2,2%
3, IX
3.2%
2.0%


40,2%

71-80

-12.1%
-4.5%
+1,0%
-4.7X
-8.9%

-15.6%
-14.7%
-15.5%
- 2.2%
-27,2%
-10.3%
-51.3%
-l,8x
2,1%
2,j%
3,0%
2.0%


34.2%
i
* RESIDUALS  IN MILLION TONS EXCEPT:   WATER IN TRILLION GALLONS
                                        RADIATION  IN  THOUSAND CURIES
                                          Figure 1 1
                                Summary: Pollution Data Output
                                  (National - All Sources, Net)

-------
            71                .85  71                85 71                85  71

               PARTICULATES        SULFUR OXIDES        CARBON MONOXIDE
BOD
                      CURRCNT LEGISLATION
                      HELD AT  1971
                                                                                                     I-
                                                                                                     3. ^
                                                                                                     E
00

-------
   Irr.proved  Insulation - savings of 25% in r.ew housing (space
                         conditioning & water heating - 3(
                         housing stock)

   Old  Stock Housing - 10% savings (comprise 70% of housing stock

   Substitution to Gas & Petroleum Products

   Reduction of non-all-electric housing stock by conversion to
     multi-family dwellings - 4.5% savings

30MMERCIAL
   Better Insulation - savings of 35%

   Retrofitting of existing buildings - savings 15%

   New Stock with total energy systems - savings of 30%

INDUSTRIAL

   Better Housekeeping - 25% savings

   Recycling & other Housekeeping measures

TRANSPORTATION

   Lighter autos, less specific power, lower acceleration,
   improved driver procedures, reduction in non-essential
   driving caused by higher fuel costs - 35% savings

   Reduction of traffic by 1/2 from car-pooling

   Long-haul trucking replaced by train - 25% savings of fuel

   Reduction of short-haul air travel by 40% by substitution
   of bus and rail - 35% savings
1.23 quads


1.15 qrads

2.54 quads


 .196 quads



1.76 quads

1.76 quads

 .50 quads



6.76 quads

2.81 quads
7.10 quads

1.20 quads

  .68 quads


  .90 quads
e
1
3 3
               2
               n
               o
               •*-.
               £
               3

-------
              200
         §
              100
         <
         u
                   GNP
                                                                     100
JOBS
BTU        BTU

         INDUSTRIAL
PARTICUIATES   SOX
                                                                                              HC
•s
• /•

-------
 USERS AND USES
 EPA

    Cost of Clean Air,  Water - 1975
    AQMA Regional Definitions
    Industrial Waste Stream Analysis
    Energy Conservation Analysis

 National Commission on Water Quality

    Fall - 1974    Economic Analysis
    Fall - 1975    Final Report

 CEQ

    Environmental Quality 1974 Report


 Water Resources  Council

    Regional Water Effluent Forecast Data
    Alternate  Scenario  Outcomes
NIH   (RTF)

   Long Range Economic/Environmental  Projections
   (Benefits Module)
Bureau of Labor Statistics

   Environmental I/O Data
   (Review of Productivity Data)
                     Figure IS
                 SEAS Users and Uses

-------
    DESIGN
 DOCUMENTATION
IMPLEMENTATION
   PROCESS
  WORKING PAPERS
DESIGN PAPERS
PROGRAM
 DESIGN
 SPECS
                          PROGRAM
                       IMPLEMENTATION
                       AND VALIDATION
    FINAL
DOCUMENTATION
                            PROGRAMMERS
                               MANUAL
                                                       DATA
                                                      SPECS
                                                      MANUAL
                           Figure 16
                     SEAS Documentation Plan
                                                                        87

-------
                             A COMPUTER CONTROLLED DATA ACQUISITION AND
                        RETRIEVAL SYSTEM FOR USE IN AIR MONITORING PROGRAMS

                             By Marvin Hertz,* Kirby Kyle, Anne Duke, and George Ward
 INTRODUCTION

     Rockwell  International Science Center  and the
 Human Studies Laboratory of the  Environmental Pro-
 tection Agency (EPA), have designed and developed an
 automated data acquisition  system  for use in the Com-
 munity Health Air Monitoring  Program (CHAMP). The
 purpose of the CHAMP system is to acquire reliable air
 quality data for use in  the Community Health Envi-
 ronmental Surveillance System  (CHESS). The technical
 objective  of the CHESS program is to correlate the qual-
 ity of  the air  with the  health of  the  population. An
 in-depili  health  effects  study  requires highly  reliable
 data: the quantities of data required from  each of the
 CHESS neighborhoods indicated need for automated ac-
 quisition. processing, and validation of the physical data.

     To meet the program objectives, the CHAMP net-
 work will have 23 remote stations in  six geographical
 areas around the country  monitoring the  air quality.

     Changing  program  requirements and advances in
 aerometric instrumentation made flexibility in the acqui-
 sition  system mandatory; capability for growth in the
 program suggested  use of intelligent controllers at the
 remote stations. In order to avoid the loss of large quan-
 tities of data, rapid retrieval and diagnosis of data valid-
 ity to alert remote station operators of failures or poten-
 tial malfunctions in the instrumentation is required.

     To satisfy  various  program objectives, a central
 computing facility  at  Research Triangle Park, North
 Carolina,  with telephonic communication with  each of
 the remote acquisitions systems and data retrieval capa-
 bility was included in the program plan. A Digital Equip-
 ment Corporation (Dl (') PDP-8 minicomputer  was se-
 lected   to  control  functions al  the remote  and the
 PDP-11/05 and  PDP-11/40  computers  operating in  a
 dual processor  mode were  chosen for the central con-
 troller (Figure I).

REMOTE

     The  PDP-X  minicomputer in  the  remote  station
serves as an interface between  the analy/ers and asso-
ciated system, magnetic tape data storage,  the  remote
 field service operator, and the telecommunications net-
 work.

     The  PDP-8  data acquisition  software  package,
 DAX, handles all of the remote functions of acquisition
 and storage of data on magnetic tape, limited hardware
 control, interaction with the  field  operator and tele-
 communications. An innovation included in the software
 is the  concept of an "instrument handler" to allow com-
 plete flexibility in controlling instrumentation with  dis-
 similar operating characteristics.  Each instrument han-
 dler is a sequence of code specifically pertaining to the
 functional peculiarities of the sensor; the handler's capa-
 biiitics  include  actuation of  appropriate  valving,
 detection of conclusion of sensor cycle time, and  obser-
 vation of settling of a valve. The "instrument" is defined
 as  the group of associated analog signals and status sig-
-nals required for validation  of the valw detected from
 an  aerometric sensor.

     The analog signals  are switched  through  a 48
 channel multiplexor  and digitized using a 41 microsec/
 word analog to digital converter. The analog signal from
 each channel is sampled once a second; values for each
 channel are averaged by the  computer over a selectable
 period  of  time  (nominally 5 minutes).  In addition,
 96 bits of digital information representing system  status
 conditions are sampled. At the end of the averaging  pe-
 riod, the detected voltage and  status signals relevant to
 the validity of that datum are placed in a buffer in core.
 When  the buffer fills, its  contents are  written to mag-
 netic tape to await transmission on request from  the
 central computer.


     One of the several unique features of the CHAMP
 system is the extensive auxiliary information  which is
 acquired and transmitted with  the aerometric data. Of
 the 48 analog  signals. 20 are  presently allocated  for
 direct  ambient  measurements (primary  channels), and
 the remainder  determine  system operating conditions
 (secondary channels) (see Table I).

     This information is  transmitted   to  central  for
 machine validation and is available to the field operator
 via  local teletype to  aid in  daily maintenance of the
*S|H-akcr

-------
      Name
                                                  Table I
                               Analog Signals, Status Checks and Validity Criteria
                             Instrument
 Analog
Slot No.
                  Validity Criteria
Wind speed

Wind direction
Outside temperature
Relative humidity
Barometric pressure
Inside temperature
Paniculate manifold
   impactor flow
Paniculate manifold
   filter flow
                                 I
                                 2
                                 3
                                 4
                                 5

                                 6

                                 7
   0

   1
   2
   3
   4
  14

  16

  17
Limit check (0 volts would assume offset) IS volt power
   supply
Limit check (0 volts =>    invalid) 15 volt power supply
                        invalid) IS volt power supply
                        invalid) IS volt power supply
                        invalid) IS volt power supply
                        invalid) IS volt power supply
Limit check (0 volts =»
Limit check (0 volts =>
Limit check (0 volts =»
Limit check (0 volts =»
Limit check (0 volts =>   invalid) Deviation from nominal

Limit check (0 volts =»   invalid)
Carbon monoxide
                                                                     0 volts =»  invalid
                                                                     Range
                                                                     Calibrate
                                              II         Ambient  =   Power
                                                                     CO flow indicator
                                                                     Intake manifold
                                                        Calibrate3 Above+ Instrument 13
Ozone
Ozone flow
Cthylcnc How
                                             35
                                             33
            0 volts  =*  invalid
            Range
            Calibrate
            Power
            Sample test point on
            Flow out of limits
            Sample manifold

            Sample flow in limits
            Mass flow test

            Ethyiene flow in limits
            Ethylenc flow value
            Mass flow test
            Ozone air override
            Calibrate = Above + Instrument 13
PAN
Hydrogen flow
                                10
                                                        PAN air flow
                                              18         PAN calibration value
                                                        Sample manifold

                                                        HT flow within limits
                                              29         PAN H2
                                                        Mass flow test

-------
                                                  Table I
                                                 Continued
       Name
Instrument
            Analog
           Slot No.
                                                                          illdity
                                           Validity Criteria
 Sulfurdiotfde
                                II
                                             38
                        0 volts =» invalid
                        Range
                        Flame Out
                        Calibrate
                        Power
                        Sample flow test
                        Sample manifold
                        Flow within limits
                        Sample flow lest
                        Mass flow lest

                        Flow within limits
                        H2 flow off
                        Mass flow test
                        Flame override
                        H2 pressure out of limits
                        Calibrate =  Above + Instrument 12
 NO


 NO2

 NO.
 NO-NQX Sow
12
Vorivertcr&mperaturc
                 6        0 volts =» invalid
                          Range

                 7        Range

                 8        Range

                For all     Calibrate
                          Power
                          Sample test port - sample manifold

                36        Flow within limits
                          Mass flow test
                          Sample test port
                          Flow within limits
                37        Mass test value
                          Oi value
                          OT pressure out of limits
                31         Within limits

                45         Within limits
                          NOX-NO/NO, ^ .\
                          Calibrate = Above + Instrument 13

-------
                                               Table 1
                                              Continued
     Name
                           Instrument
Analog
Slot No.
Validity Criteria
Dilution air - Range 1

Dilution air - Range 2

Dilution air - Range 3
Calibration flow
Oxygen flow
                                13      4
O/onc generator
SO-> permeation - tube temperature
   41         0 volts =» invalid

   42         All Word 7

   43         Meatless air dryer
             Air pressure out of limits
             Humidifier

             NO on
             H2Son
   34        CO on
             Mass flow test
             Gas to manifold on

             Oj flow measure
             NO-7 flow measure
             S02 flow measure
   40        Mass flow test
             S(>2 span gas
             NO2 span gas
             Oj span gas

   28        O^ generator on
   44        Within limits
Combustion alarm
                                14
 Reference voltage
             Within limits
              LEL
   46         Exhaust manifold
              24 volt power supply
              Illegal entry
              H-» ext
               *.

   32         Within limits

-------
 instrumentation. The local operator may include supple-
 mentary infoiniation as "manual entry data" or "journal
 enirics"; this information  is recorded on magnetic tape
 and transmitted to central as separate data for use in the
 quality  assurance reports and  area  engineer  data
 validation.

     Since each instrument may have  a  different aver-
 aging or cycle time, the capability for asynchronous han-
 dling had to be implemented. An interrupt is generated
 by the  system clock at a fixed rate and an instantaneous
 voltage is  acquired  from each sensor; the handler for
 each on-line  instrument is called in the acquisition mode
 and all  necessary operations of control, acquisition, and
 storage arc performed. The instrument handler may also
 be called in three other  modes:  1)  the initiah/.e  mode
 which allows (he field operator to specify operating pa-
 rameters; 2) the  reset  mode  which resets  cycles and
 iniliali/cs the lime sequencing; and 3) the calibrate mode
 which  allows  the  field  operator to perform  the cali-
 bration  procedures interactively through  the console
 teletype.

     The  DAX program is organized around a priority
 interrupt  scheme  coupled to foreground/background
 mode of operation; data acquisition occurs at the highest
 priority level to guarantee no loss of data. Background
 functions are handled as processor time is available.

     DAX handles the data transmission  task in  fore-
 ground  mode. When polled from central, the  magnetic
 lape is  backed up  to the position on  the (ape where the
 last  transmission terminated. Data is  read into  core and
 iransniilled record by record with an acknowledgement
 protocol at  the conclusion of the transmission of each
 record.

 TELECOMMUNICATIONS

     Data is  retrieved on the request  of the central com-
 puter system 1'nnn each of the remote stations  via a dial
 up phone line at  two-hour mleivals. The central and
 remote  wmpuicis  converse via a telecommunications
 system  consisting  of  modems operating in full duplex
 mode at Ihe  rate of 1200 baud from remote to central
 and  150 baud in the reverse direction.  Polling is under
 complete coin ml of the  central controller  and is the
 primary function of one of the two central processors. A
 file on disk contains the phone number of each station
 in the  formal lequired  by  the  Automatic Calling Ut'i'
(ACH) iiilcrlacal to the  PDP-11 system.  An  alterable
 poll in}1. sci|ueiuv queue is also disk resident. A rigid pro-
 tocol has bivn established  It) guarantee accurate
     iti  and u-iiu-val of dala. Central  makes several
 to establish  contact with a remote station before aban-
 doning the attempt and placing the station at the bot-
 tom  of the polling queue. A hardware carrier detect
 protocol  establishes the  communications  link. Each
 frame is checked  for parity and  framing errors by the
 modem  controller. Checksums are computed for  each
 512 frame record and compared  by the computer. An
 acknowledge character  is exchanged indicating correct
 receipt of the record. Should any of the tests fail, several
 transmission retries are made. Communications are ter-
 minated by receipt of a character from the remote indi-
 cating the end of data or by failure of the remote to
 transmit in the requisite period of time.

 CENTRAL CONTROLLER

     The central controller for the CHAMP network is a
 dual processor system with a full  complement  of input,
 storage,  and display peripherals.  The  heavy burden on
 processor time placed by  the telecommunications and
 real-time processing of the large quantities of data antic-
 ipated justified the choice of the dual processor system.
 A POP-11/40 was  selected  to  perform the  tasks asso-
 ciated with management of the  large data  base to be
 generated by the network. The telecommunications and
 real-time  processing tasks   are   handled by  a
 PDP-11/OS computer. The  two  processors are inter-
 connected by a device called the UNIBUS window which
 takes advantage  of the  unified asynchronous data path
 architecture of the 11 system. The window allows each
 processor to address core and peripherals on the other
 processor as if it  were its own. In addition,  the DEC
 Memory Management hardware option was added to the
 PDP-11/40 to handle addressing above 32K in the 1   bit
 system.  An  extensive  complement of peripherals in-
 cluding two  cartridge type disks, three tape drives, an
 electrostatic  printer-plotter, line printer, and CRT dis-
 play   were  initially   selected;  the  rapid  retrieval
 requirements for large quantities of historical data sug-
 gested  addition  of  a  2316 type  disk  file  and
 implementation of an extensive file management system
 after the start of the program.


     One  of the program requirements is timely diag-
 nosis of system faults at central; the need to perform a
 cursory check on all data  as  received was apparent.
 Hence, the  PDP-11/OS processor was dedicated to the
 telecommunications and real-time checking of values
against preset emergency  and instrument malfunction
 limits,  and storing of dala on  a  magnetic tape as  re-
ceived. The Level I tapes so created are an image of the
 tapes recorded at the remote station, but with  the data
interleaved by station in (lie  polling order. The 11/40

-------
 processor is devoted primarily  to data  validation  and
 quality assurance tasks. Although the DEC supplied Disk
 Operating  System (DOS), is  suited to data reduction,
 analysis and display, the real-time applications and pro-
 blem  of shared  peripherals  required  additional  super-
 visory  software.   A  Real-Time  Monitoring System
 (RTMS), developed  as  a Rockwell proprietary package,
 is used in  the  CHAMP network  to allow the two pro-
 cessors to  act in a single foreground/background  mode.
 RTMS controls peripherals access, implements memory
 management using the  core in the 11/40 not accessible
 by DOS, and protects  the  core memory allotted to each
 task  from  destruction  by other  tasks. RTMS supports
 the  dual modes of  operation of the 11 system,  kernel
 and user; RTMS in the 11/40 runs the kernel mode  for
 system protection with the  nonreal-time work  in  the
 background under DOS running in user mode.

     A  fundamental  system objective  is  to provide
 machine  validation  of data   to  ensure  reliability
 (Figure 2). Each primary  channel datum is  checked  for
 non/cro voltage, normal setting of all valves, power to
 ihe  instrument and  digitally measured tolerances  for
 proper ambient sampling. Related secondary channel an-
 alog signals  are tested  to  verify  that values are  within
 tolerances  and status  conditions  necessary for  correct
 sampling are checked. Information referring to the gen-
 eral system operation is checked  on a  five-minute basis.

     Once  a datum is determined by the machine to be
 valid, the engineering unit value is computed by applying
 the  calibration constants  to the  voltage; this value is
 stored on tape and disk. This Level 2 (validated)  data is
 used in further processing and, when accepted by  the
 area engineering,  is the  end product  of the CHAMP
 system.

     Since  a crucial  factor in obtaining accurate data is
 the calibration of the instrumentation, the same in-depth
 validation procedure is applied to the calibration  op-
 eration to verify proper flows froin the calibration gases
 and  correct valve setting  on  the  (low  system. If  an  im-
 proper calibration   is  detected  ul central,  the Held
 operator is contacted  and advised to repeat the oper-
 ation.  The calibration constants arc computed from  the
.known concentrations of the source gases; the constants
 are checked  for goodness  of fit to the measured  points
 and for linearity where applicable.

     Initially, the validation procedure was envisioned as
 a  daily  procedure involving  processing of  the  Level  1
 tapes  from the previous  day to  create a Level  2 tape
 containing only machine validated data. Addition of the
large random access disk impacted the procedure for val-
idation; data will be recorded on  the disk as polled. It
became apparent  that validation  could be done as  a
butch  operation  at  regular  intervals or on  demand.
Hence, a user could have access to validated data at any
time; tape need be  used only as a secondary storage
medium for archival  or temporary storage purposes. The
validation  procedure will reveal  system problems and
alert cognizant personnel at central; hence, more fre-
quent  validation will contribute to timely anticipation
and prevention or correction of system difficulties.

     Several additional  features have been  included in
the background  software  to assure prompt diagnosis of
system problems at central. The system status condition
and analog values may be displayed on the CRT as each
station is polled. Secondly, a user may "demand" poll or
alter the  polling sequence to retrieve all  data from a
particular station since the last polling. Thirdly, a user at
the central facility may  "view" the real-time operation
of a remote station. The  one-second values, as sampled
at a remote station, may be transmitted to the central
facility and displayed on the CRT within the limitations
of the transmission rate.

     A quality assurance file is maintained on a separate
disk unit. The  maintenance  operations  performed at
each remote site are encoded and entered as  "manual
entry" data, transmitted to central and  recorded in this
file during validation procedure. Data  acquired during
the calibration  procedure is  saved.  Plots of  the cali-
bration values over an extended period of time  may
indicate a steady degradation in  a sensor which would
not be detected during the normal validation procedure.
Correlations  between parameters or stations  may in-
dicate  a problem which could escape the validation tests.
Tabulations and  displays of the above, in addition to
reports of failures and other anomalies, are printed on  a
regular basis and are  available to the user on demand.

     A statistical package which fully  utilizes the periph-
eral equipment will  be  available  to users. The program
will collect the requested parameters  from the data base
and perform the statistical summary desired. Included in
the  repertoire will be  calculation  of frequency dis-
tributions, correlations, threshold  exceedances, and
minima, maxima and mean values. Reports may be tab-
ulated  or plotted; a  software  goal  is to  make all
summaries device independent, allowing the user to se-
lect the  output  media at run time. The statistical sub-
routines will be of particular value to the field manager,
quality assurance personnel, and the area engineers at
EPA.

-------
     Final  acceptance of validalccl  data is the rcspun-         will entail heavy usage of the statistical and display tools
sibility  of  the area engineer.  Data editing capability is         available to him.
provided prior  (o the production of the final data base              The features of the automated CHAMP system re-
tape for the health effects  study. It is anticipated that         present  a significant advance over previous systems, both
his  subjective evaluation and verification of data quality         in terms of quantity and reliability of date.

-------
ANALOG
VOLTAGE
                         REMOTE
                                                                     CENTRAL
                                        Figure I
                                   CHAMP Data Flow
                    CONVERT
                     FROM
                    VOLT TO
                   ENG. UNITS
  VALVES
   SET
CORRECTLY
                                       CHECK
                                    FLOW RATE(S)
LEVEL I
 DATA
                                     REJECT IF
                                      OUT OF
                                    TOLERANC
  REJECT
    IF
   NOT
CALIBRATION
 CONSTANTS
                                        Figure 2
                                   Machine Validation
                                                                               NO
                                                                           FLAG AS
                                                                           POSSIBLY
                                                                           INVALID

-------
                               AN AQUATIC ECOSYSTEM SIMULATOR AND VS/8

                                                By David M.CIine
 INTRODUCTION
      The Aquatic  Ecosystem Simulator (AEcoS) is a
 large-scale simulation chamber in which a variety of envi-
 ronmental parameters may be controlled and monitored
 over a wide dynamic range. AEcoS will  be utilized to
 verify aquatic ecosystem mathematical models with bio-
 logical experiments conducted  under controlled condi-
 tions. This approach will  provide insight into aquatic
 ecosystems that cannot be found in bench experiments
 or in the real world.

      A minicomputer system was chosen  to control and
 monitor the AEcoS, as well as to process the acquired
 information. The system has since evolved from a min-
 imum configuration, which was specified to perform cal-
 ibration of the many transducers, to a fairly sophis-
 ticated computer system that allows full  control of the
 environmental  parameters,  data  acquisition  of trans-
 ducers as well as analytical  instruments, and background
 processing.

     Considerable  attention  was  given to the  area of
 background processing and, as a result, a  real-time soft-
 ware operating system was developed to allow multiple
 independent  background tasks  to execute  in a virtual
 memory mode. Since the software was not available
 from the manufacturer, a decision was made to imple-
 ment  the  real-time  virtual memory  operating system
 in-house.

 AQUATIC ECOSYSTEM SIMULATOR

     AEcoS is the implementation of an  idea spawned
 by  Dr.  Walter M.  Sanders,  III,  Chief,  Freshwater
 Ecosystem Branch, Southeast Environmental Research
 Laboratory, National Environmental  Research Center,
 Corvallis, Oregon. The concept, initially conceived in the
 early 1960's has been transformed into a  completed fa-
 cility  that was  formally   dedicated  in  March  1973.
 Briefly, AEcoS is u chamber 72 feet long. 12 feet wide,
 and  c> feet high  and houses an  experimental  stream
 (>4 feet  long, 1 .5 feel  wide, and 2 feet deep.  The arti-
 ficial stream  consists  of  eight  8-foot  Teflon lined
 Plexiglass sections; thus, any of the 8-foot sections may
 be altered without altering the other sections.

     Air  and  influent water  temperature  can be  con-
trolled with an accuracy  of one-half of I°C over the full
 dynamic  range of 0° to 40°C. Independent of temper-
 ature, relative  humidity  can be controlled to  within 2
 percent over the dynamic range of 20 to 95 percent and
 can be changed by a maximum of 60 percent in one
 hour.

     The influent water supply can provide up to a max-
 imum of 2000 gallons per day through the use of four
 30-gallon per hour  stainless steel deionizer-distillation
 units, each of which has  a  storage capacity of 500
 gallons. Dissolved gases, nutrients, and pollutants may
 be introduced  into the water for  preparation  of  a tai-
 lored experimental design.


     Radiant energy simulation within the chamber is
 accomplished by combining lamps of different colors
 and intensities. The  radiant  energy system consists of a
 light feedback control system, 833 florescent lamps, and
 100 infrared lamps, all of which consume a maximum of
 200 kilowatts.
     Instrumentation  for  AEcoS  includes  48
 temperature sensors at various depths along the water
 channel. Sensors are also  installed at each eight-foot sec-
 tion to  measure  pH. Nine chemical analyses  are  per-
 formed  at  each  of  nine  locations by hydraulically
 multiplexing fluid  samples with  a series of  solenoid
 operated  valves  and routing  them  into  only    inc
 Technicon autoanalyzers.


 COMPUTER HARDWARE


     A  Digital  Equipment Corporation  PDP-8/E mini-
 computer was  specified initially in lieu of hard-wired
 logic to act  as a calibration system for the AEcoS instru-
 mentation system.  The  decision  was based  on cost
 effectiveness and  flexibility of the minicomputer versus
 hard-wired logic. Subsequently, a management level deci-
 sion was made  to upgrade the minicomputer in order to
 provide control of the  environmental  parameters (air
 temperature, water temperature, and relative humidity),
 the water flow, and the radiant energy system. In addi-
 tion, the capability to perform preliminary data analysis
and  the capability of exporting the acquired data in a
 form compatible with  EPA's computer facilities were re-
quired.

-------
     To ilus end. (lie following hardware was added id
upgrade l'"-' minicomputer:

         Hardware multiply/divide unit

         12 K o f core memory

         Power fail/auto restart

         128 analog inputs

         4 digital-to-analog converters

         5 digital input/output units

         9-track industry  compatible  magnetic  tape
         drive

         1.6 million word disk cartridge unit.

COMPUTER SOFTWARE

     In  conjunction  with the  decision to upgrade the
minicomputer  hardware, a decision was also made to
upgrade the software for the minicomputer system.

     The initial real-time software executive used with the
AEcoS computer system was  obtained from the Oak
Ridge National  Laboratory (ORNL), which had devel-
oped the software for use by many of its minicomputer
systems. The software was core resident and supported a
single teletypewriter, a real-time clock, and a maximum
of 4K of core memory. The  executive supported the
scheduling and execution of seven periodic tasks in a
strict priority manner.

     Since the executive was coded for a predecessor of
die  PDP-8/E.  several modifications were necessary to
lake advantage of the available hardware. Using  the ini-
tial scheduling algorithm as u base, algorithms were mod-
ified  or added to  provide  support   for  a  real-time
programmable   clock, a digital I/O unit, an  extended
arithmetic clement,  an industry  compatible  magnetic
tape unit, and  an analog multiplexer. The resulting real-
time executive  has performed  satisfactorily and is cur-
rently being used on another SERL computer system.

     Also, a disk subsystem and additional memory were
obtained for the storage of data and programs, and the
autoanalyzcrs  and  the gas chromatographs were inter-
faced to the AEcoS computer system. Computer termi-
nal support was also desired in the AEcoS control room
and  in the AEcoS chemical support laboratory to permit
small computer programs to be executed on demand. A
conceptual design of a new real-time virtual memory op-
erating system was therefore defined.

    The resulting specifications required that the ability
to  perform  program  preparation, assembling  or
compiling,  program  loading (absolute  or  relocatable),
program  execution,  device independent I/O, memory
management of 32K  core memory, and file management
be provided while real-time tasks were  being scheduled
and executed. It  was also specified that these require-
ments be fulfilled at each of the seven teletypewriters
supported  by  the   system, which  would demand  a
PDP-8/E with a total of 224K of memory exclusive of
the monitor  and  real-time task memory requirements.
However, the PDP-8/E can only be expanded to 32K of
memory. An investigation of the hardware  revealed that
the minicomputer could  be made to interrupt the soft-
ware executive system whenever a hardware request was
made to change from one memory field to another. Use
of this  fact  made  the  implementation of the virtual
memory system (VS/8) possible while maintaining soft-
ware compatibility with existing PDP-8 programs.

    Since  the  implementation  of the  entire  VS/8
system would be a  time-consuming effort, a working
system, without the device independence and file man-
agement  attribute, was developed to provide the  capa-
bility of using a higher  level language for background
processing. The  hardware manufacturer  provided  an
interpretative  language,  with its  own  program editor,
which was modified to make use of the VS/8 supervisory
calls. The resulting software was utilized immediately to
provide interaction with the chemists in  the chemical
analysis laboratory in conjunction with the automation
of the analytical  instruments,  and to provide graphical
representation of the AEcoS system parameters in real
lime. This version of VS/8 is release 3.0 and will be used
and supported until release 4.0  is completed.

     The capabilities of VS/8,  release 4.0 are as follows:

     1.  The  system permits 8 levels of priority sched-
uling. Within each level  n tusks (where  n is dynamically
alterable) are executed  in a  round robin fashion. All
tasks at any given priority will execute before any tasks
of lower priority.

     2.  A virtual task has 8 core fields (32K) unique to
that task,  which correspond logically to the physical
fields 0-7 of a hypothetical 32K machine. Greater region
sizes would  be  possible by sacrificing direct compat-
ibility with a nonvirtual  PDP-8/E.

-------
     3.  Any given (ask may have a maximum of 4 files
open simultaneously within a system-wide maximum of
32 opened files. Piles shared by several  tasks count as
one open  file for the  system limitation. Files are iden-
tified by a device name, a file number or file name, and
account name dependent on the type of device on which
they reside. Direct access files may specify optional pro-
tection  features such as write lock, and may select the
degree of access permitted to other users.


     4.  Supported devices are subdivided into classes
and are  too numerous to list but include essentially any
I/O device that may be interfaced to the hardware.

     The system has the following requirements:

     I.  Minimum configuration  for a solely real-lime
system  includes  12K. of memory, any real-time clock,
and appropriate I/O and control devices.
     2.  Minimum  configuration  for  a solely  virtual
system includes 16K of memory, any real-time clock, at
least 256K  disk  memory, and one to  seven terminal
devices.

     3.  Minimum  combined system  includes the re-
quired  devices for the stand alone systems with 20K of
memory.

CONCLUSION

     The viable simulation system presented includes the
simulator and a support computer system. Without the
computer system, the AEcoS would be difficult, if not
impossible, to utilize because of the problems inherent
in manual acquisition of data, control of the chamber,
and analysis of collected data. With in-house manpower,
a  hardware/software computer  system has been de-
veloped  which not only supports AEcoS, but also
provides a small in-house computer capability.

-------
                         AN ON-LINE PDP-12 SPIROMETRY PACKAGE THAT SUPPORTS
                                    BOTH ONSITE AND REMOTE STATIONS

                                               By Sam D. Bryan
PROJECT BACKGROUND

     The project described below is an ongoing project
of the Clinical Studies Branch in the Human  Studies
Laboratory at the  Research Triangle Park, North Caro-
lina NERC. The mission of the Clinical Studies Branch is
to study the effects of pollution, primarily air pollution.
on human health in a clinical laboratory setting where
quantitative measurements arc made.
     The lung is tine of I he main organs of interest. One
of the lung projects began ubout two years ago when the
brunch  set out  to determine what effect  air pollution
might have on the growth of lung function in children.
An answer  to this  implied  that the pattern of lung
growth  in a clean air environment first be determined, a
requirement that led to the computer system described
here.  A suitable population of preteenage children was
located at a school in Chapel Hill, North Carolin.

     Although many other  lung function tests exist, the
investigators chose to concentrate on spirometry  tests.
Spirometry  tests measure  how effective the lung  is
simply as a machine that moves air back and forth. The
measures  relate  largely  to  how much air  the lung can
store and how fast it can move air during different stages
of the respiratory cycle. The subjects arc exhorted by
the attending technician to  perform at their best effort;
sometimes many trials arc  required. For  a detailed
ilcscriplion of spiiomctry, sec  Commc.*

     Tin* next question was how lo carry  out the mea-
surements. The  project team had  access  to  a PDP-12
computer,  which  they installed  ut   the  school,  and
located a PDP-12 spirometry package developed at the
University of Chicago Biomedical Computation Facility
by Drs. Earle and Domi/i.t Digital Equipment Corpora-
tion (DEC) is marketing a  PDP-ft system  based on the
University of Chicago package, which they call the Pul-
monary  Testing System.tt  The initial   task was to
modify the  package for the project's configuration and
to redesign it  for use within our experimental setting,
since its original use was within a patient care setting.
Several measurements that were of particular interest
were also added.

     The team rejected the approach of performing the
measurements  in the old manual way since there was an
automated  system available to use more or less off-the-
shelf. The  automated approach is more  accurate  and
many times faster  than the manual  way. Speed  was
especially important since the children involved in the
study would  often have to repeat the breathing trials
many times in  order to  produce usable results.  The
on-line approach  made  it  possible to  know  within
seconds if further testing was required.

COMPUTER CONFIGURATION

     In relation to other PDP-12's, the  project team has
a rather large configuration. It has 16K of core, floating-
point processor, 1.2-million word cartridge disk,9-track
industry compatible magnetic tape, 300 line-per-minute
line  printer, flatbed plotter, as well as the usual LINC-
tapes, CRT scope, A/D converter and relays.

     The  configuration  required  for  the spirometry
package is considerably smaller and is shown in Figure I.
The  programs run  in 8K and  use the LINC tapes for
program and data storage. Team members can copy the
LINC tape  data to 9-track tape for processing on a large-
scale computer such as the Univac 1110.


MEDICAL  INSTRUMENTATION

     The transducer used to convert the rate of flow of
the subject's  breath into an electrical signal is the Med-
science 465 Hi-Fi spirometer. The subject breathes into a
tube which is connected to a bellows  whose back end
moves back and forth on a track as the subject breathes
out and in. A potentiometer is moved in the process and,
 •J.IL Conwoo et at.. The Lung: Clinical t'liysiolo/i.v and Pulmonary Function Tests, 2nd ed. (Chicago: Year Book Medical Publishers, Inc.,
  1963)
 fD.B. Domi/iand R.H. l.arle, "OivLine Pulmonary rum-lion Analysis: Program Design," DKCL'S Proceedings (Fall 1970).

ft Digital l^uipincntC orporation, /Yi/iuonarr Testing Systems; Spirometry (Maynard. Massacheselts, Digital Fquipment Corporation, 197H

-------
  in I urn, serves us par! of a resistor network in an ampli-
  fier  that  produces an output  voltage proportional to
  flow. This voltage  is sampled  by the computer's A/D
  converter  80 times  per second  under control  of a pro-
  grammable clock. The analog sample in the range ±1 volt
  is convened into a digit in the range ±511.

  EXPERIMENTAL  PROTOCOL  AND  PROGRAM
  LOGIC

       Before subject testing is begun, the system must be
  calibrated  so that the computer correctly interprets the
  digital  values it receives from the A/D converter in terms
  of (he corresponding flow rate  of air in the spirometer.
  l)y   using   the  teletype,  the  technician informs  the
  computer of the occurrence of a /.ero flow (baseline) and
  of n  10-lilcr/sccond (low calibration signal provided by a
  special amplifier circuit. The two digital values observed
  can then be used by the program in  a  simple interpola-
  tion  formula to relate any observed digital value to the
  corresponding flow rate. For example, assume that the
  baseline signal yields a digitized  value of 100 and that
  the 10-liter/second  flow  calibration signal yields a digi-
  tixed value of 500. Then, if a value of 300 is observed,
  we would  know that it was caused by a 5-liter/second
  positive flow in the spirometer.

       After  calibration the system is ready for subject
  testing. The  subject is "logged in"  by the  technician,
  who  uses  the  teletype  to  provide certain  identifying
  information that will subsequently be recorded on LINC
  tape  along with the digitized flow data. The  identifying
  information includes the date, the subject's ID code, his
  height, weight  and a symptom code if  he has a cold,
  cough, etc.

       The subject is then  instructed to  place liis mouth
  over  the mouthpiece leading to  the spirometer, to take a
  deep breath, and then to breathe out as quickly and
  completely as possible. The flow signal is monitored on
  the CRT screen as a left-to-right  oscilloscope type sweep.
  If the subject appears not to be performing the breathing
  maneuver properly,  he is encouraged  to do better. Once
  a  successful  maneuver has been completed, as observed
  on the  CRT, the technician strikes a key on the teletype
  and  thereby causes the  computer to  analyze  the last
  10 seconds of How  data. The first part of the analysis
  entails  the  detection of several events of the respiratory
  cycle, including  beginning and  end of inspiration and
  beginning and end of expiration. These  events (points in
  time) arc then  indicated on the  scope as hash marks
  superimposed on the analyzed flow signal. If the tech-
  nician  sees thai the compute!   hus  made the wrong
determination, the analysis is rejected and the subject is
instructed to begin a new trial.

     Once the  pattern  recognition part  is performed
successfully, the second stage of analysis, the measure-
ment stage, is begun. One measurement, for example, Is
the  maximum  expiratory  flow  rate.  Since  the  last
10 seconds of flow data are stored  in core and since the
computer  knows  the  beginning  and  end expiration
points, a simple search  of the corresponding part of the
data storage buffer yields the maximum expired  flow
value and the time it occurred. To compute the capacity
(technically speaking the "forced vital  capacity")  mea-
sure of the lung, the flow values of the expiration part of
the data buffer are integrated to  provide a volume. The
program computes other measurements in a similar  way.
A complete list of measurements provided as expressed
in conventional  pulmonary function abbreviations and
terminology is given in Table I.

                        Table I
               Spirometry Measurements
                                                    i
l-orced Vital ( ap;icit\ in liters                          I:V('

Forced Expiratory Volume in one second in liters            I 1-.V ,

Peak l:\piralory I low Rule in lilerv'second                 Pl-f-'R

Maximum Midcxpiratory Flow Rate in liters/second          MMFR

Forced Vital Capacity Time in seconds                    f-'VCT

Inspiralory Capacity in liters                            1C AP

Flim at My-f cxpiratun volume remaining in liters'second      \'MI

I Inw al 25',' evpiiatoty volume reiiiuinicig in lilerVsecoml      VJ$

(\2> • V50)   I VC I

Rue-1 ime between III', mul'HY; total expiration In second*
     The  technician can  display  this list of measure-
ments on  the CRT under teletype control. If the values
indicate  that  the subject  did not  make his maximal
effort, he is again encouraged to do better. The forced
vital capacity measurement is usually the one of greatest
interest.

     The  above sequence of: 1) maneuver by subject,
2) technician inspection of the computer pattern recog-
nition results,  and  3) technician  inspection  of the
computed measurements is repeated until the technician
is  satisfied  with  the  results.  He then instructs the
computer  to save the  digitized flow data and  derived
100

-------
measurements on I.INC liipc for subsequent playback or
further |in)ccssiii}',.

     Other displays which can be icqucsted by I be Iccb-
nidail  under iclciypc control include I be volume curve
(i c., the integrated  How curve) and a curve produced by
plotting  expiratory flow  against  volume. Normally the
technician  also requests a teletype listing of the subject
identification information along  with  (be  derived
measurements fora permanent hard copy record.

THE NEED FOR REMOTE SERVICE

     The system described  above served the lung func-
tion project well  for two years.  It became increasingly
apparent, however,  that the PDF-12 computer could be
more advantageously applied to a diversity of biomedical
projects  that  were  being  conducted at another  site
several miles away  from  the school. The question  was
how to continue to support  the important  growth of
lung function project remotely.

ALTERNATIVES CONSIDERED

    The  alternative of  transporting the test  subjects
from the school was rejected primarily because it would
have grcully increased the time required for testing.

    The alternative of providing the school with its own
dedicated Central Processing Unit (CPU) was rejected on
(he basis of cost considering the following facts:

          A minimally configured PDF-12 costs $37,000

          DEC's PDP-8 based PTS system costs $25,000

          A PDP-11 or other non-DEC equipment would
          cost approximately $20,000  for hardware and
          would create prohibitive reprogramming costs.

ALTERNATIVE CHOSEN

     The  final choice was to provide  the pulmonary
function  laboratory in the school with a storage-tube
graphics  terminal  and a telecommunications  link be-
tween  the  terminal and  the PDP-12.  Another telecom-
munications link  transmits the analog signal from the
spiromclcr amplifier It) the l'DP-12's A/D converter. The
remote support configuration is shown in Figure 2.  The
purchase costs of the Tektronix 4010 graphics terminal
and the OKC DIM 2 interface totaled around $5.400,
     The (>().< analog  modems arc adequate for purposes
D| the piojccl. They can liansmil signals within a voltage
range of  '2  volts and  within a frequency  range  of
0-100 Ilert/.  Some  noise  problems were  experienced,
but they were solved satisfactorily by adding a simple
low-pass  filter circuit in  the line going  to  the  A/D
converter.

     The 202R modems operate at 1200 baud. This rate
allows  the display of the spiromctcr  How  signal in real
time at the Tektronix terminal.

     Functionally,  the remote system  behaves exactly
like the onsite system. The only noticeable difference is
that  several scans of flow  data  are  displayed before
erasing  the screen.  When a  "processed"  curve is dis-
played, it  takes 3 or  4 seconds to be painted rather than
the few  milliseconds required on the original refresh
display.  The technicians have  had  no  trouble  in
switching systems; in fact, when they have had a choice
of systems, they have preferred the new one since it does
not involve the  noisy teletype, and  the keyboard and
display screen are positioned in the same field of vision.

     One desirable feature of the system is that all dis-
plays appearing on the Tektronix  screen also appear  on
the display of the home based PDP-12. This provides an
effective means  of allowing the medical staff at the
home site to view what is happening at the school and to
offer assistance over  the telephone if required. It  is also
convenient for demonstration purposes.

FUTURE PLANS

     Currently statistical  analysis of the data is per-
formed  off-line on  the  UnivaclllO.  The data   is
communicated to the Univac in  the form of punched
cards prepared from  forms  hand transcribed from the
PDP-12 teletype listings. The project team is considering
bypassing this transcription  step by copying  the LINC
tapes to 9-track magnetic  tape, which can then be read
directly by the Univac.

     One  of  the highest priority  future plans is to  re-
place the very high-priced  ($8 3/month total for the four
modems)  Bell-leased equipment  with purchased  equip-
ment  from  an  independent  manufacturer.  Bell  was
selected initially so as to reduce the number of vendors
(o be dealt with in the event of system problems. This
conservative  philosophy proved successful since  it was
possible to solve all the problems that did arise.
                                                                                                             101

-------
SUBJECT
                 SPIROMETER
                 TRANSDUCER
                     AND
                  AMPLIFIER
                    VR12
                 CRT DISPLAY
    A/D
CONVERTER
               ASR33
          KEYBOARD PRINTER
 POP 12
  CPU

8K CORE


 CLOCK
                                                  Figure 1
                                            Onsite Configuration

-------
                  REMOTE SITE
                                                                                                    HOME BASE
SPIROMETER
TRANSDUCER
AND
AMPLIFIER


603A ANALOG
MODEM
TRANSMITTER
                                                           2-WIRE
                                                          DIAL-UP
                                                       TELEPHONE LINE
     A,
  CONVERTER
^x"
•/y
0 ° o o o <>y
000° ° y
X
X



202R MODEM
12004AUO
FULL -DUPLEX

                                                   4-WIRE
                                              ••  DEDJCATEO
                                               TELEPHONE LINE
POP-) 2
 CPU
                                                                                                             8K CORE
                                                                                                              CLOCK
   DP-12
DATAPHONE
INTERFACE
  TEKTRONtX 4010
KEYBOARD GRAPHICS
  STORAGE TUBE
                                                                Figure!
                                                       Remote Site Configuration

-------
                                       AN ON-LINE REAL-TIME MULTI-USER
                                       LABORATORY AUTOMATION SYSTEM

                               By William L. Budde,* Edward J. Mime, and Jack Teuschler
 INTRODUCTION

      Before  I°6S it was unthinkable to put an electronic
 digital  computer in  an analytical chemistry laboratory.
 Since  that  time the  steady  decreases in the cost of
 computers and  the  increases  in  their reliability have
 brought about  another  revolution  in analytical  chem-
 istry. Digital computers that were once in the  $10,000
 lo $1.000,000 class  now sell for $500 to $100,000 and
 have  far  more  computational power. The analytical
 chemical  instrumentation  thai has all  but  completely
 replaced (he burct  and filler paper is rapidly becoming
 compiitori/ed instrumentation.

      Why  is the digital  computer invading the labora-
 tory? As our technology and society have become more
 complex,  the  demand for more chemical  analyses in
 many fields  has increased exponentially. Along with this
 increased demand there  arc the requirements  for more
 accuracy,  better precision, higher  sensitivity,  more
 timely  results, greater selectivity, and of course  all of
 these at a lower cost  per analysis.  A good  example of
 this is  the health Held. At  one time  the  family physi-
 cian's stethoscope was one of the few routinely used
 diagnostic  tools.  Today a large  clinical lab  must do liter-
 ally thousands  of  blood cholesterol and  urine  sugar
 analyses every day.


     In  the  environmental  field  (he measurement  of
 specific air and water pollutant  chemicals is the basis for
 the  whole  environmental  movement. Until  reliable
 measurements were  made and  correlated with  undesir-
 able health or wildlife population effects, environmental
 concern wus mostly limited to those concerned with
 purely  aesthetic values. Currently there is considerable
 emphasis  on setting standards for  acceptable  air and
 water quality,  issuing  permits  for discharge of wastes
 into livers and  oceans, monitoring these effluents  to
 ensure  compliance  with  permit limitations, and con-
 ducting enforcement  actions when violations occur. All
 of  these activities arc increasing the demand loi  more
 and butler chemical environmental analyses.

     Belter analyses embody the ideas of accuracy and
 precision, and require extensive  use ol analytical quality
                                                             control  techniques. Quality control is often deleted in
                                                             analytical laboratories because of its cost and time re-
                                                             quirement. With this deletion, the meaningfulness of the
                                                             measurements  decreases substantially. There is nothing
                                                             more worthless than the wrong answer. With a comput-
                                                             erized system, quality control is easily possible without a
                                                             high price. Another aspect of better analyses is the desire
                                                             for  new kinds of measurements that are more revealing
                                                             about the state of environmental pollution than tradi-
                                                             tional measurements. These  more  revealing measure-
                                                             ments are often more  complex  and simply cannot be
                                                             accomplished economically or at  all  without  on-line
                                                             computerisation.

                                                                 In response to the  needs of the Environmental Pro-
                                                             tection Agency (EPA), the Methods Development  and
                                                             Quality Assurance  Research Laboratory (MDQARL) of
                                                             the  National Environmental Research Center (NERC) in
                                                             Cincinnati and the Computer Services and Systems Divi-
                                                             sion  of  NERC-Cincinnati  began  a program to satisfy
                                                             these needs. The primary  mission  of  MDQARL is to
                                                             develop,  improve,  and  validate  methodology  for  the
                                                             collection of physical, chemical, radiological, microbio-
                                                             logical, and biological water quality data by EPA  Re-
                                                             gional, Office  of Enforcement  and General  Counsel.
                                                             Office of Air and  Water Programs, and other NERC-
                                                             Cincinnati organizations.  This  laboratory, also advises
                                                             EPA and other laboratories concerning the developn int
                                                             of monitoring  and  quality control  programs for  water
                                                             quality activities. These functions  in methodology  devel-
                                                             opment and quality control arc clearly highly relevant to
                                                             the development of a laboratory automation system.

                                                             PROJECT GOALS

                                                                 In late 1972 an interagency  agreement  was estab-
                                                             lished with the Lawrence Livermore Laboratory (LLL)
                                                             of the  Atomic Energy  Commission  (AEC) which is
                                                             operated  for AEC by the University of California. This
                                                             organization was chosen  to  assist in the project because
                                                            of its long record of solid accomplishments in this field.
                                                            Some of the earliest successful applications of computers
                                                            in a chemical laboratory were developed in this organiza-
                                                            tion's general chemistry  division.t  A thorough systems
                                                            analysis was performed to define precisely the needs of
tSeo J.W I .'
             . .v,./,Wr „,,,/
                                  . July l%8. p.4 I Al«« >"' J W ' ra"r' '*""' Chem '•• 40 26 A -
104

-------
lil'A. to cslublisli the exact goals of (he project, to write
detailed specifications Tor hardware and software, and to
develop an implementation plan. Several goals that were
defined are:

     1.  To develop a laboratory automation  system
that  would incorporate presently owned chemical  anal-
ysis instrumentation widely used at MDQARLand many
oilier KPA laboratories for measurements of water qual-
ity parameters.

     2.  To develop this methodology in a fashion that
will  permit the adaptation of the technology to other
tPA  laboratories at a very significant cost and  time
savings. In particular, the designs for hardware interfaces
between  the  instruments  and the computer and the
custom software would become EPA property and could
be used in any riPA laboratory without further cost.

     3.  To develop an open ended design that would
permit the attachment of many additional instruments
that  might be  used for a variety of measurements in-
cluding nonwater parameters.

     4.  To write as many of the computer programs as
technically feasible  in a very  flexible high-level modern
programming language. This would support the  ease of
modification  and improvement of the software  by sci-
entific personnel and the transfer of technology to oilier
laboratories.

     5.  To design the system  with sufficient flexibility
so it  is applicable to methods development research as
well  as the production atmosphere.  In a methods re-
search program, automation   makes  careful  testing of
new  procedures possible by allowing independent varia-
tion  of a large number of method  variables with  a statis-
tically significant number of samples.

INSTRUMENTATION TO BE AUTOMATED, PHASE  1

     In the first phase of the MDQARL pilot project,
five  instruments were  selected for ;iulom;i!ion. These
were  a IVrkin-lilmer model 503 atomic absorption spec-
i,omelet, a Varian model AA-5 atomic absorption spec-
trometer. .1  Jarrcll-Ash  3.4 meter electronic  readout
emission spectrometer,  u  Technicon  AutoAnaly/ei II,
and u Beck man total organic carbon analy/.cr.

     The  first  two instruments were  chosen because
atomic absorption (AA)spcctrometry is the most widely
used method for  measuring trace metal concentrations.
The  two manufacturers represent a high percentage of
the total market  for this equipment. EPA laboratories
utilize a large number of AA spectrometers with few, if
any, automated for on-line data acquisition.

     Although EPA uses only a few emission spectrom-
eters for trace  metal  analyses,  this instrument was in-
cluded   because  it  has  simultaneous,  multielement
capability and should be a method of choice in mon-
itoring  for  many  elements  when large numbers of
samples are involved. Emission spectrometry will be used
more in  the  future as newer and  more sensitive tech-
niques are  developed. The  Technicon Auto Analyzer is
widely  used for  measurements  of many  pollutants
including the important  nutrients.* The total organic
carbon instrument is widely used to measure the organic
load in many waste streams.

Atomic Absorption Spectrometers

     These  units measure trace metal concentrations by
optical spectrometry using  samples obtained by one of
two  methods. Water  samples  may be  aspirated into a
flame or gaseous samples may be swept from a graphite
furnace. Water samples can be retrieved from a computer
controlled  40  position  circular   sample  vial holder.
Sampling modes  available with this unit include semi-
automatic  sampling  where the  operator  moves  the
sample holder by  entering commands from his terminal.
fully automatic sampling where the computer automat-
ically moves the sample  holder,  and  fully  automatic
bracketing  of the unknown with appropriate standards.
The computer  program  that  reads the data is an as-
sembly language subroutine that is  called from the high-
level BASIC language. Data reduction to concentration
can be accomplished,  at the user's  option, by interpola-
tion or by  first  or second  degree linear least squares
fitting. The operator can enter a wide range of informa-
tion about  his samples from  his  terminal. This infor-
mation  can also be output along with  the analytical
results  in  a  variety of formats to suit  the particular
situation.

Emission Spectrometer

     The automation  of this instrument  is somewhat
different from the other  four instruments since  the
presently existing analog to digital  converter that drives
a nixie tube digital  display  will be  utilized. In the digital
display  system  there is  a high  degree  of operator-
 "Methods for Chemical Analyses of Water and Wustcs, Environmental Protection Agency, 1974.
                                                                                                             105

-------
  instrument interaction. The  operator  manually incre-
  ments a sequential readout from  the 23 data channels,
  copies digits  from the display onto a  raw data report
  form, conducts calculations, makes judgments about the
  result, and begins a new sequence using the results of the
  judgments. This  very desirable  interactive feature  is
  retained in the automated system,  but the operator is
  released   from  the  manual  incrementing,  reading,
  copying, and  calculating functions. He retains judgment
  functions  based on  reduced  data that  is dynamically
  displayed  at  the  instrument  terminal.  The automated
  system  greatly improves instrument calibration, back-
  ground  correction, quality control, and development of
  analysis reports. As in the other  instruments,  informa-
  tion about the samples may be entered from the opera-
  tor's terminal keyboard and included in part or in whole
  on the analysis report.

  Technicon Autoanalyzer

       This  system results in a digital computer readout of
  the  Technicon AutoAnalyzer output  as  it   aspirates
  standards  and  samples from the  existing automatic
  sampler module.  The  computer measures the  time
  between several  standard  peak  heights and, using this
  time, looks for absorption peaks  above background at
  appropriate intervals  as samples pass through the color-
  imeter. A  slope measurement is made to ascertain  that
  the readings start on the upside of a peak. The maximum
  value of these readings is selected  as the datum point to
  be associated  with that particular sample.

       Unknown concentrations are  reported as the analy-
  sis proceeds.  Unknowns out of maximum range of the
  instrument cause  the computer to  ring a  bell  and the
  operator may add  a  diluted sample at the end of the
  series. The computer looks for these samples automati-
  cally. Dilution factors are entered at the completion of
  all samples prior to obtaining a formal analysis report.

      Provisions  are  made  for quality  control checks
  using  standards and  spiked samples.  The  pattern for
  quality control checks is predetermined and the cycle of
  repetition  entered by  the operator. Quality control chart
  values for  acceptance  arc stored. The computer assigns
  automatic  samplci position numbers on  the basis of I he
  minibci of standards and the sequence  of the unknown
  identification information  entered.  Calculations  ol
  concentrations are done by first, second or third degree
  least  squares fitting of the standard  curve or by simple
  interpolation  as predetermined  by  the operator. The
  type of lit  may he selected at  the beginning of the anal-
  ysis or after standards  have been read.
 Total Organic Carbon Analyzer

     This system results in a digital computer analysis
 readout of the Total Organic Carbon Analyzer data as
 samples or standards are injected into the furnace. The
 computer searches for a peak height or integrates under
 a curve as the CC^ from the sample or standard is swept
 through  the  infrared detector. Replicate injections are
 averaged and after each injection, the  new average is
 reported.  Individual  results  may  be replaced  without
 losing  prior  information   or, alternatively,  an  entire
 sample or standard run may be deleted.

     Allowances are  made  for the  introduction of stan-
 dards as checks and also for samples spiked with known
 amounts. Quality control chart values are stored. Calcu-
 lations of concentrations  are  done by  first or second
 degree least squares  fitting of the  standards data  or by
 simple interpolation  using  the two closest  standards as
 predetermined by the operator.

 TYPE OF COMPUTER SYSTEM

     A fundamental  decision that  had to be made was
 the  choice between a system of dedicated minicom-
 puters  or microcomputers, an in-laboratory  time sharing
 computer, or a remote time sharing system. The remote
 time sharing  concept  was easily  eliminated as this was
 extensively tested during the late 1960's and was proved
 a failure. In laboratory systems, a central processing unit
 (CPU)  response time of the  order of microseconds to
 milliseconds is absolutely essential  to avoid  loss of data
 from instruments. Response times of -this speH are
 simply well beyond current capabilities. In addition the
 difficulties  and cost  of transmission of analog data and
 the control of the computer by individuals  not familiar
 with laboratory operations led to a totally unacceptable
 level of reliability.

     A system of minicomputers or microcomputers was
 eliminated because of the cost of  equipping each with
 sufficient  memory and peripheral  devices. Also  small
 systems invariably must  be  programmed in assembly
 language which precludes fast modification and updating
 of programs. Specifications for u medium scale in-labora-
 tory time sharing system  are developed and are sum-
 marized below.

Computer Hardware Specifications Summary

    A 12-18 bit word  length  processor  equipped with
floating point hardware, integer multiply  and divide
hardware, and with capabilities necessary to interface a
106

-------
fixed head  high-speed swapping disk, u moving hoad
cartridge disk,  industry standard magnetic  tape, u fast
paper tape reader, a medium-speed line printer, a console
keyboard-printer, remote cathode ray tube (CRT) termi-
nals, and multiplexed A/Dand D/A converters.
Computer Software Specifications Summary


     A  well-documented  and  field  tested  operating
system which, in a time slicing sense, can serve up to ten
terminal  users simultaneously.  The operating system
must  support  an  extended  version  of  the standard
Dartmouth  BASIC  interpreter  language. The BASIC
language  must he capable of acquiring data in real lime.
controlling instruments in real time, processing character
strings, manipulating data files, supporting insertion of
custom  input-output device handlers, supporting  inser-
tion  of machine language subroutine calls, chaining to
other programs, and spooling output.

     In  an  open competitive bidding process, the Data
General Corporation was the successful bidder with a
NOVA  model 840 computer and associated peripherals.
The  hardware configuration is  shown  in  Figure 1.
Figure 2 gives a general overview of the organization of
the laboratory user software. Figure 3 is the organization
of the  Data General  real-time  disk operating system
which  meets  all  specifications.  Figure 4 gives  a  more
detailed outline of the foreground core memory where
the BASIC language and assembly language instrument
handlers reside. Figures  5  and 6 show the significant
features of BASIC and a comparison with FORTRAN.
All of  these were  discussed in greater detail during the
presentation and subsequent sections of this report.
                                                                                                              10'

-------
o
oc
                                                                       MOVING
                                                                        HEAD
                                                                        DISC
                                                                                          TECH
                                                                                          AUTO.
                                                                                          ANAL
                                                                                           J

-------
    SAMPLE
     LOGIN
                                        \
                                       J
^MANAGEMENTS
V  CONTROL  J

     SAMPLE
      CZAR
     PROGRAM
    SAMPLE
     LOGIN
    PROGRAM
   UNANALYZEDI/
[SEMI-ANALTZEOU
    SAMPLES  \
                           \
                              PRIORITY
                              PROGRAM
C                              OPERATOR'S
                              CONSOLE
                              TECHNICON
                            AUTO-ANALYZER
                              PROGRAM
\
\
>
'



"\ C OPERATOR 'SS ( OPERATOR'S S
y V CONSOLE J V CONSOLE J
_
L_ ;
;
\ / 8ECKMAN \ /JARREL-ASH\
ER) ( TOC ) < EMIS SPEC.)
/ \ PROGRAM / \ PROGRAM f

< NO



                   REPORT
                   FORMAT
                   PROGRAM
                                                                      NON - AUTOMATED
                                                                       DAT*    INPUT J
                               Figure 2
              Software Organization for the NERC-Cincinnati
              MDQARL Laboratory Automation Pilot System

-------
                  FORGROUND  <
                                 400
                                         program NREL
                                         overlay area(s)
                                         program NREL
                                        overlay directory
                                           TCB pool
      UST
                                        program page zero
                                  logical
                                   a ddr£§se_s.

                 BACKGROUND  <3IK
 program NREL
                                         overlay area(s)
                                      /r
                                         program NREL
                                 400

                                    0
                                 logical
                                 addresses
                                        overlay diretory
                                           TCB pool
      UST
program page zero
                        RDOS
                                        System Buffers
                                            Resident
                                             RDOS
                                        RDOS page zero
                                   physical
                                   addresses


                               USER AND RDOS SPACE (MAPPED)

                                   Figure 3
                       Organization of Core Memory in the Data General
                         Real Time Disk Operating System (RDOS)
                          with Hardware Memory Management
10

-------
 32 K
Ul
tf
o
o
UJ
QC
                    USER

                  PROGRAM

                    AREA
                 LINE TABLES
       A OC DATA BUFFER |


       ASSEMBLY LANGUAGE PROGRAMS
                    BASIC

                   PROGRAM
                              OVERLAY

                                AREA
                OVERLAY DIRECTORY
                   TCB  POOL
                      UST
                PROGRAM PAQE ZERO
"I4K
                                         	i	
  UK
  LOGICAL ADDRESSES
         FORGROUND ADDRESS  SPACE
                       Figure 4

            Organization of Foreground Core Memory Under

              RDOS in the NERC-Cincinnati MDQARL

               Laboratory Automation Pilot System
                                                          111

-------
      FEATURES  OF EXTENDED BASIC
 1)   MATRIX MANIPULATION  COMMANDS
 2)   STRING COMMANDS
 3)   FILE I/O HANDLING
     ASSEMBLY LANGUAGE CALLS FROM  THE
     BASIC LANGUAGE
 5)  CHAINING CAPABILITY
 6)  TIME SHARING  CAPABILITY
                    Figure 5
               Significant Features of Data General
            Multi-Tasking Extended BASIC Programming Language
n:

-------
                PROGRAMMING   LANGUAGES

   BftSK                                  EQBIBAU
FIND SOURCE FILE                         FIND SOURCE FILE
CALL EDITOR                              CALL EDITOR
INSERT : 12 - A"B                        INSERT:  ISTREAK - A"B
                                         REAL ISTREAK
INSERT: PRINT "THE ANSWER IS"jI2         INSERT:  WRITE  (3,4)  ISTREAK
                                      1  FORMAT (15H THE ANSWER IS ,F5,2)   *p
END EDIT                                 END EDIT                          ft
                                         REQUEST COMPILATION
RUN                                      RUN COMPILED CODE
                                                                           •n a
                                                                           85:
                                                                           88
                                                                           e a
                                                                           3 I
                                                                           ?B
                                                                           8-s-

-------
                            COMPUTERIZED CHROMATOGRAPHY-MASS SPECTROMETRY

                                        By William L. Budde and D. Craig Shew*
  BACKGROUND
      Identifying and  measuring specific organic  com-
  pounds that contaminate the environment has concerned
  environmental  research scientists for  many years. The
  huge number of potentially harmful or toxic compounds
  that are theoretically possible lias discouraged the devel-
  opment of a specific analytical method for each. Instead,
  general approaches  based on slight differences in solubil-
  ity,  chroimitographk behavior, and  spectroscopic
  measurements were developed and used widely to gain
  information about  the  types of compounds present in
  various samples.

      Unfortunately, many of these standard techniques
  of chemistry, which provided very  valuable information
  in flagrant pollution situations, were either slow, expen-
  sive  in  terms of skilled  manpower, or  relatively  insensi-
  tive  to  very small quantities. In addition, most of  these
  techniques frequently failed to generate really definitive
  information.  As  a result,  environmental  researchers
  began  to  turn to  a new mult (disciplinary technique:
  computerized (automated data processing) gas chroma-
  tography/mass spectrometry (ADP/GC/MS).

      This  sophisticated technique was  developed during
  the  l')60's primarily to meet the  demand of basic  re-
  search  workers in  medical, biochemical, flavor, odor.
  medicinal, and gcochcmical  research. Basic research sup-
  ported by the National Institutes of Health, the National
  Science Foundation, and the  National Aeronautics and
  Space Administration played the major role in the devel-
  opment of the concept.  Since one of (lie basic problems
  in each  of the above areas of research is the identifica-
  tion  and  measurement  of  trace quantities of a  great
  variety of organic  compounds,  the application  of the
  technique  to  environmental  research  was obvious but
 slow to come about.

      Like any automated method, ADP/GC/MS does not
 reduce  the need lor skilled manpower; in fact, it requires
 highly skilled, multidisciplinary learns of spectroscopists,
 analytical chemists, electronics engineers, and laboratory
 minicomputer specialists. The major advantage of this
 method  is  that it enormously increases the capacity of a
 fixed or limited staff to  make large numbers of unambig-

 *S|H-;ikcr
uous  identifications of specific organic compounds In
environmental samples.  Identification of noxious  pol-
lutants at the part-per-billion level with a high degree of
confidence in the result  has become nearly routine in
more than a few U.S.  Environmental Protection Agency
(EPA) laboratories. What was once a nearly impossible
task for a staff of 100 working for 6 months can some-
times be accomplished by several people in a few hours.
The ADP/GC/MS system  is by far one of the most  out-
standing  examples of  the  successful  integration of
laboratory-scale,  analytical instrumentation with  a dedi-
cated data processing system.

STATE OF THE ART

     EPA has  made a major commitment  to ADP/GC/
MS during the last few  years by installing about 23
complete systems valued at about $2.5  million in  labora-
tories across the country. The systems are  employed in
eight regional surveillance and analysis laboratories, both
National  Field  Investigations  Centers, two  National
Environmental  Research  Center  (NERC)-Cincinnati
laboratories,  one  NERC-RTP  laboratory,  six NERC-
Corvallis laboratories,  and two Office  of Pesticide  Pro-
grams laboratories. The annual total costs  dedicated to
these  systems should be at least  equivalent to the  pur-
chase price.

     Notwithstanding the  many different missions ->f the
various  EPA laboratories  across the country, AL.'/GC/
MS has  proved to be a highly successful technique capa-
ble of accomplishing a wide  variety of tasks. In many
cases involving the identification  of organic pollutants,
ADP/GC/MS is not only the method of choice, it is,
indeed,  the only method  available. Typical of the wide
variety of uses are the following:

         Identification of organic compounds in indus-
         trial  effluents  for  background  data  and/or
         enforcement action

         Characterization of oil spills

         Identification of organic compounds  as  the
         cause  of taste  and  odor in  drinking water
         supplies
114

-------
          Evaluation of  (lie effectiveness of treatment
          systems designed (u remove organic pollulanls

          Studying  llic  accumulation  of organic  pol-
          lutants in aquatic life

          Determining the causes of fish kills.

SYSTEM OPERATION

     Eighteen  of  the systems  mentioned  above  are
nearly  identical  Finnigan  quadrupolc  mass spectro-
meters* computerised with Digital Equipment Corpora-
tion  PDP-8  minicomputers  (Figure 1).  The  three
functional parts of the system are as follows:

     I.  The gas chromatograph is very  well-established
as a powerful tool  for the separation of complex  mix-
tures  of volatile organic  compounds, while  the sample
enrichment  device  provides an effective interface be-
tween the gas chromatograph and mass spectrometer.

     2.  The mass spectrometer is the only GC detector
used,  and like conventional GC detectors, it is very sensi-
tive. In contrast to the single-channel response of con-
ventional  GC detectors, the  mass spectrometer provides
a multichannel  response that carries a great deal of infor-
mation  about the molecular structure and composition
of organic compounds. This information may be dis-
played graphically by a fast cathode-ray  tube or a  slow
(pen and  ink) plotter, printed in digital form, or trans-
mitted over conventional voice-grade telephone circuits
to other data-handling systems.

     3.  The minicomputer controls the operation  of
the mass spectrometer and processes  the generated data.
In noncomputerized systems, mass spectra generally are
not acquired continuously during the GC run. However,
with a computerized system, the minicomputer simulta-
neously controls the operation of the mass spectrometer
and continuously acquires mass spectral data. This  large
volume of MS  data is processed by  the computer and
stored temporarily on a magnetic disk or magnetic tape
before plotting, printing, or transmission. Personnel who
previously devoted  long, unproductive hours to manual
data processing arc  freed to  do productive tasks such us
data interpretation.

Interpretation of Data

     The  necessary  prerequisite for effective data inter-
pretation  in ADP/GC/MS is the availability of valid  mass
spectra. Assuming that  ilie  ADP/GC/MS is well-tuned
and in good operating condition, the only other require-
ment  for valid mass spectra is the clean separation of the
individual  components  of  the mixture  on  the GC
column.  This is, of course, the classical  problem of
cinematography, but it is made somewhat more manage-
able  with the MS  detector. An experienced user can
frequently ascertain that the separation is clean and free
of overlaps  by an  examination of  the internal consis-
tency of the mass spectral data. This is not possible with
the simpler, conventional GC detector.

     There are two fundamentally different approaches
to the identification of specific organic pollutants from
valid  mass spectra. Interpretations  may  be attempted
bused on the theory of mass spectroscopy and the rules
of fragmentation  of organic   ions in the gas  phase.
Unfortunately, this approach is frequently  slow and te-
dious  and  requires extensive  training and experience.
However, it  is generally  far more  definitive  than  the
equivalent  approach with other types of spectrometry
(e.g., infrared  spectra). This approach  is clearly justified
when  all else fails in important situations such as an
enforcement action or fish kill.

     The alternative is the purely empirical method of
searching a file of reference mass spectra to find a similar
or exact  match of the experimental  spectrum.  While
manual searching of printed data is rather slow, subject
to human error, and intellectually  unstimulating, com-
puterized searching overcomes these problems. Two
experimental  remote systems, which  are markedly dif-
ferent in their approach to the problem, are currently in
use by EPA and are described briefly below. Perhaps the
simplest  approach would  be to utilize the PDP-8 mini-
computer with the disk  or tape storage available on the
MS data system  to search a data base of reference spec-
tra. This may be the most  economical method, but it has
not been fully developed and is not  currently used by
EPA.

     One system  used by EPA was  developed  by the
Bundle  Memorial Institute  with funding  from  the
Southeast  Environmental Research Laboratory compo-
nent  of NERC-Corvallis. It  is  intended to be used by a
laboratory technician who  needs  to  know the  correct
buttons  to operate and  the correct  commands to enter.
MS data are  transmitted in batches  directly from the
ADP/GC/MS data system to a large Control Data Corp-
oration model 6400 computer  at Battelle-Columbus via
standard  voice-grade telephone  lines.  The  system  is
highly automated, rather sophisticated, and makes good
'Mention ol coiniiH'ivwl ptoilm-is does noi imply endorsement h> the H»A.
                                                                                                              115

-------
   use of Ihu data, ll is simple fo Utie because il cither finds
   a match  or il tlocs not; however, if no match is found,
   there are no oilier alternative inodes of search, Since the
   system does require the ADP/GC/MS data system, use of
   the  ADP/GC/MS is  precluded  during  the   matching
   operation.

       The other  system used by EPA  was  developed by
   the National Institutes of Health (NIH)and may be used
   by any EPA laboratory under an existing EPA-NIH inter-
   agency agreement. This system runs  on a large Digital
   Equipment Corporation model POP-10 computer at  the
   NIH computer center in BetheSda, Maryland. The user
   enters MS data via a standard keyboard/printer terminal
   connected to a  standard voice-grade telephone line and
   docs not utili/.e the resources of the  ADP/GC/MS data
   system. The matching system is highly  interactive and
   conversational, and the user must impose his judgment
   in entering dala. This system  is oriented  to the exper-
   ienced user who sits  in the interactive loop with  the
   computer program and has many optional approaches to
   probing the data base and searching  for a match. This
   approach makes less efficient use of computer resources
   and depends more on an operator's judgment to make
  £ sod  use of the available  data. However, it  is a more
   Pexible method, especially when an exact match is not
  a 'ailablc. For economy and additional flexibility, these
  t 10  experimental systems are in  the process of being
  combined to use the same data base on the same time-
  sharing computer.

  SPA-MS USERS GROUP

       In the spring of  1972, an EPA-MS users group was
  orpi.nized for the purpose of promoting  the  informal
 exchange of technical information among the EPA labor*
 atories using the 23  computerized GC/MS systems. The
 group is designed to be especially helpful to laboratory
 personnel (e.g.,  regional surveillance and analysis chem-
 ists) whq are just beginning to acquire the  skills neces-
 sary to utilize this sophisticated equipment effectively,,
 In addition, the group keeps all involved personnel up,
 to-date on the  latest instrumental improvements,  new
 techniques, new equipment additions, and new software
 programs and capabilities.


 CONCLUSION
     It has been clearly demonstrated that ADP/GC/MS
can provide extremely high quality and highly significant
information about organic environmental pollutants to
environmental  scientists  and  enforcement  officials
However, the achievement of this capability requires a
substantial laboratory management commitment.


     The ADP/GC/MS is not the kind of equipment that
can stand idle for weeks and then be used effectively by
occasional users. Users require in-depth knowledge, and
the equipment  requires  routine  maintenance,  adjust-
ment,  and utilization. Furthermore,  the commercial
packages are not perfected and will require updating in
the future. Nevertheless, it is believed that the commit-
ment is worthwhile and that ADP/GC/MS routine analy-
ses should replace, in many EPA laboratories, some of
the traditional,  environmental  pollution methodology
that generates less relevant and very much less significant
data.
llh

-------
                 GAS CHROMATOGRAPH
                   (ISOTHERMAL OR
              TEMPERATURE PROGRAMMED)
                         1
              SAMPLE ENRICHMENT DEVICE
                  QUADRUPOLE MASS
               SPECTROMETER (INTEGER
               RESOLUTION TO 750 AMU)
SLOW PLOTTER
(HIGH DUALITY
  GRAPHICS)
  MINI COMPUTER
(DATA ACQUISITION,
 REDUCTION, AND
    CONTROL)
       SLOW PRINTER
       WITH KEYBOARD
    (ALPHANUMERIC DATA)
     DIAL  UP  TELEPHONE
    TO LARGE COMPUTER
     ( DATA BASE SEARCH)
                              U
MAGNETIC DISK
   OR TAPE
(PROGRAM AND
DATA STORAGE)
            CATHODE RAY TUBE
           WITH KEYBOARD (CRT)
              (FAST GRAPHICS
           AND ALPHANUMERICS)
             FAST HARD COPY
              OF CRT DISPLAY
                         Figure 1
             Configuration of the ADP/GC/MS System Widely Used in EPA
                                                     117

-------
                                             CALCOMP SOFTWARE
                                              By Theodore R. Harris
       CALCOMP software leased  by the Environmental
  Protection Agency (EPA) was discussed, and  its avail-
  ability on the  two major computer resources  was por-
  trayed.  Utilizing FORTRAN,  CALCOMP  software
  provides  the  capability  to  produce graphic  outputs.
  These outputs commonly are in the form of off-line pen
  plots.

       CALCOMP provides three general classifications of
  graphic software. The highest level is made up of applica-
  tions programs. These programs are as follows:

           CPCP - General Purpose Contouring  Package
           (Figure 1)

           THREE  D - A Perspective Three-Dimensional
           Package (Figure 2).

       The second type of graphic software is classified  as
  Functional Software. Functional  programs  or  subpro-
  grams perform plotting functions frequently  used  in
  many different applications and are as follows:

           FLOCT - Flow Charting Package (Figure 3)

           FORGN - Form Design Package (Figure 4)

           DRAFT - Drafting Subroutine Package
          (Figure 5)

          SCIEN - Scientific Subroutine  Package
          (Figures 6 and 7)
          GENRL •  General  Subroutine  Packtee
          (Figures 8 and 9)                       ^

          BUSNS - Business  Subroutine  Package
          (Figure 10)                             ^

          DRIVER-Curve Fitting  Subroutine Packaa*
          (Figures 11  and  12).                     **

 The functional subroutines are shown in Figure 13.

     The third classification is the Basic Software, which
 is the lowest level for controlling the plotting hardware
 These routines arc as follows:

          PLOT - Direct control of pen movement

          SYMBOL - Produces annotation

         NUMBER - Produces number  conversion  fa
         annotation

         SCALE - Determines  range and scaling  factor
         of data

         AXIS - Produces X and Y axis elements

         LINE - Produces line plots.

The software  is currently installed at Research Triangle
Computing Center and is being installed at C ptimum
Systems Inc.  (OSI).  A  training seminar will be an-
nounced in the "Data Systems Newsletter."
UK

-------
       IUHCMt


UMI.IMLI «•'•
                                              Fifuicl
                                 Flexible Mode Sample Problem Output
                                                                                                        119

-------
       Z-SQRT
                                            Figure 2
                                      Sample Perspective View
                                 Q = 2, THETA = 40°, GAMMA
                                      Hidden Lines Eliminated
50°
120

-------
L .USTRAT I  ON  OF   I l.OC'l   F AC I I.  I T !
 r
SUPPLEMENTARY NT«9 CODES.
NCD *   1-988 ALLOWS THAT  NUMBER OF FREE FORMl
               COMMENT LINES.
NCD » 989     MOVES WITH PEN UP TO X,Y.      I
NCD a 990     PLOTS SOLID  LINE. FROM CURRENT
               PCS IT ION TO X,Y.              1
NCD » 991-999 PLOTS .1-.9  DASHES RESPECTIVELY
               FROM CURRENT POSITION TO X,Y. I
   'NT  »  6
   [  CARD
   i	
    NT  «  I
  RECTANGLE
                                  NT
                                  HEXAGON
                       BOTTOM
                                    	V_	
                                    NT H  7
                                    PRINTED
                                    PAGE
                      Figure 3
                Illustration of FLOCT Facilities
                                                    1:1

-------
  •n
  O


 I
*
    f
K
1

8 •? •;
:,:£,: jnj = M"|
- , . . ; . i . : . i
PLE OF MULTIPLE
sfo turn E*CH CA?.O
1 4

i
:

i
'


	 . 	 ,

. 	 	 	 ,
', i ;
•l 	 ,
i
• '
C*^3 iTTE 2

^
L • 	
_'•:; TJC






H ?0 Z6 32
*
!/•- 	 	 i 	


(21 . . • ..,.,.
«^| 	 1 — J.._.'_i_l_l 	 ;_
, 	 1
•• • . . i.l'i
->'.
ts. , , . . , l . , ,
•)'•
S f i - . .1 i— -

1 1 1 1 1
111.11
1 1 1 1—1-'-
1
8


c
t
_i_i — : — i— 1_
. . . • .
	 1 . . ...
1 	 1 _t 1 ._!_

1 	 1 ' 1 J_
j i l i i l ; l l .


1 1
2 	




-11' — L_l_
1 ' -1 — — i
, , . , ,
, . . . ,




FORM SEGMENTS ON ONE FORM
5« 7G s:
FICLO 3 SESOVf Til!. - L1I.E Cl-t •"' •
L'KE T10 . <:£>•"






1 ' ' '
1 -. - -1
i __? — : — : —
i • _ _i_ i
i
; 1 • t '
I
3J « • SO 56 62 63 TS 8t
ft
_1 	 L_l 	 1 	 1—

•ll.l
' 1 1 	 1—1—
1 t t 1 I
1 • ' 1
1 ' ( 1 '




F(JiW IJ-eS !!;VIJti> <-09-6S
E
i i i i i
. i . . .
. i t i i
. i. l i ' j..
i i — i_i i
t LJ_ 1 1
.1111




F
• j . , .
, . i i •
. t . i ,
J 1 	 L J— 1
1.1:1
1.1:1
1 l _ 1 1 1




G
1 i 1 1 — —

H
1 1 1 1 1 	

i t_t_i iii.il.
i i_i i .
iiti.
;,_!*.
1 JL_L 1 1




l . l • •
1 1 1_1 •
_ • 1 1_1 1
•ll.l




1
— 1— 1 	 1 	 1—1—

.....
i . 1 1 1
. 1 1 1 1
1 1 1 1 1
.1.11




FI£IS CCXST4M1
C,AL.C,OM.P.
C.A.L,C,O.MP,
C.AL.C.O.M.?.
C.A.L.C.O.MP.
C.A.L.C.OW.P.
C^.L.C,O.MP.
C.A.L.C.O.M.P.
C.A.L.C.C.M.P.
b.A.L.C.O.M.P.
C.A.L,C.O.WP,

CAL.C.QMP.
jo; ,

I


• tit
, i i ' '
i • ' •






-------
   SflMPLE  OF  DRHFIIK'Q  SUBROUTINES  PfiCKflGE
TOP
                    11
                    tn

                    tn
                    o
                    o
                    w
•™^iTj 	 J . U U U 	 1-- —]
C
1
n
rvi
m
r>
a:
Q
m

-^-0-750
^~^
1






w
^
en











^e£3- 1 . 500 -r-~-






1
                                                        r
   SIZE =  0.250
                           Figure S

                  Sample of Drafting Subroutines Package

-------
         SflMPLE OF  SCIENTIFIC  SUBROUTINES PRCKflGE

         USING  CURVX SUBROUTINE
         o
         o
         o
         (M
         o
         o
         o
         o
                             Y =C. OVSX»-4-0. 525X»-.<3t0.7bX»
         °O.OC      1.00      2.00      3.00

        USING  CURVY  SUBROUTINE
                           4.00
         o
         o
          X=Y«»3.78-6Y««2.52+9Y*»1.26
         °0 - 00
1.00
2.00
3.00
4-00
                           5-00
5.00
                              Figure 6

                      Samples of Scientific Subroutines Package
124

-------
  USING  SGRLG,  LGflXS.  flND  LGLIN SUBROUTINES
  •CD
  03-
  r*-

UJ <
0£ lf
ID
I—
CC
ct:
LU m-
Ci_
z:
UJ
  O
    0-00
25.00     50.00     75.00
         flLTITUOE
         100.00    125.00
  USING POLflR  SUBROUTINE
                             RflDIUS=2»(l-COS(flNGLE))
    4.00     3.00
         2-00
1 -UO
0-00
1.00
                         Figure?
                 Samples of Scientific Subroutines Package
                                                         125

-------
            SflMPLE  OP  GfiN'EHflL  SUBROUTINES  PRGKRGE
                   CIRCL
EL1PS
                   FIT,  DflSHP
PCJLV
                                 GRID.  DflSML













/
/




/
/
/




/
/







\







X
y







\









\
\








-













•— • *•
r
/






-







V








<.







^


I

1
/
1
i~
I




\.
\








\









^








\
\







J





i






/







— *— .





                TMf. lIDHOtu 13 [irwi< UJIM ftCCT
                                     Figures
                           Sample of General Subroutines Package
126

-------
USING  SHRDE  SUBROUTINE
 o
 o
 o
 o
 C1
 o
          NET SRLES

          COST OF SflLES
              '   RPR       JUl.       OCT       JFIN

                         I960


   SHnOEO REGiOM REPRESENTS GKOS5  PROFIT
                       Figure 9

               Sample of General Subroutine Package
                                                        117

-------
        SAMPLE  OF BUSINESS SUBROUTINES PRCKRGE
        USING RXiLa.RXISC RNO  bMK SUBRUUTINES
                O I
                     JHN
FEB      MRri
   1969
       USING 5CRLG,  LGHXS,  RNO  LGLIN
                cr,
                OD
                UD
             j	
             —  ^
             u..
              C>  XJ*
              .}
             Cf.

V
I 1 1 1
JflN flPH JUL OCT
1968.
1
                            Figure 10
                    Samples of Business Subroutines Package
128

-------
 USING  FLINE RND SHOOT  SUBROUTINES
o
CJ
                                        _ SHOOT

                                          FLINE
          B.OU     V.OO     8^00
                 .SL'RVICE  TIME
9.00
10.00
                       Figure II
                Sample of Buiinets Subroutines Package

-------
jj.nMrM.f-:  PLOT  USING  DHTH  UHKUS  HNU  CRVPLT DRIVER
 o  Y:-. »3'J- VH97 -.);.'.'. 1-1
 «  Y  »l.
    p' o"      4~ToO      6'. 00      0.00      10.00     12-00    14-00
                      TIME  (MJNTHS)
                         Figure 12
                  Sample Plot Uring Data Cards
                     and CRVPLT Driver

-------
          CALCOMP   FUNCTIONAL
       PLOTTING   SUBROUTINES
   IS
5
 i
                                      PT
                  Figure 13
       CALCOMP Functional Plotting Subroutines
                                                            131

-------
                                         HARVARD GRAPHICS PACKAGES

                                                 By Curtis S. Lackey
   PREFACE

       This paper consists of  excerpts of  the  systems
   manuals of the subject programs. Permission has been
   obtained to reprint these excerpts from the Laboratory
   for Computer Graphics and Spatial Analysis at Harvard
   University  and  from  Professor Elliot E. Dudnik at  the
   University of Illinois at Chicago Circle.

       Interested  parties should  obtain  a  copy  of
   LAB-LOG from  the Laboratory. Address inquiries to:

    Laboratory for Computer Graphics & Spatial Analysis
       Graduate School of Design Harvard University
                     520GundHall
                     48 Quincy Street
            Cambridge, Massachusets 02138

   LAB-LOG summarizes the materials available from the
   Laboratory and details ordering procedures.

   INTRODUCTION

       The  Laboratory  for  Computer  Graphics was
  established in the Spring of 1965 with a grant from the
  Ford Foundation to the Graduate  School of Design of
  Harvard University. Under the direction of Howard T.
  Fisher, and with the advice and strong support of Dean
  Jose Luis Sert and Professor  William  Nash,  the Lab-
  oratory has  developed  programs  for  high-speed
  electronic digital computer mapping and new techniques
  for  graphic  display  that  utilize  the  accuracy, thor-
  oughness, speed, and low cost  of computers. Originally
  based in the  Department of City and Regional Planning,
  the  Laboratory during the past two years has become a
  service  organization to  all departments in the Graduate
  School  of Design, Recent projects have included faculty
  and  students  from each  of  the  departments and
  programs, and currently every department shares in (he
  Laboratory's activities through joint appointments.

      Research in the Laboratory is essentially Ol" 'wo
  types. The first is a continued investigation into the uses
  of graphical  analysis,  and  computer graphics   jn
  particular. Graphics is oflen a desirable mode of analysis,
  synthesis, and communication, and in the past, manually
  executed display has been adequate for many situations.
  Today's problems, however,  arc so complex  that many
 elements of great diversity must be weighted in order to
 consider even the most rudimentary of them. Graphic
 display  must be capable of a new level of sophistication
 and reliability in order to provide a sound basis for plan-
 ning decisions of all kinds.  Parenthetically, it  is im-
 portant  to  note  that  the Laboratory has not  been
 concerned  with  the  development of computers  as
 machines. Rather, its aim has been to develop  appli-
 cations for existing, widely available equipment.


      The Laboratory's past work  in graphics was built
 largely on the computer mapping program developed at
 Northwestern University's  Technological Institute  by
 Howard T. Fisher. The technique, called the Synagraphic
 Mapping System, or SYMAP, is capable of composing
 spatially-distributed data of wide diversity into a map, a
 graph, or other visual display. The Laboratory initial'ly
 built its work on the SYMAP  program because  of its
 great adaptability and because it uses the most widely
 available kinds of computer equipment. However, in-
 vestigation  has also  been directed to other types of
 programs and to applications 'of-cathode ray tubes and
 line plotters.


     The second  type of research undertaken by the
 Laboratory is pure research in the framework of general
 systems  theory and spatial patterns. Much of the spatial
 analysis  research  constitutes work  in  what mi,  it be
 called "general spatial systems" theory, undertaken in
 relation  to architecture, landscape architecture, city and
 regional  planning, and urban design, with emphasis on
 the  roles of  computers  in programming, design,  simu-
 lation, and   evaluation. In addition, a  strong research
 effort  has  been  organized in  theoretical  geography,
 supported at  present  by the Office  of Naval Research.
 Here, there  are three  particular lines of endeavor: the
 theory of surfaces as  related  to spatial structure and
 spatial   processes  for geographical  phenomena; the
 macrogeography of social and  economic phenomena;
 and  central place theory. Needless to say, these are not
 mutually exclusive categories, and already several major
 syntheses have been produced. The work is obviously
 undergoing  a period  of continued mathematization.
 Heavy  use of the various geometries  seems assured
 Furthermore,  the   distinctions between  various
 systematic branches of spatial study diminish at the the-
urelical  level  as common  spatial  properties are
132

-------
rccogni/cd and  as spalial solutions  increasingly  cut
Across traditional subject mutter.

     As u service organization to the Harvard Graduate
School of Design, the Laboratory maintains data banks
for several courses and assists in the establishment of
computer programs and in the design of models for de-
cision making. The Laboratory  staff currently  teaches
nine  courses within  the  School  of Design and partic-
ipates  frequently  in  seminars.  In addition,  the
Laboratory  maintains  out-of-house  services  to other
universities,  government  agencies,  and  planning  or-
ganizations. A prime policy of the Laboratory has been
to disseminate its programs for the widest possible use;
correspondence  training  in   the  use  of SYMAP and
OTOTROL has been and  continues to be offered, and
several intensive  teaching sessions have been held at var-
ious  universities  and other  organizations. The
Laboratory has held two  conferences, one for planners
and geographers at Harvard in the Spring of 1967, the
second, primarily  for architects,  at the Center for Con-
tinuing Education, University  of  Chicago, June 1968. A
scries of luncheon seminars dealing with many aspects of
the Laboratory's interests were held at Harvard in the
Springs of 1966,  1967,  and  1968. Speakers were the
Laboratory's staff and invited guests. Finally, a number
of demonstration  contracts have been  undertaken, the
largest current project  being  to show the advantages of
computer mapping programs in air pollution studies.

     The  Laboratory was reorganized'on July 1, 1968,
as the Laboratory for Computer Graphics and  Spatial
Analysis  and  made a part of the Harvard Center for
Environmental Design  Studies  within the  Graduate
School of Design. This move was a result both of the
Laboratory's larger role as a computer consultant to the
Graduate  School of Design and of an expanded concept
of its intellectual goals. William Warntz became Director
of the Laboratory in July 1968.

     In  the  immediate  future, the Laboratory plans to
continue its dual  tuscurch  and development emphases in
computer graphics und  spatial  analysis.  It  hopes to
further these objectives by strengthening the program of
joint  appointments within the Graduate School of De-
sign  and by providing  more opportunities for graduate
and   undergraduate students  to become involved  in
computer-assisted  research. It further hopes to build
upon the wider contacts already made with planners,
designers,  statisticians, systems analysts,  economists,
computer technicians,  programmers,  etc.,  both inside
and outside the university.
THE SYMAP PROGRAM

     SYMAP is a computer program for the production
of maps and diagrams which graphically depict spatially
disposed  quantitative and  qualitative information. It is
suitable for a broad range  of applications in a variety of
disciplines and is provided with numerous options to
meet widely varying requirements.

     The term "synagraphic" is used to designate the
general type of computer mapping to be described. It is
derived from a combination of the words "synactic" and
"graphic," the former an  English word meaning acting
together, cumulative in effect  (from the  Greek  word
"synagcin,"  to bring together). The first  syllable  of
SYMAP is pronounced as in symbol.

     The  program  permits raw data  of  many  kinds
(physical, social, economic, etc.) to be related, manip-
ulated, weighted, and  aggregated in a variety  of ways
subject  only to the user's needs or requirements. By
assigning values to  the  coordinate  locations  of data
points or data zones, one or more of three basic types of
map may then be produced, as specified by the user.

     The  concept, overall  design,  and  mathematical
model for the SYMAP  program were developed  in the
autumn  of 1963  by  Howard  T.  Fisher,  working at
Northwestern Technological Institute. Programming was
carried out  by Mrs. O.  G. Benson  of the Northwestern
University  Computing Center.  Since that time  many
others have contributed ideas and cooperated in bringing
the program to its present state. Much of the work has
been  accomplished  at  the Laboratory for Computer
Graphics and Spatial Analysis  established in  1965 at
Harvard  University. The  present  expanded  program,
SYMAP, Version 5. is the product of this facility with
particular debts owed to Robert A. Russell and Donald
S. Shcpard.

GENERAL DESCRIPTION

     The SYMAP  program is written in FORTRAN IV.
The source program consists of a main  program and
49 subroutines and is approximately  5000 cards  in
length.  At  present, the  program  requires  either a
200K partition if  implemented without  overlays or a
108K partition with overlays. The program has been suc-
cessfully implemented on numerous computers ranging
from the IBM 709 to the IBM 370. Comparable CDC,
UNIVAC, XES, or HONEYWELL equipment have been
or can be used with little or no modification.
                                                                                                             133

-------
TYIM.S 01  MAPS
     I lie tliiee types of maps most commonly produced
by the  SYMAP program are:

     I.  ('t)ntour  (has d  on  the use ol  contoui lines,
eai.'h ot which represents  a uniform value throughout its
length: B-DATA  POINTS package). The  contour (or
isoline  or isopleth) map consists ol closed curves known
as contour lines  which  connect all points having the
same  numeric value or  height.  Contour lines  emerge
lioin a datum plane at selected levels which are deter-
mined  tiom the scale of  the map and the  range of the
data. Between any  two adjacent contour lines, a  con-
tinuous variation or slope is assumed. Theietore. the use
ol conloui  lines should he restricted to  the lepresenia-
lion of spacially  continuous  information, such  as  most
lopogi.iphy, rainfall  and population deriMts.

    J.  CoHJiirnuint (based on conloimance   to  the
boundanes  ol a data /one; A-CONFORMOLINliS pack-
age). The conformant (or  chotopletli) map is best suited
for  data, either qualitative  or  quantitative, whose arcal
limits are ol significance, and whose representation as a
continuous surface is  inappropriate. Each  data zone  is
enclosed by  a boundary "conformant"  to  some  pre-
defined spatial unit. The entire spatial unit is given the
same  value, and symbolism is  assigned  according to its
numeric  class.  Local variation of the data within the
boundary will not be apparent, but on the average, will
be correct.
     3.  Proximal (based on proximity to a data point;
B-DATA POINTS package and eleetives 31, 36 and 37).
The  proximal map  is very similar in appearance to the
conformant map. However, the spatial units are defined
by  nearest neighbor  methods  from  point  information.
liach character location  on the output map is assigned
the value of the data  point nearest to it. Boundaries are
assumed along the line where the values change and con-
formant mapping is  applied.

-------
I


JATERQUftLITY STUDY]
               Dissolved Oxygen
                           Figure I
                           SYMAP

                                           V  .  II
                                           EK-fe
THE  HUDSON  RIUER
JMftTER QUALITY STUDY|
                                                               coliform.

-------
                                                   MISSISSIPPI- V R 2 0 D   D E LT R
    U.S.G.S. Map With
    Grid Overlay
Average Rain During
Growing Season
                                          Figure 2
                                          SYMAP
136

-------


               I
     A  J /"'     [-,    I
 flA
/
           ym   i
              fi  $   e.<;
         >.  •'". *i   \   '^i
PUGET   5DUND  REGION
                                       - -.-ir...!:«•:„".aw.-
                              PUGET SOUND REGION STUDY
                          Carbon Monoxide Motor Vehicle Emissions
                           Cities and Corridors Plan — System 35
                      Figure 3
                      SYMAP
                                                          137

-------
 THE SYMVU PROGRAM

      SYMVU is u computer graphics program written for
 the purpose ol generating three dimensional line-drawing
 displays ot  data.  The SYMVU program  is usable by
 persons  with  very  little  programming experience. Only
 three control cards are necessary for the generation of
 the graphic displays. Thirty-two electives or options are
 built into  the program allowing for  considerable  flexi-
 bility in generating the displays of data. However, only
 seven of these options are absolutely necessary for the
 pioduction of a single display.

     SYMVll  is most  commonly  used  for quantitative
 geographic mapping purposes. Unlike contour mapping,
 SYMVU  illustrates the  absolute  values  of spatially
 ciiiitiiuums  data. Contoui  maps (or  isoline  maps) are
 lounded  intervals  lor  mapping displays of quantitative
 mloi iiialion. In the case of SYMVU absolute values ot a
 continuous surface  are illustrated. The SYMVU program
 .ilsu has  the capabilities ot  conformant and proximal
 mapping using data generated by the SYMAP program.

     One of the distinct features of this program is that
it accomplishes the task of deciding which  parts of the
object being viewed are seen and which are hidden from
view.
      The program is written in FORTRAN-IV and is cur-
 rently  being  operated  on  the  IBM 360-65  using
 220K memory.  The  output device  that  draws  the
 illustrations  is a CalComp plotter. Runs have  been suc-
 cessfully  completed  on both an 11 inch and  30 inch
 plotter,  although,  because  of a  number  of  size
 constraints within  the program  a 30inch plotter pre-
 sently has little advantage.

 BASIC PRINCIPLES

     The  program  utih/.ies basic data which  is in the
 form of a  grid matrix. This  matrix information can be
 generated in either of two ways.

     The first and most common  alternative is to utilize
 data  generated  by   the  SYMAP  program   as  a
 two-dimensional  graphic display. SYMAP interpolates
 data  from point or  area  locations  so as to produce
 spatially  continuous surfaces organized in  the form of a
grid  matrix.  This  data  can  then be  utilized  by the
SYMVU   program  for  producing  three-dimensional
graphic displays.

     The second  alternative  involves  providing data di
rectly in a matrix form, either on cards or tape.

-------
        ST.  LOUIS RE:GION STUDY
St. Louis In U' rstatr Air Pollution Study Aroa
   ii-houf SO, Surl.Uf  in C")hli(|iu' Vir\v
    Deccnibrr £0. Il,*o4 —4 and  t> .1.111.

                                        Figure 1
                                        SYMVU
6  am
                                                                                    .139

-------
             • 4    *~   - -J=I~ :^- i^^SST'    "  >-;;-~~-:=>.  :T:O S~-~-^~ ~ •---"• '-"^

                                                                                    ^^
I960 Population Density Ijy Counties
Within the  Northeast Corridor,
BuMon,  .M;i>.s;icliu.-ictls to Norfolk, \ ii'
                                                                                                                                                   <
                                                                                                                                                   C i

-------
     GRID PROGRAM

     GRID  is  a computer  program  which has been
        to provide a highly efficient means for graphic
Display  of  information collected on the basis of a rec-
c^ngular coordinate grid. The GRID program is designed
^r use by persons  with very little programming  ex-
^^rience. However, it is usually necessary for the user to
*t>ecify his own data formats in subroutine FLEXIN, and
*h)js   requires  an  elementary  knowledge   of
FORTRAN iv.

     The program is written in FORTRAN IV and is cur-
^ntly  being  operated on an IBM 360/65  computer at
Harvard  using   ISOKbytes.  With  small  programming
Changes  it can be  operated  on the  IBM 7094 with
^2K memory. It is possible to operate this program on a
^mailer  machine with a memory of at least 12K words.

&ASIC PRINCIPLES

     Each data  value is assumed to be  associated with a
tell on  a  grid.  It is essential that the  values should be
processed  in the correct order, since the program accepts
the data in the  order in which it prints the map. By the
standardized  printing  process,  the program starts at the
top ol  the map and processes the data  horizontally row
by  row and from  left to  right  in each row. Thus the
numbers below  represent the order in which thirty data
values on a 6 x 5 grid will be processed and printed:
      123456

      7    8    9     10   11    12

      13   14   15   16   17    18

      19   20   21   22   23    24

      25   26   27   28   29    30

     In  the mapping process, the actual data values are
generalized into groups,  each  group having a unique
graphic symbol associated  with it. The groups into which
data are to be placed and  the associated symbols may be
specified by the user. Two types of  symbolism  are
available:  a grey scale of symbolism from light-dark
which must be  specified  by the user, or a dot map in
half-inch square  cells. The coordinates of each grid cell
location  may  also  be printed. Most  of the SYMAP
electives for scale generation are available on GRID.
                                                                                                             141

-------
                   m%«£                          Kxmzmx
        1 • Water
        i' • :;wamps,  Vtet laiu.s uinj/or l.rvi-J
        j • Kofi-:-.t  .'(Hi/or Hilly !.ur» ; .-,,pt
        *i • Ak1 rl culluru ur c^'indi'.i.i
        !' » lnail lullons  or KuliHc ..
        i' • I-oil  irnsKj HeolUe-itlnl
        i' • nlfii  ionsHy  r\cslo(T,tlal
        '' • i'oi.,ni-folul
        ') • In.;uat rial
                                    •S3JE--
                                   JIfjgggjS.
      Predominant Visual Cliaraclri', T    0
                                                 Figure I
                                                  GRID
142


-------
 THE CALFORM PROGRAM

      CALI-ORM is a computer program designed to pro-
 duce conformant maps on  a pen (or CRT) plotter. A
 Conformant  map is one in which  symbolism (usually
 ^hading) represents the values of data attributed to data
 Collection  units (data zones) such as census tracts, mu-
•*licipalitics, counties, and so forth.

      Three  steps are necessary to prepare a conformant
 'Viap. These involve the definition of locations, values,
 ^nd map  options. Locations need be defined only once
 for a series of maps which portray various subjects for
 'he same data zones.

      The  location of each data /one must be described
 ^ u scries of straight line segments. Curved lines may be
 Approximated by several straight  line segments in order
 ID  preserve  the  degrees  of  detail  desired. Straight  line
 Segments  are defined in terms of the x-y coordinaies at
 their end  points. The  resulting  description  of  /one
 hciundaiics  is called a computer  readable base map or
geographic base file. This Jam is punched onto cards and
oipmml in  the form of several functional packages.

     The  CALFORM  program  is written  in
FORTRAN IV and is currently  operated on the Harvard/
MIT IBM 370/165 using 128K bytes of core memory.
The  illustrations  in this  paper  were  produced on  a
CaJComp Model 780, 11-inch incremental drum plotter.

     The program was written initially by  Robert S.
Cartwright, Jr., an applied mathematician and computer
programmer for the Laboratory for  Computer Graphics
and  Spatial  Analysis, Harvard University. The original
development of  this program by  Mr. Cartwright  was
sponsored by the  Environmental  Health Surveillance
Center, University of Missouri in  the summer of 1969.
Since that time, the program and manual have been ex-
tensively  revised  at   the  Laboratory  for  Computer
Graphics and Spatial Analysis. Individuals who have con-
tributed to this work include Duncan Hughes, Kathleen
Reine, David  Sheehan, Geoffrey  Dutton and Nicholas
Christman.

-------

                   POPULRTION  DENSITY  BY  STF1TE


         CONTERMINOUS  UNITED  STRTES:  1970  U.  S.  CENSUS
I
  KEY TO SYMBOLISM

People per Square Mile
192.06 -  943.04




79.93 -  192.05




47.16 -  79.92




19.34 - 47.15




3.37 - 19.33
                                CALFORM Map
o
>
r
                                                                               B-

-------
               GROWTH RRTE"
       '.966-70 INCREPSE IN PLL  JOBS
       EQUOLIZED 8T LPND RRER
SOURCE:

TOT&L5:
MRSS DIV OF EMPLOYMENT SECURITY

MHFfl MflRKET flREBS
           28. 3U  -   101,16
           6.514     28.34
         • 2. 99  -  6. 54
         1-0.34     2.99
         1-26.96  -  -0. 34
                                                                         o
                                                                         > 3
                                                                         O 3
                                                                         I"

-------
   THE POLYVRT PROGRAM

       Recent yeuis have witnessed an upsurge in lite use
   nl' digital computers  I'or geographical analysis and  dis-
   play, particularly in cai.ograpluc applications. Througli-
   oii(  industry, government, and academia, spatial data is
   being  encoded,  manipulated and displayed  for an in-
   creasing  variety  of applications  such as street address
   encoding, land use information retrieval, market analysis
   and  cartographic display  of social  data.  Due to  the
   relative novelty and the diversity  of uses of such applica-
   tions, these computerized geographic base  files (GBFs)
   currently  exist  in a  variety of  special  formats, each
   suited  to  certain uses, and each  requiring different
   computer software for their  use.  If may be many years,
   if indeed ever, before geographic  information is encoded
   in a standardized fashion; some  recording  schemes  are
   oriented  toward fast  selective retrieval of spatial  data,
   other  systems  provide efficient  storage  for  mapping
   purposes,  while  still  others  may sacrifice internal effi-
   ciency  for ease of input and/or editing by inexperienced
   computer users. The lack of any common coding system
   for  GUI-s makes itself felt by restricting the exchange
   and use of existing data buses. This is particularly unfor-
   tunate  in  the case of  massive base Hies,  such as DIME
   files, which have been produced  at  great expense, and
  which may gather dust on users' shelves merely because
   the  format is incompatible with  the software required
   for  the users' desired  applications. And  even when the
   file  in question is not  particularly cumbersome, it is still
  annoying at the least to have to recode  the information
  merely because one wishes,  for  instance, to go from
   producing maps  of the file  on the  printer to  making
  plotted versions of those maps.

       Without hoping  for a  universal GBF format to
  exorcise such bottlenecks, there is still no need for GBF
  users to recode their files by  hand; this is the sort of job
  that  computers  have   been  designed to do,  quickly,
  tirelessly  and exhaustively.  The  missing link between
  different  GBF  formats exists  now in  the program
  POLYVKT (Polygon  Converter), which  can translate
  from any of several GBF types into uny  other of  their
  formats, while at the same time allowing  editing of the
  file, selective polygon retrieval, filtering out of unneeded
  line  detail,  coordinate  transformations and  scaling, and
  annotated  plotted output of files as well as output in
  machine-readable form.  POLYVRT  docs not translate
  between all  different  formats directly, but  through a
  common data structure called a  chain  file which pre-
  serves all   input  information, as well  as  including
  information  generated   by  POLYVRT for  translation,
  such  as left  and  right  polygon designations which arc
  lacking  in  most file formats. GBFs  in chain  file form
 may be written out in binary or formatted records and
 arc acceptable input to the POLYVRT program itself.
 POLYVRT chain files  may be created from or into  the
 following GBF formats:

          DIME files (county, state, metropolitan)
          World Data Bank I chain files
          SYMAP Conformolines (output only)
          CALFORM Points and Polygons Packages.

 More file types will be accommodated as the demand  for
 them becomes known  or the user may define his own
 file types to the program.

     Although the chain file structure combines the  ad-
 vantages of many base  file formats, it is not intended to
 replace them, except possibly as a medium for exchange
 of base files from one user's installation to another. Each
 input  and output file  type  has  its  own strengths and
 special  applicability, and through POLYVRT these may
 be most effectively  exploited.  Since the program edits,
 selects, checks for consistency and topological closure.
 generalizes overly detailed input, and allows  visual  in-
 spcction of files through plotted  output, in addition to
 its  conversion capacity,  GBF  users  can be sure  of
 obtaining with it the  exact base  file they  need  for
 virtually any application.

     POLYVRT has a user-oriented command language
 organized into independent packages using fixed-field  in-
 struction  formats.  By  the inclusion  and  exclusion  of
 packages, all or only some of POLYVRT's capabilities
 can  be invoked for any given execution of the program.
 Data may be  read in with the input deck or  fron any
 sequential storage device. Some of the chain file features
 which  POLYVRT  produces are automatically stored
 off-line in temporary files,  and  this enables the program
 to manipulate files  of  nearly unlimited  size in  points.
 Using a carefully constructed list structure of pointers,
 core requirements are modest.

     The  POLYVRT program  is the  product of two
 years of design and  development, during which  time it
 has grown from a few routines to over 60 as new func-
 tions have been built into the system. It is not,  however,
 the complexity of the programming, but the clarity  of
 the data structures that brings an important new tool for
 computer cartographers and spatial analysts.

     The data structure for Dual  independent  Map En-
coding (DIME) files arose out of the Census Use Study
by the U.S. Bureau of the Census in preparation for the
 1970 Census of Population and  Housing and was de-
veloped  under the direction of Caby  Smith. The DIME
146

-------
file  (of  U.S.  County  Boundaries)  supplied  with
POLYVRT is a duplicate of the county DIME file dis-
tributed by the Bureau of the Census, reformatted, bui
not altered in structure.*

     The central concept of POLYVRT's structure is the
recognition of unbroken  boundary units (chains) as the
basic element of a geographic base file.  World Data
Bank It  the  product of  Warren  Schmidt  of  the U.S.
Central Intelligence Agency, provided a basis for the
Laboratory's development. The need for easy access to a
variety  of data Hies formulated the requirements for a
topological data structure. Most of the development  of
POLYVRT's  Hie structure and the bulk of the  code was
the work of Nicholas Chrisman. James Little, author  of
the  UPDATE package,  also participated  in  the  pro-
tracted testing and design process.

     A  particular debt is owed  to  David Douglas  of
Simon Fraser University, who with Dr. Thomas Peucker
developed  the  detail  filtering algorithm  used  in
POLYVRT's  F-GENERAL1ZE  package, enabling super-
fluous or inconsequential  detail  in user's  GBFs  to  be
suppressed  in   files  output  from   POLYVRT.  The
DougJas-Peucker  algorithm was expanded  by  the  Lab-
oratory to allow a hierarchy of detail levels to  be stored
and retrieved by POLYVRT.

     Constructive criticism and comments from many
persons contributed to  the process of programming
POLYVRT, but  specific acknowledgement to all might
prove tedious  to the reader. Special thanks  is given to
Allan Schmidt,  Tom Pcuckcr, Kathy  Reine and  Bob
Fowler.  Documentation  was  prepared  by  Geoffrey
Dutton,  carefully  scrutinized  and  corrected by  Nick
Chrisman and Allan Schmidt.
REFERENCES
Dudnik, Elliott  E., SYMAP  User's Reference Manual,
     Chicago, University of Illinois.

Schmidt,  Allan  H.,  LAB-LOG,  Cambridge, Harvard
     University, 1973.

Stinton,  David  and  Steinitz, Carl,  GRID Manual,
     Cambridge, Harvard University, 1971.

— .CALFORMManual, Cambridge, Harvard University,
     1973.

	POLYVRT User's  Manual,  Cambridge, Harvard
     University, 1974.

	The  Red  Book,  Cambridge, Harvard  University,
     1970.

	SYMVV, Cambridge, Harvard University,  1971.
       of the county DIME files on tape, as well as copies of the metropolitan DIME files currently available, may be obtained from the
     Service Staff, Bureau of the Census, Washington. D.C. 20233. for $70 per reel of tape. Certain metro DIME files require more than one
reel.
+Thc World Data  Bank File (version I, April 1972. in original Forucii) is currently available from the  National Technical Information
Service, Springfield, Virginia 221 51 , as Pli-2231 1 78, on magnetic tape lor $97.50 ($1 22.50 for foreign orders). Documentation for tills file
may  b« found in  Cieiyraphiral Location Codes, prepared by the  General Services Administration of the  Federal Government. Federal
Supply Service number 7601-926-9078. available from the Superintendent of Documents, U.S. Government Printing Office, Washington,
D.C. 20402, for $2.75.
                                                                                                              147

-------
                                EPA'S REGIONAL AIR POLLUTION STUDY (RAPS)

                                               By Robert B. Jurgens
   DESCRIPTION OF THE REGIONAL AIR POLLUTION
   STUDY

       The  concept of a Regional Air Pollution Study
   (RAPS) was developed in the late  1%0's.  With direct
   White House support, RAPS became a reality when Con-
   gress agreed to allocate between $22 and $26 million to
   be  spent  over  a  5-year  period  beginning  in  fiscal
   year 1973. The  overall  focal  point and rationale for
   RAPS is the development of a series of Air Quality Sim-
   ulation Models (AQSM) which can be used in least cost
   air pollution control strategies relative to EPA's current
   standards. These models could, for example, help in de-
   signing state pollution control implementation plans.

      The criteria for choosing a region included having a
   city with  significant air pollution,  being isolated from
   influx of air pollution from surrounding areas and having
  a relatively simple topography. Of many candidate cities,
  St. Louis,  Missouri, was chosen. The St. Louis region is a
   20- by  14-kilometer area defined in Universal Transverse
  Mercator (UTM) coordinates.

      The   principal elements of  RAPS arc shown  in
  Figure I. As mcnlioncd previously, the goal of RAPS is
  the development and evaluation of AQSM. Four func-
  tional modules comprise the sophisticated AQSM:

           An  emissions model  describing  the  location
           and temporal variations of emission rates of
           pollutant sources

          Atmospheric transport and diffusion

          Atmospherical chemical reactions of primary
          and secondary sources

          Removal processes such as rain, dry deposition
          or absorption by vegetation.

     The RAPS emission inventory  will encompass 12
 separate  source categories, the principal  ones being
 illustrated.  The  individual  methodologies  arc  being
 designed lor collecting emission inventories on an hourly
 basis.  The  data handling system for the RAPS emission
 inventories  is  scheduled  to become  operational  in
 April 1075.
      Paralleling the development of an emissions inven-
 tory is the establishment of a comprehensive aerometric
 data bank. The St. Louis Regional Air Monitoring Sys-
 tem  (RAMS) is  a network of 25 fixed meteorological
 and air quality data acquisition sites. A Digital Equip.
 incut Corporation  (DEC)  PDP-8 is used  for  onsite
 process control and as a communications interface to the
 central facility. A PDP-11/40 at the central facility pro-
 duces one 1200-foot  reel of tape per day on which are
 recorded  instrument calibration and  status information
 as well as minute values and hourly averages of the aero-
 metric data. The aerometric measurements that are  made
 are shown in Table I.

                      Table I
           RAMS Network Measurements
fe*..**:.!
Wind speed
Wind direction
Turbulence
reinpfniturf
IVitilH-raliiie fr 	 rnl
PTeunie
IV» intuit
•\etos,.| s, illfl IMsllltltlsl
(,lol<.il sol.ii i.idulinii
Dllft 1 sol.il l.llll.ltloll
lerresliul loiifunsf r.uli.iiion
WS
Wll
i1
1
Al
1'
1)1'
'Vat!



AirQuiUly
Sultur dioxide
I ol.il sulfur
Hydrogen sulflde
O/one
Nitrous uxldv
Nilr»|>on dioxide
McllLine
lot.il lisdrocwtxtns
lol.il susprndctl iiDrticulali-s


SO,
Ts"
M,S
t)-
Ni)
M),
OI4
UK
ISf


     Augmenting the RAMS network are seasonal field
expeditions where concentrated efforts are made in such
areas as plume studies, wind field analysis, gas chroma-
tography, elemental  analysis, radiation and energy bud-
get,  aerosol  programs,  photochemistry  and  temper-
ature-humidity effects on light scattering.

     In addition  to  the ground based RAMS network,
upper  air  measurements  are  routinely  made.  Meteo-
rological balloon  releases, pibals. and rawinsondes yield
horizontal wind fields and vertical temperature  profiles
During  the field  experiments, helicopter flights spiral
down above certain  field sites  collecting a complex of
aerometric data similar to that  at the RAMS site below.
Thus, a three-dimensional data bank for model devel-
opment and evaluation is obtained.  RAPS will be pro-
viding the most comprehensive  emissions inventory ever
amassed. It will be collecting  the most comprehensive
and  best ambient  air quality data ever  obtained. The
148

-------
meteorological conditions wilt be better described than
ever before for un urban  urea, and all this data will be
readily accessible.

     This  leads  us directly to data management, the
fourth element of RAPS. The  24-hours worth of RAMS
network data which has been written onto magnetic tape
at the St.  Louis central facility is sent to the data man-
agement  unit  of the  Meteorology   Laboratory  at
Research Triangle Park.

     A quick look survey program checks data consis-
tency, reformats the  data for archiving, creates  mass
storage data files of hourly data for inclusion  into the
RAMS S2K data base and produces a one-page summary
printout. This summary information which can also be
stored in a report file on the Univac  1110 computer in-
cludes station status information, diurnal meteorological
conditions and key pollutant concentrations for selected
rural and urban sites. This information will be displayed
in the form of X-Y printer plots.

     Access to minute values of RAMS data will  be via
sequential tape files where data for about 10 days can be
stored on one  2400-foot reel at  1600 bytes per inch.
Access to hourly data will be  either via S2K immediate
access mode  for  those having formal S2K  training or
through  FORTRAN (or  COBOL)  programs.  For
browsing through the data base from a demand terminal,
the immediate  access mode would  be  used;  whereas,
FORTRAN programs, for instance, will interface the
data base to the graphics package.

     Data from the upper air network and from certain
field experiments, which  together with the RAMS data
compose  the RAPS data  bank, will  similarly be  acces-
sible via tape or S2K data base.

RAPS GRAPHIC SYSTEM

     The maximum daily volume of hourly RAMS data
is 12,000 words. The total RAPS volume per day is close
to one million words. When dealing with a scientific data
base of the size of RAPS, effective  computer  graphics
become  the  primary  mode  of  visual analysis  and
presentation of results.

     The principal hardware/software support for  graph-
ics  at Research Triangle Park is the UNIVAC 1110 facil-
ity  which will  support both off-line Calcomp  plotting
and interactive  graphics  terminals  such  as the Tek-
tronix 4010. The Meteorology  Laboratory has purchased
two 4010's and associated hard copiers. With the aid of
the Data Systems Division, the  Laboratory  has elimi-
nated utmost all  bugs in the UNIVAC 1110/Tektronix
interface. Transmission rates are  now 300 baud but are
expected  to go  to  1200baud  when  some  UNI-
SCOPE 100  ports are freed. The Meteorology Labora-
tory  owns  a  DEC  POP-11/40  disk  operating  mini-
computer with a refresh graphics terminal. Data trans-
mission  rates for  this cathode  ray  tube  (CRT) are
9600 baud, thus permitting the possibility of movie gen-
eration at a later time.

     Two task orders entitled "Computer Graphics Plan-
ning  and  Support" and "Plan  for Interface between
Graphics Display and RAPS Data Base" have been com-
pleted. The implementation phase is now in progress.

     The structure  of the interface  between Data Base
and  Graphics Display is illustrated in Figure 2. The key
to the  system is the command program. The command
language will be  developed for both graphics and data
requests. The same command  language will be used on
all RAPS data and graphics systems. Consequently, if the
same set of commands are entered on  punched cards or
typed  into  a remote typewriter terminal, the results
should be the same.

     The description of the command language, the class
of commands to be developed, and  an example request
illustrating the hierarchical command structure are given
in Appendix A. These are from Dr. R. H. Allen's "RAPS
Data Base and Graphics Interface Plan."

     The RAPS Graphics Specifications are included in
Appendix B and are taken from  a Request for Proposal
(RFP) of Graphics System.


CURRENT STATUS

     The  RAMS network  is about  2  months from the
30-day  acceptance period.  The  upper  air network  is
scheduled  to become operational  November 1,  1974.
The  RAPS graphic system should be  developed by the
end  of the year.

APPENDIX A • THE COMMAND PROGRAM

User and Command Program Interface

     The command program is the FORTRAN main pro-
gram that communicates with the user. All user requests
arc entered into the command  program and all results
                                                                                                           149

-------
   conic buck l<> the user through the main program. The
   main program has the following functions:

            Log user on

            Keep use stathtics

            Prompt user

            Accept requests

            Give command error message

            Call  data,  graphics,  utility, and library
           subroutines

            Return results

           Generate displays

            Keep status.

       The standard interface  between the user and the
  command  program is the command language. The same
  command  language is used for all input devices and all
  output devices.  Of course some input  devices cannot
  generate some input requests and some output devices
  cannot display some results. For example in the card
  input batch  mode the  user  cannot take action on an
  unexpected response to a request. Also interactive graph-
  ics requests cannot take place when line printer sim-
  ulated plots are being generated. The command language
  is considered in the next section.

       The user  inputs are entered on the following media:

          Punched cards

          Typewriter keyboard

          Position input on a graphics terminal

          Digitizing tablet input.

      The  results   are  returned  to  the  user on  the
 following media:

          Line printer
          Typewriter
          Graphics terminal
          Electrostatic printer/plotter
          Calcomp Plots.
Command Language

     A common command language should be developed
for both graphics and data requests. The same command
language should be used on all RAPS data and graphics
systems so  that if the same set of commands are entered
on  punched cards or typed into a remote typewriter
terminal, the results should be the same. The format and
procedures  for  data or graphics requests should be the
same.

     The command  language should have the following
characteristics:

         Hierarchical

              Full instructions
              Prompting
              Short form

         Forgiving

             Checking

         Well-documented

             Users
             Programs
             Examples

         Common language

             All input devices
             All output devices

         Graphics language subset of data language

         Any command can be given at any time

         Always store status so that program or com-
         mand  can  be restarted (RSTR) in event of
         failure or interrupt.

    The classes of commands should include:

    1.   Log

        a.   On
        b.   Off
        c.   Terminal characteristics

    2.   Information
ISO

-------
    3.  Control
         u.
         b.
Access rights
Mode
    4.   Status

         a.   Default

    5.   Menu

    6.   Graphic

    7.   Data

    8.   Reporting

    9.   Movies and videotape.

    The  data and the graphics commands should have
the following form:

            xxxx • command description.

The Tour-character mnemonic should be chosen to be
meaningful and unique. Typically the first four chara-
cters of the  command description can be used. Other
mnemonic formation rules include drop one letter of
double letters, dropping silent  letters, and dropping
vowels.

    The  command format should be free form. Com-
mands and parameters should be separated  by spaces
and/or commas. The order of a command should  be:

    1.   Command
    2.   Input parameters
    3.   Output parameters.

    On interactive terminals, display prompting chara-
cters should  be displayed  when  waiting  for input:
REQ<- A command should be read in 80A1 format and
scanned  in a conversion routine to avoid conversion
errors. Scan  from the end of the line to And  the last
character in  the line. Convert numbers by comparison
with a list of 10 digits and special charaters. Break the
buffer into parameters in the main program.

    The  full instruction mode for an example  request
might take the following form:
THE CLASSES OF COMMANDS ARE:

    LOG -...

    INFO -...

    CONT -...

    STAT -...

    MENU

    GRAF - MENU OF GRAPHICS COMMANDS

    DATA

    RPRT -...

    MOV1

ENTER REQUESTX3RAF

    WFLD -...

    WSTA -...

    WROS -...

    CNVT - PLOT CONCENTRATION VERSUS TIME
    OF A POLLUTANT AT A STATION FOR A TIME
    RANGE
                                        ENTER REQUEST>CNVT

                                        THE POLLUTANTS ARE CO, CO2, S02,...

                                        ENTER REQUEST>SO2

                                        THE STATIONS ARE NUMBERED 1 to 25. SEE MAP.

                                        ENTER REQUEST>2,

                                        etc.
                                        form:
                                            The prompting  form might  take  the following
                                                                                                151

-------
   CLASSICS - LOG,. .., GRAF, ...

   REQXiRAF

   GRAF COMMANDS - %VFLD, WSTA, WROS.CNVT,..


   REQXTNVT

   POLLUTANTS - CO, CO2, SO2,. ..

   REQ>S02

   STATION ]  TO 25

   REQ>2

   START TIME Y  D H

   REQ>Y4D144H14

   END TIME

   REQ>HI7

       The short  form might  take  the following form:
   REQ>CNVT SO2 2 Y4 D144 H14 H17.

       The command plots the concentration versus time
   of S02 from station 2 starting  at  2 p.m. on the 144th
   day of 1974 and ending at 5 p.m. on the same day.

   APPENDIX B - RAPS GRAPHICS SPECIFICATIONS

       As a supplement to these specifications, refer to the
  "EPA RAPS Graphics Plan," by Flow  Research  Inc.,
  written as part of a RAPS task order. This document
  shall be used as a guide in developing the plan specified
  in the Scope of Work and as  a reference in augmenting
  these specifications.

      The  contractor shall  develop, test, and install  a
  package which includes software  at the following levels:

       1. Command Program. -Contractor  shall  design
  and  develop  u command language and  program com-
  patible with  (he  RAPS data management software and
  existing LPA  facilities.

      2. Basic Routines.-Contractor  shall use existing
  Cakomp, Tektronix, and other software available within
  EPA where possible. As an example, a program  to dis-
  play  wind roses  on a Calcomp plotter  has been  de-
  veloped by the Meteorology Laboratory. This should be
 incorporated  into the package. All routines shall be
 FORTRAN callable and  be  cataloged in the command
 program.


      3.   System Modifications.-Contractor  shall
 identify system requirements and work with EPA com-
 puter specialists in resolving conflicts or in implementing
 and testing modifications to the data base software.
     Provision shall be made for displaying the graphics
 on a variety of existing and planned EPA computer pe-
 ripherals. The EPA Univac 1110 computer system is the
 focal point of all  RAPS data base activities. Supported
 graphics devices include a  Calcomp 900/1136 Plotting
 System (off-line), a Tektronix 4010 series interactive de-
 mand  terminal, and  standard  printers for  simulated
 plotting. These devices, because of their speed, are suited
 primarily to producing a limited number of high-quality
 picture  frames. They  will  be  used  to satisfy  over
 95 percent of the RAPS graphics requests. There are ap-
 plications requiring an extremely  large number of time
 sequenced frames,  namely  movie and videotape  pro-
 duction. In these instances, it will be necessary to trans-
 fer portions of the data base and graphics software to a
 computer system which supports faster graphical output
 devices.  A planned PDP-11/40 minicomputer facility
 with a  high-speed  interactive GT-40CRT and electro-
 static printer/plotter will be used  for movie generation.
 Where  possible, graphics routines shall be compatible
 with both computers  and  all peripheral devices.  The
 command programs, however, shall be implemented on
 the Univac 1110 system only.

     The preceding paragraphs enumerate general spec-
 ifications  regarding the  type of software and facilities
 involved. Tables B-l and B-2 contain detailed break-outs
 of the  displays  to be developed  and their associated
 specifications.

     The graphics system shall be readily applicable to a
 wide variety of data. Foremost is the application to the
 RAPS. After accessing the data base, the user shall be
 able  to apply  the desired graphics program in a straight-
 forward  manner.   The contractor shall  design  the
 software so that  the graphics is semiautomatically scaled
 to the datu to be  displayed.

     During the  course  of this project, the contractor
shall frequently  consult with the EPA  Project Officer
and his designated representatives. These people will be
readily  available to  the contractor so that system incom-
patibilities can be resolved in an efficient manner.
152

-------
                                                        Table  B I
                                               Displays to Be Developed
                     Display Typo*
                      I xamplrs
lilies

Text

Plots and (iraphs

   Graph of a function of one variable, f (x)

   family of curvet of a function versus one variable at a list of
      of values or a second variable, I (x,y)

   Multiple functions of one variable, f(x) and g(xl

   Contour plots, f 
-------
                                                   RAP5
   DATA
MANAGEMENT
MODEL DEVELOPMENT
  AND EVALUATION
 AEROMETR1C
MEASUREMENTS
 EMISSIONS
INVENTORIES
       SYSTEM
      VALIDATION
         RAPS
         DATA
         BANK
         DATA
        SYSTEM
       FACILITIES
           CHEMICAL
        TRANSFORMATION
            MODELS
        METEOROLOGICAL
            MODELS
          AIR QUALITY
            MODELS
          RAMS
          FIELD
       EXPEDITIONS
        AIRBORNE
        POINT
      SOURCES
        LINE
      SOURCES
        AREA
       SOURCES
                                                                                                  AIRPORTS

-------
    USER
COMPUTING
EQUIPMENT
  COMMAND
  PROGRAM
                                                         COMPUTING
                                                          SYSTEM
                                                          LIBRARY
                                                          ROUTINES
 COMMAND
 PROGRAM
 ROUTINES
GRAPHICS
 MODELS
REPORTING
ANALYSIS
    DATA
MANAGEMENT
  ROUTINE
    DATA
    BASE
                              Figure 2
                     RAPS Data Base and Graphics
                         System Interfaces
                                                                                155

-------
                                 CAPABILITIES OF TEKTRONIX SOFTWARE

                                             By David M. Cline
  INTRODUCTION

      Graphical display computer terminals have been in
  use for a  number of years. Until  recently, the display
  units were used essentially in  a refresh mode and were,
  by  necessity, connected directly to the computer that
  utilized them. With the advent of a low-cost graphic
  display storage unit, the use of graphics in a time sharing
  mode became very cost effective. The number of graphi-
  cal  display computer terminals that the Environmental
  Protection Agency (liPA) lias acquired over the last two
  years iiulicales u commitment  from the user community
  thai graphical support is required.

     As early  as 1972, EPA  personnel were utilizing
  Tektronix 4010 graphical display computer terminals at
  the Boeing Computer Services (BCS) IBM  360/67 com-
  puter facility at  Wichita, Kansas,  and at  the National
  Institutes of Health (NIH) PDP-10 computer facility at
  Bethesda,  Maryland.* Subsequently,  the EPA contract
 with BCS was  terminated in the spring of 1973, and it
 was not until late July 1973 that graphical support was
 available at the Optimum  Systems Corporation (OSI)
 facility.  The  OSI graphic facilities were chosen because
 they are available  to  all  EPA computer uses.  The
 PDP-8/E graphic  capabilities are discussed as they are
 available to a  number of EPA installations.

 GRAPHICS AT OSI

     Graphical  support at  OSI  is  available on  the
 IBM/360 model 155  and may be  accessed via  IBM's
 Time Sharing Option (TSO). The  OSI systems  group
 made modifications  to the  Telecommunications Access
 Method  (TCAM) that were necessary  for graphical sup-
 port. Southeast Environmental  Research  Laboratory
 (SERL)  personnel installed the Terminal  Control Sys-
 tem, the Advanced Graphing II. and the Calcomp Pre-
 view packages. The subroutine. TINPUT. which allows
 graphical input  was in error as were several of the Ad-
 vanced  Graphing II  subroutines;  however,  all  were
corrected before being made available as public files.

TERMINAL CONTROL SYSTEM

     The Terminal Control System (TCS) package is the
most fundamental of the packages  and  is  used by  the
 other graphics packages. It is  written completely  in
 FORTRAN IV in  the  form of 59 subroutines and is
 described  in  detail   in  the  Tektronix  document,
 062-1474-00,  entitled,   "Terminal  Control  Sys-
 tem - 4010, User's Manual."  The subroutines are saved in
 a cataloged dataset, CNA 324.DMC.TCS.

      It is assumed that  the user has a knowledge of TSO
 to the extent that a FORTRAN dataset may be created
 and  compiled. During  the  LINK or  LOAD step, the
 library of subroutines may be accessed by the following
 type of command:

      LINK your-dataset-name LIBCCNA324.DMC.TCS')
 FORTLIB. To utilize the subroutines, the FORTRAN
 files  FT05F001 and FT06F001 must be allocated to
 your terminal. This is accomplished by issuing the fol-
 lowing commands to TSO:

     ALLOCATE DA(*) F(FTOSOOl)

     ALLOCATE  DA(*)  F(FT06001) or  by executing
 the following TSO CLIST:

     EXEC  'CNF324.DMC.FT5.CLIST  LIST  (See
     Appendix.)

     A TSO CLIST dataset  is available to perform the
 process of compiling,  linking,  and  executing  jf  a
 FORTRAN dataset containing calls to the TCS package.
 It  may be copied to your account by issuing the follow-
 ing TSO command:

     COPY  'CNF324.DMC.TCS.CLJST  TCS.CLIST'
     (See Appendix.)

 The CLIST is used as follows:

     EXEC TCS 'dataset  name' LIST

where the dataset name is the name of the FORTRAN
dataset to be compiled without the FORT  extension. If
the dataset  to  be  compiled  and executed is DRAW.
FORT, issue the following command:

    EX TCS 'DRAW LIST (See Appendix.)
•Mention ol'i-omim'irial products does not necessarily constitute endorsement by EPA.

-------
 ADVANCED GRAPHING II

     The  Advanced Graphing II (A(ill) package  is a
          etl collodion of FORTRAN subroutines  ihui
        "I"-'  graphing  and labeling of data with lilllc
        jy.1 ol the TC'S package. The subroutines and ex-
fc»mplcs  for  using them are described in the Tektronix
fcocumeiil. 062- 1 530-00, entitled "Advanced Graphing
• l  User's Manual." The subroutines are saved in a cata-
loged datasct CNA324.DMC.AGII.

     To use  the  AGH  package,  one  must allocate the
 FORTRAN I/O files FTOSFOOI  and  FT06F001 as de-
 rkribed above. A TSO CLIST is available to support the
 frequence  of commands  required to compile, link, and
 Execute a FORTRAN dataset containing subroutine  calls
 [ LIST (Sec Appendix.)
 CALCOMP PREVIEW PACKAGE

     The Calcomp Preview Package consists of a series of
 ^ORTRAN subroutines that are used in lieu of the stan-
 dard  Calcomp plotter subroutines to give the user  the
 Ability lo preview  the output of a Calcomp plotter on
 'he Cathode Ray Tube (CRT). Thus, the user may view
 His proposed  graph  immediately with the interactive
 fektronix software. This capability  should prove to be
 &f tremendous benefit to Calcomp plotter users in terms
 tof decreased program development time. The subrou-
 tines  with examples  for using them, are described in the
 ^ektronix document. 062-1526-00,  entitled "Preview
 Routines for Calcomp Plotters, User's Manual."
     As  with  llu1  oilier  Tektronix  packages,  (lie
  ORTRAN I/O lilcs lllusl ')C allocated before using the
  ' Icoinp Preview routines.  A TSO (LIST is available
  'hich contains tlic sequence of commands required to
 ^vjmpile. li"k- and cxcclltc a FORTRAN program con-
fining 'subroutine  calls to the Calcomp replacement
f^utines.
    It  may  be copied by  issuing the following TSO
command:

    COPY  'CNF324.DMC.CAL.CLIST'  CAL.CLIST
    (See Appendix.)

Line 40 of this datasct must be modified to reflect  the
name of the dataset containing the Calcomp subroutines
normally used  for plotting.  Line  40 of the dataset
CAL.CLIST is listed below. The name of the dataset that
must be replaced by the name of the user's dataset is
underscored:

    00040   LINK  &NAME.  ('CNF324.DMC.-
    CALPREV.LOADVCNF324.DMC.CALM1C..
    LOAD'. 'CNA324.DMC.TCS') FORTLIB

    Then, to  execute a FORTRAN  program that  the
user may wish to preview, issue the following command:

    EXEC CAL 'dataset name' LIST (Sec Appendix.)

GRAPHICS ON A PDP-8/E

    The PDP-8/E  minicomputer system that supports
the Aquatic Ecosystem Simulator (AEcoS)at the South-
east Environmental Research Laboratory (SERL) is
equipped with a Tektronix 4010 computer display term-
inal, the manufacturer's  OS/8 software system, and a
FORTRAN IV software system. A Calcomp plotter for
the computer system has been ordered to complement
the graphics effort at SERL. Compiling the TCS subrou-
tines and writing two  machine dependent subroutines,
TINPUT and TOUTPUT, were all that was necessary to
implement the TCS package on the PDP-8/E computer.
Upon arrival of the Calcomp plotter and associated soft-
ware, it will be possible to preview Calcomp plots on the
4010 by simply linking to the Calcomp software replace-
ment  routines in much the same manner as on the OSI
facility.

     The TCS and  Calcomp Preview  routines that exe-
cute on the PDP-8/li  can  be easily exported to other
PDP-8/E's;  but  the  OS/8 and FORTRAN software
cannot, since they  were purchased on a licensing agree-
ment on a per machine basis. As the work load at SERL
permits, an  attempt will be  made to install the AGII
routines on the PDP-8/E.
                                                                                                         157

-------
                          APPENDIX • LISTING OF TSO GRAPHICS CLISTS
  Listing olTTx CLIST

     00010 ALLOCATE DA(*) F(FT05F001)
     00020 ALLOCATE DA(*) F(FT06F001)
     00030 END
  InvokmgFTS. CLIST

     EXEC FT5 LIST
     ALLOCATE DA(*) F(FT05F001)
     ALLOCATE DA(*) F(FT06F001)
     END

  l.islmgol'TCS. ('LIST

     OOOIOI'ROC I NAME
     00020 FORT &NAME. NOPRINT
     00030 WHEN SYSRC(GE 8) END
     00040 SCRATCH &NAME . . LOAD
     00050 LINK &NAME. LIB('CNA324.DMC.TCS')
     FORTLIB
     00060 WHEN SYSRC(GE 8) END
     00070 SCRATCH &NAME .. OBJ
     00080 CALL &NAME.(TEMPNAME)
     00090 END
  Invoking TCS. CLIST

     EXEC TCS 'DRAW LIST
     FORT DRAW NOPRINT
     Gl COMPILER ENTERED
     SOURCE ANALYZED
     PROGRAM NAME = MAIN
     *  NO DIAGNOSTICS GENERATED
     WIIENSYSRC(GE8)END
     SCRATCH DRAW.LOAD
     LINK DRAWirLIBCCNA324.DMC.TCS') FORTLIB

     WHENSYSRC(GE8)END
     SCRATCH DRAW.OBJ
     CALL DRAW(TEMPNAME)
 Listing  of AGII. CLIST

     OOOIOI'ROC I NAME
     00020 FORT &NAME. NOPRINT
     00030 WHEN SYSRC(GE 8) END
     00040 SCRATCH &NAME .. LOAD
     00050 LINK &NAME. LIB('CNA324.DMC.TCS't
     •CNA324.DMC.AGH') FORTLIB
     00060 WHEN SYSRC(GE 8) END
     00070 SCRATCH &NAME . . OBJ
     ()
 Invoking AGII. CLIST

    EXEC AGII 'WILL' LIST
    FORT WILL NOPRINT
    Gl COMPILER ENTERED
    SOURCE ANALYZED
    PROGRAM NAME = MAIN
       NO DIAGNOSTICS GENERATED
    WHEN SYSRC(GE 8) END
    SCRATCH WILL.LOAD
    DATA SET WILL.LOAD NOT IN CATALOG
    LINK  WILL LIBOCNA324.DMC.TCS'. 'CNA324..
    DMC.AGIH FORTLIB

    WHEN SYSRC(GE 8) END
    SCRATCH WILL.OBJ
    CALL WILL(TEMPNAME)
    END

Listing of CAL CLIST

    00010PROC1  NAME
    00020 FORT &NAME. NOPRINT
    00030 WHEN SYSRC(GE 8) END
    00040 SCRATCH &NAME . . LOAD
    00050  LINK &NAME.   LIB('CHF324.DMC.-
    CALPREV.LOAD', 'CNF324.DMC.CALMIC.-
    LOAD', 'CNA324.DMC.TCS') FORTLIB
    00060 WHEN SYSRC(GE 8) END
    00070 SCRATCH &NAME .. OBJ
    00080 CALL &NAME .. LOAD(TEMPNAME)
    00090 END

Invoking CAL. CLIST

    EXEC CAL 'DEMO' LIST
    FORT DEMO NOPRINT
    Gl COMPILER ENTERED
    SOURCE ANALYZED
    PROGRAM NAME = MAIN 001 DIAGNOSTICS
    GENERATED. HIGHEST SEVERITY CODE IS 4
    WHEN SYSRQGE 8) END
    SCRATCH DEMO.LOAD
    LINK  DEMO  LIBCCNF324.DMC.CALPREV.
    LOADVCNF324.DMC.CALM1C.LOAD'
    •CNA324.DMC.TCS')
    FORTLIB

   WHENSYSRC(GE8)END
   SCRATCH DEMO.OBJ
   CALL DEMO.LOAD(TEMPNAME)
I5K

-------
                     AUTOMATED LABORATORY MANAGEMENT SYSTEM IN REGION V

                                            By David Rockwell
(INTRODUCTION

     The Laboratory Data Management System (LDMS)
KUS developed to process data resulting from compliance
*iionitoring requirements of EPA legislation. Our objcc-
M\c was to complete a compliance monitoring report on
fcuch survey within  30-35 days  after the completion of
Mie survey field study.

     On (lie average, each compliance monitoring simly
[lakes three days.  Dal a arc  collected from a specific
Station  for a 24-hour period. The types of samples are
 vgrah samples";  twelve  grab samples arc integrated over
 U 24-hour period.

     In Region V these compliance monitoring studies
 *rc performed by four district  offices (DO) located in
 Different states.  Each DO collects,  analy7.es, and trans-
 inits water samples  to  the Central Regional Laboratory
 (CRL) in Chicago.

 CHARACTERISTICS OF  LDMS

     This  project provides a beginning stage in  the qual-
 ity control of data going into the Storage and Retrieval
 fSTORET) system.  The advantages of LDMS are  rapid
 Exchange of analytical  data between the CRL  and DO.
 This capability is very useful in emergency situations. It
 ;,|S)) u||o\vs ready reproduction of data values for reports
 luior lo storage of  the data in  STORI-'T as well as a
 Mmplcr means for correcting data values. All data from a
 Survey  are orpani/cd and available in one place  at all
 limes during  the study  period. This data availability
Assists laboratory managers, chemists, field engineers,
J*nd ADP personnel  in performing their jobs more effec-
tively   Quality  control of data values is  provided by
i*can's of  the SIDES   and  QUALITY  CONTROL
brograms. Bctier quality  assurance  of data handling by
knowledgeable DO  and CRL personnel is possible, as
•Contrasted with  the keypunch approach used previously.
Vith the  LDMS, CRL  has experienced fewer rcanalyses
 broblcms  (30 lo 40 percent  fewer  samples must be an-
alyzed) in studies because sample descriptions are now
always complete. The CRL can better schedule and plan
its incoming work  load when studies are identified in
LDMS.

    The disadvantages of the system are the heavy de-
pendence on a central computing facility, the need to
train an  uninformed user community in  a new skill, the
use of computer terminals and computers for data entry,
and the  strain on the organi/ation to run two data han-
dling  systems  with  the  same personnel during devel-
opment and implementation of the LDMS.

     Common  to all automated systems is the problem
of inputting the data. This problem has not been satis-
factorily solved in  spile of the use of  LOAD and GO
programs designed to reduce the need for advanced ADP
skills.

DATA FLOW

     In carrying out a survey, DO engineer plans a study
to  monitor  a  discharger.  He  selects station  sites,
STORET water  quality parameters,  and  CRL  log
numbers prior to survey. In LDMS, this information is
given to  the DO ADP staff and is input into the LAB-
ORATORY program.* Figure I shows the programs and
procedures which comprise LDMS, Figure 2 shows  the
overall steps in the process of entering data.

     LABORATORY is run and produces LABEL. The
DO ADP staff enters the LABEL name  in the 00 LAB-
ORATORY DATA CONTROL SYSTEMf (LDCS) data
set (see  Figure 3).  The  important columns comprising
Figure 3 are numbered, and  the columns can be iden-
tified as follows:

    (T) LABEL data set name.

    (T) Disk  volume number on  Optimum Systems
Incorporated where LABEL data set name is found

    (3) Compliance monitoring  study  area description
 'Programs LAHORATOR. INTI'RVACI, STORI'IT. and llic file l.ABML were created hy Mr. Jon Ahraytis. Senior Programmer, Region V,
 I PA Data Systems Branch. Manu|zcmcm Division.

  tThis LIX'S wasmMii-d hy Richard Shekel). Indiana District Office Region V. I PA.
                                                                                                          159

-------
      (4) Dale when DO completes their data entry

      (V) Dale when CKL completes their Jala entry

      (7T) Dale when d. la arc processed through SIDES

      (l) Dale when  data are stored in STORE! water
   quality Hie

      (5} Date when  storage of  data in  STORET is
   verified, after which LABEL and  related data sets are
   scratched from OSI disks.

   This LDCS is used by the  CRL to retrieve the various
   LABEL dala sets indicated  on  Figure 3 in order to plan
   work  schedules  for anticipated, incoming field studies.

      Each  Held  study is run by  the DO field engineer.
   The DO laboratory  analysis of lime-dependent param-
  eicrs  (e.g..  HOD and phenols),  together  with  field
   paiaiiieiei.s. (e.g., pH and lemperaiim') are cmered info
   LABEL  by  the DO ADP  staff.  Figure 4 shows the
  LABEL  dala  set after the  parameter values have been
  entered.  The various parts of this dala set are identified
  as follows:

     (7) Number of parameters in LABEL

     Q) Number of CRL log numbers assigned to study

     (5) STORET agency code of stations used

     (4) STORET station types code description

     (?) Expected study date

     (fij Expected sample arrival dale ai CRL

     (J) Ex pod cJ analysis due date

     Qy Compliance  monitoring study area  description

    (2) CRL log numbers assigned to study

    @ STORET station numbers

    (R) STORET sample lime and  type descriptions in
 STORET code

    (T3) Sample site descriptions
      QJ) STORET parameters grouped in sets of ten

      (14) DATA values from study analyses.


  LABEL  can handle a maximum of 60 water quality
  parameters and 100 CRL log numbers. This data set uses
  a maximum of eight tracks of 3300 disk space. Region V
  LABEL data sets can usually  be stored on two tracks
  After  the DO  ADP  staff  has entered a date for  corn-
  pletion of its work in the LDCS, the major responsibility
  for inputting data into  LABEL is passed to the  CRL
  ADP  staff.  The CRL analyzes the samples  for metals
  (e.g.,  mercury and lead), organics (e.g., pesticides), and
  inorganics (e.g., nutrients). The results of these analyses
  can be entered on blank copies of the LABEL and given
  to CRL ADP staff for entry into LABEL After entry js
  complete, the CRL ADP staff enters a date in the LDCS
  under CRL dala completion column.


      Data in a complete LABEL data set are then ready
  for the editing and quality assurance programs  SIDES
  and QUALITY CONTROL (QC)-SCREEN, which are ex-
 ccuted consecutively. SIDES flags any format errors  to
 achieve correct sample identification codes and  formats
 (see Figure 2). The QC program eliminates physically jm.
 possible data  and  prevents  their being  stored  into
 STORET. It  prevents physically impossible  results, such
 as a dissolved  value exceeding the total value  for any
 parameter. It flags improbable data for the CRL quality
 assurance officer and DO  field engineer  to investigate
 Positive action must be taken to prevent storage of "im-
 probable" data into STORET.


     The LDMS system is replacing a manual system for
 filling out Hie Region Vdala form. Page one of this form
 (FigmeS) contains study and sample identification \n.
 for mat ion.  Page two (Figure 6) is the  first of 17 forms
 for dala value entry.


 ACKNOWLEDGMENTS

     This report is based on the interim results of a de-
 velopment team  effort. The number of active persons
 involved arc too numerous  to  mention;  however,  the
 following  contributed  significantly  to  the  present
system:  Mr. Jon Abraytis. Mr.  Dave Barrow. Dr. Billv
 Fairless. Mr.  Jim Ganglci.  Dr. Wayne  Ott,  and Mr
Richard Shckell.
100

-------
 EDIT
LISTING
 ERROR
LISTING
                           STAUT
                             PGM
                        LABORATORY
                            LABEL
                            FILE*
                             PGM
                         INTERFACE'
                               F—}
                      I TEST A |   TESTS
  PGM
 SIDES-
QUALITY '
CONTROL
                           STOHEir'
                              f
                                               UStR COMMUNITY CONCEIVES STUDY
                     CREATES LABEL FILE FOR INPUT OF LABORATORY DATA
                     DATA VALUES ENTERED BY USER COMMUNITY
                     REFORMATS LABEL AFTER ENTRY OF ALL DATA FOR USE
                     BY SIDES
                     INPUT FILES FOR SIDES
PERFORMS EDIT CHECK ON DATA AND REFORMATS FOR INPUT
INTO STORET WATER QUALITY FILE
                                                INPUT FILES FOR QUALITY CONTROL
PERFORMS DATA CHECKS FOR RANGES OF ACCEPTABLE DATA
FLAGGING IMPROBABLE DATA
                                                STORES DATA INTO STORET
                        DATA IN STORET
                      WATER QUALITY FILE

   •PROGRAMMED BY DAVE BARROW OF ATHENS, REGION IV, EPA
   • 'PROGRAMMED BY JIM GANGLER, VITRO LABS. INC.. WASHINGTON. D.C.
   'PROGRAMMED BY JON ABRAYTIS. CHICAGO REGION V, EPA
                                           Figure 1
                             Laboratory Data Management System
                                      Structure Diagram
                                                                                                     161

-------
ERRORS
CORRECT
RESUBMIT
EDIT RUN
                     CHECK
                   EDIT LISTING
'T
  I
  I
  I
  I
 *
               LABORATORY DATA MANAGEMENT SYSTEM
                        SUBMIT
                        EDIT RUN
    FILL
 LABEL DISK
    FILE
 WITH DATA
IN PRESCRIBED
  FORMAT
 ^  I   START    I
*n_          J
    SUBMIT
    QC RUN
r~
i
i
•
i
i
i
i
i
i
i
i
i
i
"" ¥



CHECK
QC LISTING

1
i
x^^
[ ERRORS
V ?
^T^




\ NO
/

I YES

RESUBMIT
QCRUN

+-











SUBMIT CHECK
RUN A "~ 1 RUN
1

T 1
ERRORS |_ RESUBMIT
CORRECT 1—— S.S. RUN
                                                                                          SIORET
                                                                                          WATER
                                                                                          QUALITY
                                                                                           FILE
                                                                             YES
                                                                           ERRORS
                                                                          CORRECT

-------
 Jv.
 Ji.
 !<-.
 1 J.
d*.
ft.
et.
^^.
J'J.
Jl.
3/r.
JO.
J/.
3U.
Si.
IviIi'.A ulsl-'UI urFiCt   111 f.*:>T UIA.MONO  AVt.  Ev »:
I. 1-1 INS uK  '«!*  3t1  UAItS «VULL(ft3  .COfHt^rs AND  'J»l
jF  in-~t  A**  -'•< uor af IONS Pct«bt CONTACT  HlCnaxD *
-(--Ic-..,' j-nro-«  fib ->*  •« lf-*i J-ton7 1  t*T «!«>»
-t I .  T-C->t  :-»l»  « 1 s r»»  O3l»o  fcCNA30U.iJAb.Uftr*  bf T
'„ A - I  _ - / " I t  -  l~'l*lll
                                                                       » ILLC.  INDIANA »7fll
  ©
                                                 ©      ©
                                                          UAIE
                                                          ClHr>
                                                          C«L
                         '-   1 LSLCUL4 It > : .O.
                •.* 1 la
              uv -••• 1 '
  I -''J
  I ~ "J
  I- ._.._!.,
  IV U.
  r.r.u,:
  I' :)'-TJ
  I ...u-
  I , .u •'
  p.  o .
  I • ju i
  i 1 ,\1 •
  I-.JO-
  IN .0"
  I .!>•>•
  iMl'U.
                                                    LOAua
                        v . •»   l->   H.iaiuCI'j' sluiT «**  1 3-")C I J ' >lolir .A
                       'i*-*   ifliien Cu^LATil' •«•*.•>. .<(.
                       i. <•«   I •»( T «IM<\  iTAiUui
          T«VUt  >">u>.>.  AT •A-ADM 1 4U.
          t (.Unr.  tl'»n-itt.I  -"tA-^lNvj ulv.
          It-'t"!  WOlK.  Cj'lt'.  .^Ol.M  JUlt-iLL
 / -j i   I -vr. T »'j-«A
 II/T   I .p |«ly•<^  y
 I j«   IdoioCIjv sloiir HUJ  btxltb
                                        c«>-c»l  l^-el  COMf.  *O«. HI
..I'll   (Sfult.NT  <» LO.Kr*. ^IOl».  r>tS I ll,ll)t T
•U j •   I"IL lA-iKUts "ICntJ U>-  Br u.a.C.-o
>iii"   IKlFt.nt.Hi  uUlf  Invtdt.  lultKt
•«••-•   l»lTCnt.«  ^Ulf  lNVt>r. MAIN  uuTFALL
                                        *J
           (3"(l'jn   IS£u«t'«r 31  b.l.f.  r_FH.Ut"ilb
 ISUO"JJI lavOU"   lbtO-J*l lbv»Uu-   l-itto'^tNT M  iNUUaTKlAL iNlArxts
 ItaOuuJ-al fauOun   IbttoHtNl Si  «ATt>4 JUALlTr
I JNOOOJt»II»i'Di)e   ISEtoltdiT 6*  b.T.f.  tFFLUfilb
                              0»  b.T.*'.  IwfLUtNTb
                                        t>4 M
                                                  UUALlTY
          IIM)OUJ9IT>UUU<>
                  ©
                                                                                                                                     ©
USN     UbN     UATt     USN
TtbTA  TtsTfJ  ENTM     UOT1
 VOL      »UL
                                                                                                                  Ol>T^
OATE     DATE
bTOMtO  Vt«IFY
1 N • A ^ t )^ • A »
I«.A. IN. A.
I'«.A. . IN.h.
| '« . A . 1 N . ft .
1 JoNt ll»O.»E 1 uO*t
1 bL««TCHtD
IUU^IL luO^IE
I'lGNi luO^t
I/*UP3'J 1
1 jLHAICncp
l/»0nn 1
1 '-.UljU I
i/*Univ 1
I7»063U 1
lbCNATCnr.0
luUNr. lUO^t
luUML (UOtlt
1
1 =
1
1
1
1
1
1
1
1
1
1
1
|i*.A« IN. A. IN. A. IN. A. IN. IN. A.
In. A. IN. A. IN. A. IN. A. IN. IN. A.
'* • A . IN. A. IN. A. li^.A. 1 1^. IN. A.
*«A. IN.A. IN. A. IN. A. 1 rt. IN. A.
CbOUUVl rsOOO*! 7««yO* MbOUOvlN.
1 bOOO^I TbuOO* 1 7»IHIO iTSOOOolN.
1 bUUu-^l TbOOOS 1 7»U>y MiOOO IN.
iboaovi TSOOUVI
(SoOovi TbuOO^i
1 1
IbOUU^I TbuOOvl
IbUuuvl TiOOO^I
1 1
1 1
1 bUuO*l TbOOOVI 7*WVU*
IbUJO-il FlUOOV 17*090*
1 1
i bOou** I T auuoy I
IbOUOBI TbUOOdl 7*U910
1 1
1 1
1 1
1 1
1 1
1 1
1 1
1 bOU09 1 TiOOO* 1 7*041 V
f iOUOV I T bOU09 1 7*UV 1 0
* 1 I
1
|
1
I
1
1
|
|
|
|
I
1
1






FbOUO-*
TbOUOv

TbOOOb







Tbuooe
TSOOOO




















N.A.
N.A.

N.A.







N.A.
N.A.













IN.A.
IN.A.
IN.A.
1 M. A _
7*090* 17*091*
7*091U








7*090*
7*090*

7*0910







7*0910
7*0910













7*091*








7*0916
7*0916

7*091b







7*0916
7*0916













                                                                                                                                                         i
                                                                                                                                                i
                                                                                                                                               f

-------
1 .
f.
3.
4.


•5.
I1',
1 1.
14.
IS.

1'.

 I1-.
 «•".
 ? / .
 V',
 31 ,
          ©
         74-»l
         74^)
ID  w
 * l^I'J.
        74'1^-K
        7411 7'i
        74^17]

         C«>-^|_»
         i 1'-,  • r
         74 '! J,
         7 4 1 1 •- . .
         74>t r
          7. i
          7-. '
         7 4 •
                               ©/    <
                            77777777  74
                                                        ©    0
                                                       740503
                                       ©
                                   C-i i 7404.10 llOu T \'t 740S01 1000 OSOO\
r .'--I- 7404J3 1100 T !•> 740501 luOO 00001
C"--l i 7*0101 100-J <)00o(«^>|
C I --f 740SO^ 1P04 OlOOjVli/
r i -s * 741)-,.,*! 1 ITU 0000.)
(i O "J :j()Ul'. U'MOO 004UO UJ4I14
r\)' • T -' '. i T* • Jo *i i_jt,
_y| I-"'. 1- !~ H*
'(, • . rv...i -I,/L 30 3-1
COS^OCTON STP INFLUENT
CObHOCTON STH AFFLUENT
COS-IOCTON STt- fFFLOEMT
O^ElOF^ STi- I-JFLUt".T
Jfif.SOFM bFK EFFLUENT
UHtSOf« sfn EFFLUtNT
(K)O'H-? 00410 i)
Oy
')5DO 31616
OOSOO '
Cf»OUCT>/T r ULX 01L-6Kifc 'tC CDLI MES1DOE '
AT ^SC CACO1 TOT

-b*LT <-fM-FC«K TOTAL 1
•VL /100KL
MG/L

r1 i r.i
II 7. I
h'.< i
Kr^
Iv!-' i 7.4
(i....... ..*.. ,..,
...../• ,..-.» I;MIO oj^4
•'^ - J.1-- ->, 1,lly -*'ll
<£. IH'O.D 1
1 1
1 1710
i i<>ou
Inlu.O 1
.0

1
I..HI,. | 4»0. .) | | | U70.0
1-5-iij. 1 ll'l.O 1
1 1
1 IV85.U
ItluOOO. 1
!I'IJ4'J DO""!) 00*4') 00^*5
rn> i u "• c c*^i
|^
009-30
.Ortlut suLFATt FLOOrtlDE
                                             ./L
                                       I l-v
                                       111.
                                                  I Hi-..
                                                  I 141.
                                                                  n/L
                                                                          nl
                                                                                        C
                                                                                       " j/L
                                                                                                  CL
                                                                                                 •"./L
                                                                                              1107.
                                                                                              I 140.
                                                                                                            F.DISS
                                                                                                             HG/L
                                                                                                                                   DONE H
                                                                                                                     •••IL
                                                                                                                     •••IL
                                                                                                                     •••IL
                                                                                                                     •••IL
                                                                                                                     •••IL

                                                                                                                        IA
                                                                                                                        IA
                                                                                                                        IA
                                                                                                                        IA

                                                                                                                     •••I A
                                                                                                                     •••I A
                                                                                                                     •••I A
                                                                                                                     • ••I A
                                                                                                                     •••I A
                                                                                                                     • ••IA

                                                                                                                         IB
                                                                                                                         IB
                                                                                                                         IB
                                                                                                                         18

                                                                                                                     •••IB
                                                                                                                     •••IB
                                                                                                                     •••18
                                                                                                                     •••IH
                                                                                                                     •••IB
                                                                                                                     •••18
 41.

 *3J
 *4.
 4S.
 4*..
 47.
 *".
 *•*.
 SO.
 SI.
 5*-.
 53.
 54.
 57.

5V.'
60.
7'. M V
7 » *M J 7
74*1 < !

74-'I 7
74'>1 /I

 sn-Hi ^
 L:ir,   1
          74^11'.   I
          74^117   I
                        -/I
          7411*.^   I
                       1 •/<_
                             !»•«•...')    I-.4-
                             I Jl ).•'     I -.TS
                             I
                             1
                                        i
                                  11 Ht 7

                               .. ,.IO(
                                 ..•••./L
                             I'll.
               N-J    '-IC<«:L    ll-
                                         1 l.e
                                4-3ir1C
                                Jb.T.iT
                                  'l:i/L
                                        \*t.
                                        IKS.
                                                       •IOC i«
                                                       •1I-./L
                                                    I. 'e1     I 14.4
                                                    '.11     I l-i.o
                                                                                                  00-.

                                                    1-10.
                                                       alo-tt
                                                                          For  Kjf

                                                                            v,/L


                                                                         11.
                                                                            U 10.14
<. roT
 J'5/L
                   I«JO.
                   IKJU.

                   IKJO.
                   KJO.
  0103?
CinO-IUH

  Oij/L
                                                                                                 .TOT
                                                                                                 •••i/L
                      0104^

                    CU.ToT
                  l*>0.
                  139.

                  180.
                  1*0.
                                                                                                   01105     010,5
                                                                                                ALUMINUM    IKON
                                                                                                 AL.TOT   FE.TOT
                                                                                                             UG/L
  01051
LtAO
Htt.TOT
  UG/L
  71900
MEHCUHY
HGtTOTAL
  UG/L
                    111.
                    110.

                    13*.
                    Ub.
         10. *
         10.2

         10.4
         14.4
                                                                                                                         1C
                                                                                                                         1C
                                                                                                                         1C
                                                                                                                         1C

                                                                                                                      •••1C
                                                                                                                      •••1C
                                                                                                                      •••1C
                                                                                                                      •••1C
                                                                                                                      •••1C
                                                                                                                      •••1C

                                                                                                                         10
                                                                                                                         ID
                                                                                                                         10
                                                                                                                         10

                                                                                                                      •••ID
                                                                                                                      •••ID
                                                                                                                      •••10
                                                                                                                      •••10
                                                                                                                      •••ID
                                                                                                                      •••ID

                                                                                                                         1C
                                                                                                                         IE
                                                                                                                         K

-------
FY 1975 District Office

Sajipling Date

Lai:. Arrival liata Analysis rue T.iV;

Saapl* Description; rJ.ver-G2ill3"4 :-:an-Trt-Ef f-0324<24a Ir.d-Trt-Ef l'-03444240
iake-02111432 Xun-Eff-Riv-02211204 Ind-Eff-Ritf-0241120-! account Ko. Study
Mun-Rav-G3212240 Ind-Raw-Inf-034 12240
Mark [A) to: sasipkes with cancer.traticr.s above the oaxinui lisdts, nark ( } for concer.traiior.s
CSi Sargli- '. Agency tio. 1 1 Station No.
Log Nuaber > B digits u 6 digits
1 I
» il
fl
75- 1 |:
75- I
75- Jt
75- i:
|
V
75- . i
7-- y
75- B
75- |?
'5- n
75- H
75- H
75- ^|











































































h
"


u
* ;
r *i
ft
!>!
!"1



















































I



















«
Crab Sample Collection Date ;
or Beginning Composite Time 1


:
^
1

H
H

M
1
i


1



M









































































































telov.' naxinirn, mark tD) for District Of.'ica analysis
j Death M ^"^e o! Coxcosite \ ' -r.i'i
i ( :F^flw
a=averace c*continuou5 i • £i.
h=ra x iir.uiB g-grab ft'J
l=r.ir.iir.un | nn*l saaiples T
blarJt-entire t
j co^aosite ' '-•}•.-'.
H f. i i f .
H ' I ^ !
it ' i
i-

i •
H
!
t
k
: •- i !
i : t
i K M
i . ',.i
i •
\ M ' ; h
1 f • • • -..-I

j Lj i : ;..
_Jl Ji

i
I1
!
I
ifi
i:'i 1 ! : .
fr
f
r
t
i
r *
t -,
y'"
_±
-.g Ccr;
i \
iS.
Tii
i
' ^
i -' i
i
;
t
— 1
i


" ff"
j i : . ;
(1 ':
I i
i ;
i •
1 ' ;


1 '
j




: 1

1





i irj
1 - ;
I
i i :
v
i
y
^
j j !
y

CM. B/t/74
•fi

-------
FY 1975  District Office
                                        Sampling tote
Lab. Arrival Date


Account No. 	
Analyols Due Date


Study 	
 KnrK (A) for staples with concentrations abov« the naucinum Unite,  nark ("O for concentrations lelov eaxlnum, marlc (D) for District Office analysis
T,..-,.-^,-,.,. -,
c:'L ^.'.--i'.o
Lc,j ;;.-.::er
Ea-ple
I/c-c.-ipticn
J'jai-un Cir.ccrtreticn Limits
•J.-.:ts
5
.'5
'5
:'S
.'5
'5
3
:-'S
:js
^5
,'S
?s
75
75
Tfttalt















OC010
Water
Tesp
	
c*















00020
Air
feTp
—
c'















00-}QO
Dissolved
Oxyc^n
	
as/1















-ir-'-n
Field
pH
	
yH Ur.lts

1













50CSO
-Re^idjal
Chlorine
—
*sA















500 6i*
Free
Chlorine
—
cg/1


































000^8
Flow
	
gal/clr.



































'
























































                                                                                                                                        -   g.ij
                                                                                                                                        Iff
                                                                                                                                        PI-

-------
            AUTOMATIC DATA PROCESSING AND REGIONAL LABORATORY QUALITY ASSURANCE

                                              By Bill Fairless, Ph.D.
     The Central Regional Laboratory (CRL), located in
 Chicago, Illinois and  providing Federal laboratory  sup-
 port lor  the states of Ohio, Indiana, Illinois, Michigan,
 Wisconsin and Minnesota was formed in February  of
 1973 with eleven positions.  During the past 20 months,
 the staff has been  expanded to 37 permanent, tempo-
 rary, and intergovernmental personnel act employees.
 Figure I shows the current organizational chart. The first
 branch  established in our  expansion was the  Quality
 Assurance Branch, and the first position filled was Chief
 of that  Branch. A CRL quality assurance program  was,
 therefore, implemented approximately one year ago and
 has evolved continually since that time. It has resulted in
 a substantial improvement in the accuracy of the analyti-
 cal results reported.

     It  is  believed that the  Chicago CRL organizational
 structure formally separating the quality assurance func-
 tion  from all other  analytical  functions is a major
 strength.  The  CRL  has  a  Quality  Assurance Branch
 which reports directly to the director of the laboratory
 and  therefore, provides an independent evaluation of all
 sampling,  analytical,  and data reduction and reporting
 procedures. The branch reviews all data to be reported,
 has access  to all  bench sheets, instrument log books,
 quality   assurance raw  data and data summaries, and
 approves  all  analytical  methods and  laboratory
 techniques-

     Cooperative efforts between the Quality Assurance,
 Biology and Chemistry Branches have resulted in devel-
opment of the quality assurance data summary which
 would  interest an ADP group as a problem needing an
 automatic  data  processing solution. Page  1  of  the
 summary i« shown in Figure 2.

     The  first  item  specifies  a  parameter  and  its
 STORET  number. It was  thought  thai an analytical
 method could be associated  with each STORET number;
 hut  since that is  not  the case,  item 2 is included as a
 reference only. Details of each procedure are given later
 in  the  summary.  Item 3  is  included lo  differentiate
 between older, and presumably more reliable, analytical
 methods  and  the recently  implemented procedures.
 Hem 4 identifies the dates enclosing the  time period of
 data collection for this summary, and  item 5 lists all
 instruments used to collect  the data with the location of
 the  instrument log book and the individual responsible
for assuring the proper operation  of that instrument.
Item 6 describes the concentration range which would
be  used  to report  data resulting from  the  procedure
referenced  in item 2 and other  results included in this
summary on  a routine  basis. Data  points outside the
working  concentration  range are either not  reported,
reported  with  a  qualification  (do  not  place  in
STORET - explain why), or the sample is reanalyzed
after  appropriate experimental  modifications. Item 7
gives  a numerical detection limit which is defined  on
page 3 for  each summary. Item 8 describes the control
limits used to flag all suspect data, and item 9 presents
the precision. Item 10 is included only after analysis of
reference samples that have been prepared by someone
outside  the administrative  group completing the sum-
mary.  Most  often  these  samples are  obtained  from
Methods Development  and  Quality Assurance Research
Laboratory (MDQARL) or  Research  Triangle  Park
(RTP). Items 11 and 12 are self-explanatory.

     Page  2  of the summary  (Figure 3) is  a tabular
summary of the data used to arrive  at answers for the
items described previously  and  listed on page 1 of the
form. The usual  statistical parameters are included  on
the left  and the chemist is required to fill  in the top
(starred) of  each  column. A  slightly  different  form
(Figure 4) is used for duplicates.

     A quality assurance data summary is prepared  tor
each  unique  analytical measurement being made at  the
CRL in which sufficient data are available. Figure S  is a
typical  example describing the  analysis  of cadmium by
flame atomic absorption spectroscopy.  Figure 6 shows
the  supporting data for cadmium, and  Figure 7 is an
example of the terms that should be defined in order to
use the numerical data presented in Figures 6 and 7.

     Figures  8, 9 and 10 show  data for a "high level"
nitrate-nitrite  analysis  while Figures 11,12 and 13 per-
tain to a "low level" procedure. The CRL staff has also
designed equipment and procedures to analyze for even
lower concentrations of nitrate  in the samples collected
from the Great Lakes.

     The staff performs a considerable number of arith-
metical calculations on finished data; still more calcula-
tions are  required  for raw data. There has not been
sufficient  time to  summarize  such data as absorbance
                                                                                                               167

-------
   readings,  extinction  coefficients,  instrument  gain
   settings, and  so forth even though these variables arc
   frequently  used by  CRL analysts to evaluate  quality
   assurance data.

       Hopefully   these  examples  clearly  illustrate  the
   kinds  of internal quality assurance data that are  now
   collected and evaluated by hand.  The data management
   file of the  CRL minicomputer system will include pro-
   grams  and data  files to accomplish all of the above auto-
   matically. The system  is described in greater detail by
   Roman Bystroff in another paper of these Proceedings.
   It is expected that once these  programs are operational,
   better  estimates  of data quality will be provided. Such
 analyses will be made available quickly to other regional
 laboratories.
     These kinds of efforts by the regional laboratories
place increasing responsibilities on the managers of the
national data bases to ensure the quality of all the data
being placed in each data base. It is questionable that a
continued  effort  to  produce  more  accurate  measure-
ments is valuable, if those measurements are only to be
lost  in a data base containing.values of all qualities. In
fact,  it is  predictable that many users will turn away
from the existing national data storage bases unless the
quality or integrity of the data is assured.
168

-------
     DIRECTOR
            DEPUTY
QUALITY
ASSURANCE
   BIOLOGY
      CHEMISTRY
ORGANIC
  METALS
      INORGANIC
        A
R
         Figure I
       CRL Organizational Chart

-------
                                             KFG10N V                                Page 1
                               u.s. CNViRoirorAi PROTECTION AC.I NCY
                                   CENTRAL REGJOCAL. LABORATORY
                                 QUALITY ASSURANCE PATA SUMMARY
                 1.  Parameter	STORET No.
                 2.  Procedure used  	
                                                  Attachment Included /_/ Yes  ___/ No
                 3.   Procedure has been in use since
                 4.   Data included in this  summary was  collected from	to
                 5<   Instruments  used       Location  of instrument  Lori book  * individual
                     to collect data         responsible for  instrument maintenance
                     a.
                     b.
                     c.
                 6.   Concentration range reported	to	
                                                                                  units
                 7.   Detection Limit 	
                                                 units
                 8.   Control Limits                              Percent of data outside
                                                                control limits
                    a.
                    b.
                    c.
                    d.
                    e.
                    f.
                9.   The precision is	at a concentration of
                    using 	standard deviations  as  an estimate of precision.
               10.   Bias  	
               11.   Significant figures  reported including  correct  STORET units 	
               12.   Signature	  Date	
                    Immediate Supervisor	            Date
                                                 Figure 2
                                       Quality Assurance Data Summary
170

-------
                                                                         Pacje ."
                       STATISTICAL  Stir.!iARY OF
                       QUALMV  AS:,(J!:A.';CE nATA

1. No. Samples
2. Mo. Dt'Vonnn/Sple.
3. True Values
4. Mean Value D;-:tcnr.n.
5. Bias
6. Vtir iiiin c
7. Average of differences
0. Std. Deviation of
differences
9. Est. Std. Deviation
of Veil ues
10. 95''. Confidence Range
11. Relative Std. Dev.
12. WL Rel. Std. Dev.
Range
13. Data Collected from
to
•* 	













*













*













*













(a)   *Blank - All  reagents except sample

(b)   Reference standard - An unknown supplied by an external audit agency
                          such as MDQARL

(c)   Control  standard   - Sample of known concentration prepared by CRL
                          staff.

(d)   Calibration standard -  Sample used to adjust instruments.

(e)   Spike  -  A measured amount of material added to a sample prior to
               the analysis.
(f)   Other
                                    Fifure3
                              Statistical Summary of
                              Quality Avunnce Date
                                                                                           171

-------
                                                                                       Pago 2A
                                      STATISTICAL SUMMARY OF
                                      QUALITY ASSURANCE DATA
                                   FOR DUPLICATE DETERMINATIONS

1. No. Samples
2. No. Determn/Sple.
3. True Values
4. Mean Value Determn.
5. Bias
6. Variance
7. Average of differences
8. Std. Deviation of
differences
9. Est. Std. Deviation
of Values
10. 95% Confidence Range
11. Relative Std. Dev.
12. 95% Rel. Std. Dev.
Range
13. Data Collected from
to
























































             *Blank - All reagents  except  sample

              Standard (a) reference - An  unknown supplied by an external audit agency
                                       such  as  I'DQARL

                       (b) control  -  Sample of known concentration prepared by CRL
                                       staff.

                       (c) calibration - Sample used  to adjust instruments.

             Spike - Amount of material added to a  sample.

             Other       	
                                                 Figure 4
                                   Statistical Summary of Quality Assurance
                                       Data for Duplicate Determination
172

-------
                         REGION V
            U.S. LIWROUTNTVl. PROTECTION AC.rNCY
               CENTRAL W.GIGNAL LABORATORY
              QUALITY ASSURANCE DATA SUMMARY
                                                          Page 1
1.  Parameter  tot. Cd
                   STORET  No.
      01027
2.  Procedure used  Acid digestion and flame atomic absorption.
3.  Procedure has been in use since   February 1973
4.  Data included in this summary was collected from  7/73
                                                   to   5/74
5.  All instruments used
    to collect data
    a.
    b.
    c.
    d.
    e.
         PE 306
IL 453
Location of  instrument
Log book & responsible individual
 R. Whitworth/Metals	
 E. King/Metals	
6.  Working concentration range   1Q ug/1
7.  Detection Limit     5 uq/1	
8.  Control Limits
    a.
                                        to
                  500  UQ/1
                             Percent of data outside
                             control limits
    b.    91-109% Spike recovery    	
    c.   <10X Sample cone,  in blank or_
    d.   <10 PPb Cd
 9.  The precision 1s     9 ug/1    at a concentration of    TOO ug/1
    using     2	standard deviations  as the measure.
10.   Bias 	
11.   Units  Including correct  Mo.  Significant Figures  2 or 3 ug/1
12.   Signature	pate  	
     Immediate  Supervisor  Ed  Huff
                                 Date   9/20/74
                                  Figures
                   Quality Assurance Data Summary-Analysis of
                Cadmium by Flame Atomic Absorption Spcctroscopy
                                                                                     173

-------
                                                                                       Page 2
                                    STATISTICAL  SUMMARY OF
                                    QUALITY ASSURANCE DATA
                                      FOR UNIVARIACLES
                                                                           Cd

1. No. Samples
2. No. Detl Sample
3. True Value
4. Mean Value of
Determinations
5. Bias
6. Variance
7. Standard
Deviation
8. 95% Confidence
Range
9. Relative Std.
Deviation
10. 95% Rel.
Confidence Range
11. Data Collected from
to
*
Spike
111
1
100%
99%
-1%
19.4
4.4
91-109
4.4
91-109
7/73 - 5/74
*
Reference
1
1
16 ug/1
13 ug/1
-3 ug/1






*
Referenc
1
1
73 ug/1
68 ug/1
-5 ug/1






*











                *Blank - All reagents except sample
                ^Standard (a) reference - An unknown supplied by an external audit agency
                                          such as MDQARL.
                          (b) control  -  Sample of known concentration prepared by CRL
                                          staff.
                          (c) calibration - Sample used to adjust instruments.
               *Sp1ke - Amount of material added to a sample.
               ^Other
                                                 Figure 6
                                        Supporting Data for Cadmium
174

-------
                 DEFINITIONS AIM) DLTA1LFD DESCRIPTIONS
                    OF TLR11S USm ON PI^VlOUS  PAGES
 Include the following

"ft   Test procedures, working concentration  range,  detection limit, control
     limits, precision, accuracy or bias, actions usually taken when tlata  Is
     outside control limits.	

 1.   The test procedure is the standard EPA approved procedure given in  "Methods
     for chemical Analysis of Water and Wastes."   U.S.  EPA,  1071,  page 83-120.

 2.   The working standards used for instrument calibration are obtained  from
     commercial sources and diluted to the desired concentration with distilled-
     deionized water.

 3.   The working concentration range is established  by  the calibration standards.
     It begins at a concentration above the  detection limit  and  covers the
     remaining linear or near linear portion  of the  absorbance vs  concentration
     curve.

 4.   The detection limit is defined as that  concentration of metal that  alves a
     signal peak twice as large as the baseline noise.   It is evaluated  from
     standards that are approximately ten times the  detection limit usina the
     equation:

              Detection Limit = 2(Standard deviation! (concentration     ,
                                 \of baseline noise  | 1 Instrument readout!

 5.   Accuracy is estimated from the long term percent recoveries of spikes added
    ~to different kinds of samples as follows:

              % Recovery = [Sample + Spike] -  fSample]
                                 r     ,    • -  ^      ** X 100
                                 [Spike]
     If the spike concentration is less than one half the sample concentration
     the results are not used for control purposes.

 6.   The apparent or real concentration of reagent  blanks are measured and sub-
     tracted from the respective samples.  If a blank 1s contaminated so it con-
     tains more than 10% of the sample concentrations the analyses are considered
     to be out of control.

 7.  Significant figures are determined by the bench chemist and are based on
     the day to day performance of the laboratory equipment.

 8.  All analyses that do not fall within the control limits are  investigated
     on an  individual basis.  Usually  the entire  series of samples are reanalyzed.
     In all cases the action taken is  recorded in the bench  log book.  All
     reported data are supported by Quality Assurance results within the deflntd
     control limits or which have been approved by  the CRL Quality  Assuranch
     Branch.

                                     Figure?
                               Definitions and Detailed
                                Description of Terra
                                                                                      17S

-------
                                       RtGION V
                        U.S.  nn'WOri.'-OTAL  PROTECTION AGENCY
                            CENTRAL  IttGIONAL LABORATORY

                           QUALITY  ASSURANCE DATA SUMMARY
                                    high
            7.  Parameter NOp+Nth-N level  STORET No.
                                              00630
            2.  Procedure used   Technicon.Industrial Method 158-71

                 modified to account for HoSOd preservation	
                                                                          Page  1
            3.  Procedure has been in. use since    7/1/73
            4.  Data included in this summary was collected from  7/1/73    to  8/12/74
            5.   All  instruments used
                to collect data

                a.  Technicon  AA II
                                          Location of instrument
                                          Log  book &  responsible individual

                                            Nutrient  Lab.  -  A.  Jirka
                d.

                e.
Fisher Diluter
Balance #104331
M M II II
Nutrient Lab. - M. Carter


king concentration range 0.03
to 5.00
ection Limit 0.03
                                                    Percent  of data outside
                                                    control  limits
                                                       3%
8.  Control Limits  3 std. dev.

    a.  Blk; 0.00 - 0.04	

    b.   1.00: 0.93 - 1.11

    C.   3.00: 2.85 - 3.09

    d.   5.00: 4.84 - 5.14	
         recovery90-108%
9.  The precision is    ^0.12     at  a  concentration of     3.00 mg/1
                                                      12%
                                                       9%
                                                      -7T
               using
                        standard deviations  as  the measure.
          10.  Bias  -0.01  average

          11.  Units including correct No. Significant  Figures  X.XX MG/L

          12.  Signature   Andrea Jirka	Date     9/4/74	

               Immediate Supervisor   Mark J.  r.artof     Dcte    9/18/74
                                              Figures
                              Data Summary, High-Level Nitrate-Nitrite Analysis
176

-------
                   STATISTICAL SUMMARY OF
                   QUALITY ASSURANCE DATA
                      FOR UNIVARIABLES
                                                                       Page 2

1. No. Samples
2. No. Detl Sample
3. True Value
4. Mean Value of
Determinations
5. Bias
6. Variance
7. Standard
Deviation
8. 95% Confidence
Range
9. Relative Std.
Deviation
10. 95%Rel.
Confidence Range
11. Data Collected from
to
* Control
Strf. Blank
58
1
.0.00
0.01
+0.01
0.00015
0.012
0-0.003


	
7/1/73 to
8/13/74
* C. Std.
1.00 ma/1
75
1
1.00
1.02
+0.02
0.00096
0.031
0.95-1.08
3.U
95 - 108*
7/1/73 to
8/13/74
* C. Std.
3.00 mg/1
40
1
3.00
2.97
-0.03
0.0037
0.061
2.85-309
3. OX
95 - 103X
i/nm
:o ft/1 3 n*
* C. Std.
5.00 mg/1
57
1
5.00
4.98
-0.02
0.0051
0.071
4.84-5.12
1.4%
97 - 102X
7/1/74 -
. 8/13/74
 *Blank - All reagents  except  sample
  Standard  (a) reference  -  An  unknown supplied by an external audit agency
                            such as MDQARL.
            (b) control  -  Sample of known  concentration prepared by CRL
                            staff.
            (c) calibration  - Sample used to adjust Instruments.
* Spike  - Amount  of material added to a sample.
* Other
                                   Figure 9
                            Data Summary, High-Level
                             Nitrate-Nitrite Analysis
                                                                                      177

-------
                          STATISTICAL SUMMARY OF
                          QUALITY ASSURANCE DATA
                             FOR UHI VARIABLES
Page 2

1. No. Samples
2. ,No. Detl Sample
3. True Value

4. Mean Value of
Determinations
5. Bias
6. Variance
7. Standard
nplv/i a -f i An

8. 95% Confidence
Range .
9. Relative Std.
Deviation
10. 95% Rel.
Confidence Range
11. Data Collected from
to
*C. Std.
? R
25
1
? 5

2.48
-0.02
0.0014
0 038

2.40 to 2.56
1.5*
96 to 1-02%
7/1/73
1/11/74
*
Spikes
47
1
i nc\°/

99%
-1%
9.9%


	
3.]%
93 - 105X
7/1/73
8/13/74
*
Std. Cal.
26
1


3.4n
c
0
kl
1
V-
Q
t
3.40 +_
1


*
Duplicates
22
2


	
Ave of
difference
is 0.01 &

Std. dev.
is
0.016


       *Blank - All  reagents except sample
        Standard  (a)  reference  - An unknown supplied by an external audit agency
                                 such as MDQARL.
                  (b)  control  - Sample of known concentration prepared by CRL
                                 staff.
                  (c)  calibration -  Sample used to adjust instruments.
     * Spike - Amount of material added to a sample.
     * Other
                                          Figure 10
                                    Data Summary, High-Level
                                     Nitrate-Nitrite Analysis
178

-------
                          REGION  V                                      page i
            U.S.  ENVIRONP.F.NTAL  PWVIFC1ION AGENCY
                CFNTRAL  RCGIONM  LABORATORY
              QUALITY ASSURANCE DATA SUMMARY
                         (low level)
1.  Parameter^? + NOa)-N    STORET No.   00630
2.  Procedure used   Technicon Industrial Method  No.  158-71W with or without
    modification to account for H^SO^ preservation.	
3.  Procedure has been in use since    7/1/74
4.  Data Included in this summary was collected from  7/1/74   to   8/13/74
5.  All instruments used
    to collect data
    a.  Technicon AA-II
    c.
    d.
    e.
                                        Location of instrument
                                        Log book & responsible individual
                                         Nutrient Lab. - A. Jirka
Balance 1104331
Nutrient Lab. - M. Carter



•king concentration range
0.002 to 1.000
:ection Limit q.002
8.  Control Limits  3 std. dev.
    a.   81 k; 0.000 - 0.003
    b.   0.2;  0.195 - 0.205
    c.
    d.   * tec:  88* -
          1.0:   0.979  -  1.013
                                        Percent of data outside
                                        control limits
                                             0
10*
9.  The  precision  1s     +_0.003     at a concentration of
    using     2	
10.  Bias    -0.002 average
                                                              0.200
                        standard deviations  as  the measure.
11.  Units including correct No. Significant Figures  q±xXft 1119/1
12.  Signature  Andrea Jjrka	Date     g/s/74	
     Immediate Supervisor   Mark J.  Carter   Date  9/18/74
                                    Figure 11
                             D»U Summary, Low-Level
                              Nitrite-Nitrite AnaJyaj
                                                                                       17*)

-------
                            STATISTICAL SUGARY OF
                            QUALITY ASSURANCE DATA
                               FOR UMIVARIABLES
Page 2

1. No. Samples
2. No. Detl Sample
3. True Value
4. Mean Value of
Determinations
5. Bias
6. Variance
7. Standard
Deviation
8. 95£ Confidence
Range
9. Relative Std.
Deviation
10. 9535 Rel.
Confidence Range
11. Data Collected from
to
* c
Blk
18
1
o.ono
0.001
0.001
0. 00000047
0. 00068
0.000-0.002


	
7/1/74
9/5/74
* c
0,200
13
1
0.200
0.200
o.ono
0.0000026
0.0016
0.197-0.203
0.8%
98.5-101.5%
7/1/74
9/5/74
*c
0.500
6
1
0.500
0.500
0.000
1 . 0000002
0.00048
0.499-
0.500
0.1%
99.8 -
100.2%
7/1/74
9/5/74
* c
0.600
8
1
0.600
0.593
-0.007
0.000038
0.0062
0.581-0.605
1%
97 - 101%
7/1/74
9/5/74
        *Blank - All reagents  except  sample

         Standard (a) reference - An  unknown supplied by an external audit agency
                                  such as MDQARL.

                  (b) control  -  Sample  of known concentration prepared by CRL
                                  staff.

                  (c) calibration - Sample used to adjust instruments.

       *Sp1ke - Amount of material added  to a sample.

       *0ther                           	
                                           Figure 12
                                    Data Summary, Low-Level
                                      Nitrate-Nitrite Analysis
180

-------
                       STATISTICAL SUMMARY OF
                       QUALITY ASSURANCE DATA
                    FOR DUPLICATE DETERHINATIONS
                                                                       Page 2A

1. No. Samples
2. No. Determn/Sple.
3. True Values
4. Mean Value Determn.
5. Bias
6. Variance
7. Average of differences
8. Std. Deviation of
differences
9. Est. Std. Deviation
of Values
10. 95% Confidence Range
11. Relative Std. Dev.
12. 95% Rel. Std. Dev.
Range
13. Data Collected from
to
i.nnn
10
1
1.000
0.996
-0.004
0.000031
	

0.0056
0.985-1.007
0.6%
98-101%
7/1/74
9/5/74
Std. Cal.
2
1
_____
3.20









% Rec.
8
1
100%
100%
0
15
	
_


-.
3.9%
92-108%
7/1/74
9/5/74
Duplicates
31
2




0.002
0.0026





*Blank - All reagents except sample
 Standard (a) reference - An unknown  supplied  by an external  audit agency
                          such as MDQARL
          (b) control  -  Sample of known concentration prepared by CRL
                          staff.
          (c) calibration - Sample used to adjust instruments.
Spike - Amount of material added to a sample.
Other	
                                    Figure 13
                             Cfcta Summary, Low-Level
                              Nitnte-Nitrite Analysk
                                                                                    181

-------
                                  SAMPLE TRACKING DATA MANAGEMENT SYSTEM

                                    By G. C. Allison, M. J. Madsen, and R. N. Snelling*
    INTRODUCTION

        A  Sample  Tracking Data  Management  System
    (STDMS) is under development at the National Environ-
    mental Research Center in Las Vegas, Nevada (NERC-
    LV)  to provide  a  capability for laboratory  control
    (sample  tracking)  as  well   as  the  management  of
    environmental surveillance data.

        The  NERC-LV is  involved  in  a variety of envi-
    ronmental  surveillance  projects.  In  support of  these
    efforts, Laboratory Operations receives on (he order  of
    3,000 samples per month for specific stable chemical
    and/or ladiochcmical analysis. Sample types include air
    filters, gas  samples,  milk, water,  edible and nonedible
   vegetation, animal tissue, and  soil. The specific analyses
   performed vary with sample type and project objectives.
   Table 1 depicts the major analysis  "families" along with
   typical monthly sample loads and expected processing
   times. Anywhere from 1 to 27 specific analyses may be
   required for a given  sample. Expected processing times
   range  from  1  to 16 weeks. The indicated sample  load
   coupled with the wide range of analyses and processing
   times  results  in  approximately  7,500 sample-analysis
   pairs in the system at  any point in time.

   SAMPLE AND DATA FLOW

       The  primary flow of data and samples within the
   laboratory is depicted in Figure 1. This schematic is typ-
   ical of an organization not having ADP capability. Four
   functional activities are identified:

           Project  Management: the planning, direction,
           and  coordination of  project  activities and the
           ultimate analysis and  interpretation of data.

           Field Operations: the collection  of physical
           samples and  in situ datu.

           Sample  Control: the  receiving, recording,  and
           routing of incoming samples and data.

           Laboratory  Operations:  the  stable chemical
           and radiochcmical analysis of samples and the
           subsequent generation of parametric data.

   •Speaker
  The progression of these activities is as follows. Project
  Management establishes the protocol for Field  Oper-
  ations and Laboratory Operations sample analysis. Field
  Operations collects samples and data and  delivers them
  to Sample Control. Sample Control serves essentially as
  the central  receiving  and accounting  point  for all
  incoming data  and samples. The samples  are routed to
  Laboratory Operations for analysis, with analysis results
  being reported  back to Project Management for review
  and interpretation.

  AUTOMATIC DATA PROCESSING FUNCTION

      The automatic data processing function is utilized
  to integrate all  these elements. A system which has been
  operational at the  NERC-LV for several years is shown
 conceptually  in Figure 2. The system provides basically
 for the storage and retrieval of data from a direct access
 master Tile. On  the input side, sample identification and
 collection data  are  entered  through Sample  Control
 through the  use of an "H" card (sample  header infor-
 mation)  and   an  "L"  card  (location  description
 information). A  unique  six-digit   sample number is
 assigned to each physical sample and is used to identify
 all data records for that sample.

     A unique part of the system is  that the laboratory
 analytical  programs  are  interfaced directly with  the
 master  file for the  purpose of retrieving needed san  ile
 collection information (such as collection date for use in
 radioactive decay calculations). Analytical result cards
 "R" cards) are generated for direct reentry to the  data
 management  system after review by Laboratory Opera-
 tions.

     On the output side of the system, data summaries
 may be generated directly from the master file. A typical
 report  is shown  in  Figure 3. In addition, working  files
 may be  created  for  the purpose of  utilizing special
 purpose application  programs such as dose models, plot-
 ling, contouring, and  statistical analysis. The weakness
 of the system as shown in Figure 2  is that there is no
 mechanism for ascertaining the analysis status of a given
 sample other than its presence or absence in the master
 file.  With an average of 7.500 analyses in various stages
of completion at  any point in time, this does not provide
1X2

-------
             Table 1
NERC-LV Analysis Distribution, 1973
Analysis Family
Radiochemistry
Gamniu (Nal)
Gamma (GeLi)
Gross a, 0
Strontium
Tritium
Tritium (enriched)
Carbon
Radium
Iron
Plutonium
Uranium
Thorium
Polonium
Americium
Noble fanes
Radon
Iodine
Lead
Stable Chemistry
Ash
Major constituent
• Water

Nutrient Analysis
-Water
Selenium
Cadmium
Arsenic
Molybdenum
Vanadium
Mercury
UlKl
Zinc
Thorium
Specific Analysis

All present
All present
• -
8'Sr, ^Sr
3H
3H
"c
*> *\£. *> ^ O
ziDRa ziOj^
55rc '
*DO tin
238pu 237pu
234.J/235U, 238,j
228*p| 230f^ 232yi|
210Po'
241 Am
Kr, Xc, Ar
222Rn
129,
2'Opb

Ash Weight
Ca, Sr, Li, Na, K, Mn,
Mg, Al, ft. Si, B. F,
ci,so4
Alk, P-diss, P-tot,
NH j -N, N02 + NOj - N, TKN
Se
Cd
As
Mo
V
Hp
Pb
7n
Th
Sample Load
(Number per
Month)

350
10
120
50
165
20
1
23
1
50
32
5
4
1
50
K
1
1

2
6


820

1








Expected Pro-
cessing Time
(Day*)

7
14
21
45
21
60
90
60
60
90
90
90
90
90
30
7
30
120

45
21


21

21
21
21
21
21
21
21
21
21
                                                                   183

-------
   the degree of control or depth of information required
   for the management of Laboratory Operations.

   SAMPLE TRACKING

       The sample  tracking module is now being imple-
   mented  to provide a  tool to management for evaluating
  and controlling the Laboratory Operations function by
  providing  sample  status information. This system  is
  shown in Figure 4. A status file has been added which
  contains status statistics. Upon sample  log-in, analysis to
  be performed is defined either by default or through the
  use of an "M" card. As completed analyses are received
  ("R" cards), the flags are turned off. A variety of status
  reports may be obtained by an exception reporting tech-
  nique. These reports are  utilized by both laboratory and
  project management for assessing sample status.

  STDMS SYSTEM DESCRIPTION

      Figure  5 shows  a more detailed schematic of the
  STDMS.  The system  is organized around two files, the
  master Tile, containing all parametric  and collection data
  for each  sample number, and the status file, containing
 status related data  for each sample number. A third file,
 the location description file, is used to  retrieve standard
 narrative  location descriptions and coordinates. Working
 files may also be created for special applications, such as
 statistical analysis, exposure  modeling, and interfacing
 with external data bases (STORET).

     The  system enters and updates data using four dif-
 ferent card types (Figure 6).  The "H" card is used to
 initiate a  sample number and  generate a data record on
 the  master  file and status file  with  each new sample.
 Analyses  to be performed and expected analysis times
 are established by default according to project code and
 sample type code. The analyses have been combined into
 related groups called families.

     The default family parameters may be overridden
 through the use of the "M" card, The "M" card  is used
 to upJiitc the families  requested and modify the report
due date for a given sample number on the status file.

     For samples which were collected at  locations not
on the location description file,  an "L" card is used to
  provide narrative location description information along
  with the station coordinates. Card type "R" is used to
  add results from a  specific analysis family to the master
  file and to update  the status file by turning off the flag
  for the completed analysis family.

      The master file and status files are index sequential
  files designed to permit direct access through the use of
  multiple keys. The keys, defined at file creation  time
  allow retrieval to be  made according to user reporting
  needs.  The key definition  generates  levels of indices
  which contain pointers to the appropriate data  records
  Sample number, project  code, and sample type are the
  primary data elements utilized to form a key.


      An array of constants is maintained defining the
  expected analysis time for each  family of analyses. Ex-
  pected analysis time is defined as the amount of time in
  days required to complete the longest analysis within a
  family.  When  a status report  is generated, the expected
 analysis time is added  to the log-in date and the resulting
 date is then compared to the  current date to determine
 an  overdue analysis.  The calculated analysis due  date
 may be  overridden with the report due date specified on
 the "M" card. Thus the report due date can be set prior
 to the calculated analysis due date.

 STATUS REPORTS
     The analysis in process report (Figure 7) lists only
 those  samples with outstanding  analyses to be corn.
 pleted. Samples are listed in order by sample nur ber
 within  project,  within  analysis  family. Samples which
 have exceeded their expected completion date are indi-
 cated with an asterisk. This format is intended for use by
 the laboratory analysis personnel. The analysis in process
 by program report lists samples  which have analyses to
 be performed  within  a project (Figure 8). Past due
 samples are  flagged. This report  is designed for middle
 management use. The status summary by project report
(Figure 9) is intended for project management. It sum-
marizes samples received, analyses requested, and status"
These  reports  allow the STDMS to provide NERC-LV
management  the  capability  of tracking  all samples
through their respective analyses to completion.

-------
                                      FIELD OPERATIONS
               direction,
              planning
samples, sample
  collection data
         PROJECT
       MANAGEMENT
  SAMPLE
  CONTROL
            direction
  samples
                                      LAB OPERATIONS
            If
            11
00
                                                                                      3
                                                                                      i

-------
X
3
                                              FIELD  OPERATIONS
                  direction,
                 planning
samples, sample
  collection data
            PROJECT
         MANAGEMENT
                                             Data
                                           analysis &
                                          presentation
                                           programs
                direction
                                                LAB  OPERATIONS
   SAMPLE
   CONTROL
    samples
                                                                          H = Header record (sample collection info.)  »
                                                                          I = i»cotioo) description rtcord
                                                                          R = Result rtcord
                                                                          S = Comment record

-------
SAMPLE REPORT (HYPOTHETICAL DATA)
PAGF 001
WASHINGTON REPORTED 09/10/71

CHEHALIS WASH - CONSOLIDATED DAIRY
000001 11 0005 DATE- 06 30 71 1200
SIZE- 3.50 L
CHEHALIS WASH - CONSOLIDATED DAIRY
000002 11 0005 DATE- 06 30 71 1200
SIZE- .400 L
CHEHALIS WASH - CONSOLIDATED DAIRY
000003 11 0005 DATE- 06 30 71 1200
SIZE- 3.50 L
CHEHALIS WASH - CONSOLIDATED DAIRY
000004 11 0005 DATE- 06 30 71 1200
SIZE- .400 L
CHEHALIS WASH - CONSOLIDATED DAIRY
000005 11 0005 DATE- 06 30 71 1200
SIZE- .400 L
CHEHALIS WASH - CONSOLIDATED DAIRY
000006 11 0005 DATE- 06 30 71 1200
SIZE- .400 L


90SR 1311
PCI/L PCI/L
NA 2.2E02
(4.4E01)

4.2E02 LT3E01
(1.5E01)

LT3E01 7.6E02
(3.9E01)

LT3E01 1.0E01
(1.0E01)

LT4E01 NO

NA NO



132TE 137CS
PCI/L PCI/L
7.5E02 6.1F01
(1.4E01) (l.OEOI

3.9E01 4.H! fn
(2.7E01) (l.ero?,

4.7E01 8.6^01
U.9E01) fl.6f.01J

1.1E01 7.9fO?
(1.0E01) (4.3E01)

NO NO

NO NO



Figure 3
Sample Report (Hypothetical D

-------
I
<•
                                               FIELD  OPERATIONS
                   direction
                  planning
samples.
  collection data
             PROJECT
                                               Data
                                            analysis &
                                            presentation
                                             programs
          MANAGEMENT
   SAMPLE
   CONTROL
                 direct***
                                                                                                               i
                                                                                                               -
                                                                                                               I
                                                 LAB OPERATIONS
                                                                          H  - Header record (sample collection info.)
                                                                          L  = Location description record
                                                                          R  - Rttult record
                                                                          S  = Comment record
                                                                         M  = Analysis to be performed rranJ
                     CM

                     12
                     S I
                     € *.

-------
LOG IN FORM
             CARD TYPES:
                H.M.L
           [RESULT  CARDS
                 R.S
              ANALYSIS
              PROGRAMS
 ANALYSIS
   DATA
/LOCATION  /
 DESCRIPTION
V   ""    \
      *
 MASTER AND
 STATUS FILE
   UPDATE
  PROGRAM
                                   I
 MASTER FILEJ

            \
    DATA
  RETRIEVAL
  PROGRAM

  RETRIEVAL
    INPUT
    DATA
                                                 STATUS FILE
f ,
(
STATUS
REPORT
PROGRAM


STATUS
REPORTS
WORKING
  FILE
( ,
(
DATA
ANALYSIS
& REPORT
PROGRAMS


DATA
REPORTS


STORET
FORMATTER



C STORET
DATA CARDS
                                                                                                  2
                                                                                                  ts

                                                                                                  ?

                                                                                                  £S

                                                                                                  S
                                                   .

                                                It
                                                                                                  re

                                                                                                  3
                                                                                                  c


-------
    FRONT
SAMPLE IDENTIFICATION

PR
1
Dl
46
OG
2
IIP
47
TY
3
P4E
$
AN
48 |49|50
Nl
6
Al
31
ANALYSIS

CD
11

1
12
IA
13
YD
TE
14
D
15
MO
Ul
16
JM
7
LYJ
52
B
8
51!
53
ER
9
{
54
10
55
CD
n
STI
12
USE
56,37
ITE
13
CO
58
14
E
59
C
13
60
IT
16
El
61
r
17
(E
62
IB
»T
63
19
64
ST
20
Rl
63
Al
21
JN
66
rit
22
T
67
IN
23
IM
68
24
E
69
Y
25
70
R
26
71
M
27
72
0
28
73
D/
29
74
IY
30
75
1
31
76
10
32
77
Uf
33
78
34
79
35
S
z
o
S
36
IIZE
37.38
39
ONI!
40
[
41
IEI
42
n
43
H
44
UNII
45
im^nn

TO BE PERFORMED - REPEAT COL. 1-1O
17
DAY
LOCATION

CD
n
LA
52
12
D
53

13
EG
54
14
M
55

IS
IN
56
16
Si
57

m
18
z
MA
19
§
20
X.
l/t
21
U
22
.
23
I
ro
24
X
25
0
R
26
a
AC
27
Ul
in
Id
28
a.
Cl
29
3
HE
31
X
M
30
O
a.
SI
32
a
a.
FR<
33
1
r
34
m
z
35
X
ae
rg
eg
36
a>
37

38

39

40
u
S
41
*
Z
42
UJ
SI
43
o
u
FA
44
Z
BL
45
O
3
E
46
>
CH
47
s
EN
48
CD
a.
II!
49
Z
!T
50
=>
RY
51
X
52

53

54

55
OTHER

DESCRIPTION - REPEAT COL. 1-1O
17
C
58
18
LOI
59
19
VGI
60
20 21
)EG
61
M
62
22
IN
63
23
S
64
24
EC
65
25
1
66
26
:NT
67
SI
27
Y
68
fAl
28
Rl
69
no
29
iN
70
IN
30
D
31
ESCRIPTI
32.33 34 35
0*
36
1
37
38
39
40
41
42
43
44
4346
47
48
49
50
51





     BACK


PR
i
CO
n
R
S'
*•*
R
• r
'^',
R
^
R
/ •
V*
S
06
2
n









IY
3
13









PE
4
ID
14









5
N
'
ENT
15 16


i
i
!


I










JM
7
IB
8
FIEfl
17 18


















ER
9
19









10
20









NOTE: ENTER RESULT IDENTIFIED LEFT JUSTIFIED
ENTER 'LT' IN COL 21-22 FOR LESS THAN VALUES
ANY OTHER NOTATION IN COL 21-22 WILL INDICATE
THAT RESULT FIELD IS TO READ AS A COMMENT
L
2I









T
22









23









24









25









RE
26









SI
27









JL
28









r
29


-






30









31









32









33









34









Z-S
35









IGI
34









IM
37









I
38









39









40








UNIT
41 42
















:~if.; ' v1
^•Y












     Figure 6
Sample Control Form

-------
                                   ANALYSIS IN PROCESS REPORT
                PO ANALYSIS
SAMPLE NO.

»1?9636
•I?9650
•1?965S
•129761
•1?9773
•nooio
•noon
•130035
•110037
•no 040
•130041
•130043
•130044
•13006*
•13007?
•13007*
•130076
•130079
•13008?
•1300*3
•130044
PROGRAM NO.

   56
   56
   56
   56
   56
   56
   56
   20
   ?0
   30
   ?0
   ?0
   20
   20
   20
   20
   16
   16
   16
   16
   16
   16
   16
   16
   16
  PAGE   1


SAMPLE TYPE

     1
                                                                                DATE
                                                                                        6X
     7
     7
     7
     7
     7
     7
     7
     7
     7
                                                             LOGIN
                                                             DATE

                                                             52074
                                                             520/4
                                                             52074
                                                             52074
                                                             52074
                                                            52074
                                                            52074
                                                            52074
                                                            52074
                                                            52074
                                                            520^4
                                                            52074
                                                            52074
                                                            52074
                                                            52074
                                                            52074
                                                            52074
                                                            52074
                                                            520/4
                                                            52074
                                                            52074
                                                            52074
                                                            52074
                                                            52074
                                                                       ANALYSIS
                                                                           OATE
                                                           DUE
50574
50774
50574
50574
51074
50974
51074
50674
50774
50574
50174
50474
50174
50374
51174
S1174
51174
51474
51174
51374
51474
51174
51174
51274
51374
REPORT DUE
OATE
50574
50774
50574
50574
51074
50974
51074
50674
50774
50574
50174
50474
50174
50374
51174
51174
51174
51474
51174
51374
51474
51174
51174
Sl?74
51374
NO. OF DAYS
SINCE LOG IN
12
12
12
12
12
12
12
12
12
12
12
12
12
12
12
12
12
12
12
12
12
12
12
12
12
                     • INDICATES ANALYSIS  IS  OVERDUE

-------
CURRENT DATE  6/ 1/74
SAMPLE
NUMBER
130557
130557
110557
110557
130557
110557
130558
1.30558
130559
130«55<»
130559
130560
130560
130560
130560
130561
130561
130561
130561
130561
130562
130562
130562
130563
130563
ANALYS

GEL I
14C
241AM
MCA-*
MG

PO
MG
14C
241AM

NA-I
GEL I
MCA-W
MG
14C
PO
241AM
MG

GEL I
MG

GEL I
14C
                                    ANALYSTS IN PROCESS
                                           FOR
                                        PROGRAM 51
SAMPLE        NO.  OF DAYS
 TYPE         SINCE LOG IN

  3                 12

  3                 12
  3                 12
  3                 12
  3                 12
  3                 12
  3                 12
  3                 12
  3                 12
  3                 12
  3                 12
  3                 12
  3                 12
  3                 12
  3                 12
  3                 12
  3                 12
  3                 12
  3                 12
  3                 12
  3                 12
  3                 12
  3                 12
  3                 12
                                                                      PAGE
                                                           ANALYSIS DUE
                                                             DATE

                                                             5/ 3/74
                                                             5/ 3/74
                                                             5/ 3/74
                                                             5/ 3/74
                                                             5/ 3/74
                                                             5/ 3/74
                                                             6/ 9/74
                                                             6/ 9/74
                                                             5/17/74
                                                             5/17/74
                                                             5/17/74
                                                             5/16/74
                                                             5/16/74
                                                             5/16/74
                                                             5/16/74
                                                             5/15/74
                                                             5/15/74
                                                             5/15/74
                                                             5/15/74
                                                             5/15/74
                                                             5/10/74
                                                             5/10/74
                                                             5/10/74
                                                             5/11/74
                                                             5/11/74
REPORT DUE
 DATE

 5/ 3/74
 5/ 3/74
 5/ 3/74
 5/ 3/74
 5/ 3/74
 5/ 3/74
 6/ 9/74
 6/ 9/74
 5/17/74
 5/17/74
 5/17/74
 5/16/74
 5/16/74
 5/16/74
 5/16/74
 5/15/74
 5/15/74
 5/15/74
 5/15/74
 5/15/74
 5/10/74
 5/10/74
 5/10/74
 5/11/74
 5/11/74
   • INDICATES ANALYSIS IS OVERDUE

-------
                                                                 ANALYSIS STAT'lb REPORT
                                                                          FOP
                                                                       PPOGRAM SI
          SAMPLE
          NUMBER
           130*51
130551
110554
110555
11055*
          110559
3
3
3
3
3

3
3
3
3
3
3
3
                LOGIN  NO. OF OAVS
                 HATE  SINCE L0« IN
5?074      1?
57074      1?
5?074      12
5?074      12
           1?
           1?
                          5?074
                          5?074
                          S7074
                          5?074
                           12
                           1?
                           1?
                           1?
                57074      12
                5?074      1?
                5?074      1?



:PORT OUE
DATE
5/?3/74
f,/lfl/74
5/11/74
6/17/74
5/1H/T4
5/ 1/74
5/ 3/74
6/ 9/74
5/17/74
5/16/74
5/15/74
5/10/74
5/11/74
5/13/74
N
A 1
. 1
I

I
I
0
I
0
0
I
c
0
0
0
I
I
0
-, A A
I S .
- H H
1

I I
I I
0 0
I I
I I
0 0
I I
c c
0 0
I I
0 0
I I
I I
I I
533
R H H
•


I 0
I I
n o
I I
I I
n o
I I
C C
0 I
I o
0 I
I 0
I I
I I
1 R 5
* x 5
C F
F

n o
I I
0 0
I I
T I
n o
T I
r c
I I
0 0
T I
0 0
T I
T I
PUT
LJ H



0 0
I I
0 0
I I
I 0
0 I
I I
C I
I I
0 0
I I
0 0
I I
I I
P 2 K
D 4 R
1
A
M
0 0
C I
0 0
C I
0 0
I I
0 I
1 I
0 I
0 0
0 I
0 0
0 I
0 I
X A R
E R N



0 O
I I
0 I
I I
0 I
I I
I I
I C
I 0
0 I
I 0
0 0
I I
I I
1 ? M
? 1 C
» 0 A
I p .
q w
o o
I C
I 0
I C
I 0
I 0
I 0
c c
0 0
I 0
0 0
0 0
I 0
I 0
Y S
1  c
T
J




•»











>

M
5



(











C
1
» Z
0 N



1 I
I
0
I
I
0
I
c
I
0
I
I
> I
I
J T
H



I
I
0
I
I
0
I
c
I
0
I
I
I
I
                                                             I  - INCOMPLETE ANALYSIS

                                                             C  - COMPLETED ANALYSIS

                                                             0  - OVERDUE  ANALYSIS

                                                             .  - NOT  REQUESTED
vO
Ui

-------
                                                      SIDES

                                                 By David Barrow
       The STORET Ii.put Data Editing System (SIDES)
  was  designed  and  built by  several  members of the
  Region IV staff  of the Southeast  Environmental Re-
  search Laboratory to overcome some difficulties in using
  STORET. These difficulties are outlined in Figure  I.
  Figure 2  gives a list of features that SIDES provides to
  alleviate these difficulties.

       Throughout the past two years, SIDES has proven
  lu be a  useful vehicle  for preparing data for entry into
  STORET and for obtaining data printouts  for technical
  publications  much  sooner than they  could  have been
  obtained after entry of the data into  STORET.

       Figure .? gives  a list of users and uses of the SIDES
  system. In addition to its primary use for data entry, the
  edit program of SIDES is being used as a data editing/
  interface  module in the transfer of Ohio River Valley
  Sanitation Commission's (ORSANCO)  water users' data
  from ORSANCO's  data files into STORET. The SIDES
  cdil program is also being used in the Office of Research
  and  Development's  Interim Laboratory Data  Manage-
  ment System as the edit module and as an interface to
  STORET.

       Figure 4 shows the system block diagram for the
  SIDES system. It  also shows  a program Loadings and
  Statistics (LDSTAT) and its interface to the SIDES data
  files.  Wlulc program LDSTAT is not officially a part of
  the SI DBS system,  it provides a sophisticated printout
  capability for data prior to entry into STORET. Program
  LDSTAT  also may  be used to print out data already
  stored in STORI-T.

      The  SIDES  system  block  diagram  begins  by
  showing two  card decks: the editing information deck
  (EID) and the data  decks. Figure 5 gives an example of
  these two decks. The EID consists of a  STORET agency
  card,  a  YEAR  card, a  parameter  (PAR) card for  each
  parameter that  may have data, and  a STATN card for
  each  station  Ihat may  have data. The function of the
  EID is simply to  provide an easy reference for the edit
  program. The edit program checks much of the contents
  of the dala decks against what  was specified in the EID.

     The second card deck, the data decks, consists of
  one or more data decks. Each dala deck in turn consists
 of one parameter (PAR) card and any number of sample
 identification  cardi,  type  B  (SIDB)  followed by
 corresponding data cards (DTA).  The PAR card specifies
 the order and  parameter number of the data punched
 into the DTA cards that follow it.

     The SIDB card  contains the sample  identification
 information for each sample to be processed by SIDES.
 The following information may be specified on the SIDB
 card: STORET station number; the date  and time the
 sample  was taken; the ending date-time and the various
 composite  description  information  for composite
 samples; the depth at which the sample was taken; and
 the percent  from the right bank of the location at which
 the sample was  taken.

     The DTA card gives the actual values measured for
 each parameter specified on  the PAR card. The DTA
 card is  linked to  its  SIDB card by the sample number
 which  must  appear  on both. The DTA  card may be
 continued on up to nine additional cards.  At six param-
 eter values/card, this  gives a total of 60 parameters that
 may be processed through the SIDES system at one
 time.

     If  the user has  data for more than 60 parameters
 he must separate the  data into two or more groups of 60
 or  less  parameters  each  and process the groups in
 separate runs of the system. On the DTA card, a missing
 parametric value is indicated by  a blank field, and the
 deletion of a value already in STORET is indicate   by a
 solitary  Din the first  position of the proper field.

     The EID and  Data Decks are processed through the
 merge/edit  (M/E)  phase of the  system during  which
 several functions  are  being performed.  First the  SIDB
 and DTA cards  are merged. Second, extensive editing is
 performed upon the cards of the data decks. All cards in
 both the EID and the data decks are printed on the M/E
 printout. If any errors are detected during the editing
 operation, appropriate error messages are  printed out
 with the guilty  card  on the M/E printout. Examples of
 M/E printouts appear  in Figures 6, 7, and 8.

     In  the third step of the operation, the data in the
data decks are reformatted and entered into the SIDES
disk files. All samples are entered into the  file that goes
to the SIDES printout programs, but all samples having
errors are flagged. Only samples that passed all edit tests
I'M

-------
urc unified into (he  (lie  (hat  goes  lo  the  STORET
storage program.

     Once  the M/t phase has been completed, the user
has many  options.  If  errors  were detected during the
M/t phase, he may decide to correct the errors  and
rerun the M/E  phase. He may decide to store the good
data, correct the erroneous data and rerun it. He may
decide  to obtain an original order or a station order
printout  of the data to further assist  him in  validating
the data: or he may decide  to get a LOST AT printout to
obtain  statistics,  loadings,  or a  printout to go  into a
technical report.

     An  example of the original order printout appears
in Figures') and 10. An example of the station order
printout  appears  in  Figures II  and 12. In  the original
order and station order printouts, the parameter values
arc printed exactly as they  were punched onto the DTA
curds. This is to facilitate checking the data. Note that
even the samples that were rejected in the M/E printouts
arc included. The original order printout greatly facil-
itates proofreading the data  from the  original  input
documents. The station order printout enables chemistry
personnel  to easily discover bad values that may have
occurred during transcription.

     An example of the LDSTAT printout is given  in
Figures  13  and  14. In  the   LDSTAT  printout, only
samples that have passed all edit  tests are printed. The
parameter values arc printed with decimal points aligned
as dictated by the STORET decimal point specification
for each parameter. The statistics are evaluated for one
station  at a time. Note that  the  remarks K and  L are
included in the  minimum and maximum values. Sta-
tistics arc not printed for  parameters having only one
value. Some parameters arc not summarized at all, e.g.,
SAMPLE NUMBER.
                                                                                                               105

-------
  Why SIDl-:S was written:

            STORHT input formats (Fixed Format and DIP) were difficult to code and to keypunch
            thereby causing errors and wasted time.

            There were no easy-to-read printouts to facilitate validation of data prior to entry into
            STORET.

            The length of time required to enter data and corrections into STORET was unsatisfactory.

            It was very difficult to correct data once it had been entered into STORET.


                                         Figure I
                                          SIDES


 SIDES offers the following features:

            An easy-to-use card formal for entering data particularly from the standpoint of
            keypunching.

            A straightforward method of entering sample identification information.

           A comprehensive pre-STORET editing system to allow user to catch keypunch and
           transcription errors prior to entering the data into STORET.

           Several data reports may be run from SIDES intermediate disk files to aid in data
           validation and to provide data printouts with statistics  for inclusion in technical
           reports prior to entry into STORET.

           A simplified method  of correcting data already eniered into STORET through SIDES.


                                        Figure 2
                                   Features of SIDES
SIDES users and uses:

           State of North Carolina for data entry

           State of South Carolina for data entry

           EPA, Region IV, S&A division, for:

               Data entry
               Technical report generation
               ORSANCO water users' data transfer

           EPA, Region IV, Escambia Bay Recovery Study for data entry

          EPA, ORD, Quality Assurance Division, for:

               Part of the Interim Laboratory Data Management System.


                                       Figure 3
                                      SIDES Uses

-------

MERGE/
EDIT
PRINTOUT
^-J


EDITING
INFORMATION
DECK
DATA
DECKS
1
r
MERGE/
EDIT
                                            • STORET AGENCY CARD
                                            • YEAR CARD
                                            • PAR AM CARDS
                                            • STATN CARDS

                                            • PAR CARD
                                                •  SIDB CARDS
                                                •  DTA CARDS
                                            •  SIDB & OTA CARDS MERGED
                                            •  EXTENSIVE EDITING OF DATA CARD DECKS
                                            •  GENERATION OF STORET INPUT CARDS
                                 I
                               DATA IN
                                 DISK
                                FILES
 STORET
STORAGE
PROGRAM
             SIDES
           PRINTOUT
           PROGRAMS
 STORET
  DATA
  FILES
PRINTOUT OF
  DATA IN
 ORIGINAL
  ORDER
                       INTERFACE
PRINTOUT OF
  DATA IN
  STATION
  ORDER_
  PSEUDO
  STORET
  TYPE IV
RETRIEVAL
   FILE
                                                                  1
                                                               PROGRAM
                                                                LOSTAT
                                  Figure 4
                          SIDES System Block Diagram

-------
imcooo
YFA-
STAT"
STAT*
PAR*''
PARA'I
PAP if
PAB&"
PiPA'
PAPAv
PARA"

74
iVC-1
S-l
31S01
31M4
00400
00100
OOO40
00000
00410
      PARA'"
      PARA*
      PARAV
00^40
                        EDI TIMS  iMpnkMflTION OFO  FOLLOWS.

                               SKWL  ATHFNS GA  4'U-S4t>-354B

                          130000
                          HOOOS
                                                   «««»««««»*

                                                       F    021112049999  A
oooooo

PAR
STO"
STOrl
STO-
sin-
sin -
S1O"
SIO--
STD>'
sin-
STO';
c TO-1
STD^
STOM
STOl-
niA
HTA i
OTA
OTA 1
OTA 1
OTA
OTA 1
OTA 1
OTA
OTA 1
OTA 1
OTA 1
OTA 1
OTA 1
OTA 1
OTA 1
OTA 1
OTA 1
OTA 1
OTA 1
OTA 1
oooo
1
11S01
S-l
S-l
S-l
wr-
S-?
S-l
S-l
S-l
S-
s-
s-
s-
5-
S-
000)
000^
000?
000?
ooh
0003
ooo;^
00^5
0004
0004
OOOiS
0007
OOOfi
ooog
0010
001?
0013
0014
0016
0017
OOlfl
        ONE DATA  C.
        I     I     I
                                       HFC* FOLLOWS.
                                        I     I     I
                     17000
                                  7*05?]
                                          1 non
                                   74ns?)

                                   740S?]  nt.4S
                      Hl^- looi
                      ??v* no?
                   74051S 10.)s
                       I?. 1(1'I/
                   740515 li"tv
                          1- 1?
                          1 .• 1 ?
                   740S1S 1-.I14
                     15
                                " 0 . ]
                                «('. 1

                                  ^.
                                KO. 1
                               29.
                                2 '
144

^.?


134
1S4
                           1 •««<
                           1*0


                           141
                           14A
                           154
                           152
                                          15?
             I     I
               :j«f.ion*>«»o
                 lioo
                                                     100
                                            Figures
                                     Example of Two Cud rack
                                                           0001
                                                           ooos
                                                           000?
                                                           0011
                                                                           001*3
                                                                           0004
                                                                           000*
                                                                           0007
                                                                           00 OR
                                                                           0009
                                                                           0010
                                                                           001?
                                                                           0013
                                                                           0017
                                                                           ooib
                                                           13


                                                           15


                                                           15
                                                                                  (S.2.0)
198

-------
REPORT RFSTOlA


EDITING INFORMATION FILE:
                                                                                                                           PAGE
10 M113SOOO
20 'TEAM
30 »STAT».
40 »STAT*
50 • PARAM
60 'PARAH
TO >PARAi>
HO • PAR AM
90 *PARA«
100 'PAR AH
110 'PAOAM
120 'PAR AM
130 • PARAM
140 • PARAM

74
MC-1
S-l
31501
31616
00400
00300
00940
00900
00410
50060
00340
00669
                               SEW. ATHENS 6A 404-546-3548
                                                                        0211130*9999  A«
                          130000
                          130*05

-------
REPORT RFSTOU
                                                                                                                            PAGE
FIELD STUDY DATA PECK FOLLOWS:
 10  «PAB
 20  «SIDB
 30  'OTA
 40  'S10B
 SO  'OTA
 60  'S1DB
 70  'OTA
 80  OTA
               31501316160040000300009400090000*10500600034000680
                 *C-\
               0001     17000
                 S-l
             1  0005
                 WC-1
               0002                         6.2        8.5        1.9
             1  0002     6
                 S-l
             1  0011
  0003
1 0003
    S-l
I 0015
    UC-1
  0004
1 0004
    5-2
1 0006
    S-l
1 0007
    S-l
1 0008
    S-l
1 0009
    S-l
1 0010
   290   141
740515 1007
146
740515 1008
154
740515 1009
152
740515 1012
136
7*0515 1013
152
740S15 1014
169
740521


100

8.5




8.3




8.1








150E6

25












                                                               2.0
                                                               2.0
0001

0005

0002
13

0011

0003
15

0015

0004
15

0006

0007

oooa

0009

0010
                                                                      0012

                                                                      0013

                                                                      0014

                                                                      0016

                                                                      0017

                                                                      ooia
—>ILL.FIELD STA.Nft.
—>BAO SIOB CARD.
~>ST.OA.OUT OF RANGE.
—>BAD SIDB CARD,
—>ST.TIHE OUT OF RANGE.
-->8AO SIOB CARD.
— >FLO  2 ILL.CHAR.BET.FLOS.
—>FLO  1 ILL.VALUE.
—>FLD  z ILL.BLANK.
—>FLO  * NOT LEFT JUSTIFIED.
COUP-TIME

bRAB


GRAB

GRAB


GRAB

GRAB


GRAB

REJECTED

REJECTED

REJECTED

REJECTED



REJECTED

6RAB

GRAB

GRAB

GRAB

GRAB

GRAB

-------
REPORT RFSTG1A                                                                                                              PA6C



EOF ON CFSTOl.
PROGRAM PFST01 RUN SUMMARY:

            40 CARDS READ FROM CFSTOl.
            30 CARDS WRITTEN ONTO OFST03 FOR STORASE  IN STORET.
            »* RECORDS WRITTEN ONTO DFSTO*.

         STORET INPUT HAS BEEN GENERATED FOR
                12 GRAB SAMPLESt
                 0 COMPOSITE SAMPLES H/0 TIME OF OATt
                 1 COMPOSITE SAMPLES DlTH TIME OF DAY.
                 0 SAMPLE DELETIONS.

         AND DATA FOR    5 SAMPLES HAS BEEN REJECTED.
EXECUTION COMPLETED.

-------
(O
O
to
                                                                                                                                                          PAGE
FIELD
STA.NB.
wc-i
S-l
WC-1
S-l
WC-i
S-l
WC-1
s-a
S-l
S-l
s-
s-
s-
s-
s-
S-l
S-l
S-l
STOHET
STA.NR.
130000
135005
130000
taooos
130000
13000S
130000

130005
13000%
130005
13000S
131005
130005
130005
13QOOS
130005
130005
FIELD SAMPLE
IOEWTIFICAUON
START ST. COUP..
DATE T[H£ TYPE
7*0621 0900
7»051S 1000
7*0521 0905
7*0515 1006
7*0521 09*5
7*0515 1011
7*0521 1030
T»0515 1001
7*0229 1D02
1*05U 10*3
7*0515 100*
7*0515 1005
7*0515 IOOT
T*0515 1006
7*0515 1009
7*0515 1012
7*0515 1013
7*0515 101*
ENDING END. SAMPLE
DATE TIME MUMBEfl
7*OSZ1 1100 0001
OOOS
0002
noil
0003
0015
0004
0006
0007
09 iB
0009
0010
0012
0013
001*
0016
0017
OOlfi
ooooe 00003
SEKL DEPTH
SAMPJ.E
NUMBER
0001
0005
0002
OOU
0003
0015
000*
0406
0007
00 OB
0009
0010
0012
0013
001*
OOlb
0017
0018
00002 31501 31616 00*00 D0300
* FROM TOT COL I FEC COL I PH 00
RIGHT MFIMENQO KFH-FCtt«
BANK /100ML /100HL 5L» MO/L
1TOBO T90

6.? 8. 5

b.3 a. 3

6.3 (J.l












-------
PAGE
FIELD
STA.MR.
HC-1
S-l
tfC-1
S-l
MC-1
S-l
HC-1
s-2
S-l
s-i
S-l
S-l
S-l
S-l
5-1
S-l
S-l
S-l
STORE?
STA.MA.
130000
130005
190000
130005
130000
13000S
130000

130005
130005
130005
130005
130005
13*005
130005
130005
13*005
130005
FICLO SAMPLE
IDENTIFICATION
START ST. CONP..
DATE TIME TVFE
7*0*21 0900
7*0515 1000
7*0521 0905
r*osi5 1006
7*0521 0945
7*osis 1011
7*0521 1030
7*0515 1001
T402Z9 1002
7*0515 10«3
7*0515 1004
7*0515 10«5
7*0515 1007
7*0515 iooa
7*0515 1009
7*0515 1012
7*0515 1013
7*0515 1014
ENDING END. SAMPLE
OATE TIME NUMBER
7*0521 1100 P001
0005
•002
0011
0003
0015
000*
0006
000 T
000*
0009
0010
0012
0013
001*
001*
0017
OOltt
009*0 00900 00*10
CHLORIDE TOT HARD T ALK
CL CAC03 CAC03
HG/L me/i. *G/L


1.9 13 6

2.0 IS 7

2.0 IS 7




IS






50060
CHLORINE
TOT RESO
MG/L


KO.I

KO.l

KO.I



9.3
2 5






003*0 00680
COO T OR6 C
HI LEVEL C
Nfi/L MG/L

1** 100

15*

15*




150E6
1*1
1*6
15*
152
136
152
169

-------
c
J-.
                                                                                                                                                                     PAGE
FIELD
STA.NR.
S-2
*C-1
MC-1
MC-1
WC-1
S-l
S-l
S-l
S-l
S-l
S-l
S-l
s-
s-
s-
s-
s-
S-l
STORE!
S1A.NR.

130000
130000
130000
130000
130005
130005
1300 OS
130005
130005
130005
130006
130005
130005
130005
130005
130005
130005
FIELD SAMPLE
IDENTIFICATION
START ST. COMP..
DATE TI»E TYPE
7*0515 1001
740521 0945
740521 9905
7405Z1 0900
740SZ1 1030
740329 1002
7*0515 1011
7*0515 1000
7*0515 1006
7*0515 101*
7*0515 1013
7*OS15 1012
740515 1009
740515 1008
740515 1007
740515 1005
740515 1004
7405J5 1063
ENDING END. SAMPLE
DATE TIME NUMBER
0006
0003
ODD?
7405ZI 1100 0001
0004
OOOT
0015
OOOS
0011
0018
0017
0016
0014
0013
0012
0010
0009
0008
ooooa 00003
SE*L DEPTH
SAMPLE
NUMBER
0006
0003
0<^
0001
0004
0007
0015
0005
0011
0018
0017
0016
0014
0013
0012
0010
0009
0906
00002 31501 31616 00*00 00300
* FROM TOT COL I FEC COL I PH DO
SIGHT MFIMENOO MFM-FCBR
BANK /100ML /100ML SU M6/L

6.3 B.3
6.2 8.5
17000 790
6.3 8.1














-------
                                                                                                                                                                              PAGE
FIELD
STA.NR.
s-2
KC-1
tfC-1
MC-1
WC-1
S-l
S-I
S-l
S-l
S-l
S-l
S-l
S-l
S-l
S-l
S-l
S-l
S-l
STORET
STA.NR.

190000
130*00
130000
130000
130005
13000S
130005
130005
130005
13000S
130*05
130005
130005
130005
130005
130005
130005
FIELD SAMPLE
IDENTIFICATION
START ST. COM>..
DATE TIME TYPE
7+0515 1001
7*0521 09*5
7*0521 0905
7*0521 0900
740521 1030
7*0229 1002
7*0515 1011
7*0515 1000
7*0515 1006
7*0515 100*
7*0515 1063
7*0515 1007
7*0515 1005
7*0515 1009
7*0515 1008
7*0515 1013
7*0515 1012
7*0515 101*
ENOIN6 CNO. SAMPLE
DATE TINE NUM6EH
0006
0003
0002
7*0521 1100 0001
000*
0007
0015
0005
0011
0009
0008
0012
0010
001*
0013
0017
001*
oeio
009*0 00900 00*10 50060
CHLORIDE TOT HARD T ALK CHLORINE
CL CAC03 CAC03 TOT RESD
HG/L KG/L HG/L HG/L

2.0 15 7 KO.l
1.9 13 6 KO.l

2.0 15 7 KO.l




9.3


15 25





003*0 00680
COO T OR6 C
HI LEVEL C
HG/L N6/L






15*
1** 100
15*
150E6

1*8
1*1
152
15*
152
136
169
o
1/1

-------
                                                                                                PAGE
STATION - 130000
00008
SEMI
SAMPLE
DATE TIME DATE TIME NUMBER
7*0i21 0900 (07*0531 1100 1
7*0531 0905 2
7*0521 09*5 3
7*0521 1030 *
7*0521
NUMBER
MAXIMUM
MINIMUM
SUM
SUM SO.
MEAN
VARIANCE
STD.DEV.
STD.ERR.
COEF VAR
LOG MEAN
7*0521
009*0
CMLORIOt
CL
MG/L

1.9
2.0
2.0

3
2.0
1.9
5.9
11.6
2.0
0.0
0.1
0.0
2.9
2.0

00900
TOT HARD
CAC03
MG/L

13
IS
15

3
15
13
*3
619
1*
1
1
1
a
i*

00*10
T ALK
CAC03
MG/L

b
7
7

3
7
6
20
13*
7
0
1
0
9
7

S0060 31501 31616
CHLORINE TOT COLI FEC con
TOT RESD HFIMENDO MFM-FCBR
MG/L /100ML /100ML
17000 790
0.10K
0.1 OK
0.10K

3 1 1
0.1 OK
0.10K
0.30
0.03
0.10
0.00
0.00
0.00
0.00
0.10

00*00
PH

su

6.2
6.3
6.3

3
6.3
6.2
18.6
117.8
6.3
0.0
0.1
0.0
0.9
6.3

00300
00

MG/L

a.5
8.3
e.i

3
8.S
8.1
2*. 9
206.7
8.3
0.0
0.2
0.1
2.*
8.3

                                                                                                        ft.
                                                                                                        S5
                                                                                                        s

-------
                                                                                                                                          PAGE
               STATION - 130006
                                              00008      00340      00680
                                             SEWL        COO      T OHG C
                                            SAMPLE     HI LEVEL      C
               DATE    UHC    DATE   TIME   NUMBER       HC/L       MG/L

                              7*0515 1000         S        U*      100.0
                              7*0515 1006        11        15*
                              7*0515 1007        13        U8
                              7*0515 1000        13        15*
                              7*0515 1009        1*        152
                              7*0515 lull        IS        15*
                              7*0515 IVli        16        136
                              7*0515 1013        17        152

               7*0515
                  NUMBER                                     0          1
                  HAXIMUM                                  15*
                  H1N1HUM                                  136
                  SUM                                     1194
                  SUM SO.                               176*93
                  f£AN                                     1*9
                  VARIANCt                                 *1
                  STD.OCV.                                  6
                  STO.ERR.                                  Z
                  COCF VAR                                  *
                  LOO NCAM                                 1*9
               7*0515
8

-------
                                   SELECTION OF PROBABILITY MODELS FOR
                      DETERMINING QUALITY CONTROL DATA SCREENING RANGE LIMITS

                                               By Wayne R. OH, Ph.D.
 INTRODUCTION

      Environmental monitoring laboratories performing
 routine chemical analyses are using computers increas-
 ingly  to process and store the data. If these laboratories
 process large quantities  of  data, errors  may  be intro-
 duced  in the data handling phase, and there is some risk
 thai those errors  may  go undetected. Such errors, which
 frequently  icsult  simply from keypunch mistakes, may
 be  siorcd  as "valid" data and may create serious prob-
 lems  when  sample  means or other statistics  are calcu-
 lated. 'I lie computer can be  used, however, to automat-
 ically  screen laboratory data as  they  are entered to
 dele, mine if any value of a given parameter* is  "un-
 usiia!" 01  lies  outside  ;m  "acceptable range." Such
 unusual values can be  "(lagged" by the computer, there-
 by  bringing  them to the attention of laboratory person-
 nel where  eirors can  be corrected and new  analyses
 pcrlormed if necessary.

 STATEMENT OF PROBLEM

      Typical environmental data processing and handling
 errors  result  from  three sources:  keypunch  mistakes,
 failure to consider minimum detectable limits, and cleri-
 cal  mistakes  or omissions.  For example, when storing
 water quality data, a valid parameter value of pH = 5.5
 may he erroneously keypunched  as pH = 55.0 simply
 because tiie  decimal is  entered in  the wrong column.
 Without piovision for  "data screening,"  the computer
 will simply accept this value and store it as valid data. To
 .so'cen (he ilala. the value o/ each parameter is compared
 with piesel hunts  which are also stored in the computer;
 values onisule these  limits arc Hugged. Values which are
                                                             improbable can  therefore be brought to the chemist's
                                                             attention while impossible values can be automatically
                                                             rejected.

                                                                  To implement a data screening system, an improb-
                                                             able  range of values  for  each parameter must be deter-
                                                             mined; historical data can be used for this purpose. At
                                                             first  glance,  two or three times the standard deviation!
                                                             might appear to be  a  suitable choice for  the  quality
                                                             control  data screening  "range  limits."  However, an
                                                             examination of water quality parameters suggested that
                                                             most parameters are  not  normally distributed, and the
                                                             underlying distribution of each  parameter  may  differ.
                                                             Thus, use of 2 a or 3 a would result in range limits that
                                                             have  different  probability of occurrence for each para-
                                                             meter.  It would be  preferable  for  the range limits to
                                                             always represent the  same probability, regardless  of the
                                                             underlying distribution. Furthermore, it would be desir-
                                                             able to have  a systematic procedure for arriving at these
                                                             data screening range limits.
                                                             METHODOLOGY

                                                                 The  Environmental  Protection  Agency  (EPA)  is
                                                             now undertaking a data analysis effort in which prob-
                                                             ability models are fit to historical water quality data and
                                                             range limits calculated.  These limits then are being I  lilt
                                                             into a  quality control data screening computer prog_am
                                                             which  serves as an integral part of a computerized labor-
                                                             atory data management system. This data management
                                                             system is  designed to  store,  manipulate, and retrieve
                                                             environmental data, ultimately transferring it to national
                                                             environmental data archives.
 "In this i>a|>ei. ihe term "parameter" is used to mean a variable which is used as a measure of the stale of the environment. Tor example
 hoth i.iiboM monoxide measured by air monitoring stations and pH measured in rivers arc environmental parameters.
i-Ilie il.ita estimate \ofthe arithmetic moan /Hi is calculated as follows:    x
                                                                   1  V-»
                                                                   -  >   x
                                                                   n  / j  i
'Hie Jala estimate s of the standard deviation U is calculated as follows:   s

-------
Probability Models

     Mosi environmental paramelcrs, such  as pollutant
concentration,  never are less than zero and have essen-
tially unbounded upper limits. For environmental data,
eight probability models arc of particular interest:*

          Exponential
          Normal
          Log-normal
          WeibuU
          Gamma
          Beta
          Raleigh
          Extreme Value.

liquations (I) and (2) give  the probability density func-
tion (PDF) and  the cumulative distribution  function
(CDF) for the Wcibull probability model:
   PDF
     I' (x)
      A
(f/
                                        x > 0
                                   (1)
   CDF
                                                 (2)
This probability  model has  two parameters:  a  "shape
parameter," 0, and a "scale parameter," £. For different
values  of 0,  it takes on  a variety of different shapes
(figure 1). Because of  this flexibility,  the  model is a
good  candidate for fitting environmental data. When
a = |. the Weibull  probability model becomes equivalent
to the exponential probability model.
     rv(x)
      A
 i
T
                    -v /•
                     X/ ,
(3)
Tlie other probability models differ in shape from each
other. Some, like the normal distribution, are symmetri-
cal.  The characteristic shape is constant, and the para-
meters determine  the "location" of the curve and its
'•spread" or dispersion.

OvenH Approach

     A practical, convenient  approach for  fitting prob-
ability models to environmental data is by the "method
          of moments." In this  approach, the  estimates of the
          moments calculated from the data are set equal to the
          actual  moments of the probability  model. The overall
          methodology for selecting and fitting models to histori-
          cal data uses this approach and is depicted in Figure 2.
          At the left  side of the figure, the raw historical data are
          analyzed, by computer, and the histogram and estimates
          of the  moments are  developed. Next, a probability
          model is selected for trial. The relative magnitude of the
          moments gives some clue as to the most appropriate
          model. The  cumulative  distribution  function  of the
          model is then compared with the cumulative frequencies
          of the data.  If the model does not have a reasonably
          good  fit, another model is selected. Once a "best-fit"
          model is selected, it is used to calculate, within the range
          of the data,  the values of the  parameter which corre-
          spond  to various  extreme probabilities (P • .01, .001;
          P = .99, .999). These values are  used as the data screen-
          ing range limits.

          Selection of Probability Models

               The most important distinction between different
          probability models  is their shape. The third and fourth
          moments provide  measures of the shape of each model.
          The third moment,  ^3. is a measure of the distribution's
          asymmetry or skewness, the extent to which it exhibits a
          long tail to the left or to the right. The fourth moment is
          a measure of the distribution's "peakedness" or kurtosis,
          the relationship of the  distribution's "height" to  its
          "width." For a given probability model, the  second,
          third, and fourth  moments are obtained from the prob-
          ability density function  fx(x) as follows, where MI"  »
          the mean:
                                                          /(*•",')'
                                              dx
                                                           (4)
                                                                          f,,(x) dx
                                                                                                             (5)
                                         ,(x) dx
                                                                                               (6)
                                              The second moment is also known as the variance and is
                                              the  square of the standard deviation, 1*2 " °^- Two
•GcnU ). Ifahn and Samuel S. Shapiro. Siaiittiial Matielt in Knglnecrlnii (New  York: John Wiley & Sons, Inc.. 1967). See pages 120
       134 for nummary of probability models lor continuous random variable*.
                                                                                                               209

-------
   iinpoiiuni quantities are the coefficient o.f skewness\/3|
   and the coefficient ofkUitosispS:
    for 1112, nl3 ''or^3' ant' m4 'or ^4 into equations (7)
 and (8):
               ("a)
                                                    (7)
                                                    (8)
   Figured shows the relative values of/3|  and/37 calcu-
   lated hy the author fpr.a variety of .probability models.*
   The uniform, normal, Raleigh, and exponential models
   all have constant shapes; tljerciore" they are represented
   on ilus  figure as points, Foi  the .normal distribution,
   /*!  = 0 (indicating thai  the jnoclcl is  symmetrical) and
   /}-, = *. Tlic Gamma  model is represented as a  straight
   line, since a linear relationship exists between0|  and^£:
        0.,     1.5/Jj <  3                          (!))


  On this figure, both the lines fcir the Gamma and Weibull
  probability models intersect the 'point representing the
  exponential probability model because the exponential
  model is one specialcase of both" of these distributions.
  For the log-normal and Weibull probability models, the
  lines,  which were calculated by computer, exhibit slight
  curvature. The  beta'distribution (not shown) would be
  represented as a region on  this figure rather than a line
  since it lakes on a multitude of shapes.

       For a given set of data, whose underlying distribu-
  tion may be unknown, it is possible to obtain estimates
  of UK- first lour moments. Kstimalcs of the  first  and
  second moments are calculated as  the mean x  and vari-
  ance s~ in the manner given previously. Estimates of the
  third and fourth moments about the mean  are obtained
  as  follows:
                                                 '(10)
                 I  1
                                                  (in
                 i  i
  Using  these quantities, estimates of the coefficients of
  skcwness  and kurtosis may be obtained by substituting
                                                                                                               (12).
                                                                                                                (13)
..These quantities, which  are easily  calculated by com-
 puter from the raw data, can assist in selecting the most
 appropriate  probability model.  Once  calculated, they
 may  be  compared to the curves and points in the (f5,  |3-)
 plane of Figure 3 to suggest which probability model is a
 good  choice. By establishing a "region of acceptance"
 around  each  line and  point in  the ((Sj^)  plane,  this
 comparison can be done automatically by computer.

 APPLICATION OF METHODOLOGY

      For the laboratory data management system, three
 classes of suspect data were selected:

      Improbable:
        P(X < XL or X > XH) = .01

      Highly improbable:
        p(x < XL or x > XH) = .001

      Impossible:
        Cannot physically occur.

The probability  model is used to calculate Xi and XLI  for
the first  two classes. With the exception ofpH, nu  ( of
the  water  quality  parameters had  impossible  'lower'
ranges of 0.0 and no upper  impossible range. Another
important criterion for impossible values  was based  on
interparametcr comparisons,  the  physical  and chemical
relationships among one  or  more environmental para-
meters; but these factors will not be discussed here.
     To  iflustrate  the 'method,  atmospheric  con-
centrations of lead in cities throughout the United States
in  1968  will  be used as an  example.  The  computer-
generated histogram of these 4411  values is shown in
Figure 4. This histogram is greatly skewed to the right
with an arithmetic mean of 0.862 micrograins per cubic
meter (jug/rn^). For these data b| =  13.9 and  b-,= 25.8.
In this case, the skewness of  the raw data and the ex-
poncntial model differs,  as does the kurtosis,  but the
   Iliis rifiiu- i\ i>.ii irriictl .il'ii-r llie (|i|. 0, ( plain1 In I .S. Pearson. I'nlviviuy Collocc, LuiuUm. i-iu-J hy Halm and Shapiro,op. ci>., p.197 t\\\
  i inu-s luu' IH-I'll riY.ili'lil.ili-it.                                                                  '
-MO

-------
exponential  model offers  advantages  due to  its sim-
plicity, and we shall apply it in this instance for purposes
of illustration.

     The  exponential  probability model  has  just one
parameter, I- as seen 'n equation (3),  which is also its
mean.  We use the  method of moments to  set £ = 0.862
ug/nr5. t'ie "wan calculated from the data. The cumula-
tive  distribution  function  Fj^(x) tor  tm's m°del  then
becomes:
                                           Highly Improbable
                    -X/.862
     Fx(x)
                           (14)
The plot of this probability model anil the histogram of
the data  are  both shown  in  Figure 5. Using  equa-
        )' l'ie ran8c 'imits *°r atmospheric lead concen-
tion
trations  are  then calculated  as  the  values for which
F  (x) = 0.001 , 0.01 , 0.99, and 0.999. A sample calcula-
ti?n for Fx(x) = 0.99 is as follows:

,,  xp |  -S/.H62)     0.01

              -x/. «(>_'     In  (d. (I I)
              x    (-0.8li2) (-4.605)    3.J1

The  resulting  range  limits are obtained in this  same
manner:

     improbable
         x
        Fx (XH)
0.01

0.99
                                       .0087 fig/in"
                  0.001

                  0. 09!)
                                                                            .00086
                                                                     x^  •-  5.95>ie/m
     These limits, which are based on national data, then
can be  programmed into the computer as part of the
laboratory's data management system and can be used to
flag unusual values of lead concentrations. Note that this
probability model does not necessarily fit the data in the
traditional statistical sense. That is, no  effort  has been
made to  examine  the question, "Is the original dis-
tribution from which the data arose truly exponential?"
Rather, this model has been selected  for the purpose of
decision-making  and concentrating on the  question, "Is
this  particular piece of data sufficiently unusual to be
nagged?"
DISCUSSION

     A  principal advantage of this methodology is that
the entire process can be carried out automatically by
computer,  producing high-  and low-range limits as its
output. Currently, the first phase of a general computer
program to  perform these steps has been completed, and
a  contract has been written to apply this program to
selected water quality data  files in Region Vs Central
Regional Laboratory in  Chicago. The data  screening
range limits which  result from this effort will be in-
corporated into  the Region V Laboratory Data Manage-
ment System (LDMS) first  as part of the time shared,
remote  terminal  system and later as part of  the auto-
mated,  minicomputer system  to be  installed at this
laboratory.
                                                                                                               211

-------
  •o
  3
si
  '
  ?
s ^
31
i/MY O"
ii E.
  s
  S>

-------
 Otta
Computer
Practicing
                                                              CakuUte
                                                              Stttirtie*!
                                                              (P. x....)
                                                              Generate
                                                             Histograms
      g»—• — ~
      atiect
    Probability
      Model
[Gamma. Wtibull."]
(Normal, Uniform.1
L     &etc.    J
 Compere "CDF"
  of Model With
    Cumulativt
Frequtncin of Ora
                                                                Select
                                                               Another
                                                                Model
Accept

Compute Values
Corrtsjponding
Probabilities From
Models (O.t. 0.01. 0.0011



Quality
Control Ranges
<•.!>. c 	 1


-------
                         1
                                                       Figure 3
                                            Relative Magnitude of ^j and /3,
                                            for Different Probability Models
214

-------
 TOTAL  NO. OF VALUES • 4411

 NO.  OF VALUES LESS THAN 0.0
DISTRIBUTION OF DATA (INTERVAL WIDTH •  0.2000)   *  *  *
MEAN •   0.862002    VARIANCE •      1.09539986
STO. DEV.
1.046614
TWTFDVAl
1 ;! 1 LnVML
0.0 -0.2000
0.2000-0.4000
0.4000--0.6000
0.6000- -0.8000
0.8000- -1.0000
1.0000- -1.2000
1.2000- -1.4000
1.4000- -1.6000
1.6000- -1.8000
1.8000- -2. 0000
2.0000—2.2000
2. 2000- -2. 4000
2 4000- -2. 6000
2 6000 --2. 8000
2.8000-3.0000
3.0000—3.2000
3. 2000- -3. 4000
3.4000—3.6000
3.6000-3.8000
3. 8000- - .0000
4.0000- .2000
4. 2000- - .4000
4.4000- .6000
4. 6000- - .8000
4 8000— 5.0000
5! 0000-5. 2000
5 2000—5.4000
5 4000—5.6000
5 6000—5.8000
5 0000—6.0000
6 0000- -6. 2000
6 2000—6.4000
6 4000—6.6000
6 6000—6.8000
*" 8000- -7. 0000
7 0000- -7. 2000
" 2000- -7. 4000
7 '4000- -7. 6000
-,6000-7.8000
7 '8000-8. 0000
8. 000-8.2000
fl.i-MXI-8.4000
\ < '00-8.6000
Sie lo-s.eooo
g. 8000-9. 0000
9 fX)00--9.2000
a. 2000- -9. 4000
9.4UOO-9.6000
9 6000—9.8000
9! 8000- -*••**•

NIIMRFR
nunccn
1012
659
665
452
361
309
213
139
100
52
106
56
39
26
16
39
24
22
14
5
16
11
6
4
3
12
6
5
2
0
2
5
7
2
1
5
1
4
2
0
3
1
0
0
1
0
0
1
0
0

prprrur j
r LntvLll i ^
22.943X
14.940X
15.076X
10, 247X
8.184X
7.005X
4.829X
3.151X
2.267X
1.179X
2.403X
1.270X
0.8B4X
0.635X
0.363X
0.884X
0.544X
0.499X
0.317X
0.1131
0.3631
0.249X
0.1361
0.091X
0.068X
0.272X
0.136X
0.1131
0.045X
0.0 X
0.045X
0.113X
0.1S9X
0.045X
0.023X
0.113X
0.023X
0.091X
0.045X
0.0 X
0.0681
0.023X
0.0 X
p.o x
0.023X
0.0 X
0.0 X
0.023X
0.0 X
0.0 X
«
MT»fV*DAM .

*•»**••*»***•*»•»••**••
••*•*•*•***••**

**********
********
*******
*****
***
• •
•
*•
*
*
*

*
*


































DM FPCfl
i«un . r new
22.943S
37.8831
52.9581
63.2061
71.3901
78.3951
83.2241
86.3751
88.6421
89.8211
92.2241
93.4931
94.3781
95.0121
95.3751
96.2591
96.8031
97.3021
97.6191
97.7331
98.0951
98.3451
98.4811
98.5721
98.6401
98.9121
99.0481
99.1611
99.2061
99.2061
99.2521
99.3651
99.5241
99.5691
99.5921
99.7051
99.7281
99.8181
99.8641
99.8641
99 . 9321
99.9541
99.9541
99.9541
99.9771
99.9771
99.9771
100.0001
100.0001
100.0001
+
NO. OF VALUES EQUAL TO OR GREATER THAN 10.0000 •    0  (  0.0  t)
                                              Figure 4
                          Computer-Generated Histogram of 4411 Atmospheric
                      24-Hour Lead Concentrations Measured in U.S. Cities in 1968

-------
 fy


f'
\

V


^
e
c/.£
62





























       0.0
1.0           2.0           3.0
    LEAD CONCENTRATION X
         4.0
5.0
                                     Figure 5
                Distribution of 24-Hour Average Lead Concentrations
                   National Air A Sampling Network, 1968, Along With
                  the PDF and CDF of the Exponential Probability Modd

-------
                                        A SYSTEMATIC APPROACH TO
                                     WATER QUALITY TREND ANALYSIS

                                     By Merlin H. Dipert* and Jon A. Abnytis
INTRODUCTION

     A  systematic approach using empirical models is
proposed  to investigate  water quality data trends.  A
trend  analysis  procedure,  with  supporting computer
programs, is presented.

     For ease and  economy  of operations, three  pro-
grams are used. One  retrieves  the data from STORET;
another fits the data; and  a third plots the theoretical
curves, the  points, and their confidence limits. In order
to simplify  the mathematics, the graph is considered as a
quality control chart.

      The  function  is  assumed to be nonlinear in time;
however, since the  program takes its own derivative, the
model  can  be  changed  by changing  one card  in the
fitting program and one card in the plot program.

DISCUSSION

      In investigating  water  quality  trends, errors of
measurement, which  differ from random errors in the
process,should be considered. In this paper, however, it
is assumed  that the points (individual points, replicated
 means, periodic means, or some similar values with their
calculated errors) have been obtained by some standard
 method. This discussion is limited to the errors of fitting
 these points.

      Consider  the  general n-dimensional mathematical
 model,

               y =  L(t) + F(t) + G(x) + S,

 where L(t) is the linear trend line, F(t) is the nonlinear
 component of time, G(x) is a function  of other para-
 meters (e.g., latitude, ambient temperature, or depth of
 measurement), and  S is the  seasonal parameter in the
 same sense as used by Box-Jenkins.t

      For  preliminary investigations,  the project  staff
 selected  a  station  in the Ohio River near Cairo, Illinois,
 and, using the specific model,

               y =  A  + Bt + C sin (D + Et),
and setting the period equal to one  yar, a 98 percent
reduction of variance was obtained for temperature, a
28 percent  reduction  for dissolved  oxygen; and  a
30 percent reduction for chlorides.

     Since the results are preliminary and there are no
plots of dissolved oxygen and chlorides, and since the
model should be  changed, values are not given.  The
values obtained  for  centigrade temperatures, with their
standard errors,are given in Table I.

                      Table I
      Sample of Values Obtained From the Model
Parameters
Intercept
Slope
Amplitude
Phase angle
Phase angle days
Degrees of freedom
Variance
Reduction in Variance
Values
16.1 + .31
-.00077 ± .0003 1
12 ±.23
-2.05 ± .02
119± 1.1
28
1.4
98%
      The   theoretical  curves,  with  the  means,  the
 95 percent  confidence  limits lines  for  the monthly
 means, and  the  individual plots  are  presented  in
 Figure 1.

 CONCLUSION

      An approach which can be used to compare water
 quality parameters, determine linear trends over time,
 and  calculate any of the parameters of the equations
 required by the physical scientist has been presented in
 this  paper. When the mathematical models are the same,
 this  approach will give  the necessary transformation to
 make  the  curves coincide. Perhaps  its most important
 use   is' to determine   environmental  quality  trends
 (changes in parameters over time) to discover if enforce-
 ment  policies have had  a real effect on environmental
 quality rather than merely  a "perceived" effect due to
 seasonal or other cyclical phenomena.
   •Speaker
   fG.P. Box and C.M. Jenkins, Time Series Analysis, Forecasting, and Control (San Francisco: Holden-Day, 1970).
                                                                                                               217

-------
K»
S»
  MAM NO 00010

  STATNO 160035
                   aoo
two
360.0
                                                                                              1260.0
                                                                            1440.0
                                                                            1620.0
laoao
                                                                                                                                          «eao
                                                                              DAVS.x I01

-------
                        SAMPLE FILE CONTROL FOR THE AUTOMATED LABORATORY

                                             By Roman I. Bystroff
INTRODUCTION

     During this year, a  team at Lawrence Uvermore
Laboratory  has  engaged in  feasibility  studies for a
number of water quality related laboratories in the En-
vironmental Protection  Agency  (EPA). All  of  these
laboratories are participating in the Cincinnati Pilot Pro-
ject, an  effort directed at developing minicomputer-
based  laboratory  automation,  which  will  be
transportable to other EPA laboratories.

     The team has definite ideas about laboratory auto-
mation, and  it seemed  sensible  and proper from  the
inception  that once sample data  were in the computer,
some form of sample  file control would be used. Ideas
about sample file control have since evolved, particularly
as .the regional laboratories emphasized  its importance
and benefits.

     Today functional  specifications for a sample  file
controller are nearing completion; this  should fit  the
needs  of laboratories  with  quite diverse operations,
namely  Region V  and  National  Field  Investigation
Center (NFIC), Cincinnati. This paper will  first describe
the  functional  characteristics that are  proposed  to be
implemented and also the value  judgments inherent in
the implementation.

FUNCTIONS OF THE SAMPLE FILE CONTROLLER

     The  central  file in  Figure 1 is referred  to as the
active status file (ASF) since it will contain all in-
formation about  samples  on  work  in  progress. A
laboratory supervisor  uses  this file as a log-book. On
receiving information about a sample or related samples
by groups, he enters this information and also indicates
possible procedures, e.g., the test procedures. The  pro-
gram modules LOGIN and EDIT create these entries and
aid  by prompting  him  for  the  required  information.
Blanks are left for late information and filled in later.

     Once samples are  logged-in, an operator at some
work  station queries the file and selects those samples
which he can handle  and which are intended for him.
 The WORK SCHEDULE program module aids him in
 this function by passing all information required about
 the samples into  his station file. The station controller is
an  applications  program for the performance  of an
automated  analysts procedure,  including the essential
quality control. The output of the station is one result
for each sample test procedure, which is passed back to
update the active  status file. In addition,  the station
passes information about quality control statistics on
each sample to another file, the primary data file (PDF).
The file  is archival; this information  is required  only
when legal cases are called  or  for quality  control au-
diting. Another archival file is the completed sample file
(CSF), which  can  be accessed for laboratory statistics,
cross-references to the  primary  data, and so forth. The
function of the  LOGOUT program module is to aid in
the  transfer of sample  data from an  "active"  to  a
"completed" archival file.

     A  report generator module  aids the operator by
producing  a listing of any specified data items or by
producing standard formatted reports from any logically
structured files, that is, the ASF, CSF and PDF.

     The final functional module is a STORET converter
which prepares the sample  data stream  in a  format
suitable for transmission to a national data system.

THE LOGICAL STRUCTURE OF FILES
     In Figure 2, a tree structure with up to three levels
 of branches defines the organization of any file in the
 sample file controller (which does not include station
 files). The file is structured to reduce redundancy. On
 the branches are data items associated with descriptors.
 The descriptors and their meanings, as employed in the
 water samples active status file at Region V, are given in
 Table I. Note in Figure 2 that there are as many data
 items associated  with  one  descriptor  as there are
 branches at any one level. For example, at level 2, there
 might be six  sample numbers which have the common
 descriptor (DESC 3) SAMPLE,  and exactly six  sample
 type items with the descriptor SAMPLETYPE (DESC 4).
 This somewhat minimal restraint on the allowed struc-
 tures of the tree simplifies a structure specification. The
 presence of  structure  code permits a rather straight-
 forward means of restructuring  the files. A PDF with a
 different structure  than the water ASF  is shown in
 Figure 3.
                                                                                                             219

-------
                       Table I
              Examples of Water Sample
                   ASF Descriptors
Dnoipl
Complefo
U'vel 1
111 IK 1
IfKOlU'
SIUIIY
(.HSIAIION
Dill DAD
I.I' lAkIN
KM DAM
NIIMSAM
Nl MI'AKM
A(('l
COM U II
(.KOM'IIAII
DIRI (TOK
SIKICIUKI
1 evel .'
< KLNUM
SAMI'I 1
SAMKK
ISI IIAII
SAM STATUS
SAM II)
SAM IYKI
CUSIODY
SAM I.AT
SAM I.ONI.
(OMPOSI1I
SAM IIMI
SAM WT
U-vcl 1
ANALYSIS
VALID

Ml TIIOI1
OPIKAIOH
ANAI DAM
STA1 US
STOK IT
or '
AhbceVtolloni

on
l.f
-
CifST
DUI
SOK
l.PIAk
KKDAI
NSAM
NI'AKM

POL
(,1'DAI
DIK
STKUCI

CKL
a
SI.IX
ISIDAI
SSIAI
SSIAI
STYK
(US
SLAT
SI.ON
COM
SUM
SWI

ANAL
VAL

MI-TII
OPIK
AN DAT
STAT
STOK

Una ipliun <>l (he L»l«

Diitrifl ollk-e
Study number
Study JeikTiptiiin
Sljtion number ol timup
(irtHipJwdute
i^roupuuu' ukvn
(iroup Jatc oi jrrrvul
NuiillMr ill \unipk-s
NutnlKT dcleniiiniilhins in he peilonncd
Study aeeminl iiumrwr
(iroup pollution index IIK ue
D-JIC ^ruii|i lo»!yed »ul
Direrlor ul Ik-kl ullkv
lypi- nl tliiiu strut lurr

<'KUS sampk1 numb«t
STOKTI «ini|ik' *
Code lor kxation «herr umpk' taken
»ute juimpk taken
Sliilui ol Ike wmpfe
Suinple idcnlilVution
lypeot \itnpk*
( linlody umpk
LililuttV nl wmp4e hieulkMi
Ixinililudeol' «impk' kvatkHt
K the Maniple a «MnfMNHV
I'ime orn|4e ukvn
Sompk> woijhl Ipxml

Any puimelcr to be iJetCTmlncd
Concentnlion « >>lu> of > 4ei«mniiion
IKV lui uf amlyKi fo> unk»
M«hodofii«lyi»,
Operator performlm tfuiyik
LKlc uulyvB nnWtcd
Slilui of vulyifc
STOKH ftnmna nyiMm
      The use of a structure code is a way of satisfying
 the functional requirement that the user has the option
 to redefine what items he requires and their relationship
 to each other In the file controller. The importance of
 this in a pilot project and in the "tunsbility** of an im-
 plementation for diverse laboratories is that difficult re-
 programming is  avoided. The laboratories SPC needs are
 evolving ind will  continue to evolve. For example, at
 Central Research Laboratory, Region V (CRL-V), three
 structure*  (ASF types) are required  for  sample files
 because water samples, bubbler samples and high-volume
 air filter samples* have different numbers and kinds of
 descriptor*.  If  biological  sampler were also  included
 under the  computer SFC, another type of active status
 file could be added.
     The  water  ASF  in Figure 2 and PDF in Figure 3
 illustrate  the use of  different keys in structured files.
 Samples  are effectively organized by  PROJECT  or
 GROUP and are often kept together in  batches for an
 analysis work session. The quality control data are keyed
 by the time-of-completion  because analysis procedures
 vary in length. Various sample determinations are related
 to a calibration  series (represented by coefficients of a
 least squares fit and the standard deviation in Figure 3).
 Retrieval of the primary data for a group  of samples will
 be fastest if the cross-indexed time-of-completions are
 first found in the CSF under its key: GROUP. This then
 is an example of a trade-off: data are logically organized
 to be easy to input without the overhead of reordering
 data.

 INTERACTIVE FEATURES  OF THE SAMPLE FILE
 CONTROLLER (SFC)

     The user's acceptance of a computer  is very high on
 the list of functional values. After all,  the laboratory
 chemist is not very burdened by choosing where to write
 his data on a standard  form sheet. Program aids to log or
 request data should be functionally equivalent to those
 of the form sheet. This means memory aids (prompts),
 defaults to normal requests, and the freedom  to enter,
 not enter or ditto data entries.
     The LOGIN program module illustrates these kinds
of features. A person logging in a group of samples may
take various paths as he fills in the sample information m
an ASF form sheet. If the SAMPLE TYPE happens to be
the same lor all the samples in  this group, he merely
types the description and ";ALL". If they differ, LOGIN
will  prompt hint  for the  entry for each individual
sampb. Many time he can pass over an entry. It may be
a delayed piece of information needed only for a final
report^ or one not always required. The LOGIN module
can be used again to fill in previously unspecified data..
In this ease* the prompting would begin with  the first
blank Jn tat  form record  and  proceed from there
LOGIN woel* not b* used to change existing items.

     Wh«a  
-------
                    Table II
   Examples of Command Lines in the Sample File
           Controller Program Modules
Modulo
i i>u






KIK;






WOKk
S( III 1)1 LI


( itminantl

AIM!
1)1 1 1 II
( IIAV.I
CHANCI



SiH'fiHfr ur Output

II SI:MOI>5 KK.KOUI' HII.SAMI'I 1 1 IIIHl III 1?
II SI lie 1 ROM(.K(ni|' IIH.
SAMKX .NAMI'M III IIIKl1 >l> ID ll'-5>»'
SAMICX :SAMI'I.I III I'llRI' 211
SAMPI.I
Hi'

MM A
fl.KOI 1' IdliSAMI'l 1 :( I'.SIOIIY.OI'I RAIOK
I'KIM j




c.Koi'i' nil
SAMI'I 1 HISIOI1Y OPI MATOK
1 SIA-III J»H
1 SIA-ltl JWH
•> DIVIMOMNIHI
SI'MMAKY >
J (iKOliP l"l


1 INI)
PKINT
sinri -
RI-STORl: .


»H|:.<:il<>llMni
»10TIIRU20, 22:11*
added under the descriptor TEST for a specific set of
samples. The delimiter ";" is used to mean the'logical
'AND"  of the  two specifications  GROUP: 101  and
SAMPLErl  THRU  10,12.  The  forms  ADD..  TO,
DELETE ... FROM, CHANGE.. TO are expected. If the
prepdsition is left out, it is assumed the operator wants
to be prompted by  sample number, so that he may enter
individual values as in line 4.-Note that if a descriptor
(SAMLOC) is  left  unmodified, it  is assumed to  mean
ALL items described by SAMLOC.

     The  report generator  commands  are  such  that
anything in the file can be listed. The called-for level 1
descriptors will appear in a row at the top of the report,
and level 2 and 3 descriptors will head columns of items
across the page. The order of appearance will be the
order in the command line. The example shown calls for
all samples and all custody and all operators. The list
would be shortened if, for example,,the operator was
specified  OPERATOR:JWD.  Special standard forms
would be called by a more concise SUMMARY  com-
mand. These could be sample summaries or managerial
reports with  column totals, if  desired.  The WORK
SCHEDULE module  commands are more terse because
it is assumed that the only items of interest to specify
are the ANALYSIS and the SAMPLE numbers. Thus the
descriptors are left out, and only the item is stated.

CAPABILITIES FOR OFF-LINE DATA ENTRY

    An SFC is most cost effective if parallel records are
altogether eliminated. All the  test results,  including
those  which come from nonautomated tests (those used
less frequently, presumably), can and should be entered
into the computer files. There are two ways for accom-
plishing this. The EDIT program can  update a sample log
with a keyed-in result (see Figure 1). A better  procedure
would be for the user to write an applications program
in the BASIC language to prompt him for the data in a
systematic fashion and incorporate quality control. This
has the added  advantage that key-in errors can be re-
duced by checking against limits.

SUGGESTIONS  FOR  THE   EFFECTIVE  USE OF
SAMPLE FILE CONTROL

     In the  EPA there has been some thought  given to
making use  of preknowledge about  samples in a passive
sense. For example, Wayne Ott and W. Fairless have con-
sidered limits that  results might be compared against.
One use of such limits would  be as a coarse filter for
key-in errors (e.g., pH of >  14  is impossible for water).

     Geographic keying of lake and river improbably
high (or low) results may prove of value in some regions
if sampling  and seasonal variations  are not serious. The
SFC  can accommodate  these  files,  and the  WORK
SCHEDULE program could make these TEST-related
parameters  available  to the  analyst  for purposes  of
double checking. It is suggested that in the compliance
monitoring   (C-M)  program,  license limits  are  quite
specific and can be keyed by industry or industry-type
and test procedure. In the passive use of these limits, a
 bell could be made to ring if a limit were exceeded. This
 may  be unnecessary, but if one understands that auto-
 matic sample file control and lab automation compress
 the operational time scale for the laboratory, then these
 C-M  limits could be used for  closed loop options. For
 example, the reduction of the number of determinations
 in the laboratory could be achieved if a minimal quality
 control  scheme were employed when the result (plus
 uncertainty) is less than the C-M limit. If a C-M limit
 (either a peak-allowable or maximum daily allowable)
 were reached, a conservative quality  control scheme
 would be  employed. After all, once the STORET base-
 lines have  been   established,  normal  values  are  of
 incidental  interest. The enforcement lawyers would have
                                                                                                          22'

-------
   to address this question, but it seems that this scheme is
   not a bias Of results, but rather like a sliding uncertainty
   criterion.

       Other examples of  closed  loop  effects of auto-
   mation on 'laboratory operations are not entirely related
   to the subject of  this paper, but in response to  queries
   from National Environmental Research Center (NERC),
   Corvallis,  a  random access  auto sampling  wheel  is
   employed. It  makes possible the scheduling of,  and
   access to, samples  to be rerun in the same work session.

       The  laboratories in question employ custody pro-
   cedures. The presence of status conditions  and custody
   identification in the SFC files can be made effective as a
   means of  locating  the  physical  sample and  possessor.
   Similarly, the analyst can be encouraged to conform to a
   strict protocol by the SFC.

       Management   reports  were  not  a  high-priority
   function of the SFC design, but are a bonus which  can
  help operational evaluations.

   DESIGN  VALUES  FOR  A SAMPLE  FILE
  CONTROLLER

       At the stage  of thinking about  the pilot project
  laboratory as a system (that is, in terms of what is done
  and  what  flexibilities exist that should be  preserved
  when automated),  value judgments must be made as to
  what features are critical for the success of the system as
  a whole.

      Some of  these may be evident from the  preceding
  description. The data flow must  correspond to existing
  operations. This includes a query and input capability
  for the engineers, chemists and managers; therefore, the
  accesses to the data base must be interactive and easy to
  use.
     There are staffing constraints in the laboratories. A
 choice is made to allow for the problem-oriented chem-
 ist to engage in the solution to laboratory procedure
 problems. In the automation  of the  laboratory, the
 choice of an easy-to-use BASIC language will allow the
 user to be involved in the growth of the system applica-
 tions.  Sample file control has a lesser, but still evident,
 requirement to be adaptable to evolving needs.  Value is
 placed  on reprogrammability  by  the user.  This  is
 particularly valid in a pilot project in which options are
 to  be  kept  open  until  tested  and  evaluated  for
 usefulness, user acceptance, cost effectiveness, integrity,
 and so forth.
     The adaptability  of the sample file control is an
 important  value  which addresses the  objective  of
 transportability  to other EPA laboratories. A modular
 design  addresses this.  The  laboratory  automation
 programs and  the  sample  file  control  routines  are
 separate modules in the whole system. The options re-
 main open  for  resident  versus hierarchal (CPU-CPU)
 implementations.
     For the present it is believed that the Data General-
supported BASIC language can be used to write the SFC
in a way  that will satisfy the values mentioned so far.
The NOVA-840 has  the  capability in its  foreground/
background disc  operating  system. However,  the
response  time  of such an  implementation is  still  a
question.  The job  is to find  the  right  trade-offs  o im-
plement the  functional values mentioned in this paper.
When this is done, the prime objective will have been
achieved:  to  have a  cost  effective  laboratory-based
system in which the  user is given authority  for his
portion  of the responsibility, that is, to efficiently pro-
duce quality analytical results.
222

-------
i j
u
                                                            STQRET
                                                          CONVERTER
                                 0 NAT1L DATA SYSTEM
                                                                                   REPORT
                                                                                 GENERATOR
                                                           ACTIVE
                                                           STATUS
/                                                       \RCHIVAL
                                                        FILES
                            \WORK
                            iCHEDULE
                                                            CSTATI Oil
                                                           ONTROLLERS
STATION
 FILES
                                                            ON-LINE
                                                            INSTRU-
                                                            MENTS

                                                                            91
                                                                            ss sr
                                                                            n a

-------
Level 1



!
Desc 1 | Dcdc 2
I
I torn 1 ' Item 1
	 } 	 1
1
i


i
1
1
1
t
Level 2 [
1 '
1
1 I
1 1
1 ' !
| Desc 3 i Desc 4
Iteml I Iternl

i *—
Item 2 1 Item 2 |
i L
i •
i
I
Item 3 | Item 3
, I
I .
Item 4 1 Item 4 j
II
1
Item 5 ' Itern 5 [ '
L_ i
i r '
r :
Item 6 1 Item 6 |
1
1
1
1
1
1
Level 3
1
1 |
1 Desc 5 Desc 6
1 I
i Item 1 Itpm 1 i

r
i
i
Item 2 Item 2 ,
1
1
1
Item "* f-t^"] 1 I
1
, I
i ;
Item 4 Item i I
1 |
i :
!

                                                     Figure 2
                                               General Data Structure
J_>4

-------
LEVEL 1
LEVEL 2
              LEVEL 3
                      REASON-
REPLICATE  DILUTION   NOT-USED
.
/STRTT£"nTRF\
( CODE /DATEDON'E T1ME COEFF1 COEFF2 STDDEV
METHOD BLANK
PROJECT SAMPLE OPR NUMREP
100 11 JWB 5
100 12 CWD 4
•
100 13 CWD 3

	 >15 1 2
10.0 2 0
	 9.2 2 0
11.1 2 0
5.1 2 3
, 	
	


o !
§ ?i-
I * se
52§ =
e * £"9
iril
i O * w
Pi

-------
                        A DATA REDUCTION SYSTEM FOR AN AUTOMATIC COLORIMETER

                                    Hy K.V. Byram,* F.A. Roberts, and L.A. Wilson
   INTRODUCTION

       The  Consolidated  Laboratory Services  Program
   (CLS) at the Pacific Northwest Environmental  Research
   Laboratory  (PNERL) das analy/ed a  large variety of
   samples supporting the research  programs and, until
   recently, the EPA Reghn X programs. It has been found
   useful to  employ  the  fechnicon  Autoanalyzer II  for
   performing  analyses, where the volume of work to b?
   done justified the  automated  system.  At PNERL,  the
   Technicon Automated System has been averaging about
   3000 determinations  per week, running  from  three to
   five channels simultaneously.

       Al this tale, i( sivmed imperative  to automate  the
   data capture process fci  information coming from lliese
   instruments  if their full  potential  to' reduce manpower
   requirements was to be utili/ed. Although Technicon,
   Inc.  produces both  a printer  (producing output  on
   adding machine width tape) and a  teletypewriter (pro-
   ducing full width or punched paper tape), these outputs
   are of only  minor advanlage  if a  full complement of
   analytical quality control procedures is employed.

   WHAT THE  TECHNICON PRODUCES

       The Technicon autoanalyzer is a continuous flow
   reagent-mixing and  colorimeter  system which selects the
  sample to be analyzed from a series of cups. The liquid
   from  each cup is followed by a wash to discriminate
  samples. The cups are selected  at rates of around 50 per
  hour. After a uniform delay for reagent mixing and color
  development, a voltage  is  produced  from the  color-
  imeter. The  voltage is proportional  to the amount  of
  constituent in  the cup, which  is a peak above  the zero
  constituent in the wash (Figure  1).

      At PNERL, (lie minimum run consists of a  series of
  slandaids. a series of unknowns (samples whose concen-
  tration  is unknown), and a series of standards, [f the run
  is extended,  another  unknown/standard series  follows
  the  first. The "standards" consist  of initial  blanks
  (distilled water) to clean out the system, followed by a
  series of at least eight standard solutions spread  over the
  range  of  interest,  followed by  trailing blanks.  The
  "unknowns"  include unknowns, replicates of unknowns,
  and  unknowns to  which  a known has been added
 (spikes). The normal run consists of about 20 standards
 in  two groups of ten,  and  105 samples, which, when
 combined and interspersed with blanks, spikes, and so
 forth, comes  to a  total of  156 peaks. It is this series
 which  is  presented  to the computer  programs  for
 reduction.

 NECESSITY FOR AUTOMATING DATA CAPTURE

     The primary reason for  the decision to process the
 autoanaly/.er output by computer was the volume of
 data expected. It is a reasonable job to reduce the out-
 put from the instrument run in  production for an hour
 or two a day, spending an equal  time reporting the data.
 But keeping up with the instrument running in  produc-
 tion for 1 5  hours a day was another matter.

     In convincing the staff that  a more fully automated
 system would be useful,  considerations such as the avail-
 ability of a programming team  with experience in  this
 kind of work and of a predecessor program to perform
 parts  of the work weighed heavily. Not least important
 was that many of the simplifying assumptions on which
 Technicon, Inc.,  markets equipment for water analyses
 were not valid for this particular application. Since the
 colorimeter  output  is produced directly in absorbance
 units, it is  assumed  that the concentration can  be read
 directly from the chart or printout with automatic blank
 correction.  The assumptions were  not reliable  enough
 for routine use at the levels and precision needed.

 PROCESSING THE STANDARDS

    When running colorimcteric analyses manually, it is
 normal procedure to  run a set of standard solutions and
 plot  their  concentrations and  absorbances  on graph
 paper. Using the chemist's best judgment, a straight line
 is  then drawn as closely  as possible through the points-
 using  the standard   curve, the  analyst  estimates  the
 concentration of an  unknown from its absorbance. The
 difficulty in automating this process is in duplicating the
 chemist's judgment.  If concentration is plotted  on  the
 x-axis and absorbance on the y-axis, it is common  fo
example, for the curve to flatten  out at the high end d«e
 to  deviation  from Beer's  Law. If most of the unknowns
are in  the high concentration range, the system is recon-
figured so that it is linear for the region of interest  If
  'Speaker
22(>

-------
I lie values for i lie unknowns fall in the linear region, the
higher standards ure ignored in drawing the line. Another
problem  occms when one of (he standard solutions is
inaccurate  because  of aging, error in preparation, or
contamination in handling. Knowing that this possibility
exists,  the  analyst simply  ignores  that point when
drawing the line.

     The immediately  apparent  way  to  automate this
process  is  to use a least  squares  curve  fitting routine
(lineai  regression).  This  technique, however,  lends to
minimi/c  the deviation of the standards from  the line
equally at any level. If a typical deviation of a series of
standards such as .01, .02, .05, .1, .2, .5 is .015, the least
squares routine  will place the line through the points so
that the .015 deviation is typical at the .01 level, as well
as at the  .5 level. This results in  a very  accurate curve
near the  high end and a very imprecise one near the low
end.

     The  staff  has  used a modified least squares tech-
nique which seems to produce a "chemist compatible"
curve most of  the time.  It is based on the  chemist's
specification of two  quantities:  the minimum value
detectable  with  the system (herein called "precision")
and  the  percent accuracy  possible with near full-scale
values. If the precision is .01  and the accuracy 5 percent,
for example, (he chemist reports his result, as do manu-
t'acUivers of instruments such as voltmeters, us "within
.01 nig/1 or 5 percent, whichever is greater."  It turns out
that below a mg/l  value of the precision  divided by the
accuracy, which is .OI/.05  or .2 in this case, the  pre-
cision will be greater,  while above that, the accuracy will
be greater (because 5 percent of .2 is .01).

     The  curve-fitting routine is an  iterative  process
which  (1)  uses a conventional technique  to compute an
equation, c = mx+b, where x is the absorbance and c  is
the concentration. Next (A) the value of m  so obtained
is used to compute a new value of b  using only the
standards below the precision-accuracy ratio. Then, (B)
that value (essentially, the baseline) is subtracted from
each of the standards above the precision accuracy ratio
to  obtain  a  new value of  m.  Steps A and B  are then
performed over and over again until the changes in cither
b  or  m  arc  trivial   (O.I percent)  from  iteration to
iteration.

     Once a trial value of m and b arc obtained with this
 method, the predicted concentration of each standard,
 using the equation, is compared with its known concen-
 tration, computing a percent and mg/l deviation for each
 one. (II) If any of the low standards have a  greater mg/l
deviation  than  the  precision,  the worst one is deleted
from the set. If any of the high standards have a greater
percentage deviation than the accuracy, the worst one of
these is deleted also. The iterative procedure (lAand IB)
in the  previous paragraph is then performed to yield a
new value of b and m. These two  steps (I  and II) are
repeated until  the  percentage and  mg/l accuracy and
precision are satisfied, or  until too many standards are
being removed.

     A  set of standards both precedes and follows each
set of  unknowns, and  the  slope and  intercept  to
compute a concentration  for each  unknown is linearly
interpolated  from both sets of  standards. An example
output, showing the final  computation for  each set of
standards and  a portion  of the computations for the
unknown is shown in Figure 2.

REJECTING ERRONEOUS RESULTS

Inaccurate Standards

     If a satisfactory  curve cannot be fitted to either set
of standards surrounding a sample batch, no computa-
tions for unknowns between those sets are made. They
are resubmitted for analysis at the  same dilution factor
unless a  particular unknown  peak was off scale, sug-
gesting a greater dilution factor,

Shoulder or Off-Scale Peaks

     We must  turn to the hydraulics  of the system to
understand   another  reason  for  rejects.  The system
pumps the  sample through its inlet tube continuously;
this tube is moved from sample cup to wash to next
sample cup, and so on. The wash is necessary to clean
out  the system so  that the constituent in one cup does
not carry over to the  next and bias its results. In the case
of an  unusually strong unknown  followed by a weak
one,  the wash may  not  be  complete enough.  In  the
system, a peak which is less than 10 percent of the pre-
vious   one  is rejected  and  resubmitted  for  analysis
because of this problem. When an unknown  is strong
enough to   yield an  off-scale  reading, the  next two
following peaks are rejected. The off-scale peak is resub-
 mitted at a high dilution factor.

Negative Results

     If a peak falls below the  lowest b value from the
standards computation, interference  is suspected, espe-
cially  if it falls far below it. In  our case, if the negative
concentration is greater  in absolute value than the pre-
cision, the unknown is resubmitted for analysis.
                                                                                                                227

-------
  Standards L,\ Imputation

       I! Ihc  peak <>l  a given unknown is higher  than Ihc
  highest slandatd  peak,  llie unknown  must be  resub-
  inilled lor  analysis at  a  higher  dilution factor. This
  procedure is required since the apparent concentration is
  greater than  the  highest  standard, and  extrapolation
  from  the curve developed  lor the concentration being
  observed  may produce invalid results.

  Quality Control

       At regular intervals throughout a set of unknown
  samples, the following set of quality control solutions is
  interspersed:  a wash,  a known  solution, a wash,  an
  unknown, a repeat of the unknown, and the unknown
  spiked with the known solution. The spiking is such that
  (lie  concentration measured  in  the spiked unknown
  should be equal  to  Ihe concentration of unknown plus
  the concentration of the known which was added to it.
  It is  required  Ilial  the  measured  concentration of the
  spiked unknown lie within  10 percent (in most  cases) of
  the calculated  concentration.  Since there  are two repli-
  cate analyses  of the unknown alone, the spike  must be
  within 10 percent of calculation using either replicate. If
  it is not within 10 percent, the sample is resubmitted for
  analysis.

  Comments

      Certain one-letter  comments  are enterable by the
  analyst when  the computations are  made. Most  such
  comments refer  to  the  original disposition of the un-
  known solution,  such   as  M  for missing,  or X for
  improper  sample  preservation, etc.  These  comments
  cause the  program not to calculate a value for the sample
  because it was  unavailable  when the analysis  was per-
  formed. Instead, the sample is reported with no result,
  hut with  its comment. Another class of comment, such
  as I)  for unreliable, causes  the program to rcsubmit the
  sample for  analysis. This  might   be used  when  the
  chemist notes something irregular in the peaks  which is
  not detectable by the program.

  NITRATE NITRITE ANALYSIS

      The  nituite  analysis  consists of a  reduction  of
  nitrate to nitrite with  measurement as nitrite.  Any  ni-
  trite in the sample is also measured, so the final result is
  nitrate  plus nitrite,  expressed as  mg/1  of nitrogen.
  Although  many laboratories assume that the nitrate and
  nitrite are additive, this  is not the case in samples with
  high nitrite, such as a nitrifying sewage. The difficulty is
that some of the nitrite ion becomes further reduced to
ammonia  and is not  measured. The  resulting "nitrate
plus nitrite" value  is not a simple sum, but is something
like  "nitrate  plus part of the  nitrite." To compute the
nitrate from this value requires subtracting some fraction
of a  measured nitrite value, a fraction determined by
running nitrite standard solutions through the reduction
process. The analysis dictates, therefore, that the entire
standards  computations be carried out three times:  once
for nitrate alone,  once for nitrite alone,  and once for
nitrite passed through the reduction process to  obtain
values for  nitrate, nitrite, and of course, their sum.

EQUIPMENT CONFIGURATION

     Technicon autoanalyzers  used are  Model AA-I1,
consisting of two sets of three channels each, with strip
chart, 4-inch wide paper tape printed output, and tele-
type with  print and paper tape output. The strip chart  is
a continuous  record of the series of peaks which come
from each  sample cup (Figure 1).  The printer  and
teletype outputs (Figure 3)  are digitized  values  of the
peaks, which  are  sampled  at  the same  time  that the
sample cups arc changed. Phasing (that is, making sure
the peak is sampled at the top rather than the side) is a
function of the flow-through time for the analysis. Since
this is constant for any particular setup, it can be set by
the operator at the beginning of a run. The  phasing for
each channel is independent, so that the peaks need not
occur simultaneously in all channels. Each time a  peak  is
sampled for printing, a mark is made on the continuous
strip chart record  so  the result can be checked  later  if
necessary.

     In  this system, the paper tape produced by  e :h
teletype  is hand-carried  daily  to the  Oregon State
University  (OSU)  Computer  Center. This is much less
expensive  than feeding the information directly to the
computer  by putting the teletype on line as it is printing
from the  autoanalyxer. Since  the output  rate is  so low
(about 2000 characters per hour) from the autoanalyzer,
hand-carrying the  tape to the high-speed  reader at the
computer  center is cheaper than reading it in a teletype
speed from the laboratory.

     Once  in the computer,  the tape is processed inter-
actively with the operaior at a terminal having consider-
able  flexibility  in  editing   the  paper  tape.  This  is
routinely useful if  the chemists let the teletype run out
of paper tape  or if they are unable to find a sample. It  is
also   useful  in  cases  of power   failure,  inadvertent
omission of samples,  insertion of extra samples,  and so
forth
22K

-------
     In addition  to  the  computational  difficulties en-
countered  in  obtaining some  sort  of computer
compatible output directly from the autoanalyzer, there
is  a sample identification  problem. It is necessary, of
course, to assign both identification numbers to samples
and concentrations to standards before  the computer
compatible absorbancc values have meaning. If  this  is
done manually, at'least some of the advantages in doing
the computations automatically is lost. The system ties
in with a laboratory  sample handling system (SHAVES)
which existed before autoanalyzer computer programs
were developed.  One  of the autoanalyzer Technicon
system programs, L1STGEN, accesses the file that holds
the SHAVES analytical requests and produces a list of
samples to be analyzed, complete with replicates, spikes,
and standard solutions (Figure 4). The analyst then uses
the list  to Till  the  sample cups  which are input to the
autoanalyzer. When the results are computed, a copy of
the list which has been kept in the computer is used to
identify the  results. The regular  repetition of replicates,
spikes, blanks,  and the descending values of peak heights
for the standards makes it easy to be sure that the right
identification goes with each peak.

-------
Ul
o
                                       value  of peak
                                       sampled here
                      Figure 1

             Strip Chart Recorder Output

-------
HtGIN BATCH 91ii)8l7 FOfe
ST   625  ?/
C/A =     .60
                    1510    (09/16/74   9120817)
1-9
150«-
151
IS?
153
154 	
155
156
157
-160 —
CONC
5.0000
-2.0000-
1.0000
.8000
.6000
— .4000-
.2000
.1000
.0500
______ 0-
PK HT
8.7600
3.1300 -_
2.1400
1.7300
1.4200
1.1600 —
.7800
.6200
.5600
.3800 	
RASE
.4439
.4439 -
.4439
.4439
.4439
.4439
.4439
.4439
.4439
.AfclO. .
.Sttt
.7446
.5896
.6220
.6147
.5585
.5950
.5677
.4305
t\
                                              OEV      %DEV0

                                           •0.3850--19.2490
                                           •0.0131   -2.1833
                                             .0306 - 7»5**3

                                             IOOS9    5^9082
                                             .0198   39.6676
                                           •0.0384	0-
ST   625  9/17/74
C/A=     .6063
                    1510    (09/18/74   9120817)
         CONC
       5.0000
       2.0000
       1.0000
        .8000
        .6000
        .40100
       - .2000
                PK HT
               8.6600
               3.1 100
               3.1200
               1.7200
               1.4200
               1.1600
                .7400-
                 .3700
                     COA
                    .6063
                    .7471
                    .5928
                    .6216
                    .6079
                    .5502
                    .6516
                    .5349
                    .6498
                        0
                                OEV     %DEV
                               .0000    .0000
                             -0.3770 -IB.8503
                               .0228 —2.2772
                             -O.OJ97  -2.4677
                                          -0.0382
--• RESUL
                    625
                      T
                    .100
9/17/74 1510
RAM ANSWER
                                   81
                                         ENTEPE
                                       PKHT-  BA
                         (    2.0698)   3.88    .44
                         	 2.0000  PREDICTED)
   70-32-603
   70-32-604
,„ 70-33-527
29 70-33-55K
30 70-33-552<

i. B:iJ:?-TJ
3? 29-33-S
  10.(
    • !»-•—•
   9.900
- 26.000-
  33.000
  32.000
   1.400
- 24.000
   5.700
   6.300
  28.000
- -3.400
   5.000
  14.000
    .400
    .300
    .800
  18.000
    .900
   2.600
    .200
    .100
    .100
   1.500
    .800
                         <   -0.0021)
                         (      .1968)
                         (      .0945)
                         (	   .0524)
                         <   -9.2058)
                         {   -0.0859)
                             T. I
                          ( —


                          (
                          I —
                          (
                  ,8625)
               25.8267)
               33.0587)
               31.5573)
                 1.3734)
               24.3383)
                 5.6710)
                 6.2777)
               28.4808)
               - 3.3654)
               T0.0336)
                 3.9677)
                  .3669)
                   3429)
                                   09/18/74  BATCH 9120817
                                    FACT -  REPORT  RERUN «E
                                                           NO     ANALY;
                                                           NO     ANALY!
                              1

                  .8973*
                 2.6441)
                  . JB67
                  .0423
                  .0785
                 1.5674
                 1.8374,
                 J-8315)
                    .44
                    .4.4
                    .44
                    .44
                    .44
                    .44
                    .44
                    .44
                    .44
                    .44
                    .44
                    .44
                    .44
                    .44
                    .44
                    .44
                    .44
                    .44
                    .44
                    .44
                    .44
                    .44
                    .44
                    .44
                    .44
                    .44
                    .44
                    .44
                                                     100
                                                                   SHLDR
                                  Figure 2
                   Computer Output From Processing an AutoaiuUyzer Run
                                                                           231

-------
027.
028.
089.
030.
031.
038
033
034
035.
036
037.
038.
039.
040.
041-
048,
041
044
041
044
047.
048.
O49.
050.
051.
058
O53
054
055
056
057.
058.
059.
06Q
061.
06fc'
061" •
064
065.
064
067.
068.
069.
07ft
071.
07*
073
074,
075
076
077.
078
079.
08ft
081.
068
083
034
035
7J 0.31
71 '0*31
•71 0.31
71 0.31
7; 0.31
71 0.31
71 0.31
'71 0.31
11 0.31
71 0.31
' 71 0.31
1 71 0.31
71 0.31
71 0.31
- 7J 0.31
71 0.31
71 0.31
71 0.31
•71, 0.31
71 0.31
71 0.31
7J 0.31
'71 0.31
71 0.31
71 0.31
•71 0.31
71 0.31
71 0.31
7i 0.31
71 0.31
71 OV31
71 0.31
'71 O.'3l
71 0.'31
71 0.31
71 0.31
J7J 0*31
71 0V 31
• 71 -0.31
71, OV31
1 71 0.31
71 0.31
• '71 0.81.
71 a* 007
71 0.'3«
7l 0.32
•71 0.38
71 0.32
71 0.32
71 0.32
71 9.28
7ji 9.48
71 9V48
7.1 9V4S
7J 0.'39
71 0.43
71 0.43
•71 0.42
71'OV48
25 0.6T 65 0.61
CS 0.64
es'0.45
25 0.46
85 0.57
85 1.00
25 6.35
25 0.87
25 0.89
25 8.34
25 0.40
25 0.36
25 1.26
85 4.14
85 0.37
25 0.40
25 4.66
65 D.8T
65 1.87
65 0.58
65 0.55
65 1.90
65 6.60
65 0.56
65 0.57
65 1.02
69 -8.'4l
65 0.57
65 0.55
65 1.28
65 3.77
65 0.58
65 0.55
25 0.237 65 3.36.
25 0.46
25 0.40
25 0.97
25 8.04
25 0.38
25 0.40
85 1*35
85 5.00
85 0.38
85 0.38
25 0.35
25 0.35
25 0.50
85 '1*81
25 0.43
85 0.41
25 0* 69
85 0.48
25 0.67
25 1.31
25 0.40
_2Sj««3S\
25 0.441
25 0.61
25 0.37
25 0.38 '
85 1.26
25 4-26
25 0.'38
25 0.'47
25 0.42
25 0.55
25 0.35
25 0.36
25 0.39
25 0.'53
25iOV35
25iO«38
25? K75
25-S.-93
25^37
65 0.841
65 0.68
65 0.55
65 0.69
65 1.25
65 0.56
65 0.61
65 0.55
65 0.'53
65 0.55
65 0.53
65 0.24?
65 0.247
65 0.91
65 0.59,
65~ 0."66*~
65 0.84
65 0.54
65 0.55
65 0.88
65 2.18
65 0.56
65 0.56
65 0.59
65 0.72
65 0.53
65 0.52
65 0.58
65 -0.59
65 0.57
65 0.31
65 4.21.
65 0.24T
65 0.61
65 0.54
65 0.77~~
63 l.58_
65 0.54
65 0.54
63 0.64
65 1.08
            Figure 3
E\(>mpl
-------
              Figure 4
Computer Prepared, Analyst Annotated,
      List of Analyses to Be Run
                                                                    233

-------
                                       A CHEMICAL INFORMATION SYSTEM

                                                 By Stephen R. Heller
       This presentation will discuss the ways in which the
  Agency  is using the National Institutes of Health (NIH)
  Chemical Information System (CIS) and ways in which
  the CIS can he modified  and/or  expanded for further
  Agency use. (Figure I)

       Figure 2 is a  list of the current components of the
  CIS.  At  present, the Agency use of the CIS is limited to
  (wo programs, MSSS and MLAB.  Both  of these are de-
  tailed by others at  this workshop.

      One component of the CIS that is not  now being
  used, but which is  of  enormous  potential use to  the
  Agency,  is  the chemical  substructure  search  (SSS).*
  Substructure search allows the scientist to query the file
  for data  of interest to him including mass spectra infra-
  red data, pesticide  registration data,  toxic  substance
  data,  and so forth.  His question  is specified by a chemi-
  cal structure, the language of chemistry.  The answers are
  the  data whose  structures contained  the  specified
  substructure.

      At  present, the  Management  Information  Data
  Systems  Division (MIDSD) is making a  pilot  version of
  the SSS option operational for testing and evaluation on
  mass  spectral  and  pesticides files. In order to  do this,
 MIDSD has  obtained a  contract  with the  Chemical
 Abstracts  Service (CAS) in Columbus, Ohio to provide
 the CAS  registry data for these two files. Included in the
 registry data is a computer readable representation of a
 chemical  structure  called  a connection  table. The SSS
 program searches this connection table to find the struc-
 tures that answer the user's query.
 registry data, it will be possible to generate and examine
 interdisciplinary data bases from a structural point of
 view. One could  ask:  "For what  compounds in the
 Merck  Index are  there mass spectra data available?" If
 registration  data  and other related  administrative data
 arc available  with  registry data, one can begin to see if
 certain structures are known pesticides, toxic substances
 hazardous materials, and so forth. Structures appearing
 in different files can he readily located since the search is
 based on a picture, the chemical structure,  not a name.
 Names, especially trade names, often tell little about the
 structure of the material. Indeed, the same material in
 different  files  can  have  many  different chemical  and
 trade names. One enormous source of data with  CAS
 registry data is  the CAS literature files. At present  CAS
 is producing  their current five-year  subject  index  with
 registry data, thus enabling one to search for structural
 information  when searching the literature. Anyone with
 any  experience with  chemical  nomenclature  and  the
 CAS habit of changing its nomenclature should  warmly
 greet such a system.

      These examples all deal with existing files. A file of
 data  which is not part of the CIS is infrared (IR) data. A
 MIDSD  sponsored  feasibility study of an IR  search
 system, to be published  next month, will  include  the
 recommendation for the inclusion of CAS registry  data
 as part of any possible future IR system. An IR searrh
 system will enable the scientist to have a complementa  '
 tool for the identification of organic materials. With the
 registry data, one will be able to have a link between the
 IR  and mass spectra files, rather than being required to
 perform separate searches.
      In addition to the mass spectral and pesticide files,
 MIDSD  is also obtaining  registry data  for the oil and
 haxardous materials  file.  Other organizations, outside
 EPA. arc also  putting the CAS registry data  into their
 files. The ninth edition of the Merck Index, the National
 Institute  of  Occupational  Safely  and Health (NIOSH)
 list of 25,000 toxic substances  and the National Library
 of Medicine  (NLM) Toxline tiles  are examples of data
 bases that have or will have registry data as part of their
 files. As more groups within and outside the Agency get
     The SSS and registry data will allow the Agency to
take specific parts of the CIS relevant to Agency needs
build up individual  data  bases  and finally be able to
conveniently and  inexpensively link them together.

     As  the pilot  SSS  project  proceeds,  the  author
would  be happy  to  meet  with any  groups interested in
discussing their specific files and needs and in discussing
how MIDSD can  help coordinate this effort to link the
Agency's chemical files.
*Scc  R.J.  leldmann and S.R. Heller. J.  Chcin. Doc.,  12 (1972), 48. AKo see R.J. Feldmann, Chapter 3, Computer Representation and
 Manipulation ol Chemical Information, ed. M.J. WipLc, S.R. Heller, R. Feldmann and E. Hyde (New York: John Wiley, 1974).
234

-------
                             GRAPHICS
                             TERMINAL
LITERATURE


SUBSTRUCTURE
SEARCH


DATA/
PROPERTY
FILES
                             GRAPHICS
                             TERMINAL/
                             PRINTER
                              Figure 1
                      Chemical Information System
DCRT/CIS CHEMICAL INFORMATION SYSTEM

TO RUN A PROGRAM IN THE CIS. TYPE THE NUMBER NEXT TO THE NAME OF THE PROGRAM

LITERATURE:

CBAC - 1

STRUCTURE:

SUBSTRUCTURE SEARCH - 2      WLN GENERATION - 3
WLN TO STRUCTURE - 4      WLN GENERATION USING THE TABLET - 17
DATA:
MASS SPEC - 5
ANALYSIS:
NMR -6
XVZ COORDINATE GENERATOR - 7    XRAY COORDINATE REFORMAT - 8
XRAY MODELING SYSTEM - 9      XRAY CRYSTAL BIBLIO SYSTEM - 10
CNDO/INDO-n      MINDO-12    ORTEP -13
GINA NMR ANALYSIS - 14     ESR SPECTRUM SIMULATION - 15
MLAB CURVE FITTING/MODELING - 16

USER CHOICE:
                               Figure 2
                            CIS Components
                                                                                235

-------
                                         MASS SPECTRAL SEARCH SYSTEMS

                                                 By John M. McGuire
   INTRODUCTION

        Setting and enforcing water quality criteria, deter-
   mining the fate and effects of water pollutants, and de-
   veloping  optimum  control  measures  require   the
   capability  for identifying specific organic pollutants.
   Table I dramatically illustrates the need to determine the
   composition of industrial wastes by chemical analysts.
   The  compounds in  the left column are those suspected
   by the discharger,  through his  knowledge of products,
   raw materials, and  processes, to be in his effluent.  The
   right column, based on chemical analysis of the effluent,
   contains over twice  as many compounds.

                       Table I
   Comparisons of Compounds Reported by Discharger and
   Compounds Identified by EPA in an Industrial Discharge
IVodiK-l* (if
Haw Mali-nan
Reported
I'minKm-
t Hnk'Jtr
Hui.idu-m-
hi.i.mv
iM.mi-
1 link-lie fKv-l
1 fltt|rm-miU-n,V
isoprop>lhen/ene
iv-cilu Imhu'iir
Jl.u.-ton,- nk'udui
>luiuix>clli.mol
Mldviir'
o-pi'MMilvf.tm1
t-nii'itn 1 nidc IK'
U-tH'X jdl'lMDl"
iuphlli;ik'tic*
hcn/yl jkiilud
cidyin.ijifidult'nc'
2,fHdiTiK-ihvU
naphthalene*
cresol isonier
accnaphthulcne
tluurene
.1,3-OiplK-nyl-
|iropjiu)l
IdenliTied
l,5-^yi.tni)ft.idiciui
M>Tene*
tMiH-tln Kiyrene*
iiiil.ui*
^iliiflliy!s|\u'iU'
dinii'thvlMn.in isnincr
)-nii'i)i\] indeni'*
uvelnplu'iitine
a-iL-rpiiiLMii
ho til meth>l
naphthalene isoincr^*
a-nu'(Jt> |ru-n/> 1 altoho)
phenol*
inethy|cih\ [naphthalene
acenaphthene
methylbiptienyl isomcr
two phtrialate dfc^ttT1.

                Milutiled with a standard.
      The  identification  technique must be highly spe-
  cific since thousands of compounds of varying degrees of
  toxicity must be considered. Because some organic com-
  pounds are toxic to aquatic organisms at concentrations
  below lO/ag/t, the technique must also be sensitive.
      Gas-liquid chromatography (GO) has adequate sen-
 sitivity and  reproducibility to provide excellent quanti-
 fication for volatile organics when the identity of the
 chemical is  known. However, pollutant identifications
  obtained by comparison of relative retention times are
  subject to interferences and are questionable for the un-
  known mixtures found in natural waters. Alternatively
  GC may  be  used as a preliminary separation technique
  and the effluent may be introduced into a different type
  of instrument for qualitative identification.

      During  the  1%0's,  workers at the  Southeast
  Environmental Research Laboratory (SERL) showed the
  feasibility of using gas chromatography interfaced with
  low resolution mass spectrometry (MS)  for unknown
  identification. They  used an Hitachi RMU-7  mass spec-
  trometer  adjusted for maximum  sensitivity, together
  with manual chart reading and  data  reduction.  Many
  hours of applied effort are required to gather data, read
  charts, correct background, construct a data presenta-
  tion for interpretation (Figure 1), and interpret the data.
  Because of the large  amount of time involved between
 data collection and interpretation, manual GC/MS is too
 slow for effective identification of water pollutants.

      Most time-limiting factors in manual GC/MS can be
 accomplished by a computer. To evaluate  the feasibility
 of this approach for EPA laboratories, SERL and  the
 Methods Development and  Quality Assurance Research
 Laboratory  in  Cincinnati  (MDQARL)  purchased
 identical  computerized GC/MS systems in 1971.  Since
 that time, more than  20 laboratories have chosen essen-
 tially the same system. A PDP-8 minicomputer in   tfs"
 system controls the operation of a quadrapole mass spec-
 trometer and associated output devices.

     The applied time required to gather data and pre-
 pare it for interpretation was decreased  by more than an
 order  of  magnitude. The  time  required  for  inter-
 pretation, however, also had to be shortened.

 COMPUTERIZED INTERPRETATION SYSTEMS

     Computer-controlled  GC/MS  produces  many
 spectra  from   a  single environmental sample.  Inter-
 pretation of these spectra might be done in a variety of
 ways.

 Matrix Inversion

     Matrix  inversion  can  be used to  give both  qual
itativc and quantitative information if the sample and all
2.16

-------
impurities  are  known  to be from a small and well-
defined  universe.  This condition  exists in many
industrial plants. It  does not exist for  environmental
samples.

Manual Interpretation

     Manual interpretation involves assignment of prob-
able compositions to  fragments. The assignments  are
then used to predict the  structure of the unfragmented
molecule. As a very simple example, Figure 2 shows the
assignments in an acetone spectrum. The predominant
features in this spectrum are due to mass unit losses of
1(H) and 15(CH^). Bastfd on the assignments, the molec-
ular structure is strongly  suggested. This method is reli-
able, but it is slow.

Manual Peak Matching

     Manual peak matching involves comparing the in-
tensity  for each spectral peak with those of literature
spectra. Lists such as those of  the Mass Spectral Data
Centre and of Cornu and Massot give the fastest manual
search results; however, use of these lists is nonetheless a
slow process. These lists are abbreviated to include only
the most intense peaks.

Computerized Spectra Matching

     Computerized spectra matching is a practical and
rapid way  to obtain  a probable interpretation. In order
to  decrease  matching  times, the  spectrum is  usually
abbreviated  by one  of several  different methods.
Biernann and his  associates  pioneered  the 2-most-in-
tense-in-every-14 method illustrated in Figure 3, which
shows the Biemann abbreviation for acetone. A compari-
son of Figures 2  and 3 shows that no significant informa-
tion has been lost because of the abbreviation.

     Computer  matching includes many approaches. All
require a large library of reference spectra for best per-
formance. The following are some of the most useful:

         The five most  intense peaks in the spectrum
          may be compared  with the five most intense
          peaks  in each spectrum of a reference  file. The
          method is very  fast, but is not accurate.

          The Biemann  abbreviation  of the unknown
          and reference spectra may be compared with
          intensity information coded to one bit/peak.
          This program is very rapid on a minicomputer
          with   a disk but  can be  slow  if reference
       spectra are stored on tape. It is satisfactory for
       files of less than 600 spectra, but it is not suf-
       ficiently selective for larger files.

       The  National  Institutes  of Health  (NIH)
       system developed by Heller, et al is an inter-
       active  FORTRAN program  to  perform  suc-
       cessive  intensity  screening  based  on
       user-selected  peaks. The number of spectra
       that pass  the  screen  is printed  after each
       screening. Each screening pass Is very rapid on
       the PDP-10. The program includes many user
       options (Table II). At present, development is
       being continued jointly  by EPA, NIH, and
       Food and Drug Administration (FDA).
                    Table II
              NIH System Options
TO SEARCH FOR PEAKS, TYPE PEAK
TO SEARCH FOR LOSSES, TYPE LOSS
TO SEARCH FOR MSDC CODES, TYPE CODE
TO SEARCH FOR MOLECULAR WEIGHT, TYPE MW
TO SEARCH FOR MOLECULAR FORMULA, Tfrt MF
TO SEARCH FOR PEAKS AND LOSSES, TYPE PL
TO SEARCH FOR PEAKS WITH MSDC CODES, TYPE PC
TO SEARCH FOR PEAKS WITH MW, TYPE PMW
TO SEARCH FOR PEAKS WITH MF, TYPE PMF
TO SEARCH FOR LOSSES AND MSDC CODES, TYPE LC
TO SEARCH FOR LOSSES WITH MW, TYPE LMW
TO SEARCH FOR LOSSES WITH MF, TYPE LMF
TO SEARCH FOR MSDC CODES WITH MW, TYPE CMW
TO SEARCH FOR MSDC CODES WITH MF, TYPE CMF
TO SEARCH FOR MW WITH MF, TYPE MWMF
TO PERFORM A SIMILARITY COMPARISON, TYPE SIM
TO PRINT OUT PEAKS/INTENSITIES AND SOURCE, TYPE SPEC
TO VIEW MICROFICHE, TYPE FICHE
TO PLOT SPECTRA ON DISPLAY TERMINAL, TYPE PLOT
TO COMMENT/COMPLAIN, TYPE CRAB
TO ENTER NEW SPECTRA, TYPE DATA
TO READ THE NEWS OF THE SYSTEM, TYPE NEWS
TO LIST THE MSDC CODES, TYPE LIST
TO EXIT FROM THE PROGRAM, TYPE OUT
USER RESPONSE
        The EPA/Battelle system, which is essentially
        a faster version of a Massachusetts Institute of
        Technology (MIT)  program, coastals of both
        PDP-8 programs that automatically abbreviate
        and transmit spectra  and also a  CDC 6400
        spectrum matching program.  This assembly
        language program is very rapid and provides a
        list  of the  most  probable  identifications
                                                                                                            237

-------
            tanked by how closely the unknown and refer-
            ence  spectra mulch. It is essentially an aulo-
            inaiic system and provides only a Tew options.

            A  very  rapi(f  FORTRAN program for  the
            CDC  6400 has been developed by Clerc  and
            his coworkers in Zurich.  It involves automatic
            screening and ranking of the matches based on
            a  weighted  binary-coded signature. Methods
            for assigning the optimum weighting factors
            are still being improved.

  Learning and Self-Training

       Learning and  self-training i computer  approaches
  offer  the  best  long-range hope of developing a com-
  puterized  identification program that will not require an
  extremely large  reference library.

      The DENDRAL program of the Stanford group is a
  360/67 program that does not require a  large library. It
  deduces all possible structures for a given elemental com-
  position and  exhaustively  tests each structure by pre-
  dicting a spectrum for it and comparing the  predicted
  spectrum to the  unknown one. It is slow.

      The Self-Training Interpretive and Retrieval System
  (STIRS) program is a PDP-11/45  program,  developed
  under Nil I and  EPA sponsorship, that compares library
  spectra to the unknown  spectrum for various classes of
  spectra-derived data. The best  matches for each class arc
  ranked. This system can lead  to identifications even if
  the reference  library does not include a  closely related
  spectrum.

  EPA SPONSORED IDENTIFICATION PROGRAMS

      Three of the foregoing are of particular interest be-
  cause of EPA's involvement in their  development. These
  are the Mill system,* the EPA/Battelle  system,t and
  STIRS tt

  NIH System

     The basic NIH search program, PEAK, permits the
  user to select an intensity ranging factor and from 0 to
  .1 mass-intensity  pairs (peaks) in each 14-mass interval.
  After  each  peak  input,  the program  searches the library
 muss file for all intensities lying within a window deter-
 mined  by the  ranging factor. It then tells the user the
 number of compounds from the reference library thai
 have passed the  screening and  gives him the option of
 obtaining a listing of these compounds, If the number il
 high, the  user can input another  peak, which may  bt
 from the same 14-mass interval or  from a different one.
 This process is repeated until the user requests a listing,
 An example of this sequential screening (Figure 4) show*
 a typical search,  which was terminated after the number
 of references was reduced to less than 20. Five of the six
 references thai passed the screening are isomcrs of the
 correct compound.


     An important variation of the  PEAK program is the
 LOSS program. In this program, each input is a mass loss
 and intensity pair. Input losses are calculated by sub-
 tracting the mass value from the molecular weight of the
 compound. Figure 5 illustrates the LOSS search for the
 same spectrum that was  used  to  illustrate the PEAK
 search.  Two of  the ten references are isomers of the
 correct  compound.


     The NIH system presently contains three other pri.
 mary spectra retrieval  modes; these are  searches for
 molecular  weight, molecular  formula, and  compound
 type. The molecular weight search is a screening for all
 compounds in the  file  that  have a  given molecular
 weight; the molecular formula search can be used to give*
 all compounds  having the stated formula, all compounds
 having stated elements, or all compounds having a stated
 number  of atoms of selected elements. The compo-ind
 type search involves screening for  86 compound cl -,si-
 fication  codes.  All of the  screening and retrieval modes
 can be  paired;  therefore, it is possible, for  example, to
 perform a LOSS  search on a file that consists only of
 one compound  type or a PEAK search on a file of only
 one molecular weight.
     When a  search has been narrowed to a reasonably
small number of compounds, the NIH program permits
ranking the fit of each of these to the input spectrum by
means of a similarity comparison. Figure 6 shows dialog
and  output of this racking routine  for the references
found  in the  PEAK search of Figure 4. The five isomers
all have low dissimilarities.
 "•User's Manual Mass Spectral Search System, EPA, Washinpton, IXC. (May 1974).
 ttevs Manual Battetk Mass Spectral Matching Syltem. Battelle Columbus Laboratories, Columbus. Ohio (1974).

 tK.S. Kivok, R. Venkataraghavan, and F.W. McLafferty.7. Amer. Chem. Sot:, 9.\ 4185, (1973).
238

-------
     When (he number of the desired spectrum has been
located in  llic reference  file, the program SPEC can be
culled  to  output u listing  of the entire spectrum in a
typed format.

     Two options require certain hardware at the remote
terminal. These  are  the  PLOT routine, which permits
retrieval and display of any spectrum in the reference
library, and the FICHE routine, which locates a desired
spectrum on a microfiche copy of the reference library.
The  PLOT routine for  PDP-10  users  will  permit
CALCOMP  or ZETA plotted copy, or will display on
any of six graphics terminals. The FICHE routine either
provides  information for manual location  or drives a
COMCARD viewer to display the desired spectrum.

     Other present NIH options permit the user to enter
data, obtain system  news,  and comment to the system
manager.

EPA/Battelle Matching Program

     The EPA/Battelle matching program, taking advan-
tage  of  the  high  information redundancy of  mass
spectra, is based on the  two most intense peaks in every
14 mass units. There are four main steps in the matching
process:

         Screening based on molecular weight range

         Screening based on the most intense peak of
          the unknown spectrum

         Presearching based on the spectrum family

         Ordering the best matches based  on peak-by-
         peak  comparison of the  unknown spectrum
         with  those reference  spectra   passing  the
         prcsearch.

     To  reduce  operation  time and eliminate human
errors and prejudices in  selecting, formatting, and trans-
mitting data, PDP-8  utility routines  transfer  input
spectra data directly from the  user's remote PDP-8 to
the central  CDC 6400. These programs have been eval-
uated  and  improved during the past year. A match
against   the  present  data  base of  9000spectra
(8400 general organic spectra culled from the Alder-
mas ton collection  and 600  pollutant spectra from SERL
and  Battellc)  requires  approximately  40 seconds of
elapsed time.
     The similarity index (SI) gives the user an imme-
diate indication of the quality of the matches. The "best
hit" will be  the  first identification; the SI will show
whether it is a poor match (<0.2 if the data base does
not  contain  any  closely related  compounds), one of
several fair matches (0.2-0.35 if the correct compound is
not in the data base but related  ones are), or a good
match  (>0.35 if the  SI of the second best hit is signifi-
cantly lower).

     In one study made at SERL, 50 percent of the un-
knowns present in the effluent of a kraft paper mill were
found  correctly as the best  hit, 8 percent as the second
best hit, and 2 percent as  the third best hit (13). The
success of the  system should improve since reference
spectra are added continually.

     The search may miss  the match because of poor
quality data in the input spectrum. If the spectrum looks
poor, or a match  is not found using the standard search
parameters,  the  user has the option of decreasing dif-
ferent  portions of the presearch selectivity.  This is done
by  redefining parameters for ratio and rectangular array
in the search dialog.

     Compared with  magnetic deflection spectrometers,
quadrupole instruments exhibit a bias toward low mass.
This is demonstrated in Figure 7,  which compares both
types of spectra for  the pesticide, parathion. Since the
Aldermaston  data base is comprised primarily of spectra
obtained on magnetic deflection mass spectrometers and
most EPA laboratories have quadrupole instruments, a
major  concern in the  development  of the  matching
system was whether  suitable matches could be obtained
between quadrupole and  magnetic deflection spectra.
Experience with the  system has shown that  the program
provides excellent matches for both  quadrupole  and
magnetic deflection spectra.

     The same magnetic  deflection spectrum  of
p-ethyltoluene that was used to illustrate the NIH pro-
gram (Figure 4 through Figure 6)  can be used as well as
an example of the EPA/Battelle program. Figure 8 shows
the  results  for  this  search.  All  of the best hits are
isomers; however, the first and third  matches are most
closely related to the correct structure.

     Another example  of  the use of the EPA/Battelle
matching program is provided by  a study of the effluent
of an experimental coal gasification plant. Organic com-
ponents were extracted with methylene chloride. The
                                                                                                             239

-------
  diromutogram of the extract (Figure 9) contained seven
  distinct peaks.

       In  a computerized  matching  of  the  quadrupole
  spectra  tor  those  compounds,  the  best  matches were
  with ('^, C-j, and Cg hydroxyl-containing materials. The
  output indicated high Si's for the first  six peaks, but a
  low one tor the last GC peak. Subsequent visual inspec-
  tion  of the  mass  spectrum for  this  GC  peak indicated
  that  the  last  peak  arose from two compounds with the
  same retention time.

       The identifications  are given  in Table III. When
  different  materials were selected by the  matching pro-
  gram as the best and  second  best matches, relative GC
  retention times favored the best match over the second
  best. In a continuation of the computer dialogue, given
  in Figure 10 for GC peak 3, thirteen cresol spectra were
  matched  with Si's greater  than 0.645.  The  first non-
  cresol match was 3-tolyl-N-methyl carbamatc with an SI
  of 0.574.

                       Table III
        Compounds Tentatively Identified in Waste
         Effluent of Coal Gasification Pilot Plant
1.1
IVak
1

]
J
s
4.


IV. c Mji.l,
I'll, ll.'l
1' 1 II* I-M
I.-CU-.
_'.l>-l> 1
I.J-II I
U-l> .
1 M> 1
I.MI 1


•II
•IJ
•II
tlr
J,
> IplK'lh
>l|ll>< lit
Sljtlu-n-
t\MUJI
.n.iphil



lull-lit
km-
SI







71S
7JI
H42
KIU
h9J
h.17
M:

      Table IV summarizes results obtained  by Budde
  and liichelberger* of the MDQARL in an  objective
  evaluation of the NIH and EPA/Battelle systems, using a
  mixture of typical environmental pollutants.

  STIRS

      Uivauw nl  lliL- relatively small size of both the NIH
  and t-l'A  reference  libraries,  a substantial  number of
  environmental samples cannot  be  matched by either of
  these system*. Hie STIRS deduces substructures based
  on  reference spectra and  permits  the analyst  lo recon-
  struct the structure of the unknown even if the library
  docs  not  contain a matching  spectrum.  The program
  permits the user to transmit a  complete mass  spectrum
  from a remote PDP-8 to the Cornell PDP-11/45 via the
same serial data interface and acousticoupler used by the
EPA/Battcllc system. The input peaks are compared to
spectra in the 25,000 spectra reference library and used
to calculate  the  probable  molecular weight and value
(from 0 to 1000) for 14 match factors. These factors are
based  partly  on known mass fragmentation behavior and
partly on  experience gained through the  use of this
program. For each of these match factors (Table V), the
10 best matches in the data base are ordered by spectral
similarity and listed  for the  user. A 15th match factor is
also  generated as the overall match  factor.

     A search using  STIRS  gives  not only the chemical
name, but also Wiswesser line notations (WLN's) for
each of the compounds selected for the particular match
factor. Figure 11  is the STIRS output obtained by trans-
mitting a  spectrum  of  ethyltoluene.  The  confidence
table indicates  a  high confidence that  a  phenyl ring is
present, a lesser confidence that  an ethyl group is pre-
sent,  and a  low confidence that  a methyl group is
present.  All  of  these  conclusions are  correct.  Low
confidence suggestions of unsaturation and  a branched
chain arc incorrect.  A set of criteria is being developed
that  will permit predicting  the probabilities that given
groupings  are present in the unknown; however, scan-
ning the WLN's found for each factor shows  that certain
WLN substructures  are  repeated  far more  frequently
than  others.  If the  same substructure is indicated by
several match factors and another substructure by other
match factors, the correct structure probably contains
both  substructures.  This intuitive conclusion has been
demonstrated to be correct in many  cases.

     STIRS suffers to a  lesser degree from  the same li-
brary  limitation as do  the  EPA/Battelle and NIH sys-
tems. A spectrum of the exact material or even a closely
related one does not have to be present in the reference
library, but  some compounds having the  appropriate
substructures must be present. A spectrum  for a phos-
phorthioate was  found to give no  useful  information
when it was  first submitted  to STIRS. This led to the
discovery that none of the spectra in the library at that
time included the phosphorthioate substructure.

     Recently,  an  American Society  for  Mass  Spec-
trometry panel submitted identical test spectra to the
Mass  Spectral  Data  Centre,  NIH, EPA/Battelle, and
STIRS  searches. The  results showed STIRS to be
markedly superior to all of the others in its ability to
suggest correct identifications.
 *W.L. Buddc and J.W. liii'helbcrper, personal communication. July, 1974
240

-------
                                                                   Table IV
                            Summary of Results Obtained with Some Typical Organic Water Pollutants
                                     Using the NIH and EPA Mass Spectrum Matching Programs
Typical Water Pollutant
1 , l,2.2-lelr;ichloriK'lhuiic
riicnnl
iiisiJ-cliloiiiisoprnpyDc-llicr
jMTVMll
(t-tfipincnl
naphthalene
o-niltotoluenc
K'luolliia/ole
>ii)clh>liiapthalcm.'
l-inethylnaphtlialenc
hcxadceanc
jccnaphilicnc
dihcnzofuran
MM (Operator Selected Ions)
No. of
Mia/Abundance
Pairs Entered1
.1
5
(i
II
S
$
1
4
ID
1
5
3
4
NO. or
File Spectra
Matched (Hits)
7
K
1
10
4
7
f>
2
II
3
8
8
8
No. of .
Correct Hits"
K
4
1
1C
>
3
lid
1
5C
2f
3
5
3
EFA (Automatic Abbreviated SpMttum)
No. of
FitoSpectn
Matched (Hill)
1.1
24
65
7H
(1
IK
15
25
K
9
442
17
II
Rank of the
Correct Hits by.
Similarity Index"
1
1.2
1,2.3
1
h
2.3
1,2
1.2
2«
1.2
3
1.2
1.2
SI
.567
.878..69I
.7S8..7S8..602
.808

.771..7J8
.892..S86
.732.J28
.668
.609..598
.766
.786..7S3
.682..560
•'No more than two mass/abundance pairs were entered lor each standard Mill 14 amu interval. The data from the highest mass interval observed wai
 entered first followed by the next highest, etc. Within each interval the higher mass number was always entered first. Input was stopped when 10 hits
 or less were reported or the data was exhausted. With this system a different sequence of data entry can give significantly different results.

bDuplicatc correct answers indicate there arc duplicate spectra in the data base.

cSeven other hits were either m-cresol or o-crcsol which have very similar miss spectra.

dfhi: spectrum of o-nitrotoluenc was in the data  base but was in error.

e | |,c oilier six hits were l-nwlhylnaphlluilrnr which has a virtually identical mass sprclrum.

' I he nlhei liil was 2-inclhylnuphlhafcnc which has a virtually ulenlH.il m.iss spectrum.

V-1 ||t. |ni with (lie  liinliv.1 SI t.74.l> wa> l-nicihylnuphlliiilenc whieh Ims u virtually iJcnlieal muss spcctiuiiK

''Remote computer abutted three limes with three different abbreviated speclra from two different Ci( /MS runs. Manual input of one of the same
 data sets several days later yielded four hitsincluding two correct with an SI of 1.00.

                                                                    Table V
                                                      Spectral Data Classes and Match
                                                          Factors Utilized by STIRS*
(!».. of Spcilral 1)111
1.
i
JK.
.<.
JH.
4.
5.
t>.
Ml.
7.
M.
').


HI.

tun x'TK-^
lit* IIUss .luiJi'K'liNlk Kilts 1 < Ill/I1 •*(!»
ihctl.iiMnimlurjik-liMk n>ii\alni/i'li7-IIKi
McJitini nuss vlurjtltfriMK tons tni/c 911-1441
(>>«lj|'|l|nf clUMlUTKlk Kills" IIIKC 1 If-l'.'l
lliijll itu^.hjuilciiMk IOIK i>ni(i- I5UI
Neutral loss fries'1
Snull piinui> m'utul losses
1 Kerl;ippni)t |ihni»r> nviiliul losses4 I.I.^-MIII
1 *ltx tuiniiir> iieiilul lossi's
Si'ittiiil.iiv ih*UM.il losses Iront the tuosl jbutvlunl iHUI mass kiss
SeiiliHl.n> lh-UM:il IOSH-S lloul the inosl alilllHlalll wen lluss loss
< l,iss«) I|.HJ ol the unknown s|veltuitt iiuli'lk'il apinslCtiss*
tt.iu ot ilie leli-ieme s|vi-num
1 inj'ei 1*11111 ions
(h.'l.lll MUHll I.I, IMI
Milch KK*M
Ml 1
MI:
MI i:
Ml .<
Ml 1)
Ml 4
Ml 14
Ml 5
Ml 1!
MIA
Ml 7
Ml K

Ml -I
Ml HI
Ml II
                                              o» iiukli Ijelm vutieiilU iiiulei iA|>eililKiiul etjlualion.

                                      • I.». Hi Ijllein. "I oiii|Miloi-AiJoJ Intriprruiiiui »l Mjss SIHMI.I 111." in («l<
                                               ;iAr/V(tffc Sivftnunclrv in thf Infrttijtalttmol thinun Dtsem. ed.
                                               JiU (.K. Svmet iMonlKjl, lafijda Mcl.ill I nivcrsily-MivntreLiI
                                       (lilklien's llus|iiul. I4)4i.
                                                                                                                                                  241

-------
  FUTURE PLANS

       Work underway and plans for the future  include
  joint efforts by the Office of Research and Development
  (OR&D), Management Information and Data Systems
  Division, NIH, and FDA to;

           Expand  and centralize a large,  refined data
           base

           Translate  the   EPA/Battelle  program  to
           FORTRAN for inclusion with the  NIH pro-
           grams

           Develop  software and hardware  to  permit
           automatic data transmission between all EPA
           GC/MS/computers and the NIH PDP-10

           Make all  matching programs developed avail-
           able  to the public  through cooperation of the
           Mass  Spectral Data Centre
          Expand the capabilities of the EPA/Battelle
          system to maintain geographic frequency dis-
          tribution of all identified organic pollutants

          Develop   FORTRAN  procedures  for  de-
          convolution of overlapping GC/MS peaks

          Expand  STIRS  to  allow the  program  to
          predict the most probable structure.

     All of these search programs are useful to EPA in
determining the identity of pollutants. This utility will
be greatly increased as the above  plans are completed.
STIRS has the greatest potential of any of the EPA pro-
grams to provide rapid identification of any GC/MS un-
known with a library of less than  50,000 spectra. As is
true,  however,  for  all  computer  identifications, such
identifications are suggestions of possible structures and
do  not  remove  analytical  responsibility from  the
chemist. Final identification requires confirmation by
the analyst.
242

-------
ICO
80-
70-
50'
40-
3t-
20r
j





-


(



1 1 f 1 till
2C 30 40








1 ,..,,,,. 1 .,, I
50




-+T-I -T-l 1 1 1 1 1 1
1 ,..,.,...,
60 70
ACETONE SPECTRUM NUMBER

-------
100-
"
90-i
80-
70-
60-
50-
40-
30-
20-
10-i
0-








1 , 1



0
ii



CH3*
Jc-c
- 1
20
1 LS ACETONE











CH3




T [c-c-c
H3J L ^
1 1 II
1 I | I I 1 i | 1 i i
30













1 '
40




r o-i







jf

CH C
r ° n +
1 " 1




r °~\+
\ f^LJ f* 1
iv^n^k^ I



o
ft
1 0
50 60 70 %
0
I
2 3
• OQ
!§
o ***
•^
00

.
9
3.

-------
100
90-
80-
70-
60-
50-
40-
30-
;
20-i
ID-!
I





•*•


_
	 L


JC-C
I
20
1 LS ACETONE






r o~
"
H3J
30 40







-i-


EOT
:H3cJ








r • T
[CH3C-CH3J





1-4 r , r r ,-, ^ , T^p^ , ,,,,.,,,,,,
50 60 70

 cr
f

-------
 TYPE PEAK, INT
 CR TO EXIT, 1 FOR ID,MW,MF,  AND NAME

 USER 105 ,:>00

    # REFS     M/E PEAKS

      893         105

 NEXT REQUEST 120,38

    # REFS     M/E PEAKS

       96         105 120

 NEXT REQUEST 91,11

    # REFS     M/E PEAKS

       25         105 120  91

 NEXT REQUEST 79,16

    # REFS     M/E PEAKS

        7         105 120  91  79

 NEXT REQUEST  1
  ID#
 2135
 4867
16799

20472
20473
20474

20475
MW
MF
120 C9 H12
152 C9 H12 02
388 C12 H12  03 W

120 C9 H12
120 C9 H12
120 C9 H12

120 C9 H12
   NAME

ISOPROPYLBENZENE
CUMENE HYDROPEROXIDE
PI-1,3,5-TRIMETHYLBENZENE-
TRICARBONYL TUNGSTEN
ISOPROPYLBENZENE
ISOPROPYLBENZENE
l-METHYL-2-ETHYLBENZENE

0-ETHYLTOLUENE
NEXT REQUEST
                            Figure 4
                   NIH PEAK Search for p-Ethyltoluene

-------
TYPE LOSS, INT
1 TO EXIT, 2 FOR ID,MW,MF AND NAME

USER -15,100

  ft REFS        LOSSES

   2169        -15

NEXT REQUEST:  -29,12
  ft REFS

    504
               LOSSES

              -15 -29
NEXT REQUEST:  -41,16
  # REFS        LOSSES
     84

NEXT REQUEST:

  ft REFS

     39

NEXT REQUEST:

  # REFS

     10

NEXT REQUEST:
   IDft

   241
   260
   446
  5657

 19076
 19271

 19272

 19274
 20476
           MW
-15 -29 -41

-43,18

 LOSSES

-15 -29 -41 -43

-55,7

 LOSSES

-15 -29 -41 -43 -55

2

    MF
            70 C5 H10
            70 C4 H6 0
            82 C6  H10
           160 C12 H16

            70 C5 H10
            82 C6 H10

            82 C6 H10

            82 C6 H10
           120 C9 H12
TY*>E C TO CONTINUE LIST,
CR TO EXIT:  C
 20477
           120 C9 H12
  NAME

3-METHYL-l-BUTENE
3-BUTYN-2-OL
1-METHYLCYCLOPENTENB,
1- (2 , 3-DIMETBYLPEHNYLj -BUT
r2-ENE
2-NETHYL-2-BUTENE
?ffiTHYLENE CYCLOPENTANE

METHYLENE CYCLOPENTANE
CYCLOHEXENE
M-BTHYLTOLUEN5
                     l-METHYL-3-ETHYLBENJJENE
                             FlfuwS
                     NIH Lorn Swch for jhEAyltohwat
                                                                  247

-------
                      STD  ETHYL
                       ID#

                       2135
                       4867

                      16799
                      20472
                      20473
                      20474
                      20415
                      USER RESPONSE
DISSIMILARITY

      0.14
    25.01

      6.83
      0.12
      0.11
      0.10
      0.16
                                     Figure 6
                       Dissimilarity Comparision for PEAK Search of Fig. 4
24K

-------
     r
                     is
                                                       litf  IV  IW
                                                                                                        «,OCiH5

                                                                                                          SOC2H5
                                                                                            ZOO  Zld

feiJ
    13  U
                      U   tt  Vu
                                                                mij  l.jILL.|lll^MH^. fl«l-.ltT-  ..:..».(.".^*r.(:
«iiV.,-.,.«i,i,l	i|m»yi i»|Bii«f
•^j  3t4<*3 V^iiT  210  Ift9 *su  '

-------
              I . D . ?   ETHYLTOLUENE
              PAPER  TAPE?      N
              MT J UAL
              STANDARD
              27 , 4 ; 39 , 8 ; 41 , 3 ; 51 , 8 ; 58 , 4 ; 63 , 4 ; 65 , 7 ; 77 , 18 ; *
              79 f 16 ; 9 1 , 12 ; 103 , 10 ; 105 , 100; 106 , 9 ; 120 , 3 8 ; 12 1 , 4 ;
              END
              PARMTRS?          M100-200
               74 HITS
              l-METHYL-2-ETHYLBENZENE  120  C9  H12 API 0312
              FILE KEY=  3136
              ISOPROPYLBENZENE (CUMENE)   120 C9 H12 API 0311
              FILE KEY=  3135
              SI=0.691

              l-METHYL-3-ETHYLBENZENE 120 C9 H12 API 0313
              FILE KEY-  3137
              SI-0.660

              1,2 ,3-TRIMETHYLBENZENE 120 C9  H12 API 0315
              FILE KEY=  3139
              SI=0.631

              1,2 ,4-TRIMETHYLBENZENE 120 C9  H12 API 0316
              FILE KEY=  3140
              SI-0.569
                                   Figure 8
                          EPA/Battelle Matching of p-EthyltoIuene
250

-------
£

-------
          S,  E, 0R P?S
          I.D.?  G0AL GASIFICATION PLANT EFFLUENT
          PAPER TAPE7Y
          FN—F73  ; S   -1   :
          CHPFI-1 (1ST EXT)   :
          37 , 3 ; 38 ,8;39,39;40,4;41,2;43,2;50,12;51,20;52,10;61;:
          53 ,18 ;54,6;55,4;61,2;62,4;63,9;64,2;65,4;66,2;56;:
          74 ,2;77 ,41;78,9;79,38;80,13 ? 81,2;89,4;90,12;91,6;61;:
          106, 3;107,100;108,91,-109,6;33;:
          END
          PARMTRS?  M100-500
          111  HITS
          M-CRES0L 108 C7.H8.J?
          FILE KEY= 186
          SI=0.857
AST  0181
          1-HYDR0XY-3-METHYLBENZENE (3-METHYLPHEN0L—M-CRES0L)
          108  C7.H8.J?  TRC 0068
          FILE KEY= 6392
          SI=0.845

          1-HYDR0XY-2-METHYLBENZENE (2-METHYLPHEN0L—0-CRES0L)
          108  C7.H8.0  TRC  0067
          FILE KEY= 6391
          SI=0.834

          1-HYDR0XY-4-METHYLBENZENE (4-METHYLPHEN0L—P-CRES0L)
          108  C7.H8.(?  TRC 0069
          FILE KEY= 6393
          SI=0.815
         M-CRES0L 108 C7.H8.J2T
         FILE KEY= 462
         SI=0.805
AST 0459
                                   Figure 10
                     Computerized Spectra Matching Program Dialogue for
                       a Component of Coal Gasification Plant Extract
252

-------
HIGH MASS FINGRRPRINT ION MATCH (MF10)
MF1 MF2 MF3 MF4 MF5 MP6
711 409 622 9 766 0
# 2139 l-METHYL-2-ETHYLBENZENE
2R B C9 H12 120
711 409 632 0 692 0
# 2141 l-METHYL-4-ETHYLBENZENE
2R D C9 H12 120
711 507 649 9 857 9
# 2135 ISOPROPYLBENZENE
1YR C9 H12 120
533 230 805 0 230 0
ft 5848 2-PHENYLHEXANE
4YR C12 H18 162
666 347 715 0 83 0
ft 5185 ALPHA-CHLOROACETOPHENONE
G1VR C8 117 01 CL1 154
OVERALL MATCH (MF11)
MF1 MF2 MF3 MF4 MF5 MF6
711 507 649 0 857 0
# 2135 ISOPROPYLBENZENE
1YR C9 H12 120
711 409 622 0 766 0
# 2139 l-METHYL-2-ETHYLBENZENE
2R B C9 H12 120
711 409 617 0 766 0
# 2140 1- METHYL- 3-ETHYLBENZENE
2R C ' C9 H12 120
711 579 496 0 764 0
# 2133 6-METHYL-6-ETHYL FULVENE
L5YJ AUY2 C9 H12 120
711 409 632 0 692 0
ft 2141 l-METHYL-4-ETHYLBENZENE
2R D C9 H12 120
% CONFIDENCE TABLE
SEARCH LIMITS 90-100
SYMBOL MATCH FACTOR CONFIDENCE
CH3 MF 7 99 MF11 94
ETH MF 5 99 MF 7 99 MF11
PHN MF 1 100 MF 2 100 MF 3
C=C MF 7 100
Y MF 7 99 MF 8 99
MF7
833
833
833
666
500

MF7
833
833

833
1999
833
— BASED

99
199


MF8
750
750
750
0
0

MF8
750
750

9
1999
750
MF9 MF10 MF11
250 675 678
250 637 644
250 635 743
9 634 411
0 624 346

MF9 MF10 MF11
250 635 743
250 675 678

0 617 677
250 589 677
250 637 644
ON WLN'S


MF10




100 MF11 100


     Figure 11
Typical STIRS Output
                                                             253

-------
                                       BOWNE TIME SHARING (WORD/ONE)

                                                By Catherine Tittle
       Large volume, impo r.iblc deadlines, small staffs and
   limited  budgets  traditionally  create  a  significant
   i oadblock  in  the  preparation  of research reports,
   manuals, directories and other  documents. The Environ-
   mental Protection  Agency (KPA),  faced  with  this
   problem since  it  was organized, found the solution in
   Word/One.

       The  OPA  has been using Word/One since  1971.
   Beginning in Washington and spreading to the Regions
   and  National  Environmental  Research  Centers,
   Woid/One has been used successfully by EPA to produce
   many  of the regulations, manuals, reports, directories,
   mailing lists and so forth which comprise a  significant
   portion of the Agency's paperwork production.
       Five  years ago,  Bowne  Time Sharing (BTS) began
  to apply computer technology  to the business of pro-
  ducing words rather than crunching numbers. Word/One,
  BTS' computeri/cd text-processing system, was designed
  to deal with the ever increasing paperwork  explosion.
  (Certain applications  lend  themselves more  readily and
  cost-effectively  to  computerized  word  processing:
  Figure  1.)  Using the  shared computer  approach, Word/
  One permits the simultaneous use  of a  central computer
  by many  remote locations using a typewriter/terminal
  and a telephone line. This service is available wherever
  phone  service exists  and is supported  by Bowne Time
  Sharing's seven offices in Washington, D.C., Boston, New
  York, Chicago, Philadelphia, Atlanta and Los Angeles.
       The BTS computer configuration, which is located
  in New York, is an IBM 370, model 155, with a million
  bytes of core storage. The peripheral devices, which are
  used for on-line storage, are capable of providing imme-
  diate access  to over a billion bytes  of text information.
  The printing volume, which  is naturally  the highest vol-
  ume of  output,  is handled  by live high-speed printers
  that opeiate  at  a speed ol over 400 lines per minute each
  and produce 800,000 to I  million lines of typewriter
  quality,  upper  and lower case, information daily.  Addi-
  tional output can  include  magnetic tape and punched
  cards. Figure 2 shows  the interrelation of the different
  parts of BTS' Word/One System configuration.

      Word/One is designed for use by personnel having
  little or no computer  knowledge or  background. All in-
  structions are phonetic.
     The system consists of four basic segments:

          Input

          Storage and Retrieval

          Editing and Manipulation

          Format and Print.

 A discussion of each segment follows.

 Input

     Word/One is  designed to: (1) capture keystrokes,
 thereby  reducing typing of future  drafts on magnetic
 disk and (2) eliminate  (mechanical and error-prone)
 typing tasks,  including insertion of heading information
 and page numbering.

 Storage and Retrieval

     Word/One allows the user  to: (I) store, for an
 indefinite time, any  input material, (2) assign a  con-
 venient  name, (3) retrieve, immediately, any of the
 material, (4) optionally share the document with select-
 ed other organizations for joint efforts and (5) maintain
 document records, e.g. the date data is stored, for com-
 plete management control.

 Editing and Manipulation

    Word/One is designed to:  (1) minimize typing when
 revising  a  document, (2) eliminate  redundant proof-
 reading of previously approved material,  and (3) reduce
the  delay  in producing  the  next draft. Selected
Word/One features that assist in this area are:
         Replace: Word/One can change all occurrences
         of a  word or  phrase  to another at all
         preselected locations.
or
         Elasticity.  Word/One  will automatically
         expand or contract  each  edited segment of a
         document to accommodate the change.

         Proofmark:  Word/One,  for  working  drafts
         automatically places a special character to the
         right of each edited line so that the reader will
         know where each change occurred.
254

-------
Format and Print

     Word/One provides  a  number  of text formatting
options that allow:  (1) mechanical  functions, such as
insertion of headings to be specified rather than typed
each time, (2) error-prone operations, such as the main-
tenance of tabular material  on a page, to be handled by
the  system, (3) flexibility  in changing the  document
layout, such as page  width and depth specifications, and
(4) print output options, such as justified left and right
margins.

     In addition, the  Word/One  system provides a
significant level of throughput. The high-speed printers
produce typewriter equivalent copy overnight, regardless
of  document   size.  Most of the low-speed  terminals
currently used  by  EPA  are  compatible with the
Word/One system.
     Using  this  concept  EPA has  produced  many
publications including EXPRO, the  OR&D  Program
Planning and Reporting  Manual, Methods Manual  for
Chemical Analysis of Water Wastes (1974), and develop-
ment documents and regulations in several  program
areas. These publications were  produced meeting tight
deadlines with limited clerical staffs. Productive time of
both  clerical and professional  personnel involved was
maximized because the material was typed only  once,
changes were   made  only where necessary  and  the
high-speed printer was used to speed turnaround.
     Where large documents with significant revision are
handled, Word/One has proven to be a necessary and
beneficial tool.
                                                                                                               255

-------
            TYPEWRITER
            AUTOMATIC
            TYPEWRITER
           COMPUTER
           EDITING


SIMPLE
<
XI

>
U"
COMPLEX
                                             Figure 1
                               Equipment Method Cost-Effectiveness Range vs
                                    Text Manipulation Requirements
         (TYPEWRITER
          TERMINAL
                                             Figure 2
                            Equipment and Data Flow for Word/One Applications
250

-------
                                  INTRODUCTION TO THE UNIVAC 1110
                             AT RESEARCH TRIANGLE PARK - CAPABILITIES

                                              By T. L. Rogers
INTRODUCTION

     In June 1973 UNIVAC was awarded the contract
to provide the new computer system for the Research
Triangle Computing Center. The new system as installed
(Figure 1) had to pass a benchmark test with a through-
put ratio of 12 to  1 as compared to the  IBM 360/50.
The system is a UNIVAC 1110 with the components and
general capabilities as described in this paper.

GENERAL DESCRIPTION

     The UNIVAC 1110  System is a  general purpose,
high performance system  incorporating the  latest
advances in  computer  design, system organization, and
programming technology. The various components of
the  UNIVAC 1110  System are  designed as separate
logical units providing  maximum functional modularity.
The multiprocessing capabilities  are an integral part of
the  system; the command/arithmetic units can perform
numerous  tasks simultaneously  under  control  of a
common  executive.  The  flexible  modular structure
enables  a  user to  tailor a system to  his individual
requirements. Principle features of the UNIVAC 1110
System are:

          Common resource systems organization

          Multiple  command/arithmetic units  (CAUs)
          and input/output access units (lOAUs)

          Character manipulating instructions

          Partial-word,  double-word,  and full-word
          addressability

          System partitioning capability

          Redundancy among system components

          Two levels of directly addressable storage

          Large modular plated-wire primary storage

          Large modular core extended storage

          Storage protection
        Program address relocation

        Independent  input/output  access units
        (lOAUs)

        Extensive  software  library  and language
        processors

        Dynamic  adjustment  to a  mix of  batch,
        demand, and real-time modes

        Wide choice of high performance peripheral
        subsystems

        Independent,  simultaneous communications
        processing.
SYSTEM COMPONENTS

    Each  component in the  UNIVAC 1110 System is
functionally  independent and may have the following
properties:

         Two or more access paths

         Access conflicts resolved by priority logic.

     In a multiprocessor configuration the  following
capabilities are standard:

         Continued system operation if any component
         fails

         Any component can be logically removed for
         servicing without disabling the entire system.

     The UNIVAC 1110 System consists of eight types
 of components:

          Command/arithmetic units

          Input/output access units

          System console

          System partitioning unit
                                                                                                         257

-------
           Primary storage

           Extended storage

           Maintenance coMtroller

           Peripheral subsystems.


  Command/Arithmetic Unit (CAU)

      The basic  UNIVAC  1110 System configuration
  includes one command/arithmetic unit (CAU) and one
  input/output  access  unit  (IOAU).  All  control  and
  arithmetic functions are executed by the CAU. The CAU
  is a multitask instruction-stacking device capable of con-
  trolling  up  to lour instructions at  various  stages of
  execution. A CAU can interface with up to four primary
  storage units by means of both an instruction path and
  an operand  path. Dual data paths connect the CAU with
  extended storage through a maximum of eight UNIVAC
  multiple access interface  (MAI) units. The data paths to
  primary and  extended  storage have overlapping  and
  interleaving  capabilities. In a multiprocessor system, the
  user can specify and can change which  units are to be
  used, thereby permitting  a system to be logically divided
  into two or three independent smaller systems, or  re-
  moving  individual  units  for  maintenance  without
  affecting the total system. Interrupt signals may be sent
  or received on one of the  three interprocessor lines.

      Additional features of each CAU are:

           Capability of executing up to 1.8 million  in-
           structions per second

           300-nanosecond  effective  basic  instruction
           time

           Four-deep instruction stack

           112-word general register stack (GRS)

          Character  manipulation  by  means  of
          byte-oriented instructions.

 The  Research Triangle Park (RTF)  System  has two
 CAUs.

 Input/Output Access Unit (IOAU)

      The  basic  UNIVAC 1110  System  configuration
 includes one  IOAU. The  IOAU  controls all transfers of
data between the peripheral devices and  primary  and
extended storage. Transfers are initiated by a CAU under
program control. The IOAU includes two nonconcurrent
data transfer paths, one for primary storage and one for
extended storage.

     The  IOAU  consists  of two  sections: a control
section  and  a  section  containing  from  8  to
24 input/output channels. Input/output (I/O) data trans-
fers  may  occur simultaneously  with the execution of
programs in the CAU.

     The  control section  includes all logic associated
with the transfer of function,  data, and status words
between primary  or  extended  storage and  the  sub-
systems. It also services I/O requests from either one or
both  of the CAUs (in a multiprocessor  system)  and
routes interrupts to one of the two CAUs. Interrupt
routing may be specified by program.

     Some outstanding  features of the IOAU are:

         Aggregate transfer  rate  of 4 million 36-bit
         words per second (24 million characters)

         Externally specified index (ESI) and internally
         specified index (ISI)  transfer  modes on  any
         channel

         Data chaining

         Interrupt tabling

         Storage-to-storage transfers.

The RTP System has one IOAU.

System Console

     The system console provides the means for  commu-
nication with  the executive  system. The basic console
consists of the following major components:

         The  cathode  ray  tube (CRT)/keyboard
         consists of a UN1SCOPE 100 display terminal.
         The  display  format  is  16  lines  with
         64 characters  per  line. The seven-bit ASCII
         character set, consisting of 95 characters plus
         the space, is  used.  The keyboard provides all
         of the  operator controls required  for  gen-
        erating data and initiating transfers.

        The   incremental printer  operates  at
        30 characters  per second and provides a hard
258

-------
         copy  of  console messages. The cabinet con-
         taming the printer also contains the power
         supplies and control  logic required to select
         the CRT, incremental printers, and  facilities
         for the real-time maintenance communication
         system (RTMCS). This unit also contains the
         interface  between  the  console  and  any ISI
         channel on the IOAU. Up to five additional
         incremental printers may be connected to the
         console.

         The  fault  indicator,   located  on  the
         incremental  printer,   provides the  operator
         with a visual indication of a fault condition in
         a major system component. The actual com-
         ponent and nature of the fault may then be
         determined  from  indicators  on  the main-
         tenance panel.

The RTF System includes one console consisting of one
CRT and one printer.


System Partitioning Unit (SPU)


     The system partitioning unit (SPU), when included
in  the  UNIVAC1110 System,  permits off-line  main-
tenance  of units,  enables the  operator  to  logically
partition the  system  into  two  or  three independent
systems, and initiates a recovery sequence in the event of
failure. The SPU  performs six functions, five under
operator control and one under software control. With
the SPU, the operator can:

          Partition the  total system into two or three
          smaller systems

          Isolate  units  and  take  them  off-line  for
          maintenance without disrupting the rest of the
          system

          Function as a system monitor by  observing
          the status of the various major components

          Perform initial load  into  the primary system

          Allow automatic  recovery procedures if an
          interrupt is not received.


      Under software control, the SPU presents status in-
 formation  to the lOAUs. When all  optional features are
 included, the SPU  is able to interface with:
         Six command/arithmetic units

         Four lOAUs

         262K words of primary storage

         Eight MAI  units (1048K words of extended
         storage)

         48 multiaccess subsystems.

This unit is not included in the RTF System. Since the
RTP System  has only  one IOAU, the SPU cannot be
utilized.

Primary Storage

     The first level of directly  addressable main storage
in the UNI VAC 1110 System is primary stotage. Primary
storage  consists  of high-speed, nondestructive readout
(NDRO) plated-wire  storage units  with nominal random
read and write cycles of 320 and 520 nanoseconds, re-
spectively. The basic 32K-word storage unit consists of
four 8.K modules, and may be  expanded in one 32K in-
crement to a 65K unit. The minimum primary storage
for a basic 1x1 configuration (one CAUand one IOAU)
consists of 32K words. A total of four 65K units pro-
vides a maximum primary storage capacity of 262K
words in  a system. The basic  32K storage unit accom-
modates  eight  access  paths,  servicing four of them
simultaneously;  a 6SK unit accommodates up to sixteen
access paths, servicing eight  of  them simultaneously.
Partial  (sixth,  quarter,  third, and half)  as well  as
full-word  operation  is  provided.  The  RTP System
contains 97K words of primary storage.

Extended Storage

     The  second level  of directly addressable  main
 storage in the UNTVAC 1110 System is the extended
 storage system.  The minimum extended storage con-
 figuration consists  of 131K  36-bit words.  Extended
 storage capacity may be expanded in 13IK increments
 up to a maximum of 1048K 36-bit words. Each unit has
 a l.S microsecond read/write cycle. Extended storage is
 connected to the system by MAI units which provide up
 to ten access paths to  each storage unit.  The RTP
 System supports 262K words of extended storage.

 Maintenance Controller

     The maintenance controller  provides for diagnostic
 checkout  by the automatic comparison of maintenance
                                                                                                            259

-------
  [i.iiicl indicators against known good d;ila on (upe fur (lie
  following:

          ("AUs

          lOAUs

          Disc  controllers  (UN1VAC8440  Disc
          Subsystem)

          Communication/symbiont processor

          Printed circuit cards.

      To complement its diagnostic capability, the main-
  tenance controller  allows  for  the  operation  of the
  operator/ maintenance panels by personnel at a remote
  site. This device is included in the TRP System.

  Peripheral Subsystems

      The UNIVAC  1110  System offers a full range of
  peripheral subsystems; this wide range provides the capa-
  bility  to  satisfy  many  requirements.  The standard
  UNIVAC peripheral subsystems include:


          High-Speed Printer Subsystem
          RTF System: 2 High-Speed Printers

          Card Subsystem
          RTF System: 2 Card Readers

          UNIVAC 9000 Series Subsystem
          RTF System: 2 9300 Subsystems, each with
          600 LPM Printer

          UNJSERVO 12/16 Magnetic Tape Subsystem
          RTF System: 15 Uniservo 16  Tape Drives

          UNISHRVO  20  Matnctic  Tape  Subsystem
          RTF System: None

          H1-432/1782 Drum Subsystem
          RTF System:  6 FH432 Drums,  2 FH1782
          Drums

          UNIVAC 8414 Disc Subsystem
          RTF System: None

          UNIVAC 8424/8425 Disc Subsystem
          RTF System:   I  8424 Disc Subsystem  (8
          Drives)
         UNIVAC 8440 Disc Subsystem
         RTF System: None

         UNIVAC 8460 Disc Subsystem
         RTF System: 4 8460 Discs

         Communications/Symbiont  Processor (C/SP)
         RTF System: None

         UNIVAC  DCT  500 Data Communications
         Terminal
         RTF System: 25 DCT-500s are included

         UNIVAC  DCT 1000 Data Communications
         Terminal
         RTF System: None

         UNISCOPE 100 Display Terminal
         RTF System: 10 Uniscope  100s are included

         Communications Terminal Module Controller
         (CTMC)
         RTF System: 2 CTMCs are included.

Destandardized Subsystems

    The  following peripheral subsystems  may  be
included in the UNIVAC 1110 System:

         UNISERVO VIIIC Magnetic Tape Subsystem

         FASTRAND  II  and  III  Mass  Storage
         Subsystems

         Communication Terminal Synchronous (C S)

         UNIVAC 1004 Subsystem

         UNISCOPE   300  Visual Communications
         Terminal.

CONFIGURATIONS

    The  basic UNIVAC 1110 Processing  System (1x1
configuration)  consists  of  two  functionally  and
physically independent units: one CAU and one IOAU.
The processor organisation  is intrinsically  that  of a
multitask processor and  is designed for operation in a
multiprogramming  and  multiprocessing environment.
The basic processor may be expanded by adding CAUs
and/or  lOAUs up  to a  total of four CAUs and four
lOAUs  (4x4). The  basic 1x1 configuration is shown in
Figure 2; Table I  lists all  fully supported configurations.
260

-------
                                                   Table I
                                        Fully Supported Configurations

UNITS
CAU
IOAU
PRIMARY STORAGE (words)
EXTENDED STORAGE (words)
MAI
SYSTEM CONSOLE
SYSTEM PARTITIONING UNIT
CONFIGURATION
txl
1
1
32K- 262K
I31K-1048K
1-8
1
0-1
2x1
2
1
65 K- 262K
262K- 1048K
2-8
1
0-1
2x2
2
2
65K- 262K
262K-1048K
2-8
1-2
0-1
4x2
4
2
131K- 262K
262K-1048K
2-8
2
1
4x4
4
4
131K- 262K
262K-1048K
2-8
2-4
1
Minimum Peripheral Complement

     The following list  of peripheral equipment  is the
minimum available with the UN1VAC 1110 System. This
minimum has been established to ensure an adequate
complement  for  customer  engineering  and software
support-
RTP COMMUNICATIONS SUPPORT
     The UNIVAC Communications Terminal Module
Controller  (CTMC)  Subsystem  enables  the
UNIVAC 1110 System to  receive and transmit data by
way of any common carrier at any of tin standard rates
of transmission up to 50,000 bits per second. It can re-
ceive data from or transmit data to low-speed (up to 300
bits  per second), medium-speed (up  to  1800 bits per
second), or high-speed (2000 to 50,000 bits per second)
lines in  any combination. The RTP system contains two
CTMCs  giving a total of 64 ports for terminal activity.
Figure 3 shows how these 64 ports are configured on the
RTP system.
                      Minimum Complement
                Communications/Symbiont Processor (C/SP)
                with card reader and high-speed printer
                Drum Subsystem*

                I'll 4.U/I7HJ Drum Subsystem with two
                til -U2 drums

                Muss Suuugc Subsystem

                UNIVAC H4I4 Disc Subsystem with (wo
                8414 disc drives
            4.    Magnetic Tape Subsystem

                'UNIVAC I 2/16 Magnetic Tape
                 Subsystem with lour magnetic tape units
               Alternate
UNIVAC l>300 Subsystem with card reader and
integral printer, or Multisubsystem Adapter with
high-speed printer subsystem and card subsystem
III 4.52/1 782 Drum Subsystem will) one
HI 17H2drum
 UNIVAC 8424/8425 Disc Subsystem with two
 8424/8425 disc drives, or UNIVAC 8440 Disc
 Subsystem with one 8440 disc drive, or UNIVAC
 8460 Disc Subsystem with one disc Ale unit
 UNISERVO VlllC Magnetic Tape
 Subsystem with four magnetic tape units
            *Ntit required for disc icsiilciil systems (IxI. 2x1, and 2x2)
                                                                                                             261

-------
STORAGE      S'ORASE
                           HORACE

                          EIPANStC*
      II  II  III  II  I
                                                 - EKTEWtD MAIN STORAGE -
                                                               TIT

a
»

3
||

,
£
5
%
;
||
i
i
$


*
|

i
* s
= 1
2 »
- *
1 I

W) t)

= :--. Ill MM

*
>

;
_L
II
* s •



1 1 II
1 MU
Tl
CH*MNEl ExfANStO*.
"MTOO

C*'J ' —




;
II

3 3

-z z
; -.
1 1
2
UKfVAC 1118 CAU
vain


UMDKC
CVTHL

MM OSC
COtTROL
                                                                     UNIVAC noo
                                                                     PROCEKOR
                                                                                                                                                                   5
                                                                                                                                                                   a

-------
EXTENDED
STORAGE
131K MINIMUM
i
\
i
r
MULTIPLE
ACCESS
INTERFACE (MAI)
i
h

PRIMARY STORAGE
32K MINIMUM


t
\
i
f i
L
r
COMMAND/ARITHMETIC

UNIT (CAU)







j
i
I
r
INPUT/OUTPUT
ACCESS UNIT
(IOAU)

I/O CHANNELS
8, 16 OR 24

                         PERIPHERAL
                         SUBSYSTEMS
                                         I
 SYSTEM
CONSOLE
      Figure!
 UNIV AC 1110 System
Bask Configuration (1x1)
                                                     263

-------
    04 Ports
                 0 - Dcdicntoil High Speed
                     f, - 9200 R.H-:
                     1 - U-100 Multiplexer
                                        10 - 2780 Type
               20  -  RJE @ 2000 bps
              12 - 1200 bps
                   iJemand
              26 Demand
                                        10 - 1004 Type
                                 ,—  3  Commercial
                                        18  -  mf Type @ 300 bps
  2  WATS

  2  Commercial

  1  Dedicated Mux,

  5  FTS
  2 WATS

  1 Commercial


  3 Dedicated Mux,

  4 FTS
                                       8  -  2741  Type @ 134.5 bps
 2 WATS

 2 Commercial

 4 Dedicated Mux.

10 FTS


 2 WATS

 4 Dedicated Mux.

 2 FTS
                                           Figure 3
                                  Data Communications Support
264

-------
                                 RTCC SOFTWARE AND ACCESSIBILITY

                                           By Maureen Johnson
    Research Triangle Park offers a full range of data
processing software  on  the UNIVAC1110  including
scientific, statistical and  data management packages as
well as many compiler languages. This software falls into
three general categories:

     1.  UNIVAC supplied and supported

    2.  Packages  obtained from  private  vendors  or
other government agencies

    3.  Packages converted from the  IBM/360 system
replaced by the UNIVAC 1110.

1.   UNIVAC SOFTWARE

    The following is a brief description of available UN-
jVAC supported software.

1100 Series Assembler

     The 1100 Series Assembler contains the following
features:
         Mnemonic codes  describe hardware function
         of each instruction

         Multiple location counters  provide for  pro-
         gram  segmentation  and  control  address
         generation.

ASCII COBOL
     Based on  American  National  Standard  plus
CODASYL and UNIVAC  extensions, ASCII  COBOL
contains the following features:

         STRING  and  UNSTRING  statements
         providing powerful character manipulation

         Multiple data formats:
          -    ASCII

          -    FIELDATA

              EBCDIC-Reading and writing IBM flies
              single and double precision floating point
              binary

              Variable length DISPLAY items
        MONITOR  and EXHIBIT  (MONITOR tre-
        mendous aid in program debugging)

        Cross-reference of  all  data and  paragraph
        names

        Index-sequential file organization on mass
        storage (some differences from IBM)

        Reentrant object code and libraries, i.e., one
        copy in memory accessed by multiple users
        concurrently

        Interprogram  and interlanguage com-
        munications

        Internal sort capability

FORTRAN V

    FORTRAN V, containing all features of American
National Standard FORTRAN V plus extensions, has the
following characteristics:

        Up to seven subscripts on variables

        Extended subscript expressions NAME (1+1)

         Forward and backward Do Loops

         FLD intrinsic function: used for extraction
         and insertion of list fields, U., bit manipu-
         lation

     .    NAMEUST: may  be used instead of a LIST
         on an INPUT/OUTPUT (I/O) statement and
         associated  FORMAT statement; provides data
         characteristic information

         DELETE:  provides facility to prevent com-
         pilation of a section of source code

         Free field input FORMAT ( )

         NTRAN: reads and writes blocks of data

         ERTRAN:  means of dynamically executing
         ECL from FORTRAN program.

-------
  ASCII I OUTRAN

       RTCC is ;i designated Its I  sile for UNIVAC ASCII
  FORTRAN.  This implies  that the compiler is available
  for user testing with the  recognition that it is still in a
  prerelease  stage and compiler problems are likely to be
  encountered. It features:

           ASCII  file formats compatible  with  ASCII
           COBOL

           List-directed I/O

           Enhanced  debugging  capabilities  including
           TRACE & DISPLAY.

  STAT-PACK AND MATH-PACK

      Comprehensive libraries of statistical  and mathe-
  matical subprograms callable from FORTRAN (com-
  parable to IBM Scientific Subroutine Package).

      The  STAT-PACK  subprgrams are  grouped  into
  13 categories:

           Descriptive statistics

           Elementary population statistics

           Distribution fitting and plotting

           Chi-square tests

           Significance tests

           Confidence intervals

           Analysis of variance

           Regression analysis

          Time series analysis

          Multivariate analysis

          Distribution functions

          Inverse distribution functions

          Miscellaneous.
     The  MATH-PACK  subprograms ire  grouped into
 14 categories:

          Interpolation

          Numerical integration

          Solution of equations

          Differentiation

          Polynomial manipulation

          Matrix manipulation: real matrices

          Matrix manipulation: complex matrices

          Matrix  manipulation:  eigenvalues  and
          eigenvectors

          Matrix manipulation: miscellaneous

          Ordinary differential equations

          Systems of Equations

          Curve fitting

          Pseudo-random number generators

          Specific functions.

Conversational Time Sharing (CTS)

     Conversational Time Sharing (CTS)  features the
following:

          Extensive file creation and editing capabilities

          FORTRAN prescan capability

          Desk  calculator  facilities  enabling  user  to
          evaluate  expressions and mathematical func-
          tions without any programming.

Functional Mathematical Programming System

     The Functional Mathematical Programming System
(FMPS) has the following characteristics:
266

-------
         Procedures commonly used to solve linear pro-
         gramming problems

         Generalized matrix generator

         Procedures for saving the basis, restoring the
         basis, and the procedures for obtaining error
         estimates and sensitivity analysis on the solu-
         tion.

SORT/MERGE

     The standalone, parameter-driven sort/merge pro-
cessor features the following:

         Will process  ASCII COBOL fixed length tape
         records

         Will process FASTRAND-format mass storage
         card files

     .    Will  process  FORTRAN V  formatted
         (80 character) recorded and unformatted flies.

PERT

     PERT is a generalized  applications program for
project/program planning and control. It contains both
          COST modules.
        Language Conversion Programs

     COBOL language conversion programs include:

          IBM PL/ 1 to UNI VAC ASCII COBOL

          IBM ANSI and COBOL F to UNIVAC ASCII
          COBOL.

Symbolic Stream Generator

     The symbolic stream generator creates a symbolic
stream  of data  and/or  control statements with great
flexibility  and powerful  modification capabilities.
Directions and models for  building stream images are
written  in  SYMSTREAM,  an extensive manipulative
language.

 CPDMPH

     CPDMPH is a utility which  can be  used to print,
 punch and copy  (ape or mass storage flics.
TAPETRAN

    TAPETRAN  translates tapes written on  other
operating systems (primarily EBCDIC oriented) to tapes
which are UNIVAC 1110 compatible.
ED
     ED  is a processor  with  powerful  capabilities in
creating and modifying data and progflun files and ele-
ments. It features:

         FIND subcommand for scanning entire file for
         particular character string

         CHANGE  subcommand for modifying one or
         all entries in a file.

 DOC

     DOC  is a processor for creating, maintaining and
 retrieving text-type data files.

 TPD

     TPD provides the facility to do a Directory Listing
 of a tape or dump it in Alpha or Octal format. Fastran
 files or elements can also be used as input.

 2.  PACKAGES   OBTAINED  FROM  PRIVATE
     VENDORS   OR  OTHER  GOVERNMENT
     AGENCIES

 SPSS

     SPSS is a Statistical Package for the Social Sciences
 and features:

          Flexibility in format of the data

          Routines commonly  required by the social
          scientist including:

               Descriptive statistics
               Frequency distributions
               Simple and partial correlation
               Multiple regression
               Factor analysis
               Guttman Scaling
               Cross tabulation

           Does not require programming experience.
                                                                                                          267

-------
  CALCOMP Graphic/Plotting Routines

       CALCOMP  Graphic/Plotting  Routines  are  basic
  plotter subroutines including:

           PLOT
           SYMBOL
           WHERE
           PLOTS
           NUMBER
           SCALE
           AXIS
           FACTOR
           NliWPEN
           LINI:
           TIIRI-K-D
           GI'CI'.

  A Tektronix interlace providing interactive graphics ca-
  pability has also been developed.

  SYMAP

      SYMAP is a graphics system designed for making
  presentations   to  nontechnical  people  and  has the
  following characteristics:

           SYMAP uses  10 intervals plus high and low to
           present the information as a shaded map pro-
           duced on a printer

           The symbol printed can be changed and the
           interval can be  controlled  in such a way that
           shading  density does  not   increase  with
           increasing values of the variable

           Legends can be included.

  SYSTEM 2000

      SYSTEM  2000 is a  general  purpose data  base
  management system and features:

          Creating and  modifying data base definitions

           Highly selective and  flexible capabilities for
           retrieving and updating values  in these data
          bases

          Interactive access capabilities

           Report Writer
          Program Language Interface, i.e., user-written
          COBOL and  FORTRAN programs can access
          the data base.
 APL
     APL is an interactive computer implementation of
 a language defined by Kenneth Iverson and named "A
 Programming Language." The  version implemented at
 RTCC was obtained  from the  University of Maryland.
 This  implementation is  designed to be as nearly  like
 IBM APL/360 as possible.

 SNOBOL/SPITBOL

     SNOBOL/SPITBOL is   a  string manipulating
 language especially well-suited to the processing of non-
 numeric data. It provides a means for searching through
 arbitrary character strings in order  to find patterns, to
 rearrange the strings and form new strings.

 PL/1 Subset

     PL/1 Subset was developed at the University of
 Maryland  and  is  only  available  on a  testing  basis.
 UNIVAC  plans announcement of  a PL/1  compiled
 during the first quarter of 1975.

 3.   SOFTWARE  CONVERTED FROM  PACKAGES
     AVAILABLE ON THE IBM/360

 Statistical Analysis System

     The Statistical Analysis System (SAS) feature

          Ease and flexibility in defining input data

          Wide range of available statistical routines

          Simple  command language requires no pro-
         gramming knowledge

          Data management capabilities.

Time Sharing Library - Interactive Statistical Routines

    The Time  Sharing Library (TSL) is composed  of
interactive statistical routines and includes the  following
characteristics:

         Allows user to enter data and retrieve results
         through interactive terminals
268

-------
          Requires no programming experience

          Particularly  well-suited for obtaining quick
          results on relatively small amounts of data.

 Keyword In Context

     Keyword  in  Context (KWIC) provides powerful
' indexing and  retrieval  capabilities  for  text  storage
T applications.

 FAST

     FAST provides the  ability to  retrieve  selected
 portions of information from a data file and print it in a
 usable  format and/or make it available for further pro-
 cessing- The control cards required  are quite simple
 allowing nondata processing oriented people to use
 FAST.
 ACCESSING THE RTCC UNIVAC 1110

      R.UHS  submitted   for  processing  on  the
 ilJNlVACUlO  require  a   valid  account number and
    •ect code. New users must register via forms provided
 L Us«rs S6™068 (FTS 919-549-2501) or found in the
 \-rrC f sers Reference Manual. The information needed
 includ*5  user's  DIPS Organization Code  and Project
 g.   ent  Code,  user's mailing address, brief project de-
 icription an(i facilities requirements, and a project code
   de up by tne  user f°r  each project being registered.

      TO  obtain  SITE-IDs  for remote and demand ter-
 rninals,  contact  Users  Services.  An information form
     t be  completed describing the terminal, its location
   .  indicating  the person to be contacted in matters
 coocerning  that  terminal. Data  Systems  will  then
     plete the form with a SITE-ID and phone numbers
 to be used, and mail a copy to the user.


 RTCC SCIENTIFIC SOFTWARE AVAILABLE

      RTCC  scientific  software  available includes  the
 following:

          CALCOMP Graphics

          SAS - Statistical Analysis System

          SPSS • Statistical  Package  for  the  Social
          Sciences
        STAT PACK - UNIVAC supplied subprogram
        callable from FORTRAN

        MATH PACK • UNIVAC supplied subprogram
        callable from FORTRAN

        TSL - Time Sharing Library of interactive sta-
        tistical routines

        FMPS - Functional Mathematic Programming
        System, procedures  for  solving  linear pro-
        gramming problems

        SYMAP - Graphics system

        APL - Interactive computer implementation of
        a  language defined  by Kenneth  Iverson
        requiring special character set or dygraphs

        FORTRAN V - Fieldata FORTRAN

        ASCII FORTRAN (RTCC  is • designated
        test-site for ASCII FORTRAN).

DATA MANAGEMENT SOFTWARE AVAILABLE

    Data management software available includes the
following:

    .   COBOL (ASCII)

        CTS - Conversational Time Sharing

        PERT • Project/Program Planning Control

        Language Converters

         -     IBM COBOL to UNIVAC ASCII COBOL

              IBM PL/1 to UNIVAC ASCII COBOL

        SSG • Symbolic Stream Generator

        CPDMPH - Prints, punches and copies files

     .   TAPETRAN - Translates EBCDIC   tapes  to
         UNIVAC compatible tapes

         ED-Processor for creating  and  editing  data
         and programs

         DOC • Processor for creating, maintaining, and
         retrieving test-type data
                                                                                                        269

-------
          TDP-Dumps  tapes and FASTRAND  mass
          storage files and elements

          FLUSH - Flowcharting package

          SYSTEM   2000 - Data Base  Management
          System
SNOBOL/SP1TBOL - String Manipulating
Language

KWIC - Key Word in Contest

FAST - Generalized data retrieval system.
270

-------
                         TECHNICAL AND ENVIRONMENTAL INFORMATION SYSTEM

                                             By Donald L. Woriey
INTRODUCTION

     The  gross incompatibility of the software system
(CFSS)  that  supported  APTIC  with the  new  En-
vironmental Protection Agency/Research Triangle Park
(EPA/RTP) computer system necessitated a thorough
evaluation of  APTIC's  ADP  needs and  alternative
methods of support.

     Results of this evaluation clearly indicated that the
most cost effective and recommended course of action
was to develop a storage and retrieval system designed to
meet APTIC's specific needs. The Data Systems Division
(DSD) joined with APTIC  in the development and im-
plementation of a system.

     The following sections of this paper will describe
the  basic design of the system developed  by DSD for
$upport of APTIC's current  requirements.

pESIGN GOALS AND BENEFITS

     The  Technical and  Environmental  Information
System (TENIS) was designed to provide the following
features:

      1.  The support of the activities which APTIC cur-
rently  performs  on a  regular batch processing  basis
including:

          The processing of input data received  from
          Franklin Institute

          The facility  to retrieve selected technical  or
          environmental information through the use of
          descriptor terms and/or  specific  document
          numbers

          The  use  of  a  dictionary to  control  the
          vocabulary of descriptors either In document
          input or in the searching process

           The pottback dictionary function currently in
           use by APTIC

           Output  may be provided in  the form  of
           printed  reports or formatted magnetic tapes to
           be  processed  by the  Government Printing
           Office
         The maintaining of the current concept and
         form  for  document  storage  and  line
         construction. This will enable the continued
         use of the programs which use the report tape
         output of CFSS. These programs will require
         conversion to Univac compatible code

         A remote terminal search capability.
    2.  The later expansion of the system to provide
capabilities which are desirable  but not a necessity at
this time. This expansion may include but is not limited
to:

         A file maintenance capability for updating and
         deleting all or selected parts of documents

         An extended search system to provide the ca-
         pability of the Identification File

         Special reports such as Inverted File Statistics,
         Output'Terms report, and others which are
         desirable.

     3.   A system which is developed and programmed
using a high-order language (COBOL) that ta an industry
standard and is understandable by the user.

     The benefits of this approach to APTIC and DSD
are numerous but include:

         The removal of agency dependency on an out-
         side contractor for diagnostic, maintenance
         and development support

         The  reduced  developmental  costs  of a
         government-produced system

          A system developed specifically for the needs
          of APTIC

          A system developed by the people who are
          responsible for using and maintaining it

          The reduced operational cost of a  system
          which meets the specific need* of APTIC

          A modular system which will be expandable as
          needs dictate.
                                                                                                            271

-------
   TENIS SYSTEM DESIGN

        An  overview of  the  initial TENIS  is  shown  in
    Figure 1. The Secondary Search File (previously referred
    lo as the Identificatio;  File) is shown by dotted lines as
    an  extension. With  lii.i  files and  systems  (groups  of
    programs)  reflected  on the overview as a reference, a
    general  description of  the  system follows. A more de-
    tailed systems  (low chart  of programs is included  as
    Attachment A and a sample run of  the terminal system
    is included as Attachment B.

        1.  New Document Input.—Input to TENIS will be
   structured in the same  format as that presently received
   from Franklin Institute. In addition, the structure and
   content  of  documents and lines of text will remain
   consistent with the present design.
        2.  Dictionary. -The dictionary  will be organized  as
   a vocabulary control  file, and as an indexer for entering
   the  search  file.  The  postback  capability  currently
   provided  will be available. One listing of  dictionary
   contents  is included as well as the  ability  to  add and
   delete terms. The ability to delete terms will be provided
   when the  file maintenance extension is implemented.
   The  dictionary  will  be supported  on a direct access
   device.
        3.  Table of  Contents. A one bit  per APTIC
   Number (I  to 200,000) file to indicate the  status of the
   number. This file shall reside on a direct access device.

        4.  Text  File.—The text  file  will contain  data
   types 01,  035,  036, and  045. One author (06), the
   journaJ (09), and  the  year (10) will be included for
   sorting output. The text file is designed for efficient use
   in a hatch environment and will be sequential file stored
   on  magnetic tape.  Data  items  are  summarized   in
   Figure 2.

        5. Search File.-The search file  shall include all the
   descriptors  necessary  for searching. Data types include:

             ID No.               Data Item

               05       Method of support
               10       Year of publication (for internal
                        storage and for searching, it  will
                        be prefixed by a P)
               13       Language
               14       Translated (yes only)
              15      Primary category
              17      Secondary category
              18      Document attributes
              19      Indexing descriptors
 All entries to the search file must first pass a vocabulary
 check by  the  dictionary. The search  file will also be
 directly accessible.

     6. Secondary Search File (not implemented).-The
 secondary  search  file  will  include  the additional
 descriptors which do not  require a vocabulary check.
 The file will  include authors (06,07) and journals (09).

     7. Search  Input.-The  search input shall use  an
 80 column card format  that  is  shown  in  Figure 3.
 Columns 1 to 4 are fixed and other columns are free
 form with punctuation.

     8. Document List.-This is a temporary  file  used
 in the text retrieval process. It will consist of APTIC
 identification  numbers of all documents that have met
 the search criteria.

     9. Report Tape.-The report  tape consists of all
 selected text  material  (data  types 033, 036, and 045)
 that has  been  sorted  into  appropriate  reporting  se-
 quences. The format will be consistent with the present
 report tape and may be used as input to many of the
 current APTIC programs written in COBOL.

     10.   Selected Documents.-Documents that have
 been searched and retrieved according to the pre-ent   ed
 criteria will be formatted and printed according to pre-
 sent standards.

     11.   Edit and File Update System.—This subsystem
 consists of six individual COBOL programs and several
 utility sorts.

     12.  Search System.-These two COBOL programs
 will  provide  the Boolean logic necessary  to effect  the
 descriptor search of the data files.

     1  3.   Text Retrieval  System  and  Report
Program. -This subsystem consists of two new COBOL
 programs  and several pf the publishing programs  that
will be converted from the present APTIC system.
272

-------
                             EDIT. &
                           FILE UPDATE
                             SYSTEM
                                               SECONDARY
                                                 SEARCH
                                                  FILE
           TEXT
        RETRIEVAL
          SYSTEM
 I
 I
 t
GPO
1
F
REPORT
PROGRAM


1
SELECTED
DOCUMENTS
                               Figure 1
                         TENIS System Overview
                                                                                  273

-------

ID
01
05
06
07
09
10
13
14
16
17
18
19
.15
.U>
45
ftj A Vlfcjl IM
MAXIMUM
DATA ITEM NUMBER
APTIC Number
Method of Support
First Personal Author
Other Authors 35
Title of Publication
Year of Publication
Language of Publication
Translated
Primary Category
Secondary Categories
Document Attributes 5
Indexing Descriptors
APTIC Number and Authors
Bibliographic Citation
Abstract

TEXT
X

X

X
X






X
X
X
RECEIVING
SEARCH
X
X



X
X
X
X

X




DATA FILE
SECONDARY SEARCH
X

X
X
X










                                                 Figure 2
                                         Data Items To Be Accepted
                                                by TENIS
27-4

-------
General Format

Column 1 to 2 — A two character search identify number which occurs on every card for a search and
     which identifies the output.

Column 3 to 4 — Card'Type

      010 to 015  Title Card               1 to 6    occurrences
      020 to 025  Control Card             0 to 6    occurrences
    * 300 to 599  APTIC Number Search      0 to 300  occurrences
    * 100 to 299  Boolean Search           0 to 200  occurrences
      600         Reserved  for Secondary  File Search

    -When used as a  sequence  number,  it raust be unique  and  ascending.

Columns 5 to a column containing an P —  Control  Information

Column containing $  to column 75 — Comments not  processed  by  Search  Program
I:OL.
L'S E .
J 2
J I
345
C T
6 to Cd
control
C" to 76
comments
77-80
Reserved
Title Card

from 0 to 71 characters of  title information, to be used on printed report.  Printed in order of
occurrence  for up to 3 lines.   The title card may not contain comments (if Included, they will be
printed as  part of title).
Control Card

fields are identified by key term and concluded by ';.'.  Present value is used if not entered.

     Maximum Number of Documents
3 •
     Key is M
     Torm i« M=NNNNNN;
     1'ivsi't  is M = 1000;

j, _   Output  Control

     Key is P
     Form is P - A, B, T, N, D, any combination which is meanful.

        D - Text only  (eliminate 35 4 36)
        A - Segment  35 i 36
        B - Segment  35 4 36 & 45
        T - Text Record
        N - Document Number Only
        Preset is P «  B

 Ci  Sort Key

     Key is S
     form is S=SN, SN,  SN;

        N»D - Document  Number
        N»J - Journal  Title  (Ascending  Only)
        N'-A - Author  Title  (Ascending Only)
        N«Y - Year
        S»A - Ascending
         S=D  - Descending
                        4   5
         Pi-fsct  is  S  *  DY, AD

                                             Figure 3
                                     Search Control Input Format
                                                                                                  275

-------
                                                                                                  Figure 3
     d .   Date ol  hnlry                                                                            P>gB 2
            1) - MMDDYY;  everything after  the date
            I) = -MMDDYY; everything up to  the date
            D = MMDDYY - MMDDYY;  between  these  two  dates  but  not  including them
            Preset is P - 00 00 00 - 123199
     e.  Report Number

         Key is R
         Form is R=NN;
         Preset is R=01;
         :;ote:  Control cards are  continued  by a ';'.
                If all presets  are  acceptable, a control card is not required.

 4.  Document Number Card

     Six digit numbers separated by ';'.   Space may occur only between ';' and first digit; and  leading zeros
     are not re-quired.
     Ranges may be selected  by  placing  a  hyphen between two numbers.
        Lxample:  5178:06034;                5178-Of)03A ;

 '- .  '•'•'.iroli I'i le
     ('unib ina t ions of operators

        •  - open parenthesis
        ^ - close parenthesis
        ? - end of search
        : - end of Boo]ean search
        & - and
        * - or
          - not   (underline)
        operands selected from  the  dictionary
        $ - truncation
        ; - document or year range  selection;
              Range selections  are  made using:
                F.Q - Kqual
                GT - Greater than
                GE - Greater than or  Equal
                LT - Less than
                LE - Less than  or Equal
   A document  range  is  selected by "DOC1  and year by 'YR'.
   The  f onnat  i s :
       ;iH>c C;T  r>ooo LT  10000;
       ; YK FQ  GT;

   "•'he  second  condition  is  not  required.

   Note:  Two  range  tests may  not  be  combined with themselves.
          Acceptable -   R,SASR0
          Not  Acceptable -   (R^ 1  R,)1A
276

-------
                        TENIS SYSTEM FLOW CHART
                                      Attachment A(l)
MAINTENANCE SUBSYSTEM
                            CODE TRANSLATE
                            & CONVERT: EDIT
                            CONTENT & LIMITS
                            DICTIONARY AND
                           TABLE OF CONTENTS
                               EDIT AND
                            FOR MAT OUTPUT
 INVERTED FILE
 UPDATE & KEY
 UPDATE IN THE
DICTIONARY FILE
               V  INV.   /
               \™*s
                                                                                     277

-------
                                           SEARCH AND REPORT
                                               SUBSYSTEM
                                                                                    Attachment A(2)
                        SEARCH
                        ERRORS
SORT
1

TEXT
SEARCH

2
                                                                        FINDS
SORT
i

REPORT
PROGRAMS

2
                                               REPORT
278

-------
DICTIONARY UPDATE
Attachment A(3)
                UPDATE CARDS
              TERM NO.
              PARAMETER
                CARD
                          IDS I
                                IIASDICT
                              CALLED BY
                              UPDATE PGM
DICTIONARY LISTING
    DICT. LOAD
       DICT.
   MASS-STORAGE
                                                            279

-------
                                                                      Attachment B(l)

                                    APTIC Using TEN IS

       The  current  implementation of APTIC includes  2800 tef^is  in the
  dictionary and over 60,000 documents in  the master te^t .file.   The
  master  file is growing at over 700 documents per month.

       Three methods  may be used for obtaining results from APTfC:

          1.  Call  Mr. /lalpth*s  group as in the past - and wait  for
               results  via mail.

          2.  Use terminal and review Citation File  directly.

          3.  Use terminal - review Citation File and  submit search for
               results  to be returned by mail .

       A  sample terminal session follows.
          TO  Pi
06  NOV  74     10:54:0
       S'lAUCH:
>AA010   INITIAL tNTKY
>AA100  ACROLEIN?
       NUMrtE'<  OF DOCUMENTS   .tr'L  C Fi iJ  = fl   PIS

INSTWUC r?
>Y
       *ei  .  -'it  "I «\*  lt:  V| '  i)
         G  FO^  OnCUMK'-jr P^s  HA  1:V,  •   (,);'iil',  >iF l,
         R  TO .ir.C TV   «, I. CT  ) '"-r'Jl  A.     I.  ! K-<  TTM
       T  '0 iFRMtNAI'"-  .If  S'7A'G
DOCUMFNTS '10 f MISTAYKH =  0    qi;>
APTTC NO  = it •  1  fiPQ
       VFNGt- RSKAY-\» KH.  Y.A.,   V.  ••'.  T"K-SOV»   ;,Nn (..  T.
       AI3  pot'  UTT^N o.i'-:' 'G  TI-IL  MA j'!P  cr^-;:.  OF  ^ • i  'TjSE D  on..   («-'••> i i--i .   . K. J  ^ .1   ,  . .ri    CY  J.  M/'.'^.IA'.'O
       OlViU FH (Fs.lOI.D  .,1   '.IT'JAI' )i,  -I-   > •>   •, >  -Jr'i)  Ai'"H.  t>r,i  AU''iu  ,.>.  11 ,r,  IK-  l(|i  AIK "OL. HTION
       AS 0(;r.\TI"i|» M.  :'VH»  -IT'.'  .. J l      A- — U  I«-»SH.   i>A -:-'^ 6H-1

-------
                                                                  Attachment B(2)
APTTC NO  = n ' '1
      ALRKS« Y VA»  M. V.»   A. S.  (>/»-•<. KIY.   A.JD  V.  A. KHW ISTALEV/A
                 5  mi-I C>) J:V iTUA. I:> .   F  r -Xif ••'•H-Mi'.T FROM
                  ( (0 -.UT7HF Jl •  KOiJl'S* 't .".; O.)
      HES':.A;M  HTi.« WKH OVIfjAT'- 1 • T , ) »   TRANSLATED
                               •' '>V'M. (v>-v,C w>>  :f.J. ,•/:>-', OEC.
                                   :AMM-\
                                         -  -in-'
                                                                    FOR f>3
                                                            «. OCIATTON»
                    .. on •.  I«J.,M


                    L. OTTs  i"^1K>
                                                 L. CLAHKK*  AND
APTTC  NO = n> M
       CAMPRFI.i »
       WO'iF. -.  L.  P»AT;.
       INHALATION rJXlClfY -)F  IMF AIi<  IJDL ilTrtU-  i"-'.i:OXYACKTYL  NITRATE:
       DEPP'S' -TON >p v )i.ilMTA>H« >, UATln'-'AL  AfR
       POLLUTION CONTROL AnMlrj!ST,*A: tOi4» «1<*)iP.,  ((1QfjH)\.   MREFS.
                !••• jr
        >r
        F.MT.-^  q -A--LM:.

        >AAniO.AC^vJLHIN WITH  LIMITIiJG PARAMETER
        >AA100 AC^OLEIN « I Y^l  GE 1973 I ?
                      OF DOCUMENTS ' EL-.CT'O  = n.  23
        >N
        FN
        >B
                SUHMI reo   ro  BATCH OOL- EC TOR
        NF.W  SrANCH?  >OFIN
        IN FX C M01»*
-<.)iio:
                      AC  r:  «*»!»?•.:•
                                                    «MOJF.CT:
                              17=
                                                             n
                                                             o
                                                             n
                                                             n
    I T>A"I
                     )' ) ''•'
                MI
                    rrr
                  I.). •  I '.  ' . H '
                  n  :
                  n  «
                                            oc=
                                           i/o:   »  :
                                           WATT:  n  :
                                           ES --
                                         '  Fin:
                                                                  NOV
                                                                                281

-------
                               THE CONVERSION OF CHESS AND OTHER SYSTEMS

                                By Andrea Kelsey, Gene R. Lowrimore,* and Jane Smith
   THE CHESS PROGRAM

       The  CHESS  rescind)  program is  a program  of
   epidciniological studies designed lo determine the effects
   of elevated levels of air pollutants on human health. This
   program provides  the bulk of  the information upon
   which  current ambient air quality  standards are based.
   The data processing problem presented by CHESS is the
   collection  and analysis  of  data  on  certain  disease
   symptoms,  data  on  demographic  characteristics, and
   data on  air pollutant concentrations,  from  which esti-
   mates  of the effects of  air pollution  on  health  are
   computed.

       Some background on the evolution of  the CHESS
   program will explain  the situation  that prevailed when
   the selection of the Univac 1110 system was announced.
   The CHESS program was conceived  about four years ago
   as a scries of replicated studies in  several metropolitan
   areas. When originally planned, these studies were to  he
   conducted  independently. Replication was planned from
   area lo area as well as from time  lo lime in  the same
   area. One replication of the study in an area is referred
   lo as a round. Several indicators of human health were
   measured:

           Chronic Respiratory Disease (CRD)

            Acute Respiratory Disease (ARD)

           Pulmonary Function (PFT)

           Acute Episodes

           Asthma Panel
           Aggravation
           1'anels)
of  Chronic Symptoms (Adult
           I ower Respiratory Disease (l.RD).

       Ity the  end of the first  year, researchers  realized
  thai: (1) ARD panelists should only be selected from the
  CRD population, (2)data on children in the PFT study
  should  be related  lo the CRD data for the family from
  which they came,  and (3) any  other possible linkages

  *Speaker
                                      between studies should be made. In this way informa-
                                      tion could be shared between studies and all available
                                      information relevant to the analysis could be obtained.

                                          A  short  time  later  an  additional  extension  was
                                      made to require us to link  together the  data of those
                                      who participated in  more than one round.  The extension
                                      was made  to enable researchers  to block out constant
                                      person-to-person differences when making comparisons
                                      over time.  These  latter changes in the CHESS program
                                      were made in the summer of 1972. Figure 1 shows the
                                      relationships between the various CHESS studies for a
                                      single  round.  Partial or  total overlapping  of clouds
                                      implies  that data collected on individuals in both studies
                                      are linked  together. With the exception  of CRD,  one
                                      round of all studies is conducted each year. One round
                                      of CRD is done every three years.
                                          The  data  processing  system was  redesigned  to
                                     handle the CHESS concept as it existed  in the summer
                                     of 1972. Two parallel efforts  were thus begun, one to
                                     implement  the redesigned system, and the other to make
                                     past  data conform in format  and structure to those of
                                     the redesigned system.
THE SYSTEMS DESIGN


     For the purposes of this paper,  the CHESS data
processing system  is composed of several  smaller sub-
systems, each of which handles one of the health indi-
cators mentioned  earlier.  A typical subsystem includes
processes to:

         Edit background questionnaires

         Establish Master Files

         Print optical mark reader forms

         Read batches of mark reader forms

         Edit  (detect  errors) data  files  of periodic
         responses
•282

-------
         Purge (correct errors) data Hies of periodic
         responses

         Combine all periods  of  responses to update
         Master Files.

     All these operations are standard business-type data
processing  applications,  and  they are best  done in
COBOL. After the Master Files have been updated, they
are  used  in  large-scale  statistical  analyses. Standard
packages are  used, such as Statistical Package for the
Social  Sciences (SPSS) and Statistical  Analysis System
(SAS), along  with other  user-written  packages and
routines which are considered less standard. These user-
written   programs  are  almost  always written in
FORTRAN.
     Muster Files are also used to update what is called a
Linkage File. The Linkage  File contains all the informa-
tion collected in all rounds on an individual or a family,
as the case may be. This file will also be used for statisti-
cal analyses  of health status changes  over time, but no
statistical  analyses  have actually  been done on the
Linkage File.

     Figure 2 shows an overall system flow indicating
I he language  resources used. Figure 3  is a more detailed
description of  the subsystem  for performing the data
processing for the Acute Respiratory Disease indicator.
Manual processes  are necessary to complete the sub-
system but are not shown.

IMPLEMENTATION

     Through a  work  agreement with the  General
Services  Administration (GSA),  the  Human Studies
Laboratory has access to programming and data clerical
services furnished by Data Processing Associated (DPA),
a  GSA  contractor.  To  implement   the  system,  the
Laboratory  gave  DPA a written specification for each
process, and  DPA delivered a tested program to perform
the  process  along with complete program  documenta-
tion. The Laboratory  maintained control by requiring
DPA to obtain approval at several steps in the develop-
ment process.

     The  implementation  of the redesigned system had
 been in progress for about seven months when it became
 necessary to begin  the conversion effort. Laboratory
 project commitments also required the continuation of
 the development of the IBM version of the system.
CONVERSION PLAN

     The  following  conversion  plan  was  announced
concurrently  with the procurement announcement  of
the Univac 1110 computer system:

     Step 1:   The user would submit source code, sys-
tems flow charts and sample run streams for all programs
which the user wanted to have converted.

     Step 2:  The user would submit file descriptions of
all files to be converted.

     Step 3:  The user  would submit test data and
sample   results  for  the  program  as run  on  the
IBM 360-50.

     Step 4:  The contractor would submit complete
programs and successful test  runs to  the  user for
concurrence.

     Step 5:   Data Systems Division would convert  all
user production files. This promise was later withdrawn.

No direct contact between the user and the converting
contractor was planned. This procedure was expected to
provide  the major means for converting user  systems
from the IBM 360 system  to the Univac 1110 system.
For reasons discussed below, this plan  did not work for
the  CHESS system.  Yet, it  seemed to work fairly well
for  other systems  in which the  Laboratory has  an
interest, such as SAS.

 CONVERSION, GOOD AND BAD

     It has been concluded that contracted conversion
 for CHESS did not work  well for two reasons:

          The CHESS system, like many others, is a
          changing thing which refuses to remain con-
          stant long enough for someone to convert it.

          The users all tend to underestimate how much
          a system depends upon  the environment in
          which it was developed.

     The Laboratory staff submitted about 60,000 lines
 of source code to be converted under the conversion
 plan. Between the time of submission of these  programs
 and the time ready for checkout, changed functional
 requirements, program modifications, and incompatible
 architectural changes made by the converting contractor,
                                                                                                            283

-------
iniiitc it  possible lo  lest  only  about  10,000 lines of
source code. The remainder were accepted in the clean-
compile stale.

     After  consideration of the above  events,  the staff
became  skeptical  that  "turnkey"  conversion  is even
possible for the majority of systems.  Any application
system has to be thought of as being  embedded in an
operating  environment.  The  scope of  the system  is
determined  in  part  by this  environment. Capabilities
essential to the  successful operation of the system are
provided by the environment in one operating system
but  would have to he part of the application system in
another. Cases in point  are the ability to concatenate
files  through Job Control  Language and the ability  to
specify  blocking factors  at the  execution  time. Both
these features are available in the IBM system and absent
in the Univac system. Suppose that  the files labeled F7
in  Figure 3 had been created  using  several  different
blocking  factors and  the files had  all been converted
 using the original blocking factors,  then there  would be
 no way that the next process could be run successfully
 for  all  these valid inputs.  A process also has to be  in-
 serted to do the concatenation. In  IBM COBOL there is
 a utility  assignment which can be  either tape or disk,
 specifiable  at run  time. Univac COBOL does not have
 this  capability. The person  converting  the program is
 forced  to make this decision although it is obviously a
 systems decision and is greatly influenced by the policy
 decision made by central site personnel.

      Following the  decision of the Laboratory staff to
 perform  the conversion of CHESS, the system was re-
 designed to account  for the  differences in Univac and
 IBM environments;  these changes were translated into
 modifications for  existing programs  and  specifications
  for new programs. No attempt was made to optimize the
  design of the system with respect to the Univac environ-
  ment.  The staff made all blocking factor and device type
  decisions and began to convert existing files to Univac
  media using these  factors; they also assigned  the con-
  version  of  the   programs   to  the  Data  Processing
  Associates (DPA) and monitored the effort according to
  methods  employed  in any other program development.
  Since  then the conversion of CHESS to the Univac has
  gone smoothly. Of course, there are problems, such as a
  bug in ASCII COBOL, which prevents the conversion of
  the variable blocked Hies. Comprehensive training was
  made  available lo the users by the  Data Systems Division
  al,,| was one of the big factors which enabled the stalf to
  confidently assume the conversion effort.

      Some  of the  pioblems encountered in  the con-
  version of CHIiSS are evident even for the conversion of
SAS which, except lor some delays, has gone well. While
the official goal for the conversion of SAS was that all
features available  in  the  IBM version  would  also be
available in the Univac version, fairly restrictive require-
ments  were, in fact, placed on input by the converting
programmers.  The input  now  must be either  cards or a
catalogued tape or disk file written one record per block
in unformatted FORTRAN. A record  may not exceed
3200 characters.  The IBM version had a record length
restriction of  32000 characters and accepted files with
•any blocking  factors. Also, IBM  SAS provided its own
sort routine   for small sorts  and interfaced with the
system sort utility for large sorts. The Univac version has
no interface with a system sort utility. Of course, many
of these restrictions are  made inevitable by  the Univac
operating environment, but had these decisions not been
left to the discretion of the converting  programmer,
 there would almost certainly be  fewer input  restrictions
 in the Univac  version.

 INCREASED CAPABILITIES

      Increased  capabilities resulting  from  the design
 changes are as follows:

          Fortran V is more  powerful than  Fortran IV,
          particularly in the area of string manipulation.
          The extra precision provided by  the longer
          word of Fortran V  in many cases  would obvi-
          ate the use of double precision.

           Run stream construction is somewhat simpler
          but may be  partially due to the decreased
           flexibility.

           Management of program files is probably the
           single  most  impressive  feature  of  the
           Univac 1110  system.  Source  programs,  run
           streams,  and   executable  programs,  are  all
           easily established  as  program  files. Powerful
           software is available to edit these  files. Source
           decks will probably no longer be  used  except
           as a means to establish the initial  version of a
           file. Up to five updates of a given file can be
           kept and easily referenced.

   DECREASED CAPABILITIES

       Decreased  capabilities  resulting from the design
   changes are  as follows:

           The  Univac  System  is very weak  in utility
           support  such as  copy routines for  data files
           and  stand-alone  sorts.  Particularly  if  °ne
 784

-------
         writes in COBOL ami ii is not feasible to use
         inlcinal  sort  capabilities, programs  may  be
         luiill jusl  to pcrl'orni the sort steps. There is a
         file  dump program called CPDMPH but it does
         no  formatting and is useful  only as a last
         resort.

         Many decisions which were previously made at
         run  time, must  now be made at the time the
         program is designed.  Some of these decisions
         concern:  (l)the number of input Hies for a
         given external name,  (2) the blocking factor,
         and (3) the  storage device.  Fixing blocking
         factors at program design time will probably
         be a blessing once the conversion period has
         passed.

         There is no adequate batch terminal support.
         Univac batch terminals  do not provide any
         indication that a job was successfully entered.
         Likewise there is no way  to determine  through
         the  batch terminal what  the status of a job is.
         The absence of such  feedback  has hampered
         the  process of learning to use the terminals.

         Files  written by  different  processors  are
         incompatible. In the Univac operating system
         concept,  much  of the file management has
         been  delegated to the  language  processors.
         Each language processor has its own way of
         writing Hies. As a result, users must introduce
         at least one  more process  into the converted
         systems just  to reformat the file for statistical
         programs which follow.

         Input formats of analysis packages are incom-
         patible.  The input  to  SAS  is  FORTRAN
         unformatted  binary.  The  input  to SPSS is
         FORTRAN  formatted.  These  two  packages
         arc  not totally interchangeable in function,
         uiul if both  are needed, parallel flies  must be
         provided.

         Management  of data  files  is   inadequate.
         Putting  multiple  files  on  tape,  identifying
         them, and  using them  seems to be an  un-
         reasonably difficult task.

MIXED BLESSING

     Using a  large fixed  mass storage device is a new
concept  for  the  Laboratory staff.  Instead of having
many small  files on tape, they may  now be stored on
mass storage as catalogued files. 11° the system runs out
of disk space, it rolls  some of the less frequently refer-
enced flies out to tapes, a process which is almost trans-
parent to the user. This concept greatly improves the
ability to perform production data processing. But in the
process of rolling files in and  out, flies have been lost.
Sometimes the  latest  good system  backup is one week
old.  The whole concept of file backup has to be recon-
sidered to keep from permanently losing data.

CHANGES TO CHESS SYSTEM

     One  of the biggest  factors in  the development of
future systems is the availability of a powerful Data Base
Management System, System 2000, on the Univac 1110
System. Establishment of the Linkage File is being con-
sidered as well as the Master File shown in Figure 2 as
System 2000 Data Bases. Such a move  would offer in-
creased  flexibility in  terms of responding to uncertain
requirements for  statistical analyses. However, many
additional people, mostly analysis  programmers, would
have to be  trained in the use  of System 2000 to allow
the change to be of any real benefit.

     Several other design changes are being considered,
such as the on-line input and correction to data.

RECOMMENDATIONS

     For similar conversions in the future, the following
approaches are recommended:

     1.   Those systems which  are reasonably stable and
have a  clearly-defined  function, such as SAS, can be
turned over to  a contractor to convert as a system. Even
for these systems, the primary communication channel
must be between the  user and the contractor converting
the  system. The user must approve all design changes
before they are implemented.

     2.  For other systems, the user must redesign the
system to be compatible with  the new operating system.
A level-of-effort contract can  be helpful to convert and
modify programs and to convert flies. But the important
 point is that this contract is an extension of the user and
 not an alternative route for the conversion.

      3.  Put first priority on making the batch terminals
 intelligent.  This in itself would have a surprisingly posi-
 tive effect  on getting off the  old machine and onto the
 new.
                                                                                                              285

-------
                      Chronic Respiratory Disease
                      Lower Respiratory  (CRD-LRD)
            Acute
            Respiratory
            Disease
             (ARD)
Acute
Episodes
                                                                        Asthma
                                                                        Panels
                                                                           Adult
                                                                           Panels
Pulmonary Function
(PFT)

-------
              Inputs    —[>j
Master
File
COBOL
Linkage
File
                                                       SPSS
                                                       SAS
                                                       FORTRAN
                                                    r
                                                       Analyses
to
00
-4

-------
          "«,n  U '«'"
         _««,»»m j 1  ,„
                 Mv|* .( (f *•• i-M-jtwl
IQMPUTtO  I -^

 — -JL. ""
 1.  This first step  in  the  ARD  system depends  upon  the CRD
    Master  File  having  been completed.  The forms  printed
    are readable on  a Sentry 70 Optical  Mark Reader.
    Interviewers visit  the  families  selected and attempt
    to enroll them in the study.   Any listing  errors  in the
    information  about the family is  corrected  at that time.

2.  The Master File  has two functions.  First  it is
    used to print all interview forms and  secondly  it
    will contain all background and  response information
    for a family.  Direct access  is  required because
    interview forms  are randomized and variable  length
    records are  used because  of the  distribution of
    family sizes.

3.  All identifying  information is printed  on  the form and
    also slugged  in machine  readable format.   Each
    family is called every  two  weeks for 32 weeks.
4.  The forms are read on a Sentry 70 Optical Mark Reader
    to 9 track 1600 bpi EBCDIC tape.
                             5.   Logical  errors are spotted and recorded in file F6
                                 along with each record somewhat reformatted.  There
                                 1s one record per form.
                             6.   The  output  F7 Is the corrected file and is optional.
                                 It  is  only  created after the deck of corrections has
                                 been cleaned  up and there are no remaining errors
                                 which  could be corrected.
                             7.  Again  the  output is  optional.   If there are any
                                discrepancies  one more cycle back through Step 6
                                is done.   All  periods  are batched together for this
                                process.   When the data is as  clean as is
                                reasonable, the output file is updated.

                             8.  The  records are reformatted to be consistent with
                                standard packages.   It is also necessaryit compute
                                certain intermediate values.


                             9.  Canned packages  are  used  wherever possible.   Some
                                user written analysis  programs are usually required.
                           10.   The data for the current round is added to the data
                                for previous rounds.  Information for families 1n  the
                                study for more than one year 1s tied together before
                                storing.
                                    FigureS
                  Description of the Acute Respiratory Disease Subsystem

-------
                                  PRESENT STATE OF EDP POLICY AND
                                         DEVELOPMENT IN EPA*

                                           By Theodore R. Harris
    This article  presents  some items currently under-
taken  by  the  Management  Information  and  Data
Systems Division  (MIDSD) of the Environmental Protec-
tion Agency (EPA).

AGENCY-WIDE EDP PLAN

    The  House  of  Representatives  in  Report
No. 93-1120 has  given  EPA a  directive to conduct a
detailed Agency-wide study of its ADP requirements and
to report its findings to the House and Senate Appropri-
ation committees. No additional ADP equipment is to be
purchased until the study  is completed. The milestones
for the plan are shown below. The completed plan is to
be submitted January 2, 1975.

        Agency - Wide EDP Plan Development

  8/  7/74 Statement of work for assistance and con-
          sultation in development of plan

  9/  1/74 Award of contract

12/11/74  Review draft of EDP Plan from contractor
12/28/74  Delivery of final  EDP Plan from contractor

  I/  2/75 Suhimltal ofl-;i)P Plan to the WlHlten
          Committee

WASHINGTON COMPUTER CENTER PROJECT

    This project was established  to determine the ADP
requirements  of  the  Agency.  The  milestones for  the
project are shown below.

         Washington Computer Center Project

ft 9/ 2/74   Feasibility Study, first draft

^lO/ 1/74   Feasibility Study incorporating Tele-
             communications, second draft (mail out
             to members of Task Force)

^10/16/74   Fourth meeting of Task Force (to approve
             study)
                                                        tt.
11/20/74  EPA review and approval of study (Messner
          and Aim)

 I/ 2/75  Whitten approval of Feasibility Study

 4/ 1/7S  Development of RFP

 5/ 1/75  EPA review and approval of RFP

 6/ 1/75  GSA review and approval of RFP

 7/ 1/75  Issue RFP to public

 10/ 1/75  Question and answer period, receipt of
          proposals

 5/ 1/76  Review and evaluation of proposals

 8/ 1/76  Cost evaluation and audit of proposals

 10/1 /76   Selection and award

 4/ 1/77  Delivery and acceptance of equipment/
          service

 7/  1/77  Test and acceptance

 11 /  1/77  Conversion of work load (vendor dependent
          from 6 to 18 months)


 INTERIM EDP RESOURCES PROJECT

     The purpose of this project is to procure an interim
 computer resource to fill  the gap between  Optimum
 Systems Inc.  (OSI)  and  the Washington  Computer
 Center. The milestones for this project are shown below.

            Interim EDP Resources Project

   8/ 5/74   Decision to establish project

 I2/ 1/74   Development of RFP
•Dates have been updated as of December 5, 1974.
 t Project has been inactivated. Hie Interim LDP Resources Project has taken precedence to provide computer resources by July 1975.
 The Washington Computer Center Project will be activated after conversion to Interim Computer Resource.
f |Milestones were nut met.
                                                                                                          289

-------
    I / 2/75   EPA review and approval of RFP

    21 I /75   GSA review and approval of RFP

    2/15/75   Issue RFP to pul lie

    4/ 1/75   Question and answer period, receipt of
             proposals

    5/ 1/75   Review and evaluation of proposals

    6/ 1/75   Cost evaluation and audit of proposals

    7/ 1/75   Selection and award

   I I/ 1/75   Delivery and acceptance of equipment/service

   12/ 1/75   Test and acceptance

    4/ 1/76   Conversion of work load
  STANDARD  EDP COMMUNICATING TERMINAL
  PROJECT

       The goal of this project is to standardize types of
  terminals and  procurement of the terminals for the
  Agency. Preliminary specifications were sent out in June
  with responses  coming back in late July. The milestones
  are  shown below. The current status is October 1 with
  momentary issue of the RFP.
     Standard EDP Communicating Terminal Project

    8/15/74   EPA/MIDSD review and approval of RFP

    9/24/74   GSA review and approval of RFP

  12118/74   Issue RFP to public

   21 1/75   Question and answer period, receipt of
            proposals

   3/ 1/75   Review and evaluation of proposals for all
             Categories I, II, HI and IV

   4/ 1/75   Selection and award Categories I and II

   5/ 1/75   Benchmark period for Categories III and IV

   6/15/75   Selection and award Categories III and IV
    SUMMARY OF TERMINAL DESCRIPTION

    Category             Description

        I.A       Low-speed typewriter style
                 Quality impact print
                 Off-line text editing

        I.B       Low-speed typewriter style
                 General purpose portable

         I.C      Low-speed typewriter style
                 General purpose nonportable

        11. A      Display terminals
                 General purpose A/N
                 Display

        II.B      Display terminals
                 Graphics

       III. A      Medium-speed remote job
                 Entry terminal

       III.B      High-speed remote job
                 Entry terminal

       1V.A      Remote job entry with
                    concurrent processing
                    capabilities
                 Data entry oriented

       I V.B      Remote job entry with
                    concurrent processing
                    capabilities
                 Scientific oriented
EDP DATA COMMUNICATIONS NETWORK

    A feasibility study contract was awarded to deter-
mine  EPA  network  feasibility. The study has been
completed.  A common network with a design linking all
users  to  the two  major computer resources of  the
Agency has  been recommer.ded.

    Figure  1  shows the EPA network WATS telephone
service.  Figure 2  shows  the  multiplexor  placement.
Figure 3  shows a  typical  multiplexor configuration.
Figure 4  shows a  functional  representation  of  the
network. The milestones of the project are shown below.
2'>0

-------
        EUP Data Communications Network

 6/10/74  Date of conlrucl awurd lor 'Tcusibility of
          an EPA Data Communications Network"

 9/18/74  Delivery of Data Communications Feasi-
          bility Study Report

121 \ /74  Initiation of Phase II, produce RFP specs

12/15/74  Delivery of RFP from contractor

 \l 2/75  HPA review and approval of RFP

 I/ 2/75  Whitten approval of feasibility study

 2/ 1/75  GSA review and approval of RFP
 2115/75  Issue RFP to public

 4/  1/75  Question and answer period, receipt of
          proposals

 5/  1/75  Evaluate proposals

 6/  1/75  Cost evaluation and audit of proposals

 II  1/75  Selection and award

11/  1/75  Delivery and acceptance of equipment/
          service

12/  1/75  Test and acceptance

 4/  1/76  Conversion of work load
                                                                                                          291

-------
c
 J

-------

-------
                                         High Speed  channel
                                           Modem
                                                                                               Network Co.T..T;u.-.;cal.o."
                                                                                               Cor.uoller (Front-tr.d)
                                      Medium Spaed
                                         Modem
,  7*xas
Washington, D. C.

-------
           Remote EPA Multiplexed Users
   TTS SERVICED            CITIES SERVICED        CITJCS SERVICED
 Seattle (2)
 Chicago (2)
 Atlanta (1)
 Corvallls. Ore. (1)


 Lai Vecai  (1)

Athens, Ga. (1)
Ktw York City 0)
Boston (1)
Dallas (1)
Phllajolphla (1)
9600 Baud
Channels (14)
CITIES SERVICED
• it 1 '
Atlanta (1)


7200 Baud
Channel* (1)
Atlanta (1)
    RTP (1)
Cincinnati (1)
Philadelphia (1)
                                                                              CITIES SERVICl
                                                                                  \   I  /
RVICCD
W-
                                                                                Seattle
                                                                                Denver (2)
                                                                                Chicago (1)
                                                                                San Francisco (1)
                                                                                Portland. Ore. (1)
                                                                                Corvallis, Ore. (1)
                                                                                Las Vegas (1)
                                                                                Athens, Ga. (1)
                                                                                Grosse He, Mich. (1)
                                                                                New York City (1)
                                                                                Boston (1)
                                                                                Namagansett, R.I. (1)
                                                                                Rochester,'K.Y.  (1)
                                                                                Rose vine. Minn. (1)
                                                                                Dallas (1)
                                                                                Madison. Wls. (1)
                                                                                Jackson, Mtsi. (1)
                                                                                Kansas City,  Mo. (1)
                           Medium

                           Speed

                            (21
                           Channel*
                                         REMOTE
                                                                      Network
                                                                   Communications
                                                                     Controller
                                                                                                                               Washington. D.C.
                                                                                                                                  - C
                                                                                                                  Local
                                                                                                           Telephone Service

-------
                       STATUS OF THE WASHINGTON COMPUTER CENTER PROCUREMENT

                                                 By Denise Swink
       The Washington  Computer Center (WCC)  Task
  Force was established by  the Office of Planning and
  Management to evaluate the need for and the resources
  required  to  implement and maintain an Agency-wide
  consolidated ADP facility.  The membership of the task
  force is  composed of a representative from each of the
  Assistant Administrators' and  Regional Administrators'
  offices.

       The first meeting of the WCC Task Force was held
  February 4, 1974, to review the General Electric (GE)
  Siudy for content and acceptability. The GE Study de-
  fined the Agency's  data  processing  requirements and
  recommended alternative courses of  action to support
  the  requirements. Members were to produce a response
  to the  GE Study by the next  meeting  in  terms  of
  selecting  an  alternative,  planning necessary  personnel
  support  for the alternative, and planning for  additional
  ADP funding to support the alternative.

       The second meeting of the WCC Task Force was
  held  March 15, 1974.  At  that time, the consensus  of
  opinion  was that  the GE Study should be rejected and
  that a new feasibility study should be developed by the
  Task Force. Reasons  for eliminating the GE Study from
  consideration  and  evaluation were based on  the facts
  that  the  study was  out-of-date (two years old at the
  time) and that the workload information presented  in
  the  study was inaccurate. Consequently, a Validation
  Committee was established  to  perform an Agency-wide
  workload survey.

      The  third meeting of the WCC Task Force was held
  April 29,  1974. The Validation Committee presented the
  findings of the workload survey noting that at the time
  they  had only a 50 percent response. A  revalidation
  cycle was suggested by the committee to complete the
  survey and correct any erroneous data. Michael Springer,
  Chairman of the WCC Task  Force, then established five
  working groups with functions as follows:

      I.  General coordination  of other groups'activities
  and higher-management interaction.

      2.   Workload,   Expansion,   and  Feasibility
  Study production  of a formal document (WCC Feasi-
  bility Study) which defines the  Agency's data processing
 workload requirements now and in the future, and rec-
 ommends  a  course  of  action  to  support the
 requirements.

     3.   Hardware, Software, and Telecommunications-
 production of RFP specifications  to implement  action
 recommended in the WCC  Feasibility Study in the areas
 of hardware, software, and  telecommunications.

     4.   Facilities  Management  and Security-
 production of RFP specifications  to implement  action
 recommended in the WCC  Feasibility Study in the areas
 of facilities management and security.

     5.   Conversion  and  Benchmark-production  of
 RFP specifications to  convert and benchmark the EPA
 workload on  the  action  recommended  in  the WCC
 Feasibility Study.

     The fourth meeting of the WCC Task Force hinges
 on the completion of the  WCC Feasibility Study since
 three of the working groups cannot produce documents
 until the  results of the  feasibility study are available.

     The Workload,  Expansion and Feasibility  Study
 Group began meeting  May 16,  1974. After completing
 the Validation Survey, the group  designed, developed,
 and wrote the Feasibility Study. The group met through
July  at  three-week  intervals, five  days in August, an
 four  days in September. The final draft of the Feasibility
Study is  now in the review process by the group. The
study includes the following:

         Definition   of   present services  including
         workload  data and reasons for  using  a par-
         ticular service

         Definition of problems  with present  service
         arrangements

         Definition of requirements

         Definition of feasible alternatives

         Analysis of alternatives (technical, managerial,
         and cost)

         Recommended action.
29<>

-------
     It should be noted that the Washington Computer         ment Information and Data Systems Division), and the
Center was considered a misnomer  by the group since         EDP  Plan required by Congress will be used by higher
there is  no requirement that  the facility  for the im-         management to procure, beginning in fiscal year 1978,
plementation 'of  possible  alternatives be physically         the ADP services required to support EPA programs for
located in the Washington, D.C. area.                           a life cycle of eight to ten years.

     The WCC  Feasibility Study  along with the Tele-
communications Feasibility Study (directed by Manage-
                                                                                                              297

-------
                                        APPENDIX A

                                    LIST OF ATTENDEES
 Allen, Ralph G.
     Senior Programmer
     D. P. Associates

     Grosse lie Laboratory
     National Environmental Research Center
     9311 GrohRoad
     Grosse lie, Michigan 48138

 Barrow, David R.
     Systems Programmer
     Surveillance and Analysis Division

     Southeastern Environmental Research Laboratory
     Environmental Protection Agency
     College Station Road
     Athens, Georgia 30601

 Bliss, James D.
     Environmentalist
     Monitoring Operations Laboratory

     National Environmental  Research Center
     Environmental Protection Agency
     P.O.Box 15027
     Las Vegas, Nevada 89114

Borthwick, Patrick W.
     Biologist/ADP Coordinator
     Gulf Breeze Environmental Research Laboratory

     Gulf Breeze Environmental Research Laboratory
     National Environmental Research Center
     Environmental Protection Agency
     Sabine Island
     Gulf Breeze, Florida 32561

Broadway, Jon A.
     Supervisor, Computer Services Section
     Oil and Special Materials Division, Office of Radiation Programs

     Environmental Protection Agency
     P.O. Box 3009
     Montgomery, Alabama 36109
                                                                                        A-l

-------
Brooks, Dorothy
     Computer Programmer
     D. P. Associates

     Grosse He Laboratory
     National Environmental Research Center
     9311GrohRoad
     Grosse He, Michigan 48138

Bryan, Sam D.
     Mathematician
     Clinical Studies Branch

     Human Studies Laboratory
     National Environmental Research Center
     Environmental Protection Agency
     Research Triangle Park, North Carolina 27711

Budde, Bill
     Chief, Advanced Instrumentation Section
     Advanced Instrumentation Section

     Methods Development and Quality Assurance Research Laboratory
     National Environmental Research Center
     Environmental Protection Agency
     Cincinnati, Ohio 45268

Burton, Judy K.
     Computer Programmer
     Laboratory Services Branch (PNERL)

     National Environmental Research Center
     Environmental Protection Agency
     200 S.W. 35th Street
     Corvallis, Oregon 97330

Byram, Kenneth V.
     Computer Specialist
     Laboratory Services Branch (PNERL)

     National Environmental Research Center
     Environmental Protection Agency
     200 S.W. 35th Street
     Corvallis. Oregon 97330
A-2

-------
 Bystroff, Roman I.
     Chemist
     Lawrence Livermore Laboratory

     Lawrence Livermore Laboratory
     P.O. Box 808, L-404, Chemistry Department
     Livermore, California 94550

 Cline, David M.
     Computer Systems Analyst
     Southeast Environmental Research Laboratory

     Southeast Environmental Research Laboratory
     National Environmental Research Center
     Environmental Protection Agency
     College Station Road
     Athens, Georgia 30601

 Conger, Charles S.
     Chief, Information Access and User Assistance Branch
     Monitoring and Data Support Division

     Office of Water and Hazardous Materials (AW-453)
     Environmental Protection Agency
     Washington, D.C. 20460

 Dell, Robert
     Computer Engineer
     Central Regional Laboratory

     Region V-CRL
     Environmental Protection Agency
     1819PershingRoad
     Chicago, Illinois 60609

Dipert, Merlin H.
    Chief,  Data Systems Branch, Management Division
     Data Systems Branch, Management Division

     Region V
    Environmental Protection Agency
     1 North Wacker Drive
    Chicago, Illinois 60606
                                                                                        A-3

-------
Hairless, William
    Deputy Director and Chief, Chemistry Branch
    Central Regional Laboratory

    Region V-CRL
    Environmental Protection Agency
    1819 W.Pershing Road
    Chicago, Illinois 60609

Feldmann, Richard
    Computer Specialist
    Computer Center Branch

    National Institutes of Health
    Division of Computer Research and Technology
    Bethesda, Maryland 20014

Florence, Cecil E.
    Chief, Data Processing Branch
    Data Processing Branch, Management Division

    Region VII
    Environmental  Protection Agency
     17 35 Baltimore
    Kansas City, Missouri 64108

Friedland, Michael J.
    Systems Analyst/Programmer
    Data Services Branch, (TSL)

    National Environmental Research Center
    Environmental  Protection Agency
    P.O.Box 15027
     Las Vegas, Nevada 89114

Gangler, James
    Programmer/Analyst
    Vitro Laboratories

    Vitro Laboratories
     14000 Georgia  Avenue
    Silver Spring, Maryland 20910
A-4

-------
 (Joldberg, Ncal
     Computer Programmer
     Ecological Research Branch

     National Marine Water Quality Laboratory
     National Environmental Research Center
     Environmental Protection Agency
     P.O. Box 277
     West Kingston, Rhode Island 02982

 Greaves, John 0. B.
     Assistant Professor
     Southeastern Massachusetts University

     Department of Electrical Engineering
     Southeastern Massachusetts University
     North Dartmouth, Massachusetts 02747

 Harris, Theodore R.
     Computer Specialist
     Management Information & Data Systems Division

     Office of Planning and Management (PM-218)
     Environmental Protection Agency
     Washington, D.C. 20460

 Hart, John J.
     Chief, Systems Analysis and Programming Branch
     Computer Services and System (OA)

     National Environmental  Research Center
     Environmental Protection Agency
     5555 Ridge Avenue
     Cincinnati, Ohio 45268

Heller, Stephen R.
     Computer Specialist
     Management Information and Data Systems Division

     Office of Planning and Management (PM-218)
     Environmental Protection Agency
     Washington, D.C. 20460

Hertz, Marvin
     Chief, Systems Engineering Section
     Bio-Environmental Measurement Branch

     Human Studies Laboratory
     National Environmental Research Center
     Environmental Protection Agency
     Research Triangle Park, North Carolina 27711
                                                                                        A-5

-------
Holsomback, Will F.
     Computer Specialist
     Surveillance and Analysis Division

     Southeastern Research Laboratory
     Environmental Protection Agency
     College Station Road
     Athens, Georgia 30601

Johnson, Maureen M.
     Computer Specialist
     Data Systems Division (OA)

     National Environmental Research Center
     Environmental Protection Agency
     Research Triangle Park, North Carolina 27711

Johnson, Richard
     Technical Information Specialist
     Air Pollution Technology Information Center

     Air Pollution Technology Information Center
     Room 255
     Chemstrand Building
     Research Triangle Park, North Carolina 27711

Jurgens, Robert B.
     Physicist
     Regional Air Pollution Studies Branch

     Meteorology Laboratory
     National Environmental Research Center
     Environmental Protection Agency
     Research Triangle Park, North Carolina 27711

Kinnison, Robert  R.
     Mathematical Statistician
     Monitoring Systems Analysis Staff (MSRDL)

     National Environmental Research Center
     Environmental Protection Agency
     P.O.Box 15027
     Las Vegas, Nevada 89114

Knight. John E.
     Technical Information Specialist
     Special Studies Staff

     National Environmental Research Center
     Environmental Protection Agency
     Research Triangle Park, North Carolina 27711
A-6

-------
 Kyle, Kirby D.
     Physicist
     Bio-Environmental Measurement Branch

     Human Studies Laboratory
     National Environmental Research Center
     Environmental Protection Agency
     Research Triangle Park, North Carolina 27711

 Lackey, Curtis S.
     Chief, Data Systems Branch
     Data Systems Branch, Management Division

     Region IV
     Environmental Protection Agency
     1421 Peachtree Street
     Atlanta, Georgia 30309

 Laurie, Vernon J.
     Physical Science Administrator
     Office of Monitoring Systems

     Office of Research and Development (RD-687)
     Environmental Protection Agency
     Washington, D.C. 20460

 Lowrimore, Gene R.
     Chief, Data Processing Section
     Biometry Branch

     Human Studies Laboratory
     National Environmental Research Center
     Environmental Protection Agency
     Research Triangle Park, North Carolina 27711

Madsen.Mark J.
     Systems Analyst
     Data Services Branch (TSL)

     National Environmental Research Center
     Environmental Protection Agency
     P.O. Box 15027
     Las Vegas, Nevada 89114
                                                                                        A-7

-------
Male, Larry M.
    Supervisory Operations Research Analyst
    National Ecological Research Laboratory

    National Environmental Research Center
    Environmental Protection Agency
    200 S.W. 35th Street
    Corvallis, Oregon 97330

McCarthy, William N.( Jr.
    Chemical Engineer
    Office of Program Management

    Office of Research and Development (RD-674)
    Environmental Protection Agency
    Washington, D.C. 20460

McGuire, John M.
    Chief, Chromatography and Mass Spectrometry Section
    Analytical Chemistry Staff

    Southeast Environmental Research Laboratory
    National Environmental Research Center
    Environmental Protection Agency
    College Station Road
    Athens, Georgia 30601

Myers, Melvin L.
    Center Staff Officer
    Program Coordination Staff

    National Environmental Research Center
    Environmental Protection Agency
    Research Triangle Park, North Carolina 27711

Nime, Edward J.
    Director, Computer Services and Systems
    Computer Services and Systems (OA)

    National Environmental Research Center
    Environmental Protection Agency
    5555 Ridge  Avenue
    Cincinnati, Ohio 45268
A-8

-------
 Ott, Wayne R.
     Systems Analyst
     Office of Monitoring Systems

     Office of Research and Development (RD-687)
     Environmental Protection Agency
     Washington, D.C. 20460

 Poole, Elijah L.
     Computer Specialist
     Management Information and Data Systems Divison

     Office of Planning and Management (PM-218)
     Environmental Protection Agency
     Washington, D.C. 20460

 Richardson, William L.
     Environmentalist
     Large Lakes Branch

     Grosse He Laboratory
     National Environmental Research Center
     Environmental Protection Agency

 Rockwell, David C.
     Acting Chief, Data Management Section
     Surveillance and  Analysis Division

     Region V
     Environmental Protection Agency
     1  North Wacker Drive
     Chicago, Illinois 60606

 Rogers, Tommie L.
     Chief Hardware and Communications Management Section
     Data Systems Division (OA)

     National Environmental Research Center
     Environmental Protection Agency
     Research Triangle Park, North Carolina 27711

Schuk, Walter W.
     Lied ionics Technician
     Technology Development Support Branch

     EPA-DC Pilot Plant
     5000 Overlook Avenue S.W.
     Washington,  D.C.  20032
                                                                                       A-9

-------
Scotton, John W.
    Technical Information Specialist
    Office of Monitoring Systems

    Office of Research and Development (RD-689)
    Environmental Protection Agency
    Washington, D.C. 20460

Shew, D. Craig
    Research Chemist
    Subsurface Environmental Branch

    Robert S. Kerr Environmental
    Research Laboratory
    National Environmental Research Center
    Environmental Protection Agency
    P.O. Box 1198
    Ada,  Oklahoma 74820

Snelling, Robert N.
    Chief, Data Services Branch
    Da la  Services Branch (TSL)

    National Environmental Research Center
    Environmental Protection Agency
    P.O. Box 15027
    Las Vegas, Nevada 89114

Swink, Denise
    ADP Coordinator
    Office of Program Management

    Office of Research and Development (RD-674)
     Environmental Protection Agency
    Washington, D.C. 20460

Teuschler, Jack
     Electrical Engineer
     Instrumentation Development Branch

     Methods Development and Quality
     Assurance  Research Laboratory
     National Environmental Research Center
     Environmental Protection Agency
     Cincinnati, Ohio 4S268
 A-10

-------
 Tiffuny, William C.
     Computer Programmer
     Eutrophication Survey Branch (PNERL)

     National Environmental Research Center
     Environmental Protection Agency
     200 S.W. 35th Street
     Corvallis, Oregon 97330

 Tittle, Catherine
     Manager, Customer Services
     Bowne Timesharing, Inc.

     Bowne Timesharing, Inc.
     1025 Connecticut Avenue N.W.
     Washington, D.C. 20036

 Uchrin, Christopher
     Sanitary Engineer
     Surveillance and  Analysis Division

     Region II
     Environmental Protection Agency
     21 Stonehenge Drive
     Lincroft, New Jersey 07738

Webb, Ronald H.
     Senior Account Representative
     Bowne Timesharing, Inc.

     Bowne Timesharing, Inc.
     1025 Connecticut Avenue, N.W.
     Washington, D.C. 20036

Williams, Edward R.
     Program Manager: Comprehensive Analyses
     Forecasting and Analysis Branch

     Washington Environmental Research Center (RD-691)
     Environmental Protection Agency
     Washington, D.C. 20460
                                                                                      A-1I

-------
WUIiams, Robert T.
     Chief, Waste Identification and Analysis
     Waste Identification and Analysis Section

     Advanced Waste Treatment Research Laboratory
     National Environmental Research Center
     Environmental Protection Agency
     Cincinnati, Ohio 45268

Worley. Donald L.
     Chief Analysis and System Design Section
     Data Systems Division (OA)

     National Environmental Research Center
     Environmental Protection Agency
     Research Triangle Park, North Carolina 27711
A-12

-------
                                                APPENDIX B

                                           AREAS OF EXPERTISE
                Air Monitoring Systems
Hert/., Marvin
Jurgens, Robert B.
Kyle, Kirby D.
           Analysis, Design, and Programming
Allen, Ralph G.
Burrow. David R.
Brooks, Dorothy
Bryan, Sum D.
Burton, Judy K.
Byram, Kenneth V.
Cli.nc, David  M.
Dipcrt. Merlin H.
Friedland, Michael J.
Gangler, James
Goldberg, Neal
Knight, John E.
Lackey, Curtis S.
Lowrimorc, Gene R.
Madscn, Mark J.
Male, Larry M.
McCarthy, William N.
Ott, Wayne R.
Tiffany, William C.
Jr.
            Analytical Methods Development
Buddc. Bill
             Automated Chromatography
McGuirc. John M.
Uchrin, Christopher
                    Basin Planning
                   CHAMP/CHESS
Her!/, Marvin
Kyle, Kirby D.
Lowrimorc, Gene R.
                                          Chemical Structure Searching
                             Feldmann, Richard
                             Heller, Stephen R.
                                                                          CLEVER/CLEANS
                                                         Lowrimore, Gene R.
                 Coordination of EPA's Use of Word/One
         Tittle, Catherine
         Webb, Ronald H.
                         Data Base Management
                            Byram, Kenneth V.
                            Bystroff, Roman I.
                            Dell, Robert
                            Florence, Cecil E.
                            Gangler, James
                            Jurgens, Robert B.
                            Knight, John E.
                                     Lowrimore, Gene R.
                                     Madsen, Mark J.
                                     Rockwell, David C.
                                     Scotton, John W.
                                     Snelling, Robert N.
                                     Worley, Donald I
                                           Data Storage and Retrieval
                            Bliss. James D.
                            Friedland. Michael J.
                            Holsomback, Will F.
                            Knight. John E.
                            Richardson, William L.
                            Worley, Donald L.
                                   Development of Library Reference Spectra
                            Heller, Stephen R.
                            McGuire, John M.
                                                                                                         B-l

-------
       Direction and Coordination of ADP Needs
                                            Laboratory Automation
Byram, Kenneth V.
Harris. Theodore R.
Johnson, Maureen M.
Knight, John E.
Niinc, Bdward J.
Snelling, Robert N.
Swink, Denise
       Effects of Pollutants on Organism Behavior
Borthwick, Patrick W.
Greaves, John O.B.
                     Engineering
flyers. Mclvin L.
Cotton, John W.
                  ENVIR/EDMPAS
Budde, Bill
Byrani, Kenneth V.
Bystroff, Roman I.
Cline, David M.
Dell, Robert
Fairless, William
Goldberg, Neal
Heller, Stephen R.
Laurie, Vemon J.
Nime, Edward J.
Ott, Wayne R.
Teuschler, Jack
                                                   Modeling
                             Dipert, Merlin H.
                             Hart, John J.
                             Ott. Wayne R.
                             Richardson, William L.
                             Williams. Edward R.
                                               Ocean Monitoring
                             Uchrin, Christopher
                Equipment Utilization

         . William N., Jr.


                 Facilities Management

McCarthy, William N., Jr.


         Forecasts and Costs of Pollution Control

Williams. Edward R.
                                          Pollution Monitoring Systems
Rockwell, David C.
 Clinc, David M.
 Feldmann. Richard
 Goldberg, Neal
 Jurgcns, Robert B.
 Lackey, Curt is S.
 Male. Larry M.
                     Geophysics
                       Graphics
                             Broadway,Jon A.
                             Hertz, Marvin
                             Jurgens, Robert B.
                             Kinnison, Robert R.
                              Bliss. James D.
                              Johnson, Richard
                              Lackey, Curtis S.
                              Lowrimore, Gene R.
                                                          Williams. Edward R.
                              Budde, Bill
                              Heller, Stephen R.
                              McGuire, John M.
                              Shew, D. Craig
                             Kyle, Kirby D.
                             Scotton, John W.
                             Uchrin, Christopher
                             Williams, Robert J.
                                                      SAS
                                                                                  SEAS
                                                                              Spectrometry
 B-2

-------
                  Statistical Analysis

Bnrlhwick, Patrick W.          Lowrimorc, Gene R.
Byrain, Kenneth V.             Male, Larry M.
Dell, Robert                   Ott, Wayne R.
Kinnison, Robert R.            Poole, Elijah L.
                      STORET
Bliss. James D.
Conger, Charles S.
Friedland, Michael J.
Holsomback, Will F.
           Telecommunications Development
Harris, Theodore R.
McCiiire, John M.
Rogers, Tommic L.
                UNIV AC 1110 System
Johnson, Maureen
Rogers, Tommie L.
Worley, Donald L
         Waste Treatment Research and Analysis
Schuk, Walter W.
Williams, Robert T.
                Water Quality Analysis
Dincrl, Merlin II.
I Lin. John J.
Richardson. William L.
                                                                                                             B-3

-------
                                               APPENDIX C

                                           AREAS OF INTEREST
             ADP Utilization/Management
By ram, Kenneth V.
Dipert, Merlin H.
Florence, Cecil E.
Johnson, Maureen M.
Knight, John I;.
Lackey, Curt is S.
Broadway,Jon A.
Jurgens, Robert B.
Ott, Wayne R.
Myers, Melvin L.
Nime, Edward J,
Poole, Elijah L.
Snelling, Robert N.
Swink, Denise
                    Air Modeling
Johnson, Richard
Jurgens, Robert B.
Knight, John E.
Lackey,Curtis S.
                                                         Harris. Theodore R.
Scot ton, John W.
Snelling, Robert N.
Worley, Donald L.
                    EDP Hardware
                                                                Effective Use of Word/One System by OR&D
                             Tittle, Catherine
                             Webb, Ronald H.
          Air/Water Surveillance and Monitoring
Barrow, David R.
Bliss,James D.
Lackey, Curtis S.
Ott, Wayne R.
Rockwell, David C.
Uchrin, Christopher
         Chemical/Organic Analysis of Pollutants
 Budde, Bill
 F airless, William
 Heller, Stephen R.
 Shew, D. Craig
 Williams, Robert J.
                 Date Base Management
 garrow, David R.
 Byram, Kenneth V.
 Conger, Charles S.
 G angler, James
 Hertz, Marvin
 Laurie, Vernon J.
 Lowrimore, Gene R.
 Madsen, Mark J.
 Ott, Wayne R.
 Rockwell. David C.
                                 Environmental Residuak and Land Use Modeling

                             Williams, Edward R.
                                                   Graphics
                             Borthwick, Patrick W.
                             Cline. David M.
                             Conger, Charles S.
                             Feldmann, Richard
                             Florence, Cecil E.
                             Goldberg, Ncal
                             Harris. Theodore R.
                             Jurgens, Robert B.
                             Lackey, Curtis S.
                             Madsen, Mark J.
                             McCarthy, William N.. Jr.
                             Snelling, Robert N.
                             Swink, Denise
                 Hardware Dependability

 Rogers, Tommie L.


        Information Storage and Retrieval Systems

 Johnson, Richard
 Knight, John E.
 Nime, Edward J.
 Scotton, John W.
                                                                                                            C-l

-------
                 Laboratory Automation
                                             Programming Techniques
 Borlhwick, I'iilrick W.
 Broadway, Jon A.
 Buddc, Bill
 Byram, Kenneth V.
 Bystroff, Roman I.
 Cline, David M.
 Fairlcss, William
 I la 1t, John J.
 I Idler, Stephen R.
 Laurie, Vernon J.
 McCarthy, William N., Jr.
 Ott, Wayne R.
 Richardson, William L.
 Scotton, John W.
 Snelling, Robert N.
 Swink, Denisc
 Teuschler, Jack
 Williams, Robert T.
      Mathematical Analysis Simulation, and Modeling
 Byram, Kenneth V.
 Cline, David M.
 Dipert, Merlin H.
 Johnson, Richard
 Kinnison, Robert R.
 Lowrimore, Gene R.
 McCarthy, William N., Jr.
Nime, Edward J.
Ott, Wayne R.
Poole, Elijah L.
Richardson, William L.
Schuk, Walter W.
Uchrin, Christopher
Williams, Edward R.
                     Minicomputers
 Allen, Ralph G.
 Bryan, Sam D.
 Bystroff, Roman I.
 Cline, David M.
 Dell, Robert
 Peldmann, Richard
Greaves, John O.B.
llcrt/., Marvin
Knight, John E.
Kyle, Kirby D.
Shew, D. Craig
Swink, Denise
             Operating Systems and Analysis
 Bryan, Sam D.
 Cline, David M.
 Dell, Robert
 Greaves, John O.B.
 Kyle, Kirby D.
 Uchrin, Christopher
                  Operations Support
Rogers, Tommie L.
 By ram, Kenneth V.
 Ganglcr, James
 Madsen,Mark J.
             Quality Control in Laboratories
 Borthwick, Patrick W.
 Fairlcss, William
 Florence, Cecil E.
 Hertz, Marvin
 Ott, Wayne R.
 Schuk, Walter W.
        Real-Time Data Acquisition and Handling
Brooks, Dorothy
Cline, David M.
Dell, Robert
Feldmann, Richard
Goldberg, Neal
Swink, Denise
      Scientific/Statistical Applications and Packages
Borthwick, Patrick W.
Brooks, Dorothy
Bryan, Sam D.
Burton, Judy K.
Byram, Kenneth V.
Friedland, Michael J.
Johnson, Richard
Jurgens, Robert B.
Kinnison, Robert R.
Lowrimore, Gene R.
Male, Larry M.
Nime, Edward J.
Poole, Elijah L.
Richardson, William L.
Swink, Denise
                                             Software Development
                             Harris, Theodore R.
                             Johnson, Maureen M.
                             Scotton, John W.
                                                 Spectrometry
                                                           Broadway,Jon A.
                                                           Budde, Bill
                                                           Heller, Stephen R.
                                                           McGuire, John M.
                                                           Shew, D. Craig
C-2

-------
                 STORET

Allen, Ralph G.            Friedland, Michael J.
Barrow, David R.           Holsomback, Will F.
Bliss, James D.             Richardson, William L.
Brooks, Dorothy           Rockwell, David C.
Byram, Kenneth V.          Tiffany, WUliam C.
Florence, Cecil E.
              Telecommunications
Bryan, Sam D.
Conger, Charles S.
Dell, Robert
Harris, Theodore R.
Rogers, TommieL.
                                                      EPA-*TP LIBRARY"

-------