-------
RAP5
DATA
MANAGEMENT
MODEL DEVELOPMENT
AND EVALUATION
AEROMETR1C
MEASUREMENTS
EMISSIONS
INVENTORIES
SYSTEM
VALIDATION
RAPS
DATA
BANK
DATA
SYSTEM
FACILITIES
CHEMICAL
TRANSFORMATION
MODELS
METEOROLOGICAL
MODELS
AIR QUALITY
MODELS
RAMS
FIELD
EXPEDITIONS
AIRBORNE
POINT
SOURCES
LINE
SOURCES
AREA
SOURCES
AIRPORTS
-------
USER
COMPUTING
EQUIPMENT
COMMAND
PROGRAM
COMPUTING
SYSTEM
LIBRARY
ROUTINES
COMMAND
PROGRAM
ROUTINES
GRAPHICS
MODELS
REPORTING
ANALYSIS
DATA
MANAGEMENT
ROUTINE
DATA
BASE
Figure 2
RAPS Data Base and Graphics
System Interfaces
155
-------
CAPABILITIES OF TEKTRONIX SOFTWARE
By David M. Cline
INTRODUCTION
Graphical display computer terminals have been in
use for a number of years. Until recently, the display
units were used essentially in a refresh mode and were,
by necessity, connected directly to the computer that
utilized them. With the advent of a low-cost graphic
display storage unit, the use of graphics in a time sharing
mode became very cost effective. The number of graphi-
cal display computer terminals that the Environmental
Protection Agency (liPA) lias acquired over the last two
years iiulicales u commitment from the user community
thai graphical support is required.
As early as 1972, EPA personnel were utilizing
Tektronix 4010 graphical display computer terminals at
the Boeing Computer Services (BCS) IBM 360/67 com-
puter facility at Wichita, Kansas, and at the National
Institutes of Health (NIH) PDP-10 computer facility at
Bethesda, Maryland.* Subsequently, the EPA contract
with BCS was terminated in the spring of 1973, and it
was not until late July 1973 that graphical support was
available at the Optimum Systems Corporation (OSI)
facility. The OSI graphic facilities were chosen because
they are available to all EPA computer uses. The
PDP-8/E graphic capabilities are discussed as they are
available to a number of EPA installations.
GRAPHICS AT OSI
Graphical support at OSI is available on the
IBM/360 model 155 and may be accessed via IBM's
Time Sharing Option (TSO). The OSI systems group
made modifications to the Telecommunications Access
Method (TCAM) that were necessary for graphical sup-
port. Southeast Environmental Research Laboratory
(SERL) personnel installed the Terminal Control Sys-
tem, the Advanced Graphing II. and the Calcomp Pre-
view packages. The subroutine. TINPUT. which allows
graphical input was in error as were several of the Ad-
vanced Graphing II subroutines; however, all were
corrected before being made available as public files.
TERMINAL CONTROL SYSTEM
The Terminal Control System (TCS) package is the
most fundamental of the packages and is used by the
other graphics packages. It is written completely in
FORTRAN IV in the form of 59 subroutines and is
described in detail in the Tektronix document,
062-1474-00, entitled, "Terminal Control Sys-
tem - 4010, User's Manual." The subroutines are saved in
a cataloged dataset, CNA 324.DMC.TCS.
It is assumed that the user has a knowledge of TSO
to the extent that a FORTRAN dataset may be created
and compiled. During the LINK or LOAD step, the
library of subroutines may be accessed by the following
type of command:
LINK your-dataset-name LIBCCNA324.DMC.TCS')
FORTLIB. To utilize the subroutines, the FORTRAN
files FT05F001 and FT06F001 must be allocated to
your terminal. This is accomplished by issuing the fol-
lowing commands to TSO:
ALLOCATE DA(*) F(FTOSOOl)
ALLOCATE DA(*) F(FT06001) or by executing
the following TSO CLIST:
EXEC 'CNF324.DMC.FT5.CLIST LIST (See
Appendix.)
A TSO CLIST dataset is available to perform the
process of compiling, linking, and executing jf a
FORTRAN dataset containing calls to the TCS package.
It may be copied to your account by issuing the follow-
ing TSO command:
COPY 'CNF324.DMC.TCS.CLJST TCS.CLIST'
(See Appendix.)
The CLIST is used as follows:
EXEC TCS 'dataset name' LIST
where the dataset name is the name of the FORTRAN
dataset to be compiled without the FORT extension. If
the dataset to be compiled and executed is DRAW.
FORT, issue the following command:
EX TCS 'DRAW LIST (See Appendix.)
•Mention ol'i-omim'irial products does not necessarily constitute endorsement by EPA.
-------
ADVANCED GRAPHING II
The Advanced Graphing II (A(ill) package is a
etl collodion of FORTRAN subroutines ihui
"I"-' graphing and labeling of data with lilllc
jy.1 ol the TC'S package. The subroutines and ex-
fc»mplcs for using them are described in the Tektronix
fcocumeiil. 062- 1 530-00, entitled "Advanced Graphing
• l User's Manual." The subroutines are saved in a cata-
loged datasct CNA324.DMC.AGII.
To use the AGH package, one must allocate the
FORTRAN I/O files FTOSFOOI and FT06F001 as de-
rkribed above. A TSO CLIST is available to support the
frequence of commands required to compile, link, and
Execute a FORTRAN dataset containing subroutine calls
[ LIST (Sec Appendix.)
CALCOMP PREVIEW PACKAGE
The Calcomp Preview Package consists of a series of
^ORTRAN subroutines that are used in lieu of the stan-
dard Calcomp plotter subroutines to give the user the
Ability lo preview the output of a Calcomp plotter on
'he Cathode Ray Tube (CRT). Thus, the user may view
His proposed graph immediately with the interactive
fektronix software. This capability should prove to be
&f tremendous benefit to Calcomp plotter users in terms
tof decreased program development time. The subrou-
tines with examples for using them, are described in the
^ektronix document. 062-1526-00, entitled "Preview
Routines for Calcomp Plotters, User's Manual."
As with llu1 oilier Tektronix packages, (lie
ORTRAN I/O lilcs lllusl ')C allocated before using the
' Icoinp Preview routines. A TSO (LIST is available
'hich contains tlic sequence of commands required to
^vjmpile. li"k- and cxcclltc a FORTRAN program con-
fining 'subroutine calls to the Calcomp replacement
f^utines.
It may be copied by issuing the following TSO
command:
COPY 'CNF324.DMC.CAL.CLIST' CAL.CLIST
(See Appendix.)
Line 40 of this datasct must be modified to reflect the
name of the dataset containing the Calcomp subroutines
normally used for plotting. Line 40 of the dataset
CAL.CLIST is listed below. The name of the dataset that
must be replaced by the name of the user's dataset is
underscored:
00040 LINK &NAME. ('CNF324.DMC.-
CALPREV.LOADVCNF324.DMC.CALM1C..
LOAD'. 'CNA324.DMC.TCS') FORTLIB
Then, to execute a FORTRAN program that the
user may wish to preview, issue the following command:
EXEC CAL 'dataset name' LIST (Sec Appendix.)
GRAPHICS ON A PDP-8/E
The PDP-8/E minicomputer system that supports
the Aquatic Ecosystem Simulator (AEcoS)at the South-
east Environmental Research Laboratory (SERL) is
equipped with a Tektronix 4010 computer display term-
inal, the manufacturer's OS/8 software system, and a
FORTRAN IV software system. A Calcomp plotter for
the computer system has been ordered to complement
the graphics effort at SERL. Compiling the TCS subrou-
tines and writing two machine dependent subroutines,
TINPUT and TOUTPUT, were all that was necessary to
implement the TCS package on the PDP-8/E computer.
Upon arrival of the Calcomp plotter and associated soft-
ware, it will be possible to preview Calcomp plots on the
4010 by simply linking to the Calcomp software replace-
ment routines in much the same manner as on the OSI
facility.
The TCS and Calcomp Preview routines that exe-
cute on the PDP-8/li can be easily exported to other
PDP-8/E's; but the OS/8 and FORTRAN software
cannot, since they were purchased on a licensing agree-
ment on a per machine basis. As the work load at SERL
permits, an attempt will be made to install the AGII
routines on the PDP-8/E.
157
-------
APPENDIX • LISTING OF TSO GRAPHICS CLISTS
Listing olTTx CLIST
00010 ALLOCATE DA(*) F(FT05F001)
00020 ALLOCATE DA(*) F(FT06F001)
00030 END
InvokmgFTS. CLIST
EXEC FT5 LIST
ALLOCATE DA(*) F(FT05F001)
ALLOCATE DA(*) F(FT06F001)
END
l.islmgol'TCS. ('LIST
OOOIOI'ROC I NAME
00020 FORT &NAME. NOPRINT
00030 WHEN SYSRC(GE 8) END
00040 SCRATCH &NAME . . LOAD
00050 LINK &NAME. LIB('CNA324.DMC.TCS')
FORTLIB
00060 WHEN SYSRC(GE 8) END
00070 SCRATCH &NAME .. OBJ
00080 CALL &NAME.(TEMPNAME)
00090 END
Invoking TCS. CLIST
EXEC TCS 'DRAW LIST
FORT DRAW NOPRINT
Gl COMPILER ENTERED
SOURCE ANALYZED
PROGRAM NAME = MAIN
* NO DIAGNOSTICS GENERATED
WIIENSYSRC(GE8)END
SCRATCH DRAW.LOAD
LINK DRAWirLIBCCNA324.DMC.TCS') FORTLIB
WHENSYSRC(GE8)END
SCRATCH DRAW.OBJ
CALL DRAW(TEMPNAME)
Listing of AGII. CLIST
OOOIOI'ROC I NAME
00020 FORT &NAME. NOPRINT
00030 WHEN SYSRC(GE 8) END
00040 SCRATCH &NAME .. LOAD
00050 LINK &NAME. LIB('CNA324.DMC.TCS't
•CNA324.DMC.AGH') FORTLIB
00060 WHEN SYSRC(GE 8) END
00070 SCRATCH &NAME . . OBJ
()
Invoking AGII. CLIST
EXEC AGII 'WILL' LIST
FORT WILL NOPRINT
Gl COMPILER ENTERED
SOURCE ANALYZED
PROGRAM NAME = MAIN
NO DIAGNOSTICS GENERATED
WHEN SYSRC(GE 8) END
SCRATCH WILL.LOAD
DATA SET WILL.LOAD NOT IN CATALOG
LINK WILL LIBOCNA324.DMC.TCS'. 'CNA324..
DMC.AGIH FORTLIB
WHEN SYSRC(GE 8) END
SCRATCH WILL.OBJ
CALL WILL(TEMPNAME)
END
Listing of CAL CLIST
00010PROC1 NAME
00020 FORT &NAME. NOPRINT
00030 WHEN SYSRC(GE 8) END
00040 SCRATCH &NAME . . LOAD
00050 LINK &NAME. LIB('CHF324.DMC.-
CALPREV.LOAD', 'CNF324.DMC.CALMIC.-
LOAD', 'CNA324.DMC.TCS') FORTLIB
00060 WHEN SYSRC(GE 8) END
00070 SCRATCH &NAME .. OBJ
00080 CALL &NAME .. LOAD(TEMPNAME)
00090 END
Invoking CAL. CLIST
EXEC CAL 'DEMO' LIST
FORT DEMO NOPRINT
Gl COMPILER ENTERED
SOURCE ANALYZED
PROGRAM NAME = MAIN 001 DIAGNOSTICS
GENERATED. HIGHEST SEVERITY CODE IS 4
WHEN SYSRQGE 8) END
SCRATCH DEMO.LOAD
LINK DEMO LIBCCNF324.DMC.CALPREV.
LOADVCNF324.DMC.CALM1C.LOAD'
•CNA324.DMC.TCS')
FORTLIB
WHENSYSRC(GE8)END
SCRATCH DEMO.OBJ
CALL DEMO.LOAD(TEMPNAME)
I5K
-------
AUTOMATED LABORATORY MANAGEMENT SYSTEM IN REGION V
By David Rockwell
(INTRODUCTION
The Laboratory Data Management System (LDMS)
KUS developed to process data resulting from compliance
*iionitoring requirements of EPA legislation. Our objcc-
M\c was to complete a compliance monitoring report on
fcuch survey within 30-35 days after the completion of
Mie survey field study.
On (lie average, each compliance monitoring simly
[lakes three days. Dal a arc collected from a specific
Station for a 24-hour period. The types of samples are
vgrah samples"; twelve grab samples arc integrated over
U 24-hour period.
In Region V these compliance monitoring studies
*rc performed by four district offices (DO) located in
Different states. Each DO collects, analy7.es, and trans-
inits water samples to the Central Regional Laboratory
(CRL) in Chicago.
CHARACTERISTICS OF LDMS
This project provides a beginning stage in the qual-
ity control of data going into the Storage and Retrieval
fSTORET) system. The advantages of LDMS are rapid
Exchange of analytical data between the CRL and DO.
This capability is very useful in emergency situations. It
;,|S)) u||o\vs ready reproduction of data values for reports
luior lo storage of the data in STORI-'T as well as a
Mmplcr means for correcting data values. All data from a
Survey are orpani/cd and available in one place at all
limes during the study period. This data availability
Assists laboratory managers, chemists, field engineers,
J*nd ADP personnel in performing their jobs more effec-
tively Quality control of data values is provided by
i*can's of the SIDES and QUALITY CONTROL
brograms. Bctier quality assurance of data handling by
knowledgeable DO and CRL personnel is possible, as
•Contrasted with the keypunch approach used previously.
Vith the LDMS, CRL has experienced fewer rcanalyses
broblcms (30 lo 40 percent fewer samples must be an-
alyzed) in studies because sample descriptions are now
always complete. The CRL can better schedule and plan
its incoming work load when studies are identified in
LDMS.
The disadvantages of the system are the heavy de-
pendence on a central computing facility, the need to
train an uninformed user community in a new skill, the
use of computer terminals and computers for data entry,
and the strain on the organi/ation to run two data han-
dling systems with the same personnel during devel-
opment and implementation of the LDMS.
Common to all automated systems is the problem
of inputting the data. This problem has not been satis-
factorily solved in spile of the use of LOAD and GO
programs designed to reduce the need for advanced ADP
skills.
DATA FLOW
In carrying out a survey, DO engineer plans a study
to monitor a discharger. He selects station sites,
STORET water quality parameters, and CRL log
numbers prior to survey. In LDMS, this information is
given to the DO ADP staff and is input into the LAB-
ORATORY program.* Figure I shows the programs and
procedures which comprise LDMS, Figure 2 shows the
overall steps in the process of entering data.
LABORATORY is run and produces LABEL. The
DO ADP staff enters the LABEL name in the 00 LAB-
ORATORY DATA CONTROL SYSTEMf (LDCS) data
set (see Figure 3). The important columns comprising
Figure 3 are numbered, and the columns can be iden-
tified as follows:
(T) LABEL data set name.
(T) Disk volume number on Optimum Systems
Incorporated where LABEL data set name is found
(3) Compliance monitoring study area description
'Programs LAHORATOR. INTI'RVACI, STORI'IT. and llic file l.ABML were created hy Mr. Jon Ahraytis. Senior Programmer, Region V,
I PA Data Systems Branch. Manu|zcmcm Division.
tThis LIX'S wasmMii-d hy Richard Shekel). Indiana District Office Region V. I PA.
159
-------
(4) Dale when DO completes their data entry
(V) Dale when CKL completes their Jala entry
(7T) Dale when d. la arc processed through SIDES
(l) Dale when data are stored in STORE! water
quality Hie
(5} Date when storage of data in STORET is
verified, after which LABEL and related data sets are
scratched from OSI disks.
This LDCS is used by the CRL to retrieve the various
LABEL dala sets indicated on Figure 3 in order to plan
work schedules for anticipated, incoming field studies.
Each Held study is run by the DO field engineer.
The DO laboratory analysis of lime-dependent param-
eicrs (e.g.. HOD and phenols), together with field
paiaiiieiei.s. (e.g., pH and lemperaiim') are cmered info
LABEL by the DO ADP staff. Figure 4 shows the
LABEL dala set after the parameter values have been
entered. The various parts of this dala set are identified
as follows:
(7) Number of parameters in LABEL
Q) Number of CRL log numbers assigned to study
(5) STORET agency code of stations used
(4) STORET station types code description
(?) Expected study date
(fij Expected sample arrival dale ai CRL
(J) Ex pod cJ analysis due date
Qy Compliance monitoring study area description
(2) CRL log numbers assigned to study
@ STORET station numbers
(R) STORET sample lime and type descriptions in
STORET code
(T3) Sample site descriptions
QJ) STORET parameters grouped in sets of ten
(14) DATA values from study analyses.
LABEL can handle a maximum of 60 water quality
parameters and 100 CRL log numbers. This data set uses
a maximum of eight tracks of 3300 disk space. Region V
LABEL data sets can usually be stored on two tracks
After the DO ADP staff has entered a date for corn-
pletion of its work in the LDCS, the major responsibility
for inputting data into LABEL is passed to the CRL
ADP staff. The CRL analyzes the samples for metals
(e.g., mercury and lead), organics (e.g., pesticides), and
inorganics (e.g., nutrients). The results of these analyses
can be entered on blank copies of the LABEL and given
to CRL ADP staff for entry into LABEL After entry js
complete, the CRL ADP staff enters a date in the LDCS
under CRL dala completion column.
Data in a complete LABEL data set are then ready
for the editing and quality assurance programs SIDES
and QUALITY CONTROL (QC)-SCREEN, which are ex-
ccuted consecutively. SIDES flags any format errors to
achieve correct sample identification codes and formats
(see Figure 2). The QC program eliminates physically jm.
possible data and prevents their being stored into
STORET. It prevents physically impossible results, such
as a dissolved value exceeding the total value for any
parameter. It flags improbable data for the CRL quality
assurance officer and DO field engineer to investigate
Positive action must be taken to prevent storage of "im-
probable" data into STORET.
The LDMS system is replacing a manual system for
filling out Hie Region Vdala form. Page one of this form
(FigmeS) contains study and sample identification \n.
for mat ion. Page two (Figure 6) is the first of 17 forms
for dala value entry.
ACKNOWLEDGMENTS
This report is based on the interim results of a de-
velopment team effort. The number of active persons
involved arc too numerous to mention; however, the
following contributed significantly to the present
system: Mr. Jon Abraytis. Mr. Dave Barrow. Dr. Billv
Fairless. Mr. Jim Ganglci. Dr. Wayne Ott, and Mr
Richard Shckell.
100
-------
EDIT
LISTING
ERROR
LISTING
STAUT
PGM
LABORATORY
LABEL
FILE*
PGM
INTERFACE'
F—}
I TEST A | TESTS
PGM
SIDES-
QUALITY '
CONTROL
STOHEir'
f
UStR COMMUNITY CONCEIVES STUDY
CREATES LABEL FILE FOR INPUT OF LABORATORY DATA
DATA VALUES ENTERED BY USER COMMUNITY
REFORMATS LABEL AFTER ENTRY OF ALL DATA FOR USE
BY SIDES
INPUT FILES FOR SIDES
PERFORMS EDIT CHECK ON DATA AND REFORMATS FOR INPUT
INTO STORET WATER QUALITY FILE
INPUT FILES FOR QUALITY CONTROL
PERFORMS DATA CHECKS FOR RANGES OF ACCEPTABLE DATA
FLAGGING IMPROBABLE DATA
STORES DATA INTO STORET
DATA IN STORET
WATER QUALITY FILE
•PROGRAMMED BY DAVE BARROW OF ATHENS, REGION IV, EPA
• 'PROGRAMMED BY JIM GANGLER, VITRO LABS. INC.. WASHINGTON. D.C.
'PROGRAMMED BY JON ABRAYTIS. CHICAGO REGION V, EPA
Figure 1
Laboratory Data Management System
Structure Diagram
161
-------
ERRORS
CORRECT
RESUBMIT
EDIT RUN
CHECK
EDIT LISTING
'T
I
I
I
I
*
LABORATORY DATA MANAGEMENT SYSTEM
SUBMIT
EDIT RUN
FILL
LABEL DISK
FILE
WITH DATA
IN PRESCRIBED
FORMAT
^ I START I
*n_ J
SUBMIT
QC RUN
r~
i
i
•
i
i
i
i
i
i
i
i
i
i
"" ¥
CHECK
QC LISTING
1
i
x^^
[ ERRORS
V ?
^T^
\ NO
/
I YES
RESUBMIT
QCRUN
+-
SUBMIT CHECK
RUN A "~ 1 RUN
1
T 1
ERRORS |_ RESUBMIT
CORRECT 1—— S.S. RUN
SIORET
WATER
QUALITY
FILE
YES
ERRORS
CORRECT
-------
Jv.
Ji.
!<-.
1 J.
d*.
ft.
et.
^^.
J'J.
Jl.
3/r.
JO.
J/.
3U.
Si.
IviIi'.A ulsl-'UI urFiCt 111 f.*:>T UIA.MONO AVt. Ev »:
I. 1-1 INS uK '«!* 3t1 UAItS «VULL(ft3 .COfHt^rs AND 'J»l
jF in-~t A** -'•< uor af IONS Pct«bt CONTACT HlCnaxD *
-(--Ic-..,' j-nro-« fib ->* •« lf-*i J-ton7 1 t*T «!«>»
-t I . T-C->t :-»l» « 1 s r»» O3l»o fcCNA30U.iJAb.Uftr* bf T
'„ A - I _ - / " I t - l~'l*lll
» ILLC. INDIANA »7fll
©
© ©
UAIE
ClHr>
C«L
'- 1 LSLCUL4 It > : .O.
•.* 1 la
uv -••• 1 '
I -''J
I ~ "J
I- ._.._!.,
IV U.
r.r.u,:
I' :)'-TJ
I ...u-
I , .u •'
p. o .
I • ju i
i 1 ,\1 •
I-.JO-
IN .0"
I .!>•>•
iMl'U.
LOAua
v . •» l-> H.iaiuCI'j' sluiT «** 1 3-")C I J ' >lolir .A
'i*-* ifliien Cu^LATil' •«•*.•>. .<(.
i. <•« I •»( T «IM<\ iTAiUui
T«VUt >">u>.>. AT •A-ADM 1 4U.
t (.Unr. tl'»n-itt.I -"tA-^lNvj ulv.
It-'t"! WOlK. Cj'lt'. .^Ol.M JUlt-iLL
/ -j i I -vr. T »'j-«A
II/T I .p |«ly•<^ y
I j« IdoioCIjv sloiir HUJ btxltb
c«>-c»l l^-el COMf. *O«. HI
..I'll (Sfult.NT <» LO.Kr*. ^IOl». r>tS I ll,ll)t T
•U j • I"IL lA-iKUts "ICntJ U>- Br u.a.C.-o
>iii" IKlFt.nt.Hi uUlf Invtdt. lultKt
•«••-• l»lTCnt.« ^Ulf lNVt>r. MAIN uuTFALL
*J
(3"(l'jn IS£u«t'«r 31 b.l.f. r_FH.Ut"ilb
ISUO"JJI lavOU" lbtO-J*l lbv»Uu- l-itto'^tNT M iNUUaTKlAL iNlArxts
ItaOuuJ-al fauOun IbttoHtNl Si «ATt>4 JUALlTr
I JNOOOJt»II»i'Di)e ISEtoltdiT 6* b.T.f. tFFLUfilb
0» b.T.*'. IwfLUtNTb
t>4 M
UUALlTY
IIM)OUJ9IT>UUU<>
©
©
USN UbN UATt USN
TtbTA TtsTfJ ENTM UOT1
VOL »UL
Ol>T^
OATE DATE
bTOMtO Vt«IFY
1 N • A ^ t )^ • A »
I«.A. IN. A.
I'«.A. . IN.h.
| '« . A . 1 N . ft .
1 JoNt ll»O.»E 1 uO*t
1 bL««TCHtD
IUU^IL luO^IE
I'lGNi luO^t
I/*UP3'J 1
1 jLHAICncp
l/»0nn 1
1 '-.UljU I
i/*Univ 1
I7»063U 1
lbCNATCnr.0
luUNr. lUO^t
luUML (UOtlt
1
1 =
1
1
1
1
1
1
1
1
1
1
1
|i*.A« IN. A. IN. A. IN. A. IN. IN. A.
In. A. IN. A. IN. A. IN. A. IN. IN. A.
'* • A . IN. A. IN. A. li^.A. 1 1^. IN. A.
*«A. IN.A. IN. A. IN. A. 1 rt. IN. A.
CbOUUVl rsOOO*! 7««yO* MbOUOvlN.
1 bOOO^I TbuOO* 1 7»IHIO iTSOOOolN.
1 bUUu-^l TbOOOS 1 7»U>y MiOOO IN.
iboaovi TSOOUVI
(SoOovi TbuOO^i
1 1
IbOUU^I TbuOOvl
IbUuuvl TiOOO^I
1 1
1 1
1 bUuO*l TbOOOVI 7*WVU*
IbUJO-il FlUOOV 17*090*
1 1
i bOou** I T auuoy I
IbOUOBI TbUOOdl 7*U910
1 1
1 1
1 1
1 1
1 1
1 1
1 1
1 bOU09 1 TiOOO* 1 7*041 V
f iOUOV I T bOU09 1 7*UV 1 0
* 1 I
1
|
1
I
1
1
|
|
|
|
I
1
1
FbOUO-*
TbOUOv
TbOOOb
Tbuooe
TSOOOO
N.A.
N.A.
N.A.
N.A.
N.A.
IN.A.
IN.A.
IN.A.
1 M. A _
7*090* 17*091*
7*091U
7*090*
7*090*
7*0910
7*0910
7*0910
7*091*
7*0916
7*0916
7*091b
7*0916
7*0916
i
i
f
-------
1 .
f.
3.
4.
•5.
I1',
1 1.
14.
IS.
1'.
I1-.
«•".
? / .
V',
31 ,
©
74-»l
74^)
ID w
* l^I'J.
74'1^-K
7411 7'i
74^17]
C«>-^|_»
i 1'-, • r
74 '! J,
7 4 1 1 •- . .
74>t r
7. i
7-. '
7 4 •
©/ <
77777777 74
© 0
740503
©
C-i i 7404.10 llOu T \'t 740S01 1000 OSOO\
r .'--I- 7404J3 1100 T !•> 740501 luOO 00001
C"--l i 7*0101 100-J <)00o(«^>|
C I --f 740SO^ 1P04 OlOOjVli/
r i -s * 741)-,.,*! 1 ITU 0000.)
(i O "J :j()Ul'. U'MOO 004UO UJ4I14
r\)' • T -' '. i T* • Jo *i i_jt,
_y| I-"'. 1- !~ H*
'(, • . rv...i -I,/L 30 3-1
COS^OCTON STP INFLUENT
CObHOCTON STH AFFLUENT
COS-IOCTON STt- fFFLOEMT
O^ElOF^ STi- I-JFLUt".T
Jfif.SOFM bFK EFFLUENT
UHtSOf« sfn EFFLUtNT
(K)O'H-? 00410 i)
Oy
')5DO 31616
OOSOO '
Cf»OUCT>/T r ULX 01L-6Kifc 'tC CDLI MES1DOE '
AT ^SC CACO1 TOT
-b*LT <-fM-FC«K TOTAL 1
•VL /100KL
MG/L
r1 i r.i
II 7. I
h'.< i
Kr^
Iv!-' i 7.4
(i....... ..*.. ,..,
...../• ,..-.» I;MIO oj^4
•'^ - J.1-- ->, 1,lly -*'ll
<£. IH'O.D 1
1 1
1 1710
i i<>ou
Inlu.O 1
.0
1
I..HI,. | 4»0. .) | | | U70.0
1-5-iij. 1 ll'l.O 1
1 1
1 IV85.U
ItluOOO. 1
!I'IJ4'J DO""!) 00*4') 00^*5
rn> i u "• c c*^i
|^
009-30
.Ortlut suLFATt FLOOrtlDE
./L
I l-v
111.
I Hi-..
I 141.
n/L
nl
C
" j/L
CL
•"./L
1107.
I 140.
F.DISS
HG/L
DONE H
•••IL
•••IL
•••IL
•••IL
•••IL
IA
IA
IA
IA
•••I A
•••I A
•••I A
• ••I A
•••I A
• ••IA
IB
IB
IB
18
•••IB
•••IB
•••18
•••IH
•••IB
•••18
41.
*3J
*4.
4S.
4*..
47.
*".
*•*.
SO.
SI.
5*-.
53.
54.
57.
5V.'
60.
7'. M V
7 » *M J 7
74*1 < !
74-'I 7
74'>1 /I
sn-Hi ^
L:ir, 1
74^11'. I
74^117 I
-/I
7411*.^ I
1 •/<_
!»•«•...') I-.4-
I Jl ).•' I -.TS
I
1
i
11 Ht 7
.. ,.IO(
..•••./L
I'll.
N-J '-IC<«:L ll-
1 l.e
4-3ir1C
Jb.T.iT
'l:i/L
\*t.
IKS.
•IOC i«
•1I-./L
I. 'e1 I 14.4
'.11 I l-i.o
00-.
1-10.
alo-tt
For Kjf
v,/L
11.
U 10.14
<. roT
J'5/L
I«JO.
IKJU.
IKJO.
KJO.
0103?
CinO-IUH
Oij/L
.TOT
•••i/L
0104^
CU.ToT
l*>0.
139.
180.
1*0.
01105 010,5
ALUMINUM IKON
AL.TOT FE.TOT
UG/L
01051
LtAO
Htt.TOT
UG/L
71900
MEHCUHY
HGtTOTAL
UG/L
111.
110.
13*.
Ub.
10. *
10.2
10.4
14.4
1C
1C
1C
1C
•••1C
•••1C
•••1C
•••1C
•••1C
•••1C
10
ID
10
10
•••ID
•••ID
•••10
•••10
•••ID
•••ID
1C
IE
K
-------
FY 1975 District Office
Sajipling Date
Lai:. Arrival liata Analysis rue T.iV;
Saapl* Description; rJ.ver-G2ill3"4 :-:an-Trt-Ef f-0324<24a Ir.d-Trt-Ef l'-03444240
iake-02111432 Xun-Eff-Riv-02211204 Ind-Eff-Ritf-0241120-! account Ko. Study
Mun-Rav-G3212240 Ind-Raw-Inf-034 12240
Mark [A) to: sasipkes with cancer.traticr.s above the oaxinui lisdts, nark ( } for concer.traiior.s
CSi Sargli- '. Agency tio. 1 1 Station No.
Log Nuaber > B digits u 6 digits
1 I
» il
fl
75- 1 |:
75- I
75- Jt
75- i:
|
V
75- . i
7-- y
75- B
75- |?
'5- n
75- H
75- H
75- ^|
h
"
u
* ;
r *i
ft
!>!
!"1
I
«
Crab Sample Collection Date ;
or Beginning Composite Time 1
:
^
1
H
H
M
1
i
1
M
telov.' naxinirn, mark tD) for District Of.'ica analysis
j Death M ^"^e o! Coxcosite \ ' -r.i'i
i ( :F^flw
a=averace c*continuou5 i • £i.
h=ra x iir.uiB g-grab ft'J
l=r.ir.iir.un | nn*l saaiples T
blarJt-entire t
j co^aosite ' '-•}•.-'.
H f. i i f .
H ' I ^ !
it ' i
i-
i •
H
!
t
k
: •- i !
i : t
i K M
i . ',.i
i •
\ M ' ; h
1 f • • • -..-I
j Lj i : ;..
_Jl Ji
i
I1
!
I
ifi
i:'i 1 ! : .
fr
f
r
t
i
r *
t -,
y'"
_±
-.g Ccr;
i \
iS.
Tii
i
' ^
i -' i
i
;
t
— 1
i
" ff"
j i : . ;
(1 ':
I i
i ;
i •
1 ' ;
1 '
j
: 1
1
i irj
1 - ;
I
i i :
v
i
y
^
j j !
y
CM. B/t/74
•fi
-------
FY 1975 District Office
Sampling tote
Lab. Arrival Date
Account No.
Analyols Due Date
Study
KnrK (A) for staples with concentrations abov« the naucinum Unite, nark ("O for concentrations lelov eaxlnum, marlc (D) for District Office analysis
T,..-,.-^,-,.,. -,
c:'L ^.'.--i'.o
Lc,j ;;.-.::er
Ea-ple
I/c-c.-ipticn
J'jai-un Cir.ccrtreticn Limits
•J.-.:ts
5
.'5
'5
:'S
.'5
'5
3
:-'S
:js
^5
,'S
?s
75
75
Tfttalt
OC010
Water
Tesp
c*
00020
Air
feTp
—
c'
00-}QO
Dissolved
Oxyc^n
as/1
-ir-'-n
Field
pH
yH Ur.lts
1
50CSO
-Re^idjal
Chlorine
—
*sA
500 6i*
Free
Chlorine
—
cg/1
000^8
Flow
gal/clr.
'
- g.ij
Iff
PI-
-------
AUTOMATIC DATA PROCESSING AND REGIONAL LABORATORY QUALITY ASSURANCE
By Bill Fairless, Ph.D.
The Central Regional Laboratory (CRL), located in
Chicago, Illinois and providing Federal laboratory sup-
port lor the states of Ohio, Indiana, Illinois, Michigan,
Wisconsin and Minnesota was formed in February of
1973 with eleven positions. During the past 20 months,
the staff has been expanded to 37 permanent, tempo-
rary, and intergovernmental personnel act employees.
Figure I shows the current organizational chart. The first
branch established in our expansion was the Quality
Assurance Branch, and the first position filled was Chief
of that Branch. A CRL quality assurance program was,
therefore, implemented approximately one year ago and
has evolved continually since that time. It has resulted in
a substantial improvement in the accuracy of the analyti-
cal results reported.
It is believed that the Chicago CRL organizational
structure formally separating the quality assurance func-
tion from all other analytical functions is a major
strength. The CRL has a Quality Assurance Branch
which reports directly to the director of the laboratory
and therefore, provides an independent evaluation of all
sampling, analytical, and data reduction and reporting
procedures. The branch reviews all data to be reported,
has access to all bench sheets, instrument log books,
quality assurance raw data and data summaries, and
approves all analytical methods and laboratory
techniques-
Cooperative efforts between the Quality Assurance,
Biology and Chemistry Branches have resulted in devel-
opment of the quality assurance data summary which
would interest an ADP group as a problem needing an
automatic data processing solution. Page 1 of the
summary i« shown in Figure 2.
The first item specifies a parameter and its
STORET number. It was thought thai an analytical
method could be associated with each STORET number;
hut since that is not the case, item 2 is included as a
reference only. Details of each procedure are given later
in the summary. Item 3 is included lo differentiate
between older, and presumably more reliable, analytical
methods and the recently implemented procedures.
Hem 4 identifies the dates enclosing the time period of
data collection for this summary, and item 5 lists all
instruments used to collect the data with the location of
the instrument log book and the individual responsible
for assuring the proper operation of that instrument.
Item 6 describes the concentration range which would
be used to report data resulting from the procedure
referenced in item 2 and other results included in this
summary on a routine basis. Data points outside the
working concentration range are either not reported,
reported with a qualification (do not place in
STORET - explain why), or the sample is reanalyzed
after appropriate experimental modifications. Item 7
gives a numerical detection limit which is defined on
page 3 for each summary. Item 8 describes the control
limits used to flag all suspect data, and item 9 presents
the precision. Item 10 is included only after analysis of
reference samples that have been prepared by someone
outside the administrative group completing the sum-
mary. Most often these samples are obtained from
Methods Development and Quality Assurance Research
Laboratory (MDQARL) or Research Triangle Park
(RTP). Items 11 and 12 are self-explanatory.
Page 2 of the summary (Figure 3) is a tabular
summary of the data used to arrive at answers for the
items described previously and listed on page 1 of the
form. The usual statistical parameters are included on
the left and the chemist is required to fill in the top
(starred) of each column. A slightly different form
(Figure 4) is used for duplicates.
A quality assurance data summary is prepared tor
each unique analytical measurement being made at the
CRL in which sufficient data are available. Figure S is a
typical example describing the analysis of cadmium by
flame atomic absorption spectroscopy. Figure 6 shows
the supporting data for cadmium, and Figure 7 is an
example of the terms that should be defined in order to
use the numerical data presented in Figures 6 and 7.
Figures 8, 9 and 10 show data for a "high level"
nitrate-nitrite analysis while Figures 11,12 and 13 per-
tain to a "low level" procedure. The CRL staff has also
designed equipment and procedures to analyze for even
lower concentrations of nitrate in the samples collected
from the Great Lakes.
The staff performs a considerable number of arith-
metical calculations on finished data; still more calcula-
tions are required for raw data. There has not been
sufficient time to summarize such data as absorbance
167
-------
readings, extinction coefficients, instrument gain
settings, and so forth even though these variables arc
frequently used by CRL analysts to evaluate quality
assurance data.
Hopefully these examples clearly illustrate the
kinds of internal quality assurance data that are now
collected and evaluated by hand. The data management
file of the CRL minicomputer system will include pro-
grams and data files to accomplish all of the above auto-
matically. The system is described in greater detail by
Roman Bystroff in another paper of these Proceedings.
It is expected that once these programs are operational,
better estimates of data quality will be provided. Such
analyses will be made available quickly to other regional
laboratories.
These kinds of efforts by the regional laboratories
place increasing responsibilities on the managers of the
national data bases to ensure the quality of all the data
being placed in each data base. It is questionable that a
continued effort to produce more accurate measure-
ments is valuable, if those measurements are only to be
lost in a data base containing.values of all qualities. In
fact, it is predictable that many users will turn away
from the existing national data storage bases unless the
quality or integrity of the data is assured.
168
-------
DIRECTOR
DEPUTY
QUALITY
ASSURANCE
BIOLOGY
CHEMISTRY
ORGANIC
METALS
INORGANIC
A
R
Figure I
CRL Organizational Chart
-------
KFG10N V Page 1
u.s. CNViRoirorAi PROTECTION AC.I NCY
CENTRAL REGJOCAL. LABORATORY
QUALITY ASSURANCE PATA SUMMARY
1. Parameter STORET No.
2. Procedure used
Attachment Included /_/ Yes ___/ No
3. Procedure has been in use since
4. Data included in this summary was collected from to
5< Instruments used Location of instrument Lori book * individual
to collect data responsible for instrument maintenance
a.
b.
c.
6. Concentration range reported to
units
7. Detection Limit
units
8. Control Limits Percent of data outside
control limits
a.
b.
c.
d.
e.
f.
9. The precision is at a concentration of
using standard deviations as an estimate of precision.
10. Bias
11. Significant figures reported including correct STORET units
12. Signature Date
Immediate Supervisor Date
Figure 2
Quality Assurance Data Summary
170
-------
Pacje ."
STATISTICAL Stir.!iARY OF
QUALMV AS:,(J!:A.';CE nATA
1. No. Samples
2. Mo. Dt'Vonnn/Sple.
3. True Values
4. Mean Value D;-:tcnr.n.
5. Bias
6. Vtir iiiin c
7. Average of differences
0. Std. Deviation of
differences
9. Est. Std. Deviation
of Veil ues
10. 95''. Confidence Range
11. Relative Std. Dev.
12. WL Rel. Std. Dev.
Range
13. Data Collected from
to
•*
*
*
*
(a) *Blank - All reagents except sample
(b) Reference standard - An unknown supplied by an external audit agency
such as MDQARL
(c) Control standard - Sample of known concentration prepared by CRL
staff.
(d) Calibration standard - Sample used to adjust instruments.
(e) Spike - A measured amount of material added to a sample prior to
the analysis.
(f) Other
Fifure3
Statistical Summary of
Quality Avunnce Date
171
-------
Pago 2A
STATISTICAL SUMMARY OF
QUALITY ASSURANCE DATA
FOR DUPLICATE DETERMINATIONS
1. No. Samples
2. No. Determn/Sple.
3. True Values
4. Mean Value Determn.
5. Bias
6. Variance
7. Average of differences
8. Std. Deviation of
differences
9. Est. Std. Deviation
of Values
10. 95% Confidence Range
11. Relative Std. Dev.
12. 95% Rel. Std. Dev.
Range
13. Data Collected from
to
*Blank - All reagents except sample
Standard (a) reference - An unknown supplied by an external audit agency
such as I'DQARL
(b) control - Sample of known concentration prepared by CRL
staff.
(c) calibration - Sample used to adjust instruments.
Spike - Amount of material added to a sample.
Other
Figure 4
Statistical Summary of Quality Assurance
Data for Duplicate Determination
172
-------
REGION V
U.S. LIWROUTNTVl. PROTECTION AC.rNCY
CENTRAL W.GIGNAL LABORATORY
QUALITY ASSURANCE DATA SUMMARY
Page 1
1. Parameter tot. Cd
STORET No.
01027
2. Procedure used Acid digestion and flame atomic absorption.
3. Procedure has been in use since February 1973
4. Data included in this summary was collected from 7/73
to 5/74
5. All instruments used
to collect data
a.
b.
c.
d.
e.
PE 306
IL 453
Location of instrument
Log book & responsible individual
R. Whitworth/Metals
E. King/Metals
6. Working concentration range 1Q ug/1
7. Detection Limit 5 uq/1
8. Control Limits
a.
to
500 UQ/1
Percent of data outside
control limits
b. 91-109% Spike recovery
c. <10X Sample cone, in blank or_
d. <10 PPb Cd
9. The precision 1s 9 ug/1 at a concentration of TOO ug/1
using 2 standard deviations as the measure.
10. Bias
11. Units Including correct Mo. Significant Figures 2 or 3 ug/1
12. Signature pate
Immediate Supervisor Ed Huff
Date 9/20/74
Figures
Quality Assurance Data Summary-Analysis of
Cadmium by Flame Atomic Absorption Spcctroscopy
173
-------
Page 2
STATISTICAL SUMMARY OF
QUALITY ASSURANCE DATA
FOR UNIVARIACLES
Cd
1. No. Samples
2. No. Detl Sample
3. True Value
4. Mean Value of
Determinations
5. Bias
6. Variance
7. Standard
Deviation
8. 95% Confidence
Range
9. Relative Std.
Deviation
10. 95% Rel.
Confidence Range
11. Data Collected from
to
*
Spike
111
1
100%
99%
-1%
19.4
4.4
91-109
4.4
91-109
7/73 - 5/74
*
Reference
1
1
16 ug/1
13 ug/1
-3 ug/1
*
Referenc
1
1
73 ug/1
68 ug/1
-5 ug/1
*
*Blank - All reagents except sample
^Standard (a) reference - An unknown supplied by an external audit agency
such as MDQARL.
(b) control - Sample of known concentration prepared by CRL
staff.
(c) calibration - Sample used to adjust instruments.
*Sp1ke - Amount of material added to a sample.
^Other
Figure 6
Supporting Data for Cadmium
174
-------
DEFINITIONS AIM) DLTA1LFD DESCRIPTIONS
OF TLR11S USm ON PI^VlOUS PAGES
Include the following
"ft Test procedures, working concentration range, detection limit, control
limits, precision, accuracy or bias, actions usually taken when tlata Is
outside control limits.
1. The test procedure is the standard EPA approved procedure given in "Methods
for chemical Analysis of Water and Wastes." U.S. EPA, 1071, page 83-120.
2. The working standards used for instrument calibration are obtained from
commercial sources and diluted to the desired concentration with distilled-
deionized water.
3. The working concentration range is established by the calibration standards.
It begins at a concentration above the detection limit and covers the
remaining linear or near linear portion of the absorbance vs concentration
curve.
4. The detection limit is defined as that concentration of metal that alves a
signal peak twice as large as the baseline noise. It is evaluated from
standards that are approximately ten times the detection limit usina the
equation:
Detection Limit = 2(Standard deviation! (concentration ,
\of baseline noise | 1 Instrument readout!
5. Accuracy is estimated from the long term percent recoveries of spikes added
~to different kinds of samples as follows:
% Recovery = [Sample + Spike] - fSample]
r , • - ^ ** X 100
[Spike]
If the spike concentration is less than one half the sample concentration
the results are not used for control purposes.
6. The apparent or real concentration of reagent blanks are measured and sub-
tracted from the respective samples. If a blank 1s contaminated so it con-
tains more than 10% of the sample concentrations the analyses are considered
to be out of control.
7. Significant figures are determined by the bench chemist and are based on
the day to day performance of the laboratory equipment.
8. All analyses that do not fall within the control limits are investigated
on an individual basis. Usually the entire series of samples are reanalyzed.
In all cases the action taken is recorded in the bench log book. All
reported data are supported by Quality Assurance results within the deflntd
control limits or which have been approved by the CRL Quality Assuranch
Branch.
Figure?
Definitions and Detailed
Description of Terra
17S
-------
RtGION V
U.S. nn'WOri.'-OTAL PROTECTION AGENCY
CENTRAL IttGIONAL LABORATORY
QUALITY ASSURANCE DATA SUMMARY
high
7. Parameter NOp+Nth-N level STORET No.
00630
2. Procedure used Technicon.Industrial Method 158-71
modified to account for HoSOd preservation
Page 1
3. Procedure has been in. use since 7/1/73
4. Data included in this summary was collected from 7/1/73 to 8/12/74
5. All instruments used
to collect data
a. Technicon AA II
Location of instrument
Log book & responsible individual
Nutrient Lab. - A. Jirka
d.
e.
Fisher Diluter
Balance #104331
M M II II
Nutrient Lab. - M. Carter
king concentration range 0.03
to 5.00
ection Limit 0.03
Percent of data outside
control limits
3%
8. Control Limits 3 std. dev.
a. Blk; 0.00 - 0.04
b. 1.00: 0.93 - 1.11
C. 3.00: 2.85 - 3.09
d. 5.00: 4.84 - 5.14
recovery90-108%
9. The precision is ^0.12 at a concentration of 3.00 mg/1
12%
9%
-7T
using
standard deviations as the measure.
10. Bias -0.01 average
11. Units including correct No. Significant Figures X.XX MG/L
12. Signature Andrea Jirka Date 9/4/74
Immediate Supervisor Mark J. r.artof Dcte 9/18/74
Figures
Data Summary, High-Level Nitrate-Nitrite Analysis
176
-------
STATISTICAL SUMMARY OF
QUALITY ASSURANCE DATA
FOR UNIVARIABLES
Page 2
1. No. Samples
2. No. Detl Sample
3. True Value
4. Mean Value of
Determinations
5. Bias
6. Variance
7. Standard
Deviation
8. 95% Confidence
Range
9. Relative Std.
Deviation
10. 95%Rel.
Confidence Range
11. Data Collected from
to
* Control
Strf. Blank
58
1
.0.00
0.01
+0.01
0.00015
0.012
0-0.003
7/1/73 to
8/13/74
* C. Std.
1.00 ma/1
75
1
1.00
1.02
+0.02
0.00096
0.031
0.95-1.08
3.U
95 - 108*
7/1/73 to
8/13/74
* C. Std.
3.00 mg/1
40
1
3.00
2.97
-0.03
0.0037
0.061
2.85-309
3. OX
95 - 103X
i/nm
:o ft/1 3 n*
* C. Std.
5.00 mg/1
57
1
5.00
4.98
-0.02
0.0051
0.071
4.84-5.12
1.4%
97 - 102X
7/1/74 -
. 8/13/74
*Blank - All reagents except sample
Standard (a) reference - An unknown supplied by an external audit agency
such as MDQARL.
(b) control - Sample of known concentration prepared by CRL
staff.
(c) calibration - Sample used to adjust Instruments.
* Spike - Amount of material added to a sample.
* Other
Figure 9
Data Summary, High-Level
Nitrate-Nitrite Analysis
177
-------
STATISTICAL SUMMARY OF
QUALITY ASSURANCE DATA
FOR UHI VARIABLES
Page 2
1. No. Samples
2. ,No. Detl Sample
3. True Value
4. Mean Value of
Determinations
5. Bias
6. Variance
7. Standard
nplv/i a -f i An
8. 95% Confidence
Range .
9. Relative Std.
Deviation
10. 95% Rel.
Confidence Range
11. Data Collected from
to
*C. Std.
? R
25
1
? 5
2.48
-0.02
0.0014
0 038
2.40 to 2.56
1.5*
96 to 1-02%
7/1/73
1/11/74
*
Spikes
47
1
i nc\°/
99%
-1%
9.9%
3.]%
93 - 105X
7/1/73
8/13/74
*
Std. Cal.
26
1
3.4n
c
0
kl
1
V-
Q
t
3.40 +_
1
*
Duplicates
22
2
Ave of
difference
is 0.01 &
Std. dev.
is
0.016
*Blank - All reagents except sample
Standard (a) reference - An unknown supplied by an external audit agency
such as MDQARL.
(b) control - Sample of known concentration prepared by CRL
staff.
(c) calibration - Sample used to adjust instruments.
* Spike - Amount of material added to a sample.
* Other
Figure 10
Data Summary, High-Level
Nitrate-Nitrite Analysis
178
-------
REGION V page i
U.S. ENVIRONP.F.NTAL PWVIFC1ION AGENCY
CFNTRAL RCGIONM LABORATORY
QUALITY ASSURANCE DATA SUMMARY
(low level)
1. Parameter^? + NOa)-N STORET No. 00630
2. Procedure used Technicon Industrial Method No. 158-71W with or without
modification to account for H^SO^ preservation.
3. Procedure has been in use since 7/1/74
4. Data Included in this summary was collected from 7/1/74 to 8/13/74
5. All instruments used
to collect data
a. Technicon AA-II
c.
d.
e.
Location of instrument
Log book & responsible individual
Nutrient Lab. - A. Jirka
Balance 1104331
Nutrient Lab. - M. Carter
•king concentration range
0.002 to 1.000
:ection Limit q.002
8. Control Limits 3 std. dev.
a. 81 k; 0.000 - 0.003
b. 0.2; 0.195 - 0.205
c.
d. * tec: 88* -
1.0: 0.979 - 1.013
Percent of data outside
control limits
0
10*
9. The precision 1s +_0.003 at a concentration of
using 2
10. Bias -0.002 average
0.200
standard deviations as the measure.
11. Units including correct No. Significant Figures q±xXft 1119/1
12. Signature Andrea Jjrka Date g/s/74
Immediate Supervisor Mark J. Carter Date 9/18/74
Figure 11
D»U Summary, Low-Level
Nitrite-Nitrite AnaJyaj
17*)
-------
STATISTICAL SUGARY OF
QUALITY ASSURANCE DATA
FOR UMIVARIABLES
Page 2
1. No. Samples
2. No. Detl Sample
3. True Value
4. Mean Value of
Determinations
5. Bias
6. Variance
7. Standard
Deviation
8. 95£ Confidence
Range
9. Relative Std.
Deviation
10. 9535 Rel.
Confidence Range
11. Data Collected from
to
* c
Blk
18
1
o.ono
0.001
0.001
0. 00000047
0. 00068
0.000-0.002
7/1/74
9/5/74
* c
0,200
13
1
0.200
0.200
o.ono
0.0000026
0.0016
0.197-0.203
0.8%
98.5-101.5%
7/1/74
9/5/74
*c
0.500
6
1
0.500
0.500
0.000
1 . 0000002
0.00048
0.499-
0.500
0.1%
99.8 -
100.2%
7/1/74
9/5/74
* c
0.600
8
1
0.600
0.593
-0.007
0.000038
0.0062
0.581-0.605
1%
97 - 101%
7/1/74
9/5/74
*Blank - All reagents except sample
Standard (a) reference - An unknown supplied by an external audit agency
such as MDQARL.
(b) control - Sample of known concentration prepared by CRL
staff.
(c) calibration - Sample used to adjust instruments.
*Sp1ke - Amount of material added to a sample.
*0ther
Figure 12
Data Summary, Low-Level
Nitrate-Nitrite Analysis
180
-------
STATISTICAL SUMMARY OF
QUALITY ASSURANCE DATA
FOR DUPLICATE DETERHINATIONS
Page 2A
1. No. Samples
2. No. Determn/Sple.
3. True Values
4. Mean Value Determn.
5. Bias
6. Variance
7. Average of differences
8. Std. Deviation of
differences
9. Est. Std. Deviation
of Values
10. 95% Confidence Range
11. Relative Std. Dev.
12. 95% Rel. Std. Dev.
Range
13. Data Collected from
to
i.nnn
10
1
1.000
0.996
-0.004
0.000031
0.0056
0.985-1.007
0.6%
98-101%
7/1/74
9/5/74
Std. Cal.
2
1
_____
3.20
% Rec.
8
1
100%
100%
0
15
_
-.
3.9%
92-108%
7/1/74
9/5/74
Duplicates
31
2
0.002
0.0026
*Blank - All reagents except sample
Standard (a) reference - An unknown supplied by an external audit agency
such as MDQARL
(b) control - Sample of known concentration prepared by CRL
staff.
(c) calibration - Sample used to adjust instruments.
Spike - Amount of material added to a sample.
Other
Figure 13
Cfcta Summary, Low-Level
Nitnte-Nitrite Analysk
181
-------
SAMPLE TRACKING DATA MANAGEMENT SYSTEM
By G. C. Allison, M. J. Madsen, and R. N. Snelling*
INTRODUCTION
A Sample Tracking Data Management System
(STDMS) is under development at the National Environ-
mental Research Center in Las Vegas, Nevada (NERC-
LV) to provide a capability for laboratory control
(sample tracking) as well as the management of
environmental surveillance data.
The NERC-LV is involved in a variety of envi-
ronmental surveillance projects. In support of these
efforts, Laboratory Operations receives on (he order of
3,000 samples per month for specific stable chemical
and/or ladiochcmical analysis. Sample types include air
filters, gas samples, milk, water, edible and nonedible
vegetation, animal tissue, and soil. The specific analyses
performed vary with sample type and project objectives.
Table 1 depicts the major analysis "families" along with
typical monthly sample loads and expected processing
times. Anywhere from 1 to 27 specific analyses may be
required for a given sample. Expected processing times
range from 1 to 16 weeks. The indicated sample load
coupled with the wide range of analyses and processing
times results in approximately 7,500 sample-analysis
pairs in the system at any point in time.
SAMPLE AND DATA FLOW
The primary flow of data and samples within the
laboratory is depicted in Figure 1. This schematic is typ-
ical of an organization not having ADP capability. Four
functional activities are identified:
Project Management: the planning, direction,
and coordination of project activities and the
ultimate analysis and interpretation of data.
Field Operations: the collection of physical
samples and in situ datu.
Sample Control: the receiving, recording, and
routing of incoming samples and data.
Laboratory Operations: the stable chemical
and radiochcmical analysis of samples and the
subsequent generation of parametric data.
•Speaker
The progression of these activities is as follows. Project
Management establishes the protocol for Field Oper-
ations and Laboratory Operations sample analysis. Field
Operations collects samples and data and delivers them
to Sample Control. Sample Control serves essentially as
the central receiving and accounting point for all
incoming data and samples. The samples are routed to
Laboratory Operations for analysis, with analysis results
being reported back to Project Management for review
and interpretation.
AUTOMATIC DATA PROCESSING FUNCTION
The automatic data processing function is utilized
to integrate all these elements. A system which has been
operational at the NERC-LV for several years is shown
conceptually in Figure 2. The system provides basically
for the storage and retrieval of data from a direct access
master Tile. On the input side, sample identification and
collection data are entered through Sample Control
through the use of an "H" card (sample header infor-
mation) and an "L" card (location description
information). A unique six-digit sample number is
assigned to each physical sample and is used to identify
all data records for that sample.
A unique part of the system is that the laboratory
analytical programs are interfaced directly with the
master file for the purpose of retrieving needed san ile
collection information (such as collection date for use in
radioactive decay calculations). Analytical result cards
"R" cards) are generated for direct reentry to the data
management system after review by Laboratory Opera-
tions.
On the output side of the system, data summaries
may be generated directly from the master file. A typical
report is shown in Figure 3. In addition, working files
may be created for the purpose of utilizing special
purpose application programs such as dose models, plot-
ling, contouring, and statistical analysis. The weakness
of the system as shown in Figure 2 is that there is no
mechanism for ascertaining the analysis status of a given
sample other than its presence or absence in the master
file. With an average of 7.500 analyses in various stages
of completion at any point in time, this does not provide
1X2
-------
Table 1
NERC-LV Analysis Distribution, 1973
Analysis Family
Radiochemistry
Gamniu (Nal)
Gamma (GeLi)
Gross a, 0
Strontium
Tritium
Tritium (enriched)
Carbon
Radium
Iron
Plutonium
Uranium
Thorium
Polonium
Americium
Noble fanes
Radon
Iodine
Lead
Stable Chemistry
Ash
Major constituent
• Water
Nutrient Analysis
-Water
Selenium
Cadmium
Arsenic
Molybdenum
Vanadium
Mercury
UlKl
Zinc
Thorium
Specific Analysis
All present
All present
• -
8'Sr, ^Sr
3H
3H
"c
*> *\£. *> ^ O
ziDRa ziOj^
55rc '
*DO tin
238pu 237pu
234.J/235U, 238,j
228*p| 230f^ 232yi|
210Po'
241 Am
Kr, Xc, Ar
222Rn
129,
2'Opb
Ash Weight
Ca, Sr, Li, Na, K, Mn,
Mg, Al, ft. Si, B. F,
ci,so4
Alk, P-diss, P-tot,
NH j -N, N02 + NOj - N, TKN
Se
Cd
As
Mo
V
Hp
Pb
7n
Th
Sample Load
(Number per
Month)
350
10
120
50
165
20
1
23
1
50
32
5
4
1
50
K
1
1
2
6
820
1
Expected Pro-
cessing Time
(Day*)
7
14
21
45
21
60
90
60
60
90
90
90
90
90
30
7
30
120
45
21
21
21
21
21
21
21
21
21
21
21
183
-------
the degree of control or depth of information required
for the management of Laboratory Operations.
SAMPLE TRACKING
The sample tracking module is now being imple-
mented to provide a tool to management for evaluating
and controlling the Laboratory Operations function by
providing sample status information. This system is
shown in Figure 4. A status file has been added which
contains status statistics. Upon sample log-in, analysis to
be performed is defined either by default or through the
use of an "M" card. As completed analyses are received
("R" cards), the flags are turned off. A variety of status
reports may be obtained by an exception reporting tech-
nique. These reports are utilized by both laboratory and
project management for assessing sample status.
STDMS SYSTEM DESCRIPTION
Figure 5 shows a more detailed schematic of the
STDMS. The system is organized around two files, the
master Tile, containing all parametric and collection data
for each sample number, and the status file, containing
status related data for each sample number. A third file,
the location description file, is used to retrieve standard
narrative location descriptions and coordinates. Working
files may also be created for special applications, such as
statistical analysis, exposure modeling, and interfacing
with external data bases (STORET).
The system enters and updates data using four dif-
ferent card types (Figure 6). The "H" card is used to
initiate a sample number and generate a data record on
the master file and status file with each new sample.
Analyses to be performed and expected analysis times
are established by default according to project code and
sample type code. The analyses have been combined into
related groups called families.
The default family parameters may be overridden
through the use of the "M" card, The "M" card is used
to upJiitc the families requested and modify the report
due date for a given sample number on the status file.
For samples which were collected at locations not
on the location description file, an "L" card is used to
provide narrative location description information along
with the station coordinates. Card type "R" is used to
add results from a specific analysis family to the master
file and to update the status file by turning off the flag
for the completed analysis family.
The master file and status files are index sequential
files designed to permit direct access through the use of
multiple keys. The keys, defined at file creation time
allow retrieval to be made according to user reporting
needs. The key definition generates levels of indices
which contain pointers to the appropriate data records
Sample number, project code, and sample type are the
primary data elements utilized to form a key.
An array of constants is maintained defining the
expected analysis time for each family of analyses. Ex-
pected analysis time is defined as the amount of time in
days required to complete the longest analysis within a
family. When a status report is generated, the expected
analysis time is added to the log-in date and the resulting
date is then compared to the current date to determine
an overdue analysis. The calculated analysis due date
may be overridden with the report due date specified on
the "M" card. Thus the report due date can be set prior
to the calculated analysis due date.
STATUS REPORTS
The analysis in process report (Figure 7) lists only
those samples with outstanding analyses to be corn.
pleted. Samples are listed in order by sample nur ber
within project, within analysis family. Samples which
have exceeded their expected completion date are indi-
cated with an asterisk. This format is intended for use by
the laboratory analysis personnel. The analysis in process
by program report lists samples which have analyses to
be performed within a project (Figure 8). Past due
samples are flagged. This report is designed for middle
management use. The status summary by project report
(Figure 9) is intended for project management. It sum-
marizes samples received, analyses requested, and status"
These reports allow the STDMS to provide NERC-LV
management the capability of tracking all samples
through their respective analyses to completion.
-------
FIELD OPERATIONS
direction,
planning
samples, sample
collection data
PROJECT
MANAGEMENT
SAMPLE
CONTROL
direction
samples
LAB OPERATIONS
If
11
00
3
i
-------
X
3
FIELD OPERATIONS
direction,
planning
samples, sample
collection data
PROJECT
MANAGEMENT
Data
analysis &
presentation
programs
direction
LAB OPERATIONS
SAMPLE
CONTROL
samples
H = Header record (sample collection info.) »
I = i»cotioo) description rtcord
R = Result rtcord
S = Comment record
-------
SAMPLE REPORT (HYPOTHETICAL DATA)
PAGF 001
WASHINGTON REPORTED 09/10/71
CHEHALIS WASH - CONSOLIDATED DAIRY
000001 11 0005 DATE- 06 30 71 1200
SIZE- 3.50 L
CHEHALIS WASH - CONSOLIDATED DAIRY
000002 11 0005 DATE- 06 30 71 1200
SIZE- .400 L
CHEHALIS WASH - CONSOLIDATED DAIRY
000003 11 0005 DATE- 06 30 71 1200
SIZE- 3.50 L
CHEHALIS WASH - CONSOLIDATED DAIRY
000004 11 0005 DATE- 06 30 71 1200
SIZE- .400 L
CHEHALIS WASH - CONSOLIDATED DAIRY
000005 11 0005 DATE- 06 30 71 1200
SIZE- .400 L
CHEHALIS WASH - CONSOLIDATED DAIRY
000006 11 0005 DATE- 06 30 71 1200
SIZE- .400 L
90SR 1311
PCI/L PCI/L
NA 2.2E02
(4.4E01)
4.2E02 LT3E01
(1.5E01)
LT3E01 7.6E02
(3.9E01)
LT3E01 1.0E01
(1.0E01)
LT4E01 NO
NA NO
132TE 137CS
PCI/L PCI/L
7.5E02 6.1F01
(1.4E01) (l.OEOI
3.9E01 4.H! fn
(2.7E01) (l.ero?,
4.7E01 8.6^01
U.9E01) fl.6f.01J
1.1E01 7.9fO?
(1.0E01) (4.3E01)
NO NO
NO NO
Figure 3
Sample Report (Hypothetical D
-------
I
<•
FIELD OPERATIONS
direction
planning
samples.
collection data
PROJECT
Data
analysis &
presentation
programs
MANAGEMENT
SAMPLE
CONTROL
direct***
i
-
I
LAB OPERATIONS
H - Header record (sample collection info.)
L = Location description record
R - Rttult record
S = Comment record
M = Analysis to be performed rranJ
CM
12
S I
€ *.
-------
LOG IN FORM
CARD TYPES:
H.M.L
[RESULT CARDS
R.S
ANALYSIS
PROGRAMS
ANALYSIS
DATA
/LOCATION /
DESCRIPTION
V "" \
*
MASTER AND
STATUS FILE
UPDATE
PROGRAM
I
MASTER FILEJ
\
DATA
RETRIEVAL
PROGRAM
RETRIEVAL
INPUT
DATA
STATUS FILE
f ,
(
STATUS
REPORT
PROGRAM
STATUS
REPORTS
WORKING
FILE
( ,
(
DATA
ANALYSIS
& REPORT
PROGRAMS
DATA
REPORTS
STORET
FORMATTER
C STORET
DATA CARDS
2
ts
?
£S
S
.
It
re
3
c
-------
FRONT
SAMPLE IDENTIFICATION
PR
1
Dl
46
OG
2
IIP
47
TY
3
P4E
$
AN
48 |49|50
Nl
6
Al
31
ANALYSIS
CD
11
1
12
IA
13
YD
TE
14
D
15
MO
Ul
16
JM
7
LYJ
52
B
8
51!
53
ER
9
{
54
10
55
CD
n
STI
12
USE
56,37
ITE
13
CO
58
14
E
59
C
13
60
IT
16
El
61
r
17
(E
62
IB
»T
63
19
64
ST
20
Rl
63
Al
21
JN
66
rit
22
T
67
IN
23
IM
68
24
E
69
Y
25
70
R
26
71
M
27
72
0
28
73
D/
29
74
IY
30
75
1
31
76
10
32
77
Uf
33
78
34
79
35
S
z
o
S
36
IIZE
37.38
39
ONI!
40
[
41
IEI
42
n
43
H
44
UNII
45
im^nn
TO BE PERFORMED - REPEAT COL. 1-1O
17
DAY
LOCATION
CD
n
LA
52
12
D
53
13
EG
54
14
M
55
IS
IN
56
16
Si
57
m
18
z
MA
19
§
20
X.
l/t
21
U
22
.
23
I
ro
24
X
25
0
R
26
a
AC
27
Ul
in
Id
28
a.
Cl
29
3
HE
31
X
M
30
O
a.
SI
32
a
a.
FR<
33
1
r
34
m
z
35
X
ae
rg
eg
36
a>
37
38
39
40
u
S
41
*
Z
42
UJ
SI
43
o
u
FA
44
Z
BL
45
O
3
E
46
>
CH
47
s
EN
48
CD
a.
II!
49
Z
!T
50
=>
RY
51
X
52
53
54
55
OTHER
DESCRIPTION - REPEAT COL. 1-1O
17
C
58
18
LOI
59
19
VGI
60
20 21
)EG
61
M
62
22
IN
63
23
S
64
24
EC
65
25
1
66
26
:NT
67
SI
27
Y
68
fAl
28
Rl
69
no
29
iN
70
IN
30
D
31
ESCRIPTI
32.33 34 35
0*
36
1
37
38
39
40
41
42
43
44
4346
47
48
49
50
51
BACK
PR
i
CO
n
R
S'
*•*
R
• r
'^',
R
^
R
/ •
V*
S
06
2
n
IY
3
13
PE
4
ID
14
5
N
'
ENT
15 16
i
i
!
I
JM
7
IB
8
FIEfl
17 18
ER
9
19
10
20
NOTE: ENTER RESULT IDENTIFIED LEFT JUSTIFIED
ENTER 'LT' IN COL 21-22 FOR LESS THAN VALUES
ANY OTHER NOTATION IN COL 21-22 WILL INDICATE
THAT RESULT FIELD IS TO READ AS A COMMENT
L
2I
T
22
23
24
25
RE
26
SI
27
JL
28
r
29
-
30
31
32
33
34
Z-S
35
IGI
34
IM
37
I
38
39
40
UNIT
41 42
:~if.; ' v1
^•Y
Figure 6
Sample Control Form
-------
ANALYSIS IN PROCESS REPORT
PO ANALYSIS
SAMPLE NO.
»1?9636
•I?9650
•1?965S
•129761
•1?9773
•nooio
•noon
•130035
•110037
•no 040
•130041
•130043
•130044
•13006*
•13007?
•13007*
•130076
•130079
•13008?
•1300*3
•130044
PROGRAM NO.
56
56
56
56
56
56
56
20
?0
30
?0
?0
20
20
20
20
16
16
16
16
16
16
16
16
16
PAGE 1
SAMPLE TYPE
1
DATE
6X
7
7
7
7
7
7
7
7
7
LOGIN
DATE
52074
520/4
52074
52074
52074
52074
52074
52074
52074
52074
520^4
52074
52074
52074
52074
52074
52074
52074
520/4
52074
52074
52074
52074
52074
ANALYSIS
OATE
DUE
50574
50774
50574
50574
51074
50974
51074
50674
50774
50574
50174
50474
50174
50374
51174
S1174
51174
51474
51174
51374
51474
51174
51174
51274
51374
REPORT DUE
OATE
50574
50774
50574
50574
51074
50974
51074
50674
50774
50574
50174
50474
50174
50374
51174
51174
51174
51474
51174
51374
51474
51174
51174
Sl?74
51374
NO. OF DAYS
SINCE LOG IN
12
12
12
12
12
12
12
12
12
12
12
12
12
12
12
12
12
12
12
12
12
12
12
12
12
• INDICATES ANALYSIS IS OVERDUE
-------
CURRENT DATE 6/ 1/74
SAMPLE
NUMBER
130557
130557
110557
110557
130557
110557
130558
1.30558
130559
130«55<»
130559
130560
130560
130560
130560
130561
130561
130561
130561
130561
130562
130562
130562
130563
130563
ANALYS
GEL I
14C
241AM
MCA-*
MG
PO
MG
14C
241AM
NA-I
GEL I
MCA-W
MG
14C
PO
241AM
MG
GEL I
MG
GEL I
14C
ANALYSTS IN PROCESS
FOR
PROGRAM 51
SAMPLE NO. OF DAYS
TYPE SINCE LOG IN
3 12
3 12
3 12
3 12
3 12
3 12
3 12
3 12
3 12
3 12
3 12
3 12
3 12
3 12
3 12
3 12
3 12
3 12
3 12
3 12
3 12
3 12
3 12
3 12
PAGE
ANALYSIS DUE
DATE
5/ 3/74
5/ 3/74
5/ 3/74
5/ 3/74
5/ 3/74
5/ 3/74
6/ 9/74
6/ 9/74
5/17/74
5/17/74
5/17/74
5/16/74
5/16/74
5/16/74
5/16/74
5/15/74
5/15/74
5/15/74
5/15/74
5/15/74
5/10/74
5/10/74
5/10/74
5/11/74
5/11/74
REPORT DUE
DATE
5/ 3/74
5/ 3/74
5/ 3/74
5/ 3/74
5/ 3/74
5/ 3/74
6/ 9/74
6/ 9/74
5/17/74
5/17/74
5/17/74
5/16/74
5/16/74
5/16/74
5/16/74
5/15/74
5/15/74
5/15/74
5/15/74
5/15/74
5/10/74
5/10/74
5/10/74
5/11/74
5/11/74
• INDICATES ANALYSIS IS OVERDUE
-------
ANALYSIS STAT'lb REPORT
FOP
PPOGRAM SI
SAMPLE
NUMBER
130*51
130551
110554
110555
11055*
110559
3
3
3
3
3
3
3
3
3
3
3
3
LOGIN NO. OF OAVS
HATE SINCE L0« IN
5?074 1?
57074 1?
5?074 12
5?074 12
1?
1?
5?074
5?074
S7074
5?074
12
1?
1?
1?
57074 12
5?074 1?
5?074 1?
:PORT OUE
DATE
5/?3/74
f,/lfl/74
5/11/74
6/17/74
5/1H/T4
5/ 1/74
5/ 3/74
6/ 9/74
5/17/74
5/16/74
5/15/74
5/10/74
5/11/74
5/13/74
N
A 1
. 1
I
I
I
0
I
0
0
I
c
0
0
0
I
I
0
-, A A
I S .
- H H
1
I I
I I
0 0
I I
I I
0 0
I I
c c
0 0
I I
0 0
I I
I I
I I
533
R H H
•
I 0
I I
n o
I I
I I
n o
I I
C C
0 I
I o
0 I
I 0
I I
I I
1 R 5
* x 5
C F
F
n o
I I
0 0
I I
T I
n o
T I
r c
I I
0 0
T I
0 0
T I
T I
PUT
LJ H
0 0
I I
0 0
I I
I 0
0 I
I I
C I
I I
0 0
I I
0 0
I I
I I
P 2 K
D 4 R
1
A
M
0 0
C I
0 0
C I
0 0
I I
0 I
1 I
0 I
0 0
0 I
0 0
0 I
0 I
X A R
E R N
0 O
I I
0 I
I I
0 I
I I
I I
I C
I 0
0 I
I 0
0 0
I I
I I
1 ? M
? 1 C
» 0 A
I p .
q w
o o
I C
I 0
I C
I 0
I 0
I 0
c c
0 0
I 0
0 0
0 0
I 0
I 0
Y S
1 c
T
J
•»
>
M
5
(
C
1
» Z
0 N
1 I
I
0
I
I
0
I
c
I
0
I
I
> I
I
J T
H
I
I
0
I
I
0
I
c
I
0
I
I
I
I
I - INCOMPLETE ANALYSIS
C - COMPLETED ANALYSIS
0 - OVERDUE ANALYSIS
. - NOT REQUESTED
vO
Ui
-------
SIDES
By David Barrow
The STORET Ii.put Data Editing System (SIDES)
was designed and built by several members of the
Region IV staff of the Southeast Environmental Re-
search Laboratory to overcome some difficulties in using
STORET. These difficulties are outlined in Figure I.
Figure 2 gives a list of features that SIDES provides to
alleviate these difficulties.
Throughout the past two years, SIDES has proven
lu be a useful vehicle for preparing data for entry into
STORET and for obtaining data printouts for technical
publications much sooner than they could have been
obtained after entry of the data into STORET.
Figure .? gives a list of users and uses of the SIDES
system. In addition to its primary use for data entry, the
edit program of SIDES is being used as a data editing/
interface module in the transfer of Ohio River Valley
Sanitation Commission's (ORSANCO) water users' data
from ORSANCO's data files into STORET. The SIDES
cdil program is also being used in the Office of Research
and Development's Interim Laboratory Data Manage-
ment System as the edit module and as an interface to
STORET.
Figure 4 shows the system block diagram for the
SIDES system. It also shows a program Loadings and
Statistics (LDSTAT) and its interface to the SIDES data
files. Wlulc program LDSTAT is not officially a part of
the SI DBS system, it provides a sophisticated printout
capability for data prior to entry into STORET. Program
LDSTAT also may be used to print out data already
stored in STORI-T.
The SIDES system block diagram begins by
showing two card decks: the editing information deck
(EID) and the data decks. Figure 5 gives an example of
these two decks. The EID consists of a STORET agency
card, a YEAR card, a parameter (PAR) card for each
parameter that may have data, and a STATN card for
each station Ihat may have data. The function of the
EID is simply to provide an easy reference for the edit
program. The edit program checks much of the contents
of the dala decks against what was specified in the EID.
The second card deck, the data decks, consists of
one or more data decks. Each dala deck in turn consists
of one parameter (PAR) card and any number of sample
identification cardi, type B (SIDB) followed by
corresponding data cards (DTA). The PAR card specifies
the order and parameter number of the data punched
into the DTA cards that follow it.
The SIDB card contains the sample identification
information for each sample to be processed by SIDES.
The following information may be specified on the SIDB
card: STORET station number; the date and time the
sample was taken; the ending date-time and the various
composite description information for composite
samples; the depth at which the sample was taken; and
the percent from the right bank of the location at which
the sample was taken.
The DTA card gives the actual values measured for
each parameter specified on the PAR card. The DTA
card is linked to its SIDB card by the sample number
which must appear on both. The DTA card may be
continued on up to nine additional cards. At six param-
eter values/card, this gives a total of 60 parameters that
may be processed through the SIDES system at one
time.
If the user has data for more than 60 parameters
he must separate the data into two or more groups of 60
or less parameters each and process the groups in
separate runs of the system. On the DTA card, a missing
parametric value is indicated by a blank field, and the
deletion of a value already in STORET is indicate by a
solitary Din the first position of the proper field.
The EID and Data Decks are processed through the
merge/edit (M/E) phase of the system during which
several functions are being performed. First the SIDB
and DTA cards are merged. Second, extensive editing is
performed upon the cards of the data decks. All cards in
both the EID and the data decks are printed on the M/E
printout. If any errors are detected during the editing
operation, appropriate error messages are printed out
with the guilty card on the M/E printout. Examples of
M/E printouts appear in Figures 6, 7, and 8.
In the third step of the operation, the data in the
data decks are reformatted and entered into the SIDES
disk files. All samples are entered into the file that goes
to the SIDES printout programs, but all samples having
errors are flagged. Only samples that passed all edit tests
I'M
-------
urc unified into (he (lie (hat goes lo the STORET
storage program.
Once the M/t phase has been completed, the user
has many options. If errors were detected during the
M/t phase, he may decide to correct the errors and
rerun the M/E phase. He may decide to store the good
data, correct the erroneous data and rerun it. He may
decide to obtain an original order or a station order
printout of the data to further assist him in validating
the data: or he may decide to get a LOST AT printout to
obtain statistics, loadings, or a printout to go into a
technical report.
An example of the original order printout appears
in Figures') and 10. An example of the station order
printout appears in Figures II and 12. In the original
order and station order printouts, the parameter values
arc printed exactly as they were punched onto the DTA
curds. This is to facilitate checking the data. Note that
even the samples that were rejected in the M/E printouts
arc included. The original order printout greatly facil-
itates proofreading the data from the original input
documents. The station order printout enables chemistry
personnel to easily discover bad values that may have
occurred during transcription.
An example of the LDSTAT printout is given in
Figures 13 and 14. In the LDSTAT printout, only
samples that have passed all edit tests are printed. The
parameter values arc printed with decimal points aligned
as dictated by the STORET decimal point specification
for each parameter. The statistics are evaluated for one
station at a time. Note that the remarks K and L are
included in the minimum and maximum values. Sta-
tistics arc not printed for parameters having only one
value. Some parameters arc not summarized at all, e.g.,
SAMPLE NUMBER.
105
-------
Why SIDl-:S was written:
STORHT input formats (Fixed Format and DIP) were difficult to code and to keypunch
thereby causing errors and wasted time.
There were no easy-to-read printouts to facilitate validation of data prior to entry into
STORET.
The length of time required to enter data and corrections into STORET was unsatisfactory.
It was very difficult to correct data once it had been entered into STORET.
Figure I
SIDES
SIDES offers the following features:
An easy-to-use card formal for entering data particularly from the standpoint of
keypunching.
A straightforward method of entering sample identification information.
A comprehensive pre-STORET editing system to allow user to catch keypunch and
transcription errors prior to entering the data into STORET.
Several data reports may be run from SIDES intermediate disk files to aid in data
validation and to provide data printouts with statistics for inclusion in technical
reports prior to entry into STORET.
A simplified method of correcting data already eniered into STORET through SIDES.
Figure 2
Features of SIDES
SIDES users and uses:
State of North Carolina for data entry
State of South Carolina for data entry
EPA, Region IV, S&A division, for:
Data entry
Technical report generation
ORSANCO water users' data transfer
EPA, Region IV, Escambia Bay Recovery Study for data entry
EPA, ORD, Quality Assurance Division, for:
Part of the Interim Laboratory Data Management System.
Figure 3
SIDES Uses
-------
MERGE/
EDIT
PRINTOUT
^-J
EDITING
INFORMATION
DECK
DATA
DECKS
1
r
MERGE/
EDIT
• STORET AGENCY CARD
• YEAR CARD
• PAR AM CARDS
• STATN CARDS
• PAR CARD
• SIDB CARDS
• DTA CARDS
• SIDB & OTA CARDS MERGED
• EXTENSIVE EDITING OF DATA CARD DECKS
• GENERATION OF STORET INPUT CARDS
I
DATA IN
DISK
FILES
STORET
STORAGE
PROGRAM
SIDES
PRINTOUT
PROGRAMS
STORET
DATA
FILES
PRINTOUT OF
DATA IN
ORIGINAL
ORDER
INTERFACE
PRINTOUT OF
DATA IN
STATION
ORDER_
PSEUDO
STORET
TYPE IV
RETRIEVAL
FILE
1
PROGRAM
LOSTAT
Figure 4
SIDES System Block Diagram
-------
imcooo
YFA-
STAT"
STAT*
PAR*''
PARA'I
PAP if
PAB&"
PiPA'
PAPAv
PARA"
74
iVC-1
S-l
31S01
31M4
00400
00100
OOO40
00000
00410
PARA'"
PARA*
PARAV
00^40
EDI TIMS iMpnkMflTION OFO FOLLOWS.
SKWL ATHFNS GA 4'U-S4t>-354B
130000
HOOOS
«««»««««»*
F 021112049999 A
oooooo
PAR
STO"
STOrl
STO-
sin-
sin -
S1O"
SIO--
STD>'
sin-
STO';
c TO-1
STD^
STOM
STOl-
niA
HTA i
OTA
OTA 1
OTA 1
OTA
OTA 1
OTA 1
OTA
OTA 1
OTA 1
OTA 1
OTA 1
OTA 1
OTA 1
OTA 1
OTA 1
OTA 1
OTA 1
OTA 1
OTA 1
oooo
1
11S01
S-l
S-l
S-l
wr-
S-?
S-l
S-l
S-l
S-
s-
s-
s-
5-
S-
000)
000^
000?
000?
ooh
0003
ooo;^
00^5
0004
0004
OOOiS
0007
OOOfi
ooog
0010
001?
0013
0014
0016
0017
OOlfl
ONE DATA C.
I I I
HFC* FOLLOWS.
I I I
17000
7*05?]
1 non
74ns?)
740S?] nt.4S
Hl^- looi
??v* no?
74051S 10.)s
I?. 1(1'I/
740515 li"tv
1- 1?
1 .• 1 ?
740S1S 1-.I14
15
" 0 . ]
«('. 1
^.
KO. 1
29.
2 '
144
^.?
134
1S4
1 •««<
1*0
141
14A
154
152
15?
I I
:j«f.ion*>«»o
lioo
100
Figures
Example of Two Cud rack
0001
ooos
000?
0011
001*3
0004
000*
0007
00 OR
0009
0010
001?
0013
0017
ooib
13
15
15
(S.2.0)
198
-------
REPORT RFSTOlA
EDITING INFORMATION FILE:
PAGE
10 M113SOOO
20 'TEAM
30 »STAT».
40 »STAT*
50 • PARAM
60 'PARAH
TO >PARAi>
HO • PAR AM
90 *PARA«
100 'PAR AH
110 'PAOAM
120 'PAR AM
130 • PARAM
140 • PARAM
74
MC-1
S-l
31501
31616
00400
00300
00940
00900
00410
50060
00340
00669
SEW. ATHENS 6A 404-546-3548
0211130*9999 A«
130000
130*05
-------
REPORT RFSTOU
PAGE
FIELD STUDY DATA PECK FOLLOWS:
10 «PAB
20 «SIDB
30 'OTA
40 'S10B
SO 'OTA
60 'S1DB
70 'OTA
80 OTA
31501316160040000300009400090000*10500600034000680
*C-\
0001 17000
S-l
1 0005
WC-1
0002 6.2 8.5 1.9
1 0002 6
S-l
1 0011
0003
1 0003
S-l
I 0015
UC-1
0004
1 0004
5-2
1 0006
S-l
1 0007
S-l
1 0008
S-l
1 0009
S-l
1 0010
290 141
740515 1007
146
740515 1008
154
740515 1009
152
740515 1012
136
7*0515 1013
152
740S15 1014
169
740521
100
8.5
8.3
8.1
150E6
25
2.0
2.0
0001
0005
0002
13
0011
0003
15
0015
0004
15
0006
0007
oooa
0009
0010
0012
0013
0014
0016
0017
ooia
—>ILL.FIELD STA.Nft.
—>BAO SIOB CARD.
~>ST.OA.OUT OF RANGE.
—>BAD SIDB CARD,
—>ST.TIHE OUT OF RANGE.
-->8AO SIOB CARD.
— >FLO 2 ILL.CHAR.BET.FLOS.
—>FLO 1 ILL.VALUE.
—>FLD z ILL.BLANK.
—>FLO * NOT LEFT JUSTIFIED.
COUP-TIME
bRAB
GRAB
GRAB
GRAB
GRAB
GRAB
REJECTED
REJECTED
REJECTED
REJECTED
REJECTED
6RAB
GRAB
GRAB
GRAB
GRAB
GRAB
-------
REPORT RFSTG1A PA6C
EOF ON CFSTOl.
PROGRAM PFST01 RUN SUMMARY:
40 CARDS READ FROM CFSTOl.
30 CARDS WRITTEN ONTO OFST03 FOR STORASE IN STORET.
»* RECORDS WRITTEN ONTO DFSTO*.
STORET INPUT HAS BEEN GENERATED FOR
12 GRAB SAMPLESt
0 COMPOSITE SAMPLES H/0 TIME OF OATt
1 COMPOSITE SAMPLES DlTH TIME OF DAY.
0 SAMPLE DELETIONS.
AND DATA FOR 5 SAMPLES HAS BEEN REJECTED.
EXECUTION COMPLETED.
-------
(O
O
to
PAGE
FIELD
STA.NB.
wc-i
S-l
WC-1
S-l
WC-i
S-l
WC-1
s-a
S-l
S-l
s-
s-
s-
s-
s-
S-l
S-l
S-l
STOHET
STA.NR.
130000
135005
130000
taooos
130000
13000S
130000
130005
13000%
130005
13000S
131005
130005
130005
13QOOS
130005
130005
FIELD SAMPLE
IOEWTIFICAUON
START ST. COUP..
DATE T[H£ TYPE
7*0621 0900
7»051S 1000
7*0521 0905
7*0515 1006
7*0521 09*5
7*0515 1011
7*0521 1030
T»0515 1001
7*0229 1D02
1*05U 10*3
7*0515 100*
7*0515 1005
7*0515 IOOT
T*0515 1006
7*0515 1009
7*0515 1012
7*0515 1013
7*0515 101*
ENDING END. SAMPLE
DATE TIME MUMBEfl
7*OSZ1 1100 0001
OOOS
0002
noil
0003
0015
0004
0006
0007
09 iB
0009
0010
0012
0013
001*
0016
0017
OOlfi
ooooe 00003
SEKL DEPTH
SAMPJ.E
NUMBER
0001
0005
0002
OOU
0003
0015
000*
0406
0007
00 OB
0009
0010
0012
0013
001*
OOlb
0017
0018
00002 31501 31616 00*00 D0300
* FROM TOT COL I FEC COL I PH 00
RIGHT MFIMENQO KFH-FCtt«
BANK /100ML /100HL 5L» MO/L
1TOBO T90
6.? 8. 5
b.3 a. 3
6.3 (J.l
-------
PAGE
FIELD
STA.MR.
HC-1
S-l
tfC-1
S-l
MC-1
S-l
HC-1
s-2
S-l
s-i
S-l
S-l
S-l
S-l
5-1
S-l
S-l
S-l
STORE?
STA.MA.
130000
130005
190000
130005
130000
13000S
130000
130005
130005
130005
130005
130005
13*005
130005
130005
13*005
130005
FICLO SAMPLE
IDENTIFICATION
START ST. CONP..
DATE TIME TVFE
7*0*21 0900
7*0515 1000
7*0521 0905
r*osi5 1006
7*0521 0945
7*osis 1011
7*0521 1030
7*0515 1001
T402Z9 1002
7*0515 10«3
7*0515 1004
7*0515 10«5
7*0515 1007
7*0515 iooa
7*0515 1009
7*0515 1012
7*0515 1013
7*0515 1014
ENDING END. SAMPLE
OATE TIME NUMBER
7*0521 1100 P001
0005
•002
0011
0003
0015
000*
0006
000 T
000*
0009
0010
0012
0013
001*
001*
0017
OOltt
009*0 00900 00*10
CHLORIDE TOT HARD T ALK
CL CAC03 CAC03
HG/L me/i. *G/L
1.9 13 6
2.0 IS 7
2.0 IS 7
IS
50060
CHLORINE
TOT RESO
MG/L
KO.I
KO.l
KO.I
9.3
2 5
003*0 00680
COO T OR6 C
HI LEVEL C
Nfi/L MG/L
1** 100
15*
15*
150E6
1*1
1*6
15*
152
136
152
169
-------
c
J-.
PAGE
FIELD
STA.NR.
S-2
*C-1
MC-1
MC-1
WC-1
S-l
S-l
S-l
S-l
S-l
S-l
S-l
s-
s-
s-
s-
s-
S-l
STORE!
S1A.NR.
130000
130000
130000
130000
130005
130005
1300 OS
130005
130005
130005
130006
130005
130005
130005
130005
130005
130005
FIELD SAMPLE
IDENTIFICATION
START ST. COMP..
DATE TI»E TYPE
7*0515 1001
740521 0945
740521 9905
7405Z1 0900
740SZ1 1030
740329 1002
7*0515 1011
7*0515 1000
7*0515 1006
7*0515 101*
7*0515 1013
7*OS15 1012
740515 1009
740515 1008
740515 1007
740515 1005
740515 1004
7405J5 1063
ENDING END. SAMPLE
DATE TIME NUMBER
0006
0003
ODD?
7405ZI 1100 0001
0004
OOOT
0015
OOOS
0011
0018
0017
0016
0014
0013
0012
0010
0009
0008
ooooa 00003
SE*L DEPTH
SAMPLE
NUMBER
0006
0003
0<^
0001
0004
0007
0015
0005
0011
0018
0017
0016
0014
0013
0012
0010
0009
0906
00002 31501 31616 00*00 00300
* FROM TOT COL I FEC COL I PH DO
SIGHT MFIMENOO MFM-FCBR
BANK /100ML /100ML SU M6/L
6.3 B.3
6.2 8.5
17000 790
6.3 8.1
-------
PAGE
FIELD
STA.NR.
s-2
KC-1
tfC-1
MC-1
WC-1
S-l
S-I
S-l
S-l
S-l
S-l
S-l
S-l
S-l
S-l
S-l
S-l
S-l
STORET
STA.NR.
190000
130*00
130000
130000
130005
13000S
130005
130005
130005
13000S
130*05
130005
130005
130005
130005
130005
130005
FIELD SAMPLE
IDENTIFICATION
START ST. COM>..
DATE TIME TYPE
7+0515 1001
7*0521 09*5
7*0521 0905
7*0521 0900
740521 1030
7*0229 1002
7*0515 1011
7*0515 1000
7*0515 1006
7*0515 100*
7*0515 1063
7*0515 1007
7*0515 1005
7*0515 1009
7*0515 1008
7*0515 1013
7*0515 1012
7*0515 101*
ENOIN6 CNO. SAMPLE
DATE TINE NUM6EH
0006
0003
0002
7*0521 1100 0001
000*
0007
0015
0005
0011
0009
0008
0012
0010
001*
0013
0017
001*
oeio
009*0 00900 00*10 50060
CHLORIDE TOT HARD T ALK CHLORINE
CL CAC03 CAC03 TOT RESD
HG/L KG/L HG/L HG/L
2.0 15 7 KO.l
1.9 13 6 KO.l
2.0 15 7 KO.l
9.3
15 25
003*0 00680
COO T OR6 C
HI LEVEL C
HG/L N6/L
15*
1** 100
15*
150E6
1*8
1*1
152
15*
152
136
169
o
1/1
-------
PAGE
STATION - 130000
00008
SEMI
SAMPLE
DATE TIME DATE TIME NUMBER
7*0i21 0900 (07*0531 1100 1
7*0531 0905 2
7*0521 09*5 3
7*0521 1030 *
7*0521
NUMBER
MAXIMUM
MINIMUM
SUM
SUM SO.
MEAN
VARIANCE
STD.DEV.
STD.ERR.
COEF VAR
LOG MEAN
7*0521
009*0
CMLORIOt
CL
MG/L
1.9
2.0
2.0
3
2.0
1.9
5.9
11.6
2.0
0.0
0.1
0.0
2.9
2.0
00900
TOT HARD
CAC03
MG/L
13
IS
15
3
15
13
*3
619
1*
1
1
1
a
i*
00*10
T ALK
CAC03
MG/L
b
7
7
3
7
6
20
13*
7
0
1
0
9
7
S0060 31501 31616
CHLORINE TOT COLI FEC con
TOT RESD HFIMENDO MFM-FCBR
MG/L /100ML /100ML
17000 790
0.10K
0.1 OK
0.10K
3 1 1
0.1 OK
0.10K
0.30
0.03
0.10
0.00
0.00
0.00
0.00
0.10
00*00
PH
su
6.2
6.3
6.3
3
6.3
6.2
18.6
117.8
6.3
0.0
0.1
0.0
0.9
6.3
00300
00
MG/L
a.5
8.3
e.i
3
8.S
8.1
2*. 9
206.7
8.3
0.0
0.2
0.1
2.*
8.3
ft.
S5
s
-------
PAGE
STATION - 130006
00008 00340 00680
SEWL COO T OHG C
SAMPLE HI LEVEL C
DATE UHC DATE TIME NUMBER HC/L MG/L
7*0515 1000 S U* 100.0
7*0515 1006 11 15*
7*0515 1007 13 U8
7*0515 1000 13 15*
7*0515 1009 1* 152
7*0515 lull IS 15*
7*0515 IVli 16 136
7*0515 1013 17 152
7*0515
NUMBER 0 1
HAXIMUM 15*
H1N1HUM 136
SUM 1194
SUM SO. 176*93
f£AN 1*9
VARIANCt *1
STD.OCV. 6
STO.ERR. Z
COCF VAR *
LOO NCAM 1*9
7*0515
8
-------
SELECTION OF PROBABILITY MODELS FOR
DETERMINING QUALITY CONTROL DATA SCREENING RANGE LIMITS
By Wayne R. OH, Ph.D.
INTRODUCTION
Environmental monitoring laboratories performing
routine chemical analyses are using computers increas-
ingly to process and store the data. If these laboratories
process large quantities of data, errors may be intro-
duced in the data handling phase, and there is some risk
thai those errors may go undetected. Such errors, which
frequently icsult simply from keypunch mistakes, may
be siorcd as "valid" data and may create serious prob-
lems when sample means or other statistics are calcu-
lated. 'I lie computer can be used, however, to automat-
ically screen laboratory data as they are entered to
dele, mine if any value of a given parameter* is "un-
usiia!" 01 lies outside ;m "acceptable range." Such
unusual values can be "(lagged" by the computer, there-
by bringing them to the attention of laboratory person-
nel where eirors can be corrected and new analyses
pcrlormed if necessary.
STATEMENT OF PROBLEM
Typical environmental data processing and handling
errors result from three sources: keypunch mistakes,
failure to consider minimum detectable limits, and cleri-
cal mistakes or omissions. For example, when storing
water quality data, a valid parameter value of pH = 5.5
may he erroneously keypunched as pH = 55.0 simply
because tiie decimal is entered in the wrong column.
Without piovision for "data screening," the computer
will simply accept this value and store it as valid data. To
.so'cen (he ilala. the value o/ each parameter is compared
with piesel hunts which are also stored in the computer;
values onisule these limits arc Hugged. Values which are
improbable can therefore be brought to the chemist's
attention while impossible values can be automatically
rejected.
To implement a data screening system, an improb-
able range of values for each parameter must be deter-
mined; historical data can be used for this purpose. At
first glance, two or three times the standard deviation!
might appear to be a suitable choice for the quality
control data screening "range limits." However, an
examination of water quality parameters suggested that
most parameters are not normally distributed, and the
underlying distribution of each parameter may differ.
Thus, use of 2 a or 3 a would result in range limits that
have different probability of occurrence for each para-
meter. It would be preferable for the range limits to
always represent the same probability, regardless of the
underlying distribution. Furthermore, it would be desir-
able to have a systematic procedure for arriving at these
data screening range limits.
METHODOLOGY
The Environmental Protection Agency (EPA) is
now undertaking a data analysis effort in which prob-
ability models are fit to historical water quality data and
range limits calculated. These limits then are being I lilt
into a quality control data screening computer prog_am
which serves as an integral part of a computerized labor-
atory data management system. This data management
system is designed to store, manipulate, and retrieve
environmental data, ultimately transferring it to national
environmental data archives.
"In this i>a|>ei. ihe term "parameter" is used to mean a variable which is used as a measure of the stale of the environment. Tor example
hoth i.iiboM monoxide measured by air monitoring stations and pH measured in rivers arc environmental parameters.
i-Ilie il.ita estimate \ofthe arithmetic moan /Hi is calculated as follows: x
1 V-»
- > x
n / j i
'Hie Jala estimate s of the standard deviation U is calculated as follows: s
-------
Probability Models
Mosi environmental paramelcrs, such as pollutant
concentration, never are less than zero and have essen-
tially unbounded upper limits. For environmental data,
eight probability models arc of particular interest:*
Exponential
Normal
Log-normal
WeibuU
Gamma
Beta
Raleigh
Extreme Value.
liquations (I) and (2) give the probability density func-
tion (PDF) and the cumulative distribution function
(CDF) for the Wcibull probability model:
PDF
I' (x)
A
(f/
x > 0
(1)
CDF
(2)
This probability model has two parameters: a "shape
parameter," 0, and a "scale parameter," £. For different
values of 0, it takes on a variety of different shapes
(figure 1). Because of this flexibility, the model is a
good candidate for fitting environmental data. When
a = |. the Weibull probability model becomes equivalent
to the exponential probability model.
rv(x)
A
i
T
-v /•
X/ ,
(3)
Tlie other probability models differ in shape from each
other. Some, like the normal distribution, are symmetri-
cal. The characteristic shape is constant, and the para-
meters determine the "location" of the curve and its
'•spread" or dispersion.
OvenH Approach
A practical, convenient approach for fitting prob-
ability models to environmental data is by the "method
of moments." In this approach, the estimates of the
moments calculated from the data are set equal to the
actual moments of the probability model. The overall
methodology for selecting and fitting models to histori-
cal data uses this approach and is depicted in Figure 2.
At the left side of the figure, the raw historical data are
analyzed, by computer, and the histogram and estimates
of the moments are developed. Next, a probability
model is selected for trial. The relative magnitude of the
moments gives some clue as to the most appropriate
model. The cumulative distribution function of the
model is then compared with the cumulative frequencies
of the data. If the model does not have a reasonably
good fit, another model is selected. Once a "best-fit"
model is selected, it is used to calculate, within the range
of the data, the values of the parameter which corre-
spond to various extreme probabilities (P • .01, .001;
P = .99, .999). These values are used as the data screen-
ing range limits.
Selection of Probability Models
The most important distinction between different
probability models is their shape. The third and fourth
moments provide measures of the shape of each model.
The third moment, ^3. is a measure of the distribution's
asymmetry or skewness, the extent to which it exhibits a
long tail to the left or to the right. The fourth moment is
a measure of the distribution's "peakedness" or kurtosis,
the relationship of the distribution's "height" to its
"width." For a given probability model, the second,
third, and fourth moments are obtained from the prob-
ability density function fx(x) as follows, where MI" »
the mean:
/(*•",')'
dx
(4)
f,,(x) dx
(5)
,(x) dx
(6)
The second moment is also known as the variance and is
the square of the standard deviation, 1*2 " °^- Two
•GcnU ). Ifahn and Samuel S. Shapiro. Siaiittiial Matielt in Knglnecrlnii (New York: John Wiley & Sons, Inc.. 1967). See pages 120
134 for nummary of probability models lor continuous random variable*.
209
-------
iinpoiiuni quantities are the coefficient o.f skewness\/3|
and the coefficient ofkUitosispS:
for 1112, nl3 ''or^3' ant' m4 'or ^4 into equations (7)
and (8):
("a)
(7)
(8)
Figured shows the relative values of/3| and/37 calcu-
lated hy the author fpr.a variety of .probability models.*
The uniform, normal, Raleigh, and exponential models
all have constant shapes; tljerciore" they are represented
on ilus figure as points, Foi the .normal distribution,
/*! = 0 (indicating thai the jnoclcl is symmetrical) and
/}-, = *. Tlic Gamma model is represented as a straight
line, since a linear relationship exists between0| and^£:
0., 1.5/Jj < 3 (!))
On this figure, both the lines fcir the Gamma and Weibull
probability models intersect the 'point representing the
exponential probability model because the exponential
model is one specialcase of both" of these distributions.
For the log-normal and Weibull probability models, the
lines, which were calculated by computer, exhibit slight
curvature. The beta'distribution (not shown) would be
represented as a region on this figure rather than a line
since it lakes on a multitude of shapes.
For a given set of data, whose underlying distribu-
tion may be unknown, it is possible to obtain estimates
of UK- first lour moments. Kstimalcs of the first and
second moments are calculated as the mean x and vari-
ance s~ in the manner given previously. Estimates of the
third and fourth moments about the mean are obtained
as follows:
'(10)
I 1
(in
i i
Using these quantities, estimates of the coefficients of
skcwness and kurtosis may be obtained by substituting
(12).
(13)
..These quantities, which are easily calculated by com-
puter from the raw data, can assist in selecting the most
appropriate probability model. Once calculated, they
may be compared to the curves and points in the (f5, |3-)
plane of Figure 3 to suggest which probability model is a
good choice. By establishing a "region of acceptance"
around each line and point in the ((Sj^) plane, this
comparison can be done automatically by computer.
APPLICATION OF METHODOLOGY
For the laboratory data management system, three
classes of suspect data were selected:
Improbable:
P(X < XL or X > XH) = .01
Highly improbable:
p(x < XL or x > XH) = .001
Impossible:
Cannot physically occur.
The probability model is used to calculate Xi and XLI for
the first two classes. With the exception ofpH, nu ( of
the water quality parameters had impossible 'lower'
ranges of 0.0 and no upper impossible range. Another
important criterion for impossible values was based on
interparametcr comparisons, the physical and chemical
relationships among one or more environmental para-
meters; but these factors will not be discussed here.
To iflustrate the 'method, atmospheric con-
centrations of lead in cities throughout the United States
in 1968 will be used as an example. The computer-
generated histogram of these 4411 values is shown in
Figure 4. This histogram is greatly skewed to the right
with an arithmetic mean of 0.862 micrograins per cubic
meter (jug/rn^). For these data b| = 13.9 and b-,= 25.8.
In this case, the skewness of the raw data and the ex-
poncntial model differs, as does the kurtosis, but the
Iliis rifiiu- i\ i>.ii irriictl .il'ii-r llie (|i|. 0, ( plain1 In I .S. Pearson. I'nlviviuy Collocc, LuiuUm. i-iu-J hy Halm and Shapiro,op. ci>., p.197 t\\\
i inu-s luu' IH-I'll riY.ili'lil.ili-it. '
-MO
-------
exponential model offers advantages due to its sim-
plicity, and we shall apply it in this instance for purposes
of illustration.
The exponential probability model has just one
parameter, I- as seen 'n equation (3), which is also its
mean. We use the method of moments to set £ = 0.862
ug/nr5. t'ie "wan calculated from the data. The cumula-
tive distribution function Fj^(x) tor tm's m°del then
becomes:
Highly Improbable
-X/.862
Fx(x)
(14)
The plot of this probability model anil the histogram of
the data are both shown in Figure 5. Using equa-
)' l'ie ran8c 'imits *°r atmospheric lead concen-
tion
trations are then calculated as the values for which
F (x) = 0.001 , 0.01 , 0.99, and 0.999. A sample calcula-
ti?n for Fx(x) = 0.99 is as follows:
,, xp | -S/.H62) 0.01
-x/. «(>_' In (d. (I I)
x (-0.8li2) (-4.605) 3.J1
The resulting range limits are obtained in this same
manner:
improbable
x
Fx (XH)
0.01
0.99
.0087 fig/in"
0.001
0. 09!)
.00086
x^ •- 5.95>ie/m
These limits, which are based on national data, then
can be programmed into the computer as part of the
laboratory's data management system and can be used to
flag unusual values of lead concentrations. Note that this
probability model does not necessarily fit the data in the
traditional statistical sense. That is, no effort has been
made to examine the question, "Is the original dis-
tribution from which the data arose truly exponential?"
Rather, this model has been selected for the purpose of
decision-making and concentrating on the question, "Is
this particular piece of data sufficiently unusual to be
nagged?"
DISCUSSION
A principal advantage of this methodology is that
the entire process can be carried out automatically by
computer, producing high- and low-range limits as its
output. Currently, the first phase of a general computer
program to perform these steps has been completed, and
a contract has been written to apply this program to
selected water quality data files in Region Vs Central
Regional Laboratory in Chicago. The data screening
range limits which result from this effort will be in-
corporated into the Region V Laboratory Data Manage-
ment System (LDMS) first as part of the time shared,
remote terminal system and later as part of the auto-
mated, minicomputer system to be installed at this
laboratory.
211
-------
•o
3
si
'
?
s ^
31
i/MY O"
ii E.
s
S>
-------
Otta
Computer
Practicing
CakuUte
Stttirtie*!
(P. x....)
Generate
Histograms
g»—• — ~
atiect
Probability
Model
[Gamma. Wtibull."]
(Normal, Uniform.1
L &etc. J
Compere "CDF"
of Model With
Cumulativt
Frequtncin of Ora
Select
Another
Model
Accept
Compute Values
Corrtsjponding
Probabilities From
Models (O.t. 0.01. 0.0011
Quality
Control Ranges
<•.!>. c 1
-------
1
Figure 3
Relative Magnitude of ^j and /3,
for Different Probability Models
214
-------
TOTAL NO. OF VALUES • 4411
NO. OF VALUES LESS THAN 0.0
DISTRIBUTION OF DATA (INTERVAL WIDTH • 0.2000) * * *
MEAN • 0.862002 VARIANCE • 1.09539986
STO. DEV.
1.046614
TWTFDVAl
1 ;! 1 LnVML
0.0 -0.2000
0.2000-0.4000
0.4000--0.6000
0.6000- -0.8000
0.8000- -1.0000
1.0000- -1.2000
1.2000- -1.4000
1.4000- -1.6000
1.6000- -1.8000
1.8000- -2. 0000
2.0000—2.2000
2. 2000- -2. 4000
2 4000- -2. 6000
2 6000 --2. 8000
2.8000-3.0000
3.0000—3.2000
3. 2000- -3. 4000
3.4000—3.6000
3.6000-3.8000
3. 8000- - .0000
4.0000- .2000
4. 2000- - .4000
4.4000- .6000
4. 6000- - .8000
4 8000— 5.0000
5! 0000-5. 2000
5 2000—5.4000
5 4000—5.6000
5 6000—5.8000
5 0000—6.0000
6 0000- -6. 2000
6 2000—6.4000
6 4000—6.6000
6 6000—6.8000
*" 8000- -7. 0000
7 0000- -7. 2000
" 2000- -7. 4000
7 '4000- -7. 6000
-,6000-7.8000
7 '8000-8. 0000
8. 000-8.2000
fl.i-MXI-8.4000
\ < '00-8.6000
Sie lo-s.eooo
g. 8000-9. 0000
9 fX)00--9.2000
a. 2000- -9. 4000
9.4UOO-9.6000
9 6000—9.8000
9! 8000- -*••**•
NIIMRFR
nunccn
1012
659
665
452
361
309
213
139
100
52
106
56
39
26
16
39
24
22
14
5
16
11
6
4
3
12
6
5
2
0
2
5
7
2
1
5
1
4
2
0
3
1
0
0
1
0
0
1
0
0
prprrur j
r LntvLll i ^
22.943X
14.940X
15.076X
10, 247X
8.184X
7.005X
4.829X
3.151X
2.267X
1.179X
2.403X
1.270X
0.8B4X
0.635X
0.363X
0.884X
0.544X
0.499X
0.317X
0.1131
0.3631
0.249X
0.1361
0.091X
0.068X
0.272X
0.136X
0.1131
0.045X
0.0 X
0.045X
0.113X
0.1S9X
0.045X
0.023X
0.113X
0.023X
0.091X
0.045X
0.0 X
0.0681
0.023X
0.0 X
p.o x
0.023X
0.0 X
0.0 X
0.023X
0.0 X
0.0 X
«
MT»fV*DAM .
*•»**••*»***•*»•»••**••
••*•*•*•***••**
**********
********
*******
*****
***
• •
•
*•
*
*
*
*
*
DM FPCfl
i«un . r new
22.943S
37.8831
52.9581
63.2061
71.3901
78.3951
83.2241
86.3751
88.6421
89.8211
92.2241
93.4931
94.3781
95.0121
95.3751
96.2591
96.8031
97.3021
97.6191
97.7331
98.0951
98.3451
98.4811
98.5721
98.6401
98.9121
99.0481
99.1611
99.2061
99.2061
99.2521
99.3651
99.5241
99.5691
99.5921
99.7051
99.7281
99.8181
99.8641
99.8641
99 . 9321
99.9541
99.9541
99.9541
99.9771
99.9771
99.9771
100.0001
100.0001
100.0001
+
NO. OF VALUES EQUAL TO OR GREATER THAN 10.0000 • 0 ( 0.0 t)
Figure 4
Computer-Generated Histogram of 4411 Atmospheric
24-Hour Lead Concentrations Measured in U.S. Cities in 1968
-------
fy
f'
\
V
^
e
c/.£
62
0.0
1.0 2.0 3.0
LEAD CONCENTRATION X
4.0
5.0
Figure 5
Distribution of 24-Hour Average Lead Concentrations
National Air A Sampling Network, 1968, Along With
the PDF and CDF of the Exponential Probability Modd
-------
A SYSTEMATIC APPROACH TO
WATER QUALITY TREND ANALYSIS
By Merlin H. Dipert* and Jon A. Abnytis
INTRODUCTION
A systematic approach using empirical models is
proposed to investigate water quality data trends. A
trend analysis procedure, with supporting computer
programs, is presented.
For ease and economy of operations, three pro-
grams are used. One retrieves the data from STORET;
another fits the data; and a third plots the theoretical
curves, the points, and their confidence limits. In order
to simplify the mathematics, the graph is considered as a
quality control chart.
The function is assumed to be nonlinear in time;
however, since the program takes its own derivative, the
model can be changed by changing one card in the
fitting program and one card in the plot program.
DISCUSSION
In investigating water quality trends, errors of
measurement, which differ from random errors in the
process,should be considered. In this paper, however, it
is assumed that the points (individual points, replicated
means, periodic means, or some similar values with their
calculated errors) have been obtained by some standard
method. This discussion is limited to the errors of fitting
these points.
Consider the general n-dimensional mathematical
model,
y = L(t) + F(t) + G(x) + S,
where L(t) is the linear trend line, F(t) is the nonlinear
component of time, G(x) is a function of other para-
meters (e.g., latitude, ambient temperature, or depth of
measurement), and S is the seasonal parameter in the
same sense as used by Box-Jenkins.t
For preliminary investigations, the project staff
selected a station in the Ohio River near Cairo, Illinois,
and, using the specific model,
y = A + Bt + C sin (D + Et),
and setting the period equal to one yar, a 98 percent
reduction of variance was obtained for temperature, a
28 percent reduction for dissolved oxygen; and a
30 percent reduction for chlorides.
Since the results are preliminary and there are no
plots of dissolved oxygen and chlorides, and since the
model should be changed, values are not given. The
values obtained for centigrade temperatures, with their
standard errors,are given in Table I.
Table I
Sample of Values Obtained From the Model
Parameters
Intercept
Slope
Amplitude
Phase angle
Phase angle days
Degrees of freedom
Variance
Reduction in Variance
Values
16.1 + .31
-.00077 ± .0003 1
12 ±.23
-2.05 ± .02
119± 1.1
28
1.4
98%
The theoretical curves, with the means, the
95 percent confidence limits lines for the monthly
means, and the individual plots are presented in
Figure 1.
CONCLUSION
An approach which can be used to compare water
quality parameters, determine linear trends over time,
and calculate any of the parameters of the equations
required by the physical scientist has been presented in
this paper. When the mathematical models are the same,
this approach will give the necessary transformation to
make the curves coincide. Perhaps its most important
use is' to determine environmental quality trends
(changes in parameters over time) to discover if enforce-
ment policies have had a real effect on environmental
quality rather than merely a "perceived" effect due to
seasonal or other cyclical phenomena.
•Speaker
fG.P. Box and C.M. Jenkins, Time Series Analysis, Forecasting, and Control (San Francisco: Holden-Day, 1970).
217
-------
K»
S»
MAM NO 00010
STATNO 160035
aoo
two
360.0
1260.0
1440.0
1620.0
laoao
«eao
DAVS.x I01
-------
SAMPLE FILE CONTROL FOR THE AUTOMATED LABORATORY
By Roman I. Bystroff
INTRODUCTION
During this year, a team at Lawrence Uvermore
Laboratory has engaged in feasibility studies for a
number of water quality related laboratories in the En-
vironmental Protection Agency (EPA). All of these
laboratories are participating in the Cincinnati Pilot Pro-
ject, an effort directed at developing minicomputer-
based laboratory automation, which will be
transportable to other EPA laboratories.
The team has definite ideas about laboratory auto-
mation, and it seemed sensible and proper from the
inception that once sample data were in the computer,
some form of sample file control would be used. Ideas
about sample file control have since evolved, particularly
as .the regional laboratories emphasized its importance
and benefits.
Today functional specifications for a sample file
controller are nearing completion; this should fit the
needs of laboratories with quite diverse operations,
namely Region V and National Field Investigation
Center (NFIC), Cincinnati. This paper will first describe
the functional characteristics that are proposed to be
implemented and also the value judgments inherent in
the implementation.
FUNCTIONS OF THE SAMPLE FILE CONTROLLER
The central file in Figure 1 is referred to as the
active status file (ASF) since it will contain all in-
formation about samples on work in progress. A
laboratory supervisor uses this file as a log-book. On
receiving information about a sample or related samples
by groups, he enters this information and also indicates
possible procedures, e.g., the test procedures. The pro-
gram modules LOGIN and EDIT create these entries and
aid by prompting him for the required information.
Blanks are left for late information and filled in later.
Once samples are logged-in, an operator at some
work station queries the file and selects those samples
which he can handle and which are intended for him.
The WORK SCHEDULE program module aids him in
this function by passing all information required about
the samples into his station file. The station controller is
an applications program for the performance of an
automated analysts procedure, including the essential
quality control. The output of the station is one result
for each sample test procedure, which is passed back to
update the active status file. In addition, the station
passes information about quality control statistics on
each sample to another file, the primary data file (PDF).
The file is archival; this information is required only
when legal cases are called or for quality control au-
diting. Another archival file is the completed sample file
(CSF), which can be accessed for laboratory statistics,
cross-references to the primary data, and so forth. The
function of the LOGOUT program module is to aid in
the transfer of sample data from an "active" to a
"completed" archival file.
A report generator module aids the operator by
producing a listing of any specified data items or by
producing standard formatted reports from any logically
structured files, that is, the ASF, CSF and PDF.
The final functional module is a STORET converter
which prepares the sample data stream in a format
suitable for transmission to a national data system.
THE LOGICAL STRUCTURE OF FILES
In Figure 2, a tree structure with up to three levels
of branches defines the organization of any file in the
sample file controller (which does not include station
files). The file is structured to reduce redundancy. On
the branches are data items associated with descriptors.
The descriptors and their meanings, as employed in the
water samples active status file at Region V, are given in
Table I. Note in Figure 2 that there are as many data
items associated with one descriptor as there are
branches at any one level. For example, at level 2, there
might be six sample numbers which have the common
descriptor (DESC 3) SAMPLE, and exactly six sample
type items with the descriptor SAMPLETYPE (DESC 4).
This somewhat minimal restraint on the allowed struc-
tures of the tree simplifies a structure specification. The
presence of structure code permits a rather straight-
forward means of restructuring the files. A PDF with a
different structure than the water ASF is shown in
Figure 3.
219
-------
Table I
Examples of Water Sample
ASF Descriptors
Dnoipl
Complefo
U'vel 1
111 IK 1
IfKOlU'
SIUIIY
(.HSIAIION
Dill DAD
I.I' lAkIN
KM DAM
NIIMSAM
Nl MI'AKM
A(('l
COM U II
(.KOM'IIAII
DIRI (TOK
SIKICIUKI
1 evel .'
< KLNUM
SAMI'I 1
SAMKK
ISI IIAII
SAM STATUS
SAM II)
SAM IYKI
CUSIODY
SAM I.AT
SAM I.ONI.
(OMPOSI1I
SAM IIMI
SAM WT
U-vcl 1
ANALYSIS
VALID
Ml TIIOI1
OPIKAIOH
ANAI DAM
STA1 US
STOK IT
or '
AhbceVtolloni
on
l.f
-
CifST
DUI
SOK
l.PIAk
KKDAI
NSAM
NI'AKM
POL
(,1'DAI
DIK
STKUCI
CKL
a
SI.IX
ISIDAI
SSIAI
SSIAI
STYK
(US
SLAT
SI.ON
COM
SUM
SWI
ANAL
VAL
MI-TII
OPIK
AN DAT
STAT
STOK
Una ipliun <>l (he L»l«
Diitrifl ollk-e
Study number
Study JeikTiptiiin
Sljtion number ol timup
(irtHipJwdute
i^roupuuu' ukvn
(iroup Jatc oi jrrrvul
NuiillMr ill \unipk-s
NutnlKT dcleniiiniilhins in he peilonncd
Study aeeminl iiumrwr
(iroup pollution index IIK ue
D-JIC ^ruii|i lo»!yed »ul
Direrlor ul Ik-kl ullkv
lypi- nl tliiiu strut lurr
<'KUS sampk1 numb«t
STOKTI «ini|ik' *
Code lor kxation «herr umpk' taken
»ute juimpk taken
Sliilui ol Ike wmpfe
Suinple idcnlilVution
lypeot \itnpk*
( linlody umpk
LililuttV nl wmp4e hieulkMi
Ixinililudeol' «impk' kvatkHt
K the Maniple a «MnfMNHV
I'ime orn|4e ukvn
Sompk> woijhl Ipxml
Any puimelcr to be iJetCTmlncd
Concentnlion « >>lu> of > 4ei«mniiion
IKV lui uf amlyKi fo> unk»
M«hodofii«lyi»,
Operator performlm tfuiyik
LKlc uulyvB nnWtcd
Slilui of vulyifc
STOKH ftnmna nyiMm
The use of a structure code is a way of satisfying
the functional requirement that the user has the option
to redefine what items he requires and their relationship
to each other In the file controller. The importance of
this in a pilot project and in the "tunsbility** of an im-
plementation for diverse laboratories is that difficult re-
programming is avoided. The laboratories SPC needs are
evolving ind will continue to evolve. For example, at
Central Research Laboratory, Region V (CRL-V), three
structure* (ASF types) are required for sample files
because water samples, bubbler samples and high-volume
air filter samples* have different numbers and kinds of
descriptor*. If biological sampler were also included
under the computer SFC, another type of active status
file could be added.
The water ASF in Figure 2 and PDF in Figure 3
illustrate the use of different keys in structured files.
Samples are effectively organized by PROJECT or
GROUP and are often kept together in batches for an
analysis work session. The quality control data are keyed
by the time-of-completion because analysis procedures
vary in length. Various sample determinations are related
to a calibration series (represented by coefficients of a
least squares fit and the standard deviation in Figure 3).
Retrieval of the primary data for a group of samples will
be fastest if the cross-indexed time-of-completions are
first found in the CSF under its key: GROUP. This then
is an example of a trade-off: data are logically organized
to be easy to input without the overhead of reordering
data.
INTERACTIVE FEATURES OF THE SAMPLE FILE
CONTROLLER (SFC)
The user's acceptance of a computer is very high on
the list of functional values. After all, the laboratory
chemist is not very burdened by choosing where to write
his data on a standard form sheet. Program aids to log or
request data should be functionally equivalent to those
of the form sheet. This means memory aids (prompts),
defaults to normal requests, and the freedom to enter,
not enter or ditto data entries.
The LOGIN program module illustrates these kinds
of features. A person logging in a group of samples may
take various paths as he fills in the sample information m
an ASF form sheet. If the SAMPLE TYPE happens to be
the same lor all the samples in this group, he merely
types the description and ";ALL". If they differ, LOGIN
will prompt hint for the entry for each individual
sampb. Many time he can pass over an entry. It may be
a delayed piece of information needed only for a final
report^ or one not always required. The LOGIN module
can be used again to fill in previously unspecified data..
In this ease* the prompting would begin with the first
blank Jn tat form record and proceed from there
LOGIN woel* not b* used to change existing items.
Wh«a
-------
Table II
Examples of Command Lines in the Sample File
Controller Program Modules
Modulo
i i>u
KIK;
WOKk
S( III 1)1 LI
( itminantl
AIM!
1)1 1 1 II
( IIAV.I
CHANCI
SiH'fiHfr ur Output
II SI:MOI>5 KK.KOUI' HII.SAMI'I 1 1 IIIHl III 1?
II SI lie 1 ROM(.K(ni|' IIH.
SAMKX .NAMI'M III IIIKl1 >l> ID ll'-5>»'
SAMICX :SAMI'I.I III I'llRI' 211
SAMPI.I
Hi'
MM A
fl.KOI 1' IdliSAMI'l 1 :( I'.SIOIIY.OI'I RAIOK
I'KIM j
c.Koi'i' nil
SAMI'I 1 HISIOI1Y OPI MATOK
1 SIA-III J»H
1 SIA-ltl JWH
•> DIVIMOMNIHI
SI'MMAKY >
J (iKOliP l"l
1 INI)
PKINT
sinri -
RI-STORl: .
»H|:.<:il<>llMni
»10TIIRU20, 22:11*
added under the descriptor TEST for a specific set of
samples. The delimiter ";" is used to mean the'logical
'AND" of the two specifications GROUP: 101 and
SAMPLErl THRU 10,12. The forms ADD.. TO,
DELETE ... FROM, CHANGE.. TO are expected. If the
prepdsition is left out, it is assumed the operator wants
to be prompted by sample number, so that he may enter
individual values as in line 4.-Note that if a descriptor
(SAMLOC) is left unmodified, it is assumed to mean
ALL items described by SAMLOC.
The report generator commands are such that
anything in the file can be listed. The called-for level 1
descriptors will appear in a row at the top of the report,
and level 2 and 3 descriptors will head columns of items
across the page. The order of appearance will be the
order in the command line. The example shown calls for
all samples and all custody and all operators. The list
would be shortened if, for example,,the operator was
specified OPERATOR:JWD. Special standard forms
would be called by a more concise SUMMARY com-
mand. These could be sample summaries or managerial
reports with column totals, if desired. The WORK
SCHEDULE module commands are more terse because
it is assumed that the only items of interest to specify
are the ANALYSIS and the SAMPLE numbers. Thus the
descriptors are left out, and only the item is stated.
CAPABILITIES FOR OFF-LINE DATA ENTRY
An SFC is most cost effective if parallel records are
altogether eliminated. All the test results, including
those which come from nonautomated tests (those used
less frequently, presumably), can and should be entered
into the computer files. There are two ways for accom-
plishing this. The EDIT program can update a sample log
with a keyed-in result (see Figure 1). A better procedure
would be for the user to write an applications program
in the BASIC language to prompt him for the data in a
systematic fashion and incorporate quality control. This
has the added advantage that key-in errors can be re-
duced by checking against limits.
SUGGESTIONS FOR THE EFFECTIVE USE OF
SAMPLE FILE CONTROL
In the EPA there has been some thought given to
making use of preknowledge about samples in a passive
sense. For example, Wayne Ott and W. Fairless have con-
sidered limits that results might be compared against.
One use of such limits would be as a coarse filter for
key-in errors (e.g., pH of > 14 is impossible for water).
Geographic keying of lake and river improbably
high (or low) results may prove of value in some regions
if sampling and seasonal variations are not serious. The
SFC can accommodate these files, and the WORK
SCHEDULE program could make these TEST-related
parameters available to the analyst for purposes of
double checking. It is suggested that in the compliance
monitoring (C-M) program, license limits are quite
specific and can be keyed by industry or industry-type
and test procedure. In the passive use of these limits, a
bell could be made to ring if a limit were exceeded. This
may be unnecessary, but if one understands that auto-
matic sample file control and lab automation compress
the operational time scale for the laboratory, then these
C-M limits could be used for closed loop options. For
example, the reduction of the number of determinations
in the laboratory could be achieved if a minimal quality
control scheme were employed when the result (plus
uncertainty) is less than the C-M limit. If a C-M limit
(either a peak-allowable or maximum daily allowable)
were reached, a conservative quality control scheme
would be employed. After all, once the STORET base-
lines have been established, normal values are of
incidental interest. The enforcement lawyers would have
22'
-------
to address this question, but it seems that this scheme is
not a bias Of results, but rather like a sliding uncertainty
criterion.
Other examples of closed loop effects of auto-
mation on 'laboratory operations are not entirely related
to the subject of this paper, but in response to queries
from National Environmental Research Center (NERC),
Corvallis, a random access auto sampling wheel is
employed. It makes possible the scheduling of, and
access to, samples to be rerun in the same work session.
The laboratories in question employ custody pro-
cedures. The presence of status conditions and custody
identification in the SFC files can be made effective as a
means of locating the physical sample and possessor.
Similarly, the analyst can be encouraged to conform to a
strict protocol by the SFC.
Management reports were not a high-priority
function of the SFC design, but are a bonus which can
help operational evaluations.
DESIGN VALUES FOR A SAMPLE FILE
CONTROLLER
At the stage of thinking about the pilot project
laboratory as a system (that is, in terms of what is done
and what flexibilities exist that should be preserved
when automated), value judgments must be made as to
what features are critical for the success of the system as
a whole.
Some of these may be evident from the preceding
description. The data flow must correspond to existing
operations. This includes a query and input capability
for the engineers, chemists and managers; therefore, the
accesses to the data base must be interactive and easy to
use.
There are staffing constraints in the laboratories. A
choice is made to allow for the problem-oriented chem-
ist to engage in the solution to laboratory procedure
problems. In the automation of the laboratory, the
choice of an easy-to-use BASIC language will allow the
user to be involved in the growth of the system applica-
tions. Sample file control has a lesser, but still evident,
requirement to be adaptable to evolving needs. Value is
placed on reprogrammability by the user. This is
particularly valid in a pilot project in which options are
to be kept open until tested and evaluated for
usefulness, user acceptance, cost effectiveness, integrity,
and so forth.
The adaptability of the sample file control is an
important value which addresses the objective of
transportability to other EPA laboratories. A modular
design addresses this. The laboratory automation
programs and the sample file control routines are
separate modules in the whole system. The options re-
main open for resident versus hierarchal (CPU-CPU)
implementations.
For the present it is believed that the Data General-
supported BASIC language can be used to write the SFC
in a way that will satisfy the values mentioned so far.
The NOVA-840 has the capability in its foreground/
background disc operating system. However, the
response time of such an implementation is still a
question. The job is to find the right trade-offs o im-
plement the functional values mentioned in this paper.
When this is done, the prime objective will have been
achieved: to have a cost effective laboratory-based
system in which the user is given authority for his
portion of the responsibility, that is, to efficiently pro-
duce quality analytical results.
222
-------
i j
u
STQRET
CONVERTER
0 NAT1L DATA SYSTEM
REPORT
GENERATOR
ACTIVE
STATUS
/ \RCHIVAL
FILES
\WORK
iCHEDULE
CSTATI Oil
ONTROLLERS
STATION
FILES
ON-LINE
INSTRU-
MENTS
91
ss sr
n a
-------
Level 1
!
Desc 1 | Dcdc 2
I
I torn 1 ' Item 1
} 1
1
i
i
1
1
1
t
Level 2 [
1 '
1
1 I
1 1
1 ' !
| Desc 3 i Desc 4
Iteml I Iternl
i *—
Item 2 1 Item 2 |
i L
i •
i
I
Item 3 | Item 3
, I
I .
Item 4 1 Item 4 j
II
1
Item 5 ' Itern 5 [ '
L_ i
i r '
r :
Item 6 1 Item 6 |
1
1
1
1
1
1
Level 3
1
1 |
1 Desc 5 Desc 6
1 I
i Item 1 Itpm 1 i
r
i
i
Item 2 Item 2 ,
1
1
1
Item "* f-t^"] 1 I
1
, I
i ;
Item 4 Item i I
1 |
i :
!
Figure 2
General Data Structure
J_>4
-------
LEVEL 1
LEVEL 2
LEVEL 3
REASON-
REPLICATE DILUTION NOT-USED
.
/STRTT£"nTRF\
( CODE /DATEDON'E T1ME COEFF1 COEFF2 STDDEV
METHOD BLANK
PROJECT SAMPLE OPR NUMREP
100 11 JWB 5
100 12 CWD 4
•
100 13 CWD 3
>15 1 2
10.0 2 0
9.2 2 0
11.1 2 0
5.1 2 3
,
o !
§ ?i-
I * se
52§ =
e * £"9
iril
i O * w
Pi
-------
A DATA REDUCTION SYSTEM FOR AN AUTOMATIC COLORIMETER
Hy K.V. Byram,* F.A. Roberts, and L.A. Wilson
INTRODUCTION
The Consolidated Laboratory Services Program
(CLS) at the Pacific Northwest Environmental Research
Laboratory (PNERL) das analy/ed a large variety of
samples supporting the research programs and, until
recently, the EPA Reghn X programs. It has been found
useful to employ the fechnicon Autoanalyzer II for
performing analyses, where the volume of work to b?
done justified the automated system. At PNERL, the
Technicon Automated System has been averaging about
3000 determinations per week, running from three to
five channels simultaneously.
Al this tale, i( sivmed imperative to automate the
data capture process fci information coming from lliese
instruments if their full potential to' reduce manpower
requirements was to be utili/ed. Although Technicon,
Inc. produces both a printer (producing output on
adding machine width tape) and a teletypewriter (pro-
ducing full width or punched paper tape), these outputs
are of only minor advanlage if a full complement of
analytical quality control procedures is employed.
WHAT THE TECHNICON PRODUCES
The Technicon autoanalyzer is a continuous flow
reagent-mixing and colorimeter system which selects the
sample to be analyzed from a series of cups. The liquid
from each cup is followed by a wash to discriminate
samples. The cups are selected at rates of around 50 per
hour. After a uniform delay for reagent mixing and color
development, a voltage is produced from the color-
imeter. The voltage is proportional to the amount of
constituent in the cup, which is a peak above the zero
constituent in the wash (Figure 1).
At PNERL, (lie minimum run consists of a series of
slandaids. a series of unknowns (samples whose concen-
tration is unknown), and a series of standards, [f the run
is extended, another unknown/standard series follows
the first. The "standards" consist of initial blanks
(distilled water) to clean out the system, followed by a
series of at least eight standard solutions spread over the
range of interest, followed by trailing blanks. The
"unknowns" include unknowns, replicates of unknowns,
and unknowns to which a known has been added
(spikes). The normal run consists of about 20 standards
in two groups of ten, and 105 samples, which, when
combined and interspersed with blanks, spikes, and so
forth, comes to a total of 156 peaks. It is this series
which is presented to the computer programs for
reduction.
NECESSITY FOR AUTOMATING DATA CAPTURE
The primary reason for the decision to process the
autoanaly/.er output by computer was the volume of
data expected. It is a reasonable job to reduce the out-
put from the instrument run in production for an hour
or two a day, spending an equal time reporting the data.
But keeping up with the instrument running in produc-
tion for 1 5 hours a day was another matter.
In convincing the staff that a more fully automated
system would be useful, considerations such as the avail-
ability of a programming team with experience in this
kind of work and of a predecessor program to perform
parts of the work weighed heavily. Not least important
was that many of the simplifying assumptions on which
Technicon, Inc., markets equipment for water analyses
were not valid for this particular application. Since the
colorimeter output is produced directly in absorbance
units, it is assumed that the concentration can be read
directly from the chart or printout with automatic blank
correction. The assumptions were not reliable enough
for routine use at the levels and precision needed.
PROCESSING THE STANDARDS
When running colorimcteric analyses manually, it is
normal procedure to run a set of standard solutions and
plot their concentrations and absorbances on graph
paper. Using the chemist's best judgment, a straight line
is then drawn as closely as possible through the points-
using the standard curve, the analyst estimates the
concentration of an unknown from its absorbance. The
difficulty in automating this process is in duplicating the
chemist's judgment. If concentration is plotted on the
x-axis and absorbance on the y-axis, it is common fo
example, for the curve to flatten out at the high end d«e
to deviation from Beer's Law. If most of the unknowns
are in the high concentration range, the system is recon-
figured so that it is linear for the region of interest If
'Speaker
22(>
-------
I lie values for i lie unknowns fall in the linear region, the
higher standards ure ignored in drawing the line. Another
problem occms when one of (he standard solutions is
inaccurate because of aging, error in preparation, or
contamination in handling. Knowing that this possibility
exists, the analyst simply ignores that point when
drawing the line.
The immediately apparent way to automate this
process is to use a least squares curve fitting routine
(lineai regression). This technique, however, lends to
minimi/c the deviation of the standards from the line
equally at any level. If a typical deviation of a series of
standards such as .01, .02, .05, .1, .2, .5 is .015, the least
squares routine will place the line through the points so
that the .015 deviation is typical at the .01 level, as well
as at the .5 level. This results in a very accurate curve
near the high end and a very imprecise one near the low
end.
The staff has used a modified least squares tech-
nique which seems to produce a "chemist compatible"
curve most of the time. It is based on the chemist's
specification of two quantities: the minimum value
detectable with the system (herein called "precision")
and the percent accuracy possible with near full-scale
values. If the precision is .01 and the accuracy 5 percent,
for example, (he chemist reports his result, as do manu-
t'acUivers of instruments such as voltmeters, us "within
.01 nig/1 or 5 percent, whichever is greater." It turns out
that below a mg/l value of the precision divided by the
accuracy, which is .OI/.05 or .2 in this case, the pre-
cision will be greater, while above that, the accuracy will
be greater (because 5 percent of .2 is .01).
The curve-fitting routine is an iterative process
which (1) uses a conventional technique to compute an
equation, c = mx+b, where x is the absorbance and c is
the concentration. Next (A) the value of m so obtained
is used to compute a new value of b using only the
standards below the precision-accuracy ratio. Then, (B)
that value (essentially, the baseline) is subtracted from
each of the standards above the precision accuracy ratio
to obtain a new value of m. Steps A and B are then
performed over and over again until the changes in cither
b or m arc trivial (O.I percent) from iteration to
iteration.
Once a trial value of m and b arc obtained with this
method, the predicted concentration of each standard,
using the equation, is compared with its known concen-
tration, computing a percent and mg/l deviation for each
one. (II) If any of the low standards have a greater mg/l
deviation than the precision, the worst one is deleted
from the set. If any of the high standards have a greater
percentage deviation than the accuracy, the worst one of
these is deleted also. The iterative procedure (lAand IB)
in the previous paragraph is then performed to yield a
new value of b and m. These two steps (I and II) are
repeated until the percentage and mg/l accuracy and
precision are satisfied, or until too many standards are
being removed.
A set of standards both precedes and follows each
set of unknowns, and the slope and intercept to
compute a concentration for each unknown is linearly
interpolated from both sets of standards. An example
output, showing the final computation for each set of
standards and a portion of the computations for the
unknown is shown in Figure 2.
REJECTING ERRONEOUS RESULTS
Inaccurate Standards
If a satisfactory curve cannot be fitted to either set
of standards surrounding a sample batch, no computa-
tions for unknowns between those sets are made. They
are resubmitted for analysis at the same dilution factor
unless a particular unknown peak was off scale, sug-
gesting a greater dilution factor,
Shoulder or Off-Scale Peaks
We must turn to the hydraulics of the system to
understand another reason for rejects. The system
pumps the sample through its inlet tube continuously;
this tube is moved from sample cup to wash to next
sample cup, and so on. The wash is necessary to clean
out the system so that the constituent in one cup does
not carry over to the next and bias its results. In the case
of an unusually strong unknown followed by a weak
one, the wash may not be complete enough. In the
system, a peak which is less than 10 percent of the pre-
vious one is rejected and resubmitted for analysis
because of this problem. When an unknown is strong
enough to yield an off-scale reading, the next two
following peaks are rejected. The off-scale peak is resub-
mitted at a high dilution factor.
Negative Results
If a peak falls below the lowest b value from the
standards computation, interference is suspected, espe-
cially if it falls far below it. In our case, if the negative
concentration is greater in absolute value than the pre-
cision, the unknown is resubmitted for analysis.
227
-------
Standards L,\ Imputation
I! Ihc peak <>l a given unknown is higher than Ihc
highest slandatd peak, llie unknown must be resub-
inilled lor analysis at a higher dilution factor. This
procedure is required since the apparent concentration is
greater than the highest standard, and extrapolation
from the curve developed lor the concentration being
observed may produce invalid results.
Quality Control
At regular intervals throughout a set of unknown
samples, the following set of quality control solutions is
interspersed: a wash, a known solution, a wash, an
unknown, a repeat of the unknown, and the unknown
spiked with the known solution. The spiking is such that
(lie concentration measured in the spiked unknown
should be equal to Ihe concentration of unknown plus
the concentration of the known which was added to it.
It is required Ilial the measured concentration of the
spiked unknown lie within 10 percent (in most cases) of
the calculated concentration. Since there are two repli-
cate analyses of the unknown alone, the spike must be
within 10 percent of calculation using either replicate. If
it is not within 10 percent, the sample is resubmitted for
analysis.
Comments
Certain one-letter comments are enterable by the
analyst when the computations are made. Most such
comments refer to the original disposition of the un-
known solution, such as M for missing, or X for
improper sample preservation, etc. These comments
cause the program not to calculate a value for the sample
because it was unavailable when the analysis was per-
formed. Instead, the sample is reported with no result,
hut with its comment. Another class of comment, such
as I) for unreliable, causes the program to rcsubmit the
sample for analysis. This might be used when the
chemist notes something irregular in the peaks which is
not detectable by the program.
NITRATE NITRITE ANALYSIS
The nituite analysis consists of a reduction of
nitrate to nitrite with measurement as nitrite. Any ni-
trite in the sample is also measured, so the final result is
nitrate plus nitrite, expressed as mg/1 of nitrogen.
Although many laboratories assume that the nitrate and
nitrite are additive, this is not the case in samples with
high nitrite, such as a nitrifying sewage. The difficulty is
that some of the nitrite ion becomes further reduced to
ammonia and is not measured. The resulting "nitrate
plus nitrite" value is not a simple sum, but is something
like "nitrate plus part of the nitrite." To compute the
nitrate from this value requires subtracting some fraction
of a measured nitrite value, a fraction determined by
running nitrite standard solutions through the reduction
process. The analysis dictates, therefore, that the entire
standards computations be carried out three times: once
for nitrate alone, once for nitrite alone, and once for
nitrite passed through the reduction process to obtain
values for nitrate, nitrite, and of course, their sum.
EQUIPMENT CONFIGURATION
Technicon autoanalyzers used are Model AA-I1,
consisting of two sets of three channels each, with strip
chart, 4-inch wide paper tape printed output, and tele-
type with print and paper tape output. The strip chart is
a continuous record of the series of peaks which come
from each sample cup (Figure 1). The printer and
teletype outputs (Figure 3) are digitized values of the
peaks, which are sampled at the same time that the
sample cups arc changed. Phasing (that is, making sure
the peak is sampled at the top rather than the side) is a
function of the flow-through time for the analysis. Since
this is constant for any particular setup, it can be set by
the operator at the beginning of a run. The phasing for
each channel is independent, so that the peaks need not
occur simultaneously in all channels. Each time a peak is
sampled for printing, a mark is made on the continuous
strip chart record so the result can be checked later if
necessary.
In this system, the paper tape produced by e :h
teletype is hand-carried daily to the Oregon State
University (OSU) Computer Center. This is much less
expensive than feeding the information directly to the
computer by putting the teletype on line as it is printing
from the autoanalyxer. Since the output rate is so low
(about 2000 characters per hour) from the autoanalyzer,
hand-carrying the tape to the high-speed reader at the
computer center is cheaper than reading it in a teletype
speed from the laboratory.
Once in the computer, the tape is processed inter-
actively with the operaior at a terminal having consider-
able flexibility in editing the paper tape. This is
routinely useful if the chemists let the teletype run out
of paper tape or if they are unable to find a sample. It is
also useful in cases of power failure, inadvertent
omission of samples, insertion of extra samples, and so
forth
22K
-------
In addition to the computational difficulties en-
countered in obtaining some sort of computer
compatible output directly from the autoanalyzer, there
is a sample identification problem. It is necessary, of
course, to assign both identification numbers to samples
and concentrations to standards before the computer
compatible absorbancc values have meaning. If this is
done manually, at'least some of the advantages in doing
the computations automatically is lost. The system ties
in with a laboratory sample handling system (SHAVES)
which existed before autoanalyzer computer programs
were developed. One of the autoanalyzer Technicon
system programs, L1STGEN, accesses the file that holds
the SHAVES analytical requests and produces a list of
samples to be analyzed, complete with replicates, spikes,
and standard solutions (Figure 4). The analyst then uses
the list to Till the sample cups which are input to the
autoanalyzer. When the results are computed, a copy of
the list which has been kept in the computer is used to
identify the results. The regular repetition of replicates,
spikes, blanks, and the descending values of peak heights
for the standards makes it easy to be sure that the right
identification goes with each peak.
-------
Ul
o
value of peak
sampled here
Figure 1
Strip Chart Recorder Output
-------
HtGIN BATCH 91ii)8l7 FOfe
ST 625 ?/
C/A = .60
1510 (09/16/74 9120817)
1-9
150«-
151
IS?
153
154
155
156
157
-160 —
CONC
5.0000
-2.0000-
1.0000
.8000
.6000
— .4000-
.2000
.1000
.0500
______ 0-
PK HT
8.7600
3.1300 -_
2.1400
1.7300
1.4200
1.1600 —
.7800
.6200
.5600
.3800
RASE
.4439
.4439 -
.4439
.4439
.4439
.4439
.4439
.4439
.4439
.AfclO. .
.Sttt
.7446
.5896
.6220
.6147
.5585
.5950
.5677
.4305
t\
OEV %DEV0
•0.3850--19.2490
•0.0131 -2.1833
.0306 - 7»5**3
IOOS9 5^9082
.0198 39.6676
•0.0384 0-
ST 625 9/17/74
C/A= .6063
1510 (09/18/74 9120817)
CONC
5.0000
2.0000
1.0000
.8000
.6000
.40100
- .2000
PK HT
8.6600
3.1 100
3.1200
1.7200
1.4200
1.1600
.7400-
.3700
COA
.6063
.7471
.5928
.6216
.6079
.5502
.6516
.5349
.6498
0
OEV %DEV
.0000 .0000
-0.3770 -IB.8503
.0228 —2.2772
-O.OJ97 -2.4677
-0.0382
--• RESUL
625
T
.100
9/17/74 1510
RAM ANSWER
81
ENTEPE
PKHT- BA
( 2.0698) 3.88 .44
2.0000 PREDICTED)
70-32-603
70-32-604
,„ 70-33-527
29 70-33-55K
30 70-33-552<
i. B:iJ:?-TJ
3? 29-33-S
10.(
• !»-•—•
9.900
- 26.000-
33.000
32.000
1.400
- 24.000
5.700
6.300
28.000
- -3.400
5.000
14.000
.400
.300
.800
18.000
.900
2.600
.200
.100
.100
1.500
.800
< -0.0021)
( .1968)
( .0945)
( .0524)
< -9.2058)
{ -0.0859)
T. I
( —
(
I —
(
,8625)
25.8267)
33.0587)
31.5573)
1.3734)
24.3383)
5.6710)
6.2777)
28.4808)
- 3.3654)
T0.0336)
3.9677)
.3669)
3429)
09/18/74 BATCH 9120817
FACT - REPORT RERUN «E
NO ANALY;
NO ANALY!
1
.8973*
2.6441)
. JB67
.0423
.0785
1.5674
1.8374,
J-8315)
.44
.4.4
.44
.44
.44
.44
.44
.44
.44
.44
.44
.44
.44
.44
.44
.44
.44
.44
.44
.44
.44
.44
.44
.44
.44
.44
.44
.44
100
SHLDR
Figure 2
Computer Output From Processing an AutoaiuUyzer Run
231
-------
027.
028.
089.
030.
031.
038
033
034
035.
036
037.
038.
039.
040.
041-
048,
041
044
041
044
047.
048.
O49.
050.
051.
058
O53
054
055
056
057.
058.
059.
06Q
061.
06fc'
061" •
064
065.
064
067.
068.
069.
07ft
071.
07*
073
074,
075
076
077.
078
079.
08ft
081.
068
083
034
035
7J 0.31
71 '0*31
•71 0.31
71 0.31
7; 0.31
71 0.31
71 0.31
'71 0.31
11 0.31
71 0.31
' 71 0.31
1 71 0.31
71 0.31
71 0.31
- 7J 0.31
71 0.31
71 0.31
71 0.31
•71, 0.31
71 0.31
71 0.31
7J 0.31
'71 0.31
71 0.31
71 0.31
•71 0.31
71 0.31
71 0.31
7i 0.31
71 0.31
71 OV31
71 0.31
'71 O.'3l
71 0.'31
71 0.31
71 0.31
J7J 0*31
71 0V 31
• 71 -0.31
71, OV31
1 71 0.31
71 0.31
• '71 0.81.
71 a* 007
71 0.'3«
7l 0.32
•71 0.38
71 0.32
71 0.32
71 0.32
71 9.28
7ji 9.48
71 9V48
7.1 9V4S
7J 0.'39
71 0.43
71 0.43
•71 0.42
71'OV48
25 0.6T 65 0.61
CS 0.64
es'0.45
25 0.46
85 0.57
85 1.00
25 6.35
25 0.87
25 0.89
25 8.34
25 0.40
25 0.36
25 1.26
85 4.14
85 0.37
25 0.40
25 4.66
65 D.8T
65 1.87
65 0.58
65 0.55
65 1.90
65 6.60
65 0.56
65 0.57
65 1.02
69 -8.'4l
65 0.57
65 0.55
65 1.28
65 3.77
65 0.58
65 0.55
25 0.237 65 3.36.
25 0.46
25 0.40
25 0.97
25 8.04
25 0.38
25 0.40
85 1*35
85 5.00
85 0.38
85 0.38
25 0.35
25 0.35
25 0.50
85 '1*81
25 0.43
85 0.41
25 0* 69
85 0.48
25 0.67
25 1.31
25 0.40
_2Sj««3S\
25 0.441
25 0.61
25 0.37
25 0.38 '
85 1.26
25 4-26
25 0.'38
25 0.'47
25 0.42
25 0.55
25 0.35
25 0.36
25 0.39
25 0.'53
25iOV35
25iO«38
25? K75
25-S.-93
25^37
65 0.841
65 0.68
65 0.55
65 0.69
65 1.25
65 0.56
65 0.61
65 0.55
65 0.'53
65 0.55
65 0.53
65 0.24?
65 0.247
65 0.91
65 0.59,
65~ 0."66*~
65 0.84
65 0.54
65 0.55
65 0.88
65 2.18
65 0.56
65 0.56
65 0.59
65 0.72
65 0.53
65 0.52
65 0.58
65 -0.59
65 0.57
65 0.31
65 4.21.
65 0.24T
65 0.61
65 0.54
65 0.77~~
63 l.58_
65 0.54
65 0.54
63 0.64
65 1.08
Figure 3
E\(>mpl
-------
Figure 4
Computer Prepared, Analyst Annotated,
List of Analyses to Be Run
233
-------
A CHEMICAL INFORMATION SYSTEM
By Stephen R. Heller
This presentation will discuss the ways in which the
Agency is using the National Institutes of Health (NIH)
Chemical Information System (CIS) and ways in which
the CIS can he modified and/or expanded for further
Agency use. (Figure I)
Figure 2 is a list of the current components of the
CIS. At present, the Agency use of the CIS is limited to
(wo programs, MSSS and MLAB. Both of these are de-
tailed by others at this workshop.
One component of the CIS that is not now being
used, but which is of enormous potential use to the
Agency, is the chemical substructure search (SSS).*
Substructure search allows the scientist to query the file
for data of interest to him including mass spectra infra-
red data, pesticide registration data, toxic substance
data, and so forth. His question is specified by a chemi-
cal structure, the language of chemistry. The answers are
the data whose structures contained the specified
substructure.
At present, the Management Information Data
Systems Division (MIDSD) is making a pilot version of
the SSS option operational for testing and evaluation on
mass spectral and pesticides files. In order to do this,
MIDSD has obtained a contract with the Chemical
Abstracts Service (CAS) in Columbus, Ohio to provide
the CAS registry data for these two files. Included in the
registry data is a computer readable representation of a
chemical structure called a connection table. The SSS
program searches this connection table to find the struc-
tures that answer the user's query.
registry data, it will be possible to generate and examine
interdisciplinary data bases from a structural point of
view. One could ask: "For what compounds in the
Merck Index are there mass spectra data available?" If
registration data and other related administrative data
arc available with registry data, one can begin to see if
certain structures are known pesticides, toxic substances
hazardous materials, and so forth. Structures appearing
in different files can he readily located since the search is
based on a picture, the chemical structure, not a name.
Names, especially trade names, often tell little about the
structure of the material. Indeed, the same material in
different files can have many different chemical and
trade names. One enormous source of data with CAS
registry data is the CAS literature files. At present CAS
is producing their current five-year subject index with
registry data, thus enabling one to search for structural
information when searching the literature. Anyone with
any experience with chemical nomenclature and the
CAS habit of changing its nomenclature should warmly
greet such a system.
These examples all deal with existing files. A file of
data which is not part of the CIS is infrared (IR) data. A
MIDSD sponsored feasibility study of an IR search
system, to be published next month, will include the
recommendation for the inclusion of CAS registry data
as part of any possible future IR system. An IR searrh
system will enable the scientist to have a complementa '
tool for the identification of organic materials. With the
registry data, one will be able to have a link between the
IR and mass spectra files, rather than being required to
perform separate searches.
In addition to the mass spectral and pesticide files,
MIDSD is also obtaining registry data for the oil and
haxardous materials file. Other organizations, outside
EPA. arc also putting the CAS registry data into their
files. The ninth edition of the Merck Index, the National
Institute of Occupational Safely and Health (NIOSH)
list of 25,000 toxic substances and the National Library
of Medicine (NLM) Toxline tiles are examples of data
bases that have or will have registry data as part of their
files. As more groups within and outside the Agency get
The SSS and registry data will allow the Agency to
take specific parts of the CIS relevant to Agency needs
build up individual data bases and finally be able to
conveniently and inexpensively link them together.
As the pilot SSS project proceeds, the author
would be happy to meet with any groups interested in
discussing their specific files and needs and in discussing
how MIDSD can help coordinate this effort to link the
Agency's chemical files.
*Scc R.J. leldmann and S.R. Heller. J. Chcin. Doc., 12 (1972), 48. AKo see R.J. Feldmann, Chapter 3, Computer Representation and
Manipulation ol Chemical Information, ed. M.J. WipLc, S.R. Heller, R. Feldmann and E. Hyde (New York: John Wiley, 1974).
234
-------
GRAPHICS
TERMINAL
LITERATURE
SUBSTRUCTURE
SEARCH
DATA/
PROPERTY
FILES
GRAPHICS
TERMINAL/
PRINTER
Figure 1
Chemical Information System
DCRT/CIS CHEMICAL INFORMATION SYSTEM
TO RUN A PROGRAM IN THE CIS. TYPE THE NUMBER NEXT TO THE NAME OF THE PROGRAM
LITERATURE:
CBAC - 1
STRUCTURE:
SUBSTRUCTURE SEARCH - 2 WLN GENERATION - 3
WLN TO STRUCTURE - 4 WLN GENERATION USING THE TABLET - 17
DATA:
MASS SPEC - 5
ANALYSIS:
NMR -6
XVZ COORDINATE GENERATOR - 7 XRAY COORDINATE REFORMAT - 8
XRAY MODELING SYSTEM - 9 XRAY CRYSTAL BIBLIO SYSTEM - 10
CNDO/INDO-n MINDO-12 ORTEP -13
GINA NMR ANALYSIS - 14 ESR SPECTRUM SIMULATION - 15
MLAB CURVE FITTING/MODELING - 16
USER CHOICE:
Figure 2
CIS Components
235
-------
MASS SPECTRAL SEARCH SYSTEMS
By John M. McGuire
INTRODUCTION
Setting and enforcing water quality criteria, deter-
mining the fate and effects of water pollutants, and de-
veloping optimum control measures require the
capability for identifying specific organic pollutants.
Table I dramatically illustrates the need to determine the
composition of industrial wastes by chemical analysts.
The compounds in the left column are those suspected
by the discharger, through his knowledge of products,
raw materials, and processes, to be in his effluent. The
right column, based on chemical analysis of the effluent,
contains over twice as many compounds.
Table I
Comparisons of Compounds Reported by Discharger and
Compounds Identified by EPA in an Industrial Discharge
IVodiK-l* (if
Haw Mali-nan
Reported
I'minKm-
t Hnk'Jtr
Hui.idu-m-
hi.i.mv
iM.mi-
1 link-lie fKv-l
1 fltt|rm-miU-n,V
isoprop>lhen/ene
iv-cilu Imhu'iir
Jl.u.-ton,- nk'udui
>luiuix>clli.mol
Mldviir'
o-pi'MMilvf.tm1
t-nii'itn 1 nidc IK'
U-tH'X jdl'lMDl"
iuphlli;ik'tic*
hcn/yl jkiilud
cidyin.ijifidult'nc'
2,fHdiTiK-ihvU
naphthalene*
cresol isonier
accnaphthulcne
tluurene
.1,3-OiplK-nyl-
|iropjiu)l
IdenliTied
l,5-^yi.tni)ft.idiciui
M>Tene*
tMiH-tln Kiyrene*
iiiil.ui*
^iliiflliy!s|\u'iU'
dinii'thvlMn.in isnincr
)-nii'i)i\] indeni'*
uvelnplu'iitine
a-iL-rpiiiLMii
ho til meth>l
naphthalene isoincr^*
a-nu'(Jt> |ru-n/> 1 altoho)
phenol*
inethy|cih\ [naphthalene
acenaphthene
methylbiptienyl isomcr
two phtrialate dfc^ttT1.
Milutiled with a standard.
The identification technique must be highly spe-
cific since thousands of compounds of varying degrees of
toxicity must be considered. Because some organic com-
pounds are toxic to aquatic organisms at concentrations
below lO/ag/t, the technique must also be sensitive.
Gas-liquid chromatography (GO) has adequate sen-
sitivity and reproducibility to provide excellent quanti-
fication for volatile organics when the identity of the
chemical is known. However, pollutant identifications
obtained by comparison of relative retention times are
subject to interferences and are questionable for the un-
known mixtures found in natural waters. Alternatively
GC may be used as a preliminary separation technique
and the effluent may be introduced into a different type
of instrument for qualitative identification.
During the 1%0's, workers at the Southeast
Environmental Research Laboratory (SERL) showed the
feasibility of using gas chromatography interfaced with
low resolution mass spectrometry (MS) for unknown
identification. They used an Hitachi RMU-7 mass spec-
trometer adjusted for maximum sensitivity, together
with manual chart reading and data reduction. Many
hours of applied effort are required to gather data, read
charts, correct background, construct a data presenta-
tion for interpretation (Figure 1), and interpret the data.
Because of the large amount of time involved between
data collection and interpretation, manual GC/MS is too
slow for effective identification of water pollutants.
Most time-limiting factors in manual GC/MS can be
accomplished by a computer. To evaluate the feasibility
of this approach for EPA laboratories, SERL and the
Methods Development and Quality Assurance Research
Laboratory in Cincinnati (MDQARL) purchased
identical computerized GC/MS systems in 1971. Since
that time, more than 20 laboratories have chosen essen-
tially the same system. A PDP-8 minicomputer in tfs"
system controls the operation of a quadrapole mass spec-
trometer and associated output devices.
The applied time required to gather data and pre-
pare it for interpretation was decreased by more than an
order of magnitude. The time required for inter-
pretation, however, also had to be shortened.
COMPUTERIZED INTERPRETATION SYSTEMS
Computer-controlled GC/MS produces many
spectra from a single environmental sample. Inter-
pretation of these spectra might be done in a variety of
ways.
Matrix Inversion
Matrix inversion can be used to give both qual
itativc and quantitative information if the sample and all
2.16
-------
impurities are known to be from a small and well-
defined universe. This condition exists in many
industrial plants. It does not exist for environmental
samples.
Manual Interpretation
Manual interpretation involves assignment of prob-
able compositions to fragments. The assignments are
then used to predict the structure of the unfragmented
molecule. As a very simple example, Figure 2 shows the
assignments in an acetone spectrum. The predominant
features in this spectrum are due to mass unit losses of
1(H) and 15(CH^). Bastfd on the assignments, the molec-
ular structure is strongly suggested. This method is reli-
able, but it is slow.
Manual Peak Matching
Manual peak matching involves comparing the in-
tensity for each spectral peak with those of literature
spectra. Lists such as those of the Mass Spectral Data
Centre and of Cornu and Massot give the fastest manual
search results; however, use of these lists is nonetheless a
slow process. These lists are abbreviated to include only
the most intense peaks.
Computerized Spectra Matching
Computerized spectra matching is a practical and
rapid way to obtain a probable interpretation. In order
to decrease matching times, the spectrum is usually
abbreviated by one of several different methods.
Biernann and his associates pioneered the 2-most-in-
tense-in-every-14 method illustrated in Figure 3, which
shows the Biemann abbreviation for acetone. A compari-
son of Figures 2 and 3 shows that no significant informa-
tion has been lost because of the abbreviation.
Computer matching includes many approaches. All
require a large library of reference spectra for best per-
formance. The following are some of the most useful:
The five most intense peaks in the spectrum
may be compared with the five most intense
peaks in each spectrum of a reference file. The
method is very fast, but is not accurate.
The Biemann abbreviation of the unknown
and reference spectra may be compared with
intensity information coded to one bit/peak.
This program is very rapid on a minicomputer
with a disk but can be slow if reference
spectra are stored on tape. It is satisfactory for
files of less than 600 spectra, but it is not suf-
ficiently selective for larger files.
The National Institutes of Health (NIH)
system developed by Heller, et al is an inter-
active FORTRAN program to perform suc-
cessive intensity screening based on
user-selected peaks. The number of spectra
that pass the screen is printed after each
screening. Each screening pass Is very rapid on
the PDP-10. The program includes many user
options (Table II). At present, development is
being continued jointly by EPA, NIH, and
Food and Drug Administration (FDA).
Table II
NIH System Options
TO SEARCH FOR PEAKS, TYPE PEAK
TO SEARCH FOR LOSSES, TYPE LOSS
TO SEARCH FOR MSDC CODES, TYPE CODE
TO SEARCH FOR MOLECULAR WEIGHT, TYPE MW
TO SEARCH FOR MOLECULAR FORMULA, Tfrt MF
TO SEARCH FOR PEAKS AND LOSSES, TYPE PL
TO SEARCH FOR PEAKS WITH MSDC CODES, TYPE PC
TO SEARCH FOR PEAKS WITH MW, TYPE PMW
TO SEARCH FOR PEAKS WITH MF, TYPE PMF
TO SEARCH FOR LOSSES AND MSDC CODES, TYPE LC
TO SEARCH FOR LOSSES WITH MW, TYPE LMW
TO SEARCH FOR LOSSES WITH MF, TYPE LMF
TO SEARCH FOR MSDC CODES WITH MW, TYPE CMW
TO SEARCH FOR MSDC CODES WITH MF, TYPE CMF
TO SEARCH FOR MW WITH MF, TYPE MWMF
TO PERFORM A SIMILARITY COMPARISON, TYPE SIM
TO PRINT OUT PEAKS/INTENSITIES AND SOURCE, TYPE SPEC
TO VIEW MICROFICHE, TYPE FICHE
TO PLOT SPECTRA ON DISPLAY TERMINAL, TYPE PLOT
TO COMMENT/COMPLAIN, TYPE CRAB
TO ENTER NEW SPECTRA, TYPE DATA
TO READ THE NEWS OF THE SYSTEM, TYPE NEWS
TO LIST THE MSDC CODES, TYPE LIST
TO EXIT FROM THE PROGRAM, TYPE OUT
USER RESPONSE
The EPA/Battelle system, which is essentially
a faster version of a Massachusetts Institute of
Technology (MIT) program, coastals of both
PDP-8 programs that automatically abbreviate
and transmit spectra and also a CDC 6400
spectrum matching program. This assembly
language program is very rapid and provides a
list of the most probable identifications
237
-------
tanked by how closely the unknown and refer-
ence spectra mulch. It is essentially an aulo-
inaiic system and provides only a Tew options.
A very rapi(f FORTRAN program for the
CDC 6400 has been developed by Clerc and
his coworkers in Zurich. It involves automatic
screening and ranking of the matches based on
a weighted binary-coded signature. Methods
for assigning the optimum weighting factors
are still being improved.
Learning and Self-Training
Learning and self-training i computer approaches
offer the best long-range hope of developing a com-
puterized identification program that will not require an
extremely large reference library.
The DENDRAL program of the Stanford group is a
360/67 program that does not require a large library. It
deduces all possible structures for a given elemental com-
position and exhaustively tests each structure by pre-
dicting a spectrum for it and comparing the predicted
spectrum to the unknown one. It is slow.
The Self-Training Interpretive and Retrieval System
(STIRS) program is a PDP-11/45 program, developed
under Nil I and EPA sponsorship, that compares library
spectra to the unknown spectrum for various classes of
spectra-derived data. The best matches for each class arc
ranked. This system can lead to identifications even if
the reference library does not include a closely related
spectrum.
EPA SPONSORED IDENTIFICATION PROGRAMS
Three of the foregoing are of particular interest be-
cause of EPA's involvement in their development. These
are the Mill system,* the EPA/Battelle system,t and
STIRS tt
NIH System
The basic NIH search program, PEAK, permits the
user to select an intensity ranging factor and from 0 to
.1 mass-intensity pairs (peaks) in each 14-mass interval.
After each peak input, the program searches the library
muss file for all intensities lying within a window deter-
mined by the ranging factor. It then tells the user the
number of compounds from the reference library thai
have passed the screening and gives him the option of
obtaining a listing of these compounds, If the number il
high, the user can input another peak, which may bt
from the same 14-mass interval or from a different one.
This process is repeated until the user requests a listing,
An example of this sequential screening (Figure 4) show*
a typical search, which was terminated after the number
of references was reduced to less than 20. Five of the six
references thai passed the screening are isomcrs of the
correct compound.
An important variation of the PEAK program is the
LOSS program. In this program, each input is a mass loss
and intensity pair. Input losses are calculated by sub-
tracting the mass value from the molecular weight of the
compound. Figure 5 illustrates the LOSS search for the
same spectrum that was used to illustrate the PEAK
search. Two of the ten references are isomers of the
correct compound.
The NIH system presently contains three other pri.
mary spectra retrieval modes; these are searches for
molecular weight, molecular formula, and compound
type. The molecular weight search is a screening for all
compounds in the file that have a given molecular
weight; the molecular formula search can be used to give*
all compounds having the stated formula, all compounds
having stated elements, or all compounds having a stated
number of atoms of selected elements. The compo-ind
type search involves screening for 86 compound cl -,si-
fication codes. All of the screening and retrieval modes
can be paired; therefore, it is possible, for example, to
perform a LOSS search on a file that consists only of
one compound type or a PEAK search on a file of only
one molecular weight.
When a search has been narrowed to a reasonably
small number of compounds, the NIH program permits
ranking the fit of each of these to the input spectrum by
means of a similarity comparison. Figure 6 shows dialog
and output of this racking routine for the references
found in the PEAK search of Figure 4. The five isomers
all have low dissimilarities.
"•User's Manual Mass Spectral Search System, EPA, Washinpton, IXC. (May 1974).
ttevs Manual Battetk Mass Spectral Matching Syltem. Battelle Columbus Laboratories, Columbus. Ohio (1974).
tK.S. Kivok, R. Venkataraghavan, and F.W. McLafferty.7. Amer. Chem. Sot:, 9.\ 4185, (1973).
238
-------
When (he number of the desired spectrum has been
located in llic reference file, the program SPEC can be
culled to output u listing of the entire spectrum in a
typed format.
Two options require certain hardware at the remote
terminal. These are the PLOT routine, which permits
retrieval and display of any spectrum in the reference
library, and the FICHE routine, which locates a desired
spectrum on a microfiche copy of the reference library.
The PLOT routine for PDP-10 users will permit
CALCOMP or ZETA plotted copy, or will display on
any of six graphics terminals. The FICHE routine either
provides information for manual location or drives a
COMCARD viewer to display the desired spectrum.
Other present NIH options permit the user to enter
data, obtain system news, and comment to the system
manager.
EPA/Battelle Matching Program
The EPA/Battelle matching program, taking advan-
tage of the high information redundancy of mass
spectra, is based on the two most intense peaks in every
14 mass units. There are four main steps in the matching
process:
Screening based on molecular weight range
Screening based on the most intense peak of
the unknown spectrum
Presearching based on the spectrum family
Ordering the best matches based on peak-by-
peak comparison of the unknown spectrum
with those reference spectra passing the
prcsearch.
To reduce operation time and eliminate human
errors and prejudices in selecting, formatting, and trans-
mitting data, PDP-8 utility routines transfer input
spectra data directly from the user's remote PDP-8 to
the central CDC 6400. These programs have been eval-
uated and improved during the past year. A match
against the present data base of 9000spectra
(8400 general organic spectra culled from the Alder-
mas ton collection and 600 pollutant spectra from SERL
and Battellc) requires approximately 40 seconds of
elapsed time.
The similarity index (SI) gives the user an imme-
diate indication of the quality of the matches. The "best
hit" will be the first identification; the SI will show
whether it is a poor match (<0.2 if the data base does
not contain any closely related compounds), one of
several fair matches (0.2-0.35 if the correct compound is
not in the data base but related ones are), or a good
match (>0.35 if the SI of the second best hit is signifi-
cantly lower).
In one study made at SERL, 50 percent of the un-
knowns present in the effluent of a kraft paper mill were
found correctly as the best hit, 8 percent as the second
best hit, and 2 percent as the third best hit (13). The
success of the system should improve since reference
spectra are added continually.
The search may miss the match because of poor
quality data in the input spectrum. If the spectrum looks
poor, or a match is not found using the standard search
parameters, the user has the option of decreasing dif-
ferent portions of the presearch selectivity. This is done
by redefining parameters for ratio and rectangular array
in the search dialog.
Compared with magnetic deflection spectrometers,
quadrupole instruments exhibit a bias toward low mass.
This is demonstrated in Figure 7, which compares both
types of spectra for the pesticide, parathion. Since the
Aldermaston data base is comprised primarily of spectra
obtained on magnetic deflection mass spectrometers and
most EPA laboratories have quadrupole instruments, a
major concern in the development of the matching
system was whether suitable matches could be obtained
between quadrupole and magnetic deflection spectra.
Experience with the system has shown that the program
provides excellent matches for both quadrupole and
magnetic deflection spectra.
The same magnetic deflection spectrum of
p-ethyltoluene that was used to illustrate the NIH pro-
gram (Figure 4 through Figure 6) can be used as well as
an example of the EPA/Battelle program. Figure 8 shows
the results for this search. All of the best hits are
isomers; however, the first and third matches are most
closely related to the correct structure.
Another example of the use of the EPA/Battelle
matching program is provided by a study of the effluent
of an experimental coal gasification plant. Organic com-
ponents were extracted with methylene chloride. The
239
-------
diromutogram of the extract (Figure 9) contained seven
distinct peaks.
In a computerized matching of the quadrupole
spectra tor those compounds, the best matches were
with ('^, C-j, and Cg hydroxyl-containing materials. The
output indicated high Si's for the first six peaks, but a
low one tor the last GC peak. Subsequent visual inspec-
tion of the mass spectrum for this GC peak indicated
that the last peak arose from two compounds with the
same retention time.
The identifications are given in Table III. When
different materials were selected by the matching pro-
gram as the best and second best matches, relative GC
retention times favored the best match over the second
best. In a continuation of the computer dialogue, given
in Figure 10 for GC peak 3, thirteen cresol spectra were
matched with Si's greater than 0.645. The first non-
cresol match was 3-tolyl-N-methyl carbamatc with an SI
of 0.574.
Table III
Compounds Tentatively Identified in Waste
Effluent of Coal Gasification Pilot Plant
1.1
IVak
1
]
J
s
4.
IV. c Mji.l,
I'll, ll.'l
1' 1 II* I-M
I.-CU-.
_'.l>-l> 1
I.J-II I
U-l> .
1 M> 1
I.MI 1
•II
•IJ
•II
tlr
J,
> IplK'lh
>l|ll>< lit
Sljtlu-n-
t\MUJI
.n.iphil
lull-lit
km-
SI
71S
7JI
H42
KIU
h9J
h.17
M:
Table IV summarizes results obtained by Budde
and liichelberger* of the MDQARL in an objective
evaluation of the NIH and EPA/Battelle systems, using a
mixture of typical environmental pollutants.
STIRS
Uivauw nl lliL- relatively small size of both the NIH
and t-l'A reference libraries, a substantial number of
environmental samples cannot be matched by either of
these system*. Hie STIRS deduces substructures based
on reference spectra and permits the analyst lo recon-
struct the structure of the unknown even if the library
docs not contain a matching spectrum. The program
permits the user to transmit a complete mass spectrum
from a remote PDP-8 to the Cornell PDP-11/45 via the
same serial data interface and acousticoupler used by the
EPA/Battcllc system. The input peaks are compared to
spectra in the 25,000 spectra reference library and used
to calculate the probable molecular weight and value
(from 0 to 1000) for 14 match factors. These factors are
based partly on known mass fragmentation behavior and
partly on experience gained through the use of this
program. For each of these match factors (Table V), the
10 best matches in the data base are ordered by spectral
similarity and listed for the user. A 15th match factor is
also generated as the overall match factor.
A search using STIRS gives not only the chemical
name, but also Wiswesser line notations (WLN's) for
each of the compounds selected for the particular match
factor. Figure 11 is the STIRS output obtained by trans-
mitting a spectrum of ethyltoluene. The confidence
table indicates a high confidence that a phenyl ring is
present, a lesser confidence that an ethyl group is pre-
sent, and a low confidence that a methyl group is
present. All of these conclusions are correct. Low
confidence suggestions of unsaturation and a branched
chain arc incorrect. A set of criteria is being developed
that will permit predicting the probabilities that given
groupings are present in the unknown; however, scan-
ning the WLN's found for each factor shows that certain
WLN substructures are repeated far more frequently
than others. If the same substructure is indicated by
several match factors and another substructure by other
match factors, the correct structure probably contains
both substructures. This intuitive conclusion has been
demonstrated to be correct in many cases.
STIRS suffers to a lesser degree from the same li-
brary limitation as do the EPA/Battelle and NIH sys-
tems. A spectrum of the exact material or even a closely
related one does not have to be present in the reference
library, but some compounds having the appropriate
substructures must be present. A spectrum for a phos-
phorthioate was found to give no useful information
when it was first submitted to STIRS. This led to the
discovery that none of the spectra in the library at that
time included the phosphorthioate substructure.
Recently, an American Society for Mass Spec-
trometry panel submitted identical test spectra to the
Mass Spectral Data Centre, NIH, EPA/Battelle, and
STIRS searches. The results showed STIRS to be
markedly superior to all of the others in its ability to
suggest correct identifications.
*W.L. Buddc and J.W. liii'helbcrper, personal communication. July, 1974
240
-------
Table IV
Summary of Results Obtained with Some Typical Organic Water Pollutants
Using the NIH and EPA Mass Spectrum Matching Programs
Typical Water Pollutant
1 , l,2.2-lelr;ichloriK'lhuiic
riicnnl
iiisiJ-cliloiiiisoprnpyDc-llicr
jMTVMll
(t-tfipincnl
naphthalene
o-niltotoluenc
K'luolliia/ole
>ii)clh>liiapthalcm.'
l-inethylnaphtlialenc
hcxadceanc
jccnaphilicnc
dihcnzofuran
MM (Operator Selected Ions)
No. of
Mia/Abundance
Pairs Entered1
.1
5
(i
II
S
$
1
4
ID
1
5
3
4
NO. or
File Spectra
Matched (Hits)
7
K
1
10
4
7
f>
2
II
3
8
8
8
No. of .
Correct Hits"
K
4
1
1C
>
3
lid
1
5C
2f
3
5
3
EFA (Automatic Abbreviated SpMttum)
No. of
FitoSpectn
Matched (Hill)
1.1
24
65
7H
(1
IK
15
25
K
9
442
17
II
Rank of the
Correct Hits by.
Similarity Index"
1
1.2
1,2.3
1
h
2.3
1,2
1.2
2«
1.2
3
1.2
1.2
SI
.567
.878..69I
.7S8..7S8..602
.808
.771..7J8
.892..S86
.732.J28
.668
.609..598
.766
.786..7S3
.682..560
•'No more than two mass/abundance pairs were entered lor each standard Mill 14 amu interval. The data from the highest mass interval observed wai
entered first followed by the next highest, etc. Within each interval the higher mass number was always entered first. Input was stopped when 10 hits
or less were reported or the data was exhausted. With this system a different sequence of data entry can give significantly different results.
bDuplicatc correct answers indicate there arc duplicate spectra in the data base.
cSeven other hits were either m-cresol or o-crcsol which have very similar miss spectra.
dfhi: spectrum of o-nitrotoluenc was in the data base but was in error.
e | |,c oilier six hits were l-nwlhylnaphlluilrnr which has a virtually identical mass sprclrum.
' I he nlhei liil was 2-inclhylnuphlhafcnc which has a virtually ulenlH.il m.iss spectrum.
V-1 ||t. |ni with (lie liinliv.1 SI t.74.l> wa> l-nicihylnuphlliiilenc whieh Ims u virtually iJcnlieal muss spcctiuiiK
''Remote computer abutted three limes with three different abbreviated speclra from two different Ci( /MS runs. Manual input of one of the same
data sets several days later yielded four hitsincluding two correct with an SI of 1.00.
Table V
Spectral Data Classes and Match
Factors Utilized by STIRS*
(!».. of Spcilral 1)111
1.
i
JK.
.<.
JH.
4.
5.
t>.
Ml.
7.
M.
').
HI.
tun x'TK-^
lit* IIUss .luiJi'K'liNlk Kilts 1 < Ill/I1 •*(!»
ihctl.iiMnimlurjik-liMk n>ii\alni/i'li7-IIKi
McJitini nuss vlurjtltfriMK tons tni/c 911-1441
(>>«lj|'|l|nf clUMlUTKlk Kills" IIIKC 1 If-l'.'l
lliijll itu^.hjuilciiMk IOIK i>ni(i- I5UI
Neutral loss fries'1
Snull piinui> m'utul losses
1 Kerl;ippni)t |ihni»r> nviiliul losses4 I.I.^-MIII
1 *ltx tuiniiir> iieiilul lossi's
Si'ittiiil.iiv ih*UM.il losses Iront the tuosl jbutvlunl iHUI mass kiss
SeiiliHl.n> lh-UM:il IOSH-S lloul the inosl alilllHlalll wen lluss loss
< l,iss«) I|.HJ ol the unknown s|veltuitt iiuli'lk'il apinslCtiss*
tt.iu ot ilie leli-ieme s|vi-num
1 inj'ei 1*11111 ions
(h.'l.lll MUHll I.I, IMI
Milch KK*M
Ml 1
MI:
MI i:
Ml .<
Ml 1)
Ml 4
Ml 14
Ml 5
Ml 1!
MIA
Ml 7
Ml K
Ml -I
Ml HI
Ml II
o» iiukli Ijelm vutieiilU iiiulei iA|>eililKiiul etjlualion.
• I.». Hi Ijllein. "I oiii|Miloi-AiJoJ Intriprruiiiui »l Mjss SIHMI.I 111." in («l<
;iAr/V(tffc Sivftnunclrv in thf Infrttijtalttmol thinun Dtsem. ed.
JiU (.K. Svmet iMonlKjl, lafijda Mcl.ill I nivcrsily-MivntreLiI
(lilklien's llus|iiul. I4)4i.
241
-------
FUTURE PLANS
Work underway and plans for the future include
joint efforts by the Office of Research and Development
(OR&D), Management Information and Data Systems
Division, NIH, and FDA to;
Expand and centralize a large, refined data
base
Translate the EPA/Battelle program to
FORTRAN for inclusion with the NIH pro-
grams
Develop software and hardware to permit
automatic data transmission between all EPA
GC/MS/computers and the NIH PDP-10
Make all matching programs developed avail-
able to the public through cooperation of the
Mass Spectral Data Centre
Expand the capabilities of the EPA/Battelle
system to maintain geographic frequency dis-
tribution of all identified organic pollutants
Develop FORTRAN procedures for de-
convolution of overlapping GC/MS peaks
Expand STIRS to allow the program to
predict the most probable structure.
All of these search programs are useful to EPA in
determining the identity of pollutants. This utility will
be greatly increased as the above plans are completed.
STIRS has the greatest potential of any of the EPA pro-
grams to provide rapid identification of any GC/MS un-
known with a library of less than 50,000 spectra. As is
true, however, for all computer identifications, such
identifications are suggestions of possible structures and
do not remove analytical responsibility from the
chemist. Final identification requires confirmation by
the analyst.
242
-------
ICO
80-
70-
50'
40-
3t-
20r
j
-
(
1 1 f 1 till
2C 30 40
1 ,..,,,,. 1 .,, I
50
-+T-I -T-l 1 1 1 1 1 1
1 ,..,.,...,
60 70
ACETONE SPECTRUM NUMBER
-------
100-
"
90-i
80-
70-
60-
50-
40-
30-
20-
10-i
0-
1 , 1
0
ii
CH3*
Jc-c
- 1
20
1 LS ACETONE
CH3
T [c-c-c
H3J L ^
1 1 II
1 I | I I 1 i | 1 i i
30
1 '
40
r o-i
jf
CH C
r ° n +
1 " 1
r °~\+
\ f^LJ f* 1
iv^n^k^ I
o
ft
1 0
50 60 70 %
0
I
2 3
• OQ
!§
o ***
•^
00
.
9
3.
-------
100
90-
80-
70-
60-
50-
40-
30-
;
20-i
ID-!
I
•*•
_
L
JC-C
I
20
1 LS ACETONE
r o~
"
H3J
30 40
-i-
EOT
:H3cJ
r • T
[CH3C-CH3J
1-4 r , r r ,-, ^ , T^p^ , ,,,,.,,,,,,
50 60 70
cr
f
-------
TYPE PEAK, INT
CR TO EXIT, 1 FOR ID,MW,MF, AND NAME
USER 105 ,:>00
# REFS M/E PEAKS
893 105
NEXT REQUEST 120,38
# REFS M/E PEAKS
96 105 120
NEXT REQUEST 91,11
# REFS M/E PEAKS
25 105 120 91
NEXT REQUEST 79,16
# REFS M/E PEAKS
7 105 120 91 79
NEXT REQUEST 1
ID#
2135
4867
16799
20472
20473
20474
20475
MW
MF
120 C9 H12
152 C9 H12 02
388 C12 H12 03 W
120 C9 H12
120 C9 H12
120 C9 H12
120 C9 H12
NAME
ISOPROPYLBENZENE
CUMENE HYDROPEROXIDE
PI-1,3,5-TRIMETHYLBENZENE-
TRICARBONYL TUNGSTEN
ISOPROPYLBENZENE
ISOPROPYLBENZENE
l-METHYL-2-ETHYLBENZENE
0-ETHYLTOLUENE
NEXT REQUEST
Figure 4
NIH PEAK Search for p-Ethyltoluene
-------
TYPE LOSS, INT
1 TO EXIT, 2 FOR ID,MW,MF AND NAME
USER -15,100
ft REFS LOSSES
2169 -15
NEXT REQUEST: -29,12
ft REFS
504
LOSSES
-15 -29
NEXT REQUEST: -41,16
# REFS LOSSES
84
NEXT REQUEST:
ft REFS
39
NEXT REQUEST:
# REFS
10
NEXT REQUEST:
IDft
241
260
446
5657
19076
19271
19272
19274
20476
MW
-15 -29 -41
-43,18
LOSSES
-15 -29 -41 -43
-55,7
LOSSES
-15 -29 -41 -43 -55
2
MF
70 C5 H10
70 C4 H6 0
82 C6 H10
160 C12 H16
70 C5 H10
82 C6 H10
82 C6 H10
82 C6 H10
120 C9 H12
TY*>E C TO CONTINUE LIST,
CR TO EXIT: C
20477
120 C9 H12
NAME
3-METHYL-l-BUTENE
3-BUTYN-2-OL
1-METHYLCYCLOPENTENB,
1- (2 , 3-DIMETBYLPEHNYLj -BUT
r2-ENE
2-NETHYL-2-BUTENE
?ffiTHYLENE CYCLOPENTANE
METHYLENE CYCLOPENTANE
CYCLOHEXENE
M-BTHYLTOLUEN5
l-METHYL-3-ETHYLBENJJENE
FlfuwS
NIH Lorn Swch for jhEAyltohwat
247
-------
STD ETHYL
ID#
2135
4867
16799
20472
20473
20474
20415
USER RESPONSE
DISSIMILARITY
0.14
25.01
6.83
0.12
0.11
0.10
0.16
Figure 6
Dissimilarity Comparision for PEAK Search of Fig. 4
24K
-------
r
is
litf IV IW
«,OCiH5
SOC2H5
ZOO Zld
feiJ
13 U
U tt Vu
mij l.jILL.|lll^MH^. fl«l-.ltT- ..:..».(.".^*r.(:
«iiV.,-.,.«i,i,l i|m»yi i»|Bii«f
•^j 3t4<*3 V^iiT 210 Ift9 *su '
-------
I . D . ? ETHYLTOLUENE
PAPER TAPE? N
MT J UAL
STANDARD
27 , 4 ; 39 , 8 ; 41 , 3 ; 51 , 8 ; 58 , 4 ; 63 , 4 ; 65 , 7 ; 77 , 18 ; *
79 f 16 ; 9 1 , 12 ; 103 , 10 ; 105 , 100; 106 , 9 ; 120 , 3 8 ; 12 1 , 4 ;
END
PARMTRS? M100-200
74 HITS
l-METHYL-2-ETHYLBENZENE 120 C9 H12 API 0312
FILE KEY= 3136
ISOPROPYLBENZENE (CUMENE) 120 C9 H12 API 0311
FILE KEY= 3135
SI=0.691
l-METHYL-3-ETHYLBENZENE 120 C9 H12 API 0313
FILE KEY- 3137
SI-0.660
1,2 ,3-TRIMETHYLBENZENE 120 C9 H12 API 0315
FILE KEY= 3139
SI=0.631
1,2 ,4-TRIMETHYLBENZENE 120 C9 H12 API 0316
FILE KEY= 3140
SI-0.569
Figure 8
EPA/Battelle Matching of p-EthyltoIuene
250
-------
£
-------
S, E, 0R P?S
I.D.? G0AL GASIFICATION PLANT EFFLUENT
PAPER TAPE7Y
FN—F73 ; S -1 :
CHPFI-1 (1ST EXT) :
37 , 3 ; 38 ,8;39,39;40,4;41,2;43,2;50,12;51,20;52,10;61;:
53 ,18 ;54,6;55,4;61,2;62,4;63,9;64,2;65,4;66,2;56;:
74 ,2;77 ,41;78,9;79,38;80,13 ? 81,2;89,4;90,12;91,6;61;:
106, 3;107,100;108,91,-109,6;33;:
END
PARMTRS? M100-500
111 HITS
M-CRES0L 108 C7.H8.J?
FILE KEY= 186
SI=0.857
AST 0181
1-HYDR0XY-3-METHYLBENZENE (3-METHYLPHEN0L—M-CRES0L)
108 C7.H8.J? TRC 0068
FILE KEY= 6392
SI=0.845
1-HYDR0XY-2-METHYLBENZENE (2-METHYLPHEN0L—0-CRES0L)
108 C7.H8.0 TRC 0067
FILE KEY= 6391
SI=0.834
1-HYDR0XY-4-METHYLBENZENE (4-METHYLPHEN0L—P-CRES0L)
108 C7.H8.(? TRC 0069
FILE KEY= 6393
SI=0.815
M-CRES0L 108 C7.H8.J2T
FILE KEY= 462
SI=0.805
AST 0459
Figure 10
Computerized Spectra Matching Program Dialogue for
a Component of Coal Gasification Plant Extract
252
-------
HIGH MASS FINGRRPRINT ION MATCH (MF10)
MF1 MF2 MF3 MF4 MF5 MP6
711 409 622 9 766 0
# 2139 l-METHYL-2-ETHYLBENZENE
2R B C9 H12 120
711 409 632 0 692 0
# 2141 l-METHYL-4-ETHYLBENZENE
2R D C9 H12 120
711 507 649 9 857 9
# 2135 ISOPROPYLBENZENE
1YR C9 H12 120
533 230 805 0 230 0
ft 5848 2-PHENYLHEXANE
4YR C12 H18 162
666 347 715 0 83 0
ft 5185 ALPHA-CHLOROACETOPHENONE
G1VR C8 117 01 CL1 154
OVERALL MATCH (MF11)
MF1 MF2 MF3 MF4 MF5 MF6
711 507 649 0 857 0
# 2135 ISOPROPYLBENZENE
1YR C9 H12 120
711 409 622 0 766 0
# 2139 l-METHYL-2-ETHYLBENZENE
2R B C9 H12 120
711 409 617 0 766 0
# 2140 1- METHYL- 3-ETHYLBENZENE
2R C ' C9 H12 120
711 579 496 0 764 0
# 2133 6-METHYL-6-ETHYL FULVENE
L5YJ AUY2 C9 H12 120
711 409 632 0 692 0
ft 2141 l-METHYL-4-ETHYLBENZENE
2R D C9 H12 120
% CONFIDENCE TABLE
SEARCH LIMITS 90-100
SYMBOL MATCH FACTOR CONFIDENCE
CH3 MF 7 99 MF11 94
ETH MF 5 99 MF 7 99 MF11
PHN MF 1 100 MF 2 100 MF 3
C=C MF 7 100
Y MF 7 99 MF 8 99
MF7
833
833
833
666
500
MF7
833
833
833
1999
833
— BASED
99
199
MF8
750
750
750
0
0
MF8
750
750
9
1999
750
MF9 MF10 MF11
250 675 678
250 637 644
250 635 743
9 634 411
0 624 346
MF9 MF10 MF11
250 635 743
250 675 678
0 617 677
250 589 677
250 637 644
ON WLN'S
MF10
100 MF11 100
Figure 11
Typical STIRS Output
253
-------
BOWNE TIME SHARING (WORD/ONE)
By Catherine Tittle
Large volume, impo r.iblc deadlines, small staffs and
limited budgets traditionally create a significant
i oadblock in the preparation of research reports,
manuals, directories and other documents. The Environ-
mental Protection Agency (KPA), faced with this
problem since it was organized, found the solution in
Word/One.
The OPA has been using Word/One since 1971.
Beginning in Washington and spreading to the Regions
and National Environmental Research Centers,
Woid/One has been used successfully by EPA to produce
many of the regulations, manuals, reports, directories,
mailing lists and so forth which comprise a significant
portion of the Agency's paperwork production.
Five years ago, Bowne Time Sharing (BTS) began
to apply computer technology to the business of pro-
ducing words rather than crunching numbers. Word/One,
BTS' computeri/cd text-processing system, was designed
to deal with the ever increasing paperwork explosion.
(Certain applications lend themselves more readily and
cost-effectively to computerized word processing:
Figure 1.) Using the shared computer approach, Word/
One permits the simultaneous use of a central computer
by many remote locations using a typewriter/terminal
and a telephone line. This service is available wherever
phone service exists and is supported by Bowne Time
Sharing's seven offices in Washington, D.C., Boston, New
York, Chicago, Philadelphia, Atlanta and Los Angeles.
The BTS computer configuration, which is located
in New York, is an IBM 370, model 155, with a million
bytes of core storage. The peripheral devices, which are
used for on-line storage, are capable of providing imme-
diate access to over a billion bytes of text information.
The printing volume, which is naturally the highest vol-
ume of output, is handled by live high-speed printers
that opeiate at a speed ol over 400 lines per minute each
and produce 800,000 to I million lines of typewriter
quality, upper and lower case, information daily. Addi-
tional output can include magnetic tape and punched
cards. Figure 2 shows the interrelation of the different
parts of BTS' Word/One System configuration.
Word/One is designed for use by personnel having
little or no computer knowledge or background. All in-
structions are phonetic.
The system consists of four basic segments:
Input
Storage and Retrieval
Editing and Manipulation
Format and Print.
A discussion of each segment follows.
Input
Word/One is designed to: (1) capture keystrokes,
thereby reducing typing of future drafts on magnetic
disk and (2) eliminate (mechanical and error-prone)
typing tasks, including insertion of heading information
and page numbering.
Storage and Retrieval
Word/One allows the user to: (I) store, for an
indefinite time, any input material, (2) assign a con-
venient name, (3) retrieve, immediately, any of the
material, (4) optionally share the document with select-
ed other organizations for joint efforts and (5) maintain
document records, e.g. the date data is stored, for com-
plete management control.
Editing and Manipulation
Word/One is designed to: (1) minimize typing when
revising a document, (2) eliminate redundant proof-
reading of previously approved material, and (3) reduce
the delay in producing the next draft. Selected
Word/One features that assist in this area are:
Replace: Word/One can change all occurrences
of a word or phrase to another at all
preselected locations.
or
Elasticity. Word/One will automatically
expand or contract each edited segment of a
document to accommodate the change.
Proofmark: Word/One, for working drafts
automatically places a special character to the
right of each edited line so that the reader will
know where each change occurred.
254
-------
Format and Print
Word/One provides a number of text formatting
options that allow: (1) mechanical functions, such as
insertion of headings to be specified rather than typed
each time, (2) error-prone operations, such as the main-
tenance of tabular material on a page, to be handled by
the system, (3) flexibility in changing the document
layout, such as page width and depth specifications, and
(4) print output options, such as justified left and right
margins.
In addition, the Word/One system provides a
significant level of throughput. The high-speed printers
produce typewriter equivalent copy overnight, regardless
of document size. Most of the low-speed terminals
currently used by EPA are compatible with the
Word/One system.
Using this concept EPA has produced many
publications including EXPRO, the OR&D Program
Planning and Reporting Manual, Methods Manual for
Chemical Analysis of Water Wastes (1974), and develop-
ment documents and regulations in several program
areas. These publications were produced meeting tight
deadlines with limited clerical staffs. Productive time of
both clerical and professional personnel involved was
maximized because the material was typed only once,
changes were made only where necessary and the
high-speed printer was used to speed turnaround.
Where large documents with significant revision are
handled, Word/One has proven to be a necessary and
beneficial tool.
255
-------
TYPEWRITER
AUTOMATIC
TYPEWRITER
COMPUTER
EDITING
SIMPLE
<
XI
>
U"
COMPLEX
Figure 1
Equipment Method Cost-Effectiveness Range vs
Text Manipulation Requirements
(TYPEWRITER
TERMINAL
Figure 2
Equipment and Data Flow for Word/One Applications
250
-------
INTRODUCTION TO THE UNIVAC 1110
AT RESEARCH TRIANGLE PARK - CAPABILITIES
By T. L. Rogers
INTRODUCTION
In June 1973 UNIVAC was awarded the contract
to provide the new computer system for the Research
Triangle Computing Center. The new system as installed
(Figure 1) had to pass a benchmark test with a through-
put ratio of 12 to 1 as compared to the IBM 360/50.
The system is a UNIVAC 1110 with the components and
general capabilities as described in this paper.
GENERAL DESCRIPTION
The UNIVAC 1110 System is a general purpose,
high performance system incorporating the latest
advances in computer design, system organization, and
programming technology. The various components of
the UNIVAC 1110 System are designed as separate
logical units providing maximum functional modularity.
The multiprocessing capabilities are an integral part of
the system; the command/arithmetic units can perform
numerous tasks simultaneously under control of a
common executive. The flexible modular structure
enables a user to tailor a system to his individual
requirements. Principle features of the UNIVAC 1110
System are:
Common resource systems organization
Multiple command/arithmetic units (CAUs)
and input/output access units (lOAUs)
Character manipulating instructions
Partial-word, double-word, and full-word
addressability
System partitioning capability
Redundancy among system components
Two levels of directly addressable storage
Large modular plated-wire primary storage
Large modular core extended storage
Storage protection
Program address relocation
Independent input/output access units
(lOAUs)
Extensive software library and language
processors
Dynamic adjustment to a mix of batch,
demand, and real-time modes
Wide choice of high performance peripheral
subsystems
Independent, simultaneous communications
processing.
SYSTEM COMPONENTS
Each component in the UNIVAC 1110 System is
functionally independent and may have the following
properties:
Two or more access paths
Access conflicts resolved by priority logic.
In a multiprocessor configuration the following
capabilities are standard:
Continued system operation if any component
fails
Any component can be logically removed for
servicing without disabling the entire system.
The UNIVAC 1110 System consists of eight types
of components:
Command/arithmetic units
Input/output access units
System console
System partitioning unit
257
-------
Primary storage
Extended storage
Maintenance coMtroller
Peripheral subsystems.
Command/Arithmetic Unit (CAU)
The basic UNIVAC 1110 System configuration
includes one command/arithmetic unit (CAU) and one
input/output access unit (IOAU). All control and
arithmetic functions are executed by the CAU. The CAU
is a multitask instruction-stacking device capable of con-
trolling up to lour instructions at various stages of
execution. A CAU can interface with up to four primary
storage units by means of both an instruction path and
an operand path. Dual data paths connect the CAU with
extended storage through a maximum of eight UNIVAC
multiple access interface (MAI) units. The data paths to
primary and extended storage have overlapping and
interleaving capabilities. In a multiprocessor system, the
user can specify and can change which units are to be
used, thereby permitting a system to be logically divided
into two or three independent smaller systems, or re-
moving individual units for maintenance without
affecting the total system. Interrupt signals may be sent
or received on one of the three interprocessor lines.
Additional features of each CAU are:
Capability of executing up to 1.8 million in-
structions per second
300-nanosecond effective basic instruction
time
Four-deep instruction stack
112-word general register stack (GRS)
Character manipulation by means of
byte-oriented instructions.
The Research Triangle Park (RTF) System has two
CAUs.
Input/Output Access Unit (IOAU)
The basic UNIVAC 1110 System configuration
includes one IOAU. The IOAU controls all transfers of
data between the peripheral devices and primary and
extended storage. Transfers are initiated by a CAU under
program control. The IOAU includes two nonconcurrent
data transfer paths, one for primary storage and one for
extended storage.
The IOAU consists of two sections: a control
section and a section containing from 8 to
24 input/output channels. Input/output (I/O) data trans-
fers may occur simultaneously with the execution of
programs in the CAU.
The control section includes all logic associated
with the transfer of function, data, and status words
between primary or extended storage and the sub-
systems. It also services I/O requests from either one or
both of the CAUs (in a multiprocessor system) and
routes interrupts to one of the two CAUs. Interrupt
routing may be specified by program.
Some outstanding features of the IOAU are:
Aggregate transfer rate of 4 million 36-bit
words per second (24 million characters)
Externally specified index (ESI) and internally
specified index (ISI) transfer modes on any
channel
Data chaining
Interrupt tabling
Storage-to-storage transfers.
The RTP System has one IOAU.
System Console
The system console provides the means for commu-
nication with the executive system. The basic console
consists of the following major components:
The cathode ray tube (CRT)/keyboard
consists of a UN1SCOPE 100 display terminal.
The display format is 16 lines with
64 characters per line. The seven-bit ASCII
character set, consisting of 95 characters plus
the space, is used. The keyboard provides all
of the operator controls required for gen-
erating data and initiating transfers.
The incremental printer operates at
30 characters per second and provides a hard
258
-------
copy of console messages. The cabinet con-
taming the printer also contains the power
supplies and control logic required to select
the CRT, incremental printers, and facilities
for the real-time maintenance communication
system (RTMCS). This unit also contains the
interface between the console and any ISI
channel on the IOAU. Up to five additional
incremental printers may be connected to the
console.
The fault indicator, located on the
incremental printer, provides the operator
with a visual indication of a fault condition in
a major system component. The actual com-
ponent and nature of the fault may then be
determined from indicators on the main-
tenance panel.
The RTF System includes one console consisting of one
CRT and one printer.
System Partitioning Unit (SPU)
The system partitioning unit (SPU), when included
in the UNIVAC1110 System, permits off-line main-
tenance of units, enables the operator to logically
partition the system into two or three independent
systems, and initiates a recovery sequence in the event of
failure. The SPU performs six functions, five under
operator control and one under software control. With
the SPU, the operator can:
Partition the total system into two or three
smaller systems
Isolate units and take them off-line for
maintenance without disrupting the rest of the
system
Function as a system monitor by observing
the status of the various major components
Perform initial load into the primary system
Allow automatic recovery procedures if an
interrupt is not received.
Under software control, the SPU presents status in-
formation to the lOAUs. When all optional features are
included, the SPU is able to interface with:
Six command/arithmetic units
Four lOAUs
262K words of primary storage
Eight MAI units (1048K words of extended
storage)
48 multiaccess subsystems.
This unit is not included in the RTF System. Since the
RTP System has only one IOAU, the SPU cannot be
utilized.
Primary Storage
The first level of directly addressable main storage
in the UNI VAC 1110 System is primary stotage. Primary
storage consists of high-speed, nondestructive readout
(NDRO) plated-wire storage units with nominal random
read and write cycles of 320 and 520 nanoseconds, re-
spectively. The basic 32K-word storage unit consists of
four 8.K modules, and may be expanded in one 32K in-
crement to a 65K unit. The minimum primary storage
for a basic 1x1 configuration (one CAUand one IOAU)
consists of 32K words. A total of four 65K units pro-
vides a maximum primary storage capacity of 262K
words in a system. The basic 32K storage unit accom-
modates eight access paths, servicing four of them
simultaneously; a 6SK unit accommodates up to sixteen
access paths, servicing eight of them simultaneously.
Partial (sixth, quarter, third, and half) as well as
full-word operation is provided. The RTP System
contains 97K words of primary storage.
Extended Storage
The second level of directly addressable main
storage in the UNTVAC 1110 System is the extended
storage system. The minimum extended storage con-
figuration consists of 131K 36-bit words. Extended
storage capacity may be expanded in 13IK increments
up to a maximum of 1048K 36-bit words. Each unit has
a l.S microsecond read/write cycle. Extended storage is
connected to the system by MAI units which provide up
to ten access paths to each storage unit. The RTP
System supports 262K words of extended storage.
Maintenance Controller
The maintenance controller provides for diagnostic
checkout by the automatic comparison of maintenance
259
-------
[i.iiicl indicators against known good d;ila on (upe fur (lie
following:
("AUs
lOAUs
Disc controllers (UN1VAC8440 Disc
Subsystem)
Communication/symbiont processor
Printed circuit cards.
To complement its diagnostic capability, the main-
tenance controller allows for the operation of the
operator/ maintenance panels by personnel at a remote
site. This device is included in the TRP System.
Peripheral Subsystems
The UNIVAC 1110 System offers a full range of
peripheral subsystems; this wide range provides the capa-
bility to satisfy many requirements. The standard
UNIVAC peripheral subsystems include:
High-Speed Printer Subsystem
RTF System: 2 High-Speed Printers
Card Subsystem
RTF System: 2 Card Readers
UNIVAC 9000 Series Subsystem
RTF System: 2 9300 Subsystems, each with
600 LPM Printer
UNJSERVO 12/16 Magnetic Tape Subsystem
RTF System: 15 Uniservo 16 Tape Drives
UNISHRVO 20 Matnctic Tape Subsystem
RTF System: None
H1-432/1782 Drum Subsystem
RTF System: 6 FH432 Drums, 2 FH1782
Drums
UNIVAC 8414 Disc Subsystem
RTF System: None
UNIVAC 8424/8425 Disc Subsystem
RTF System: I 8424 Disc Subsystem (8
Drives)
UNIVAC 8440 Disc Subsystem
RTF System: None
UNIVAC 8460 Disc Subsystem
RTF System: 4 8460 Discs
Communications/Symbiont Processor (C/SP)
RTF System: None
UNIVAC DCT 500 Data Communications
Terminal
RTF System: 25 DCT-500s are included
UNIVAC DCT 1000 Data Communications
Terminal
RTF System: None
UNISCOPE 100 Display Terminal
RTF System: 10 Uniscope 100s are included
Communications Terminal Module Controller
(CTMC)
RTF System: 2 CTMCs are included.
Destandardized Subsystems
The following peripheral subsystems may be
included in the UNIVAC 1110 System:
UNISERVO VIIIC Magnetic Tape Subsystem
FASTRAND II and III Mass Storage
Subsystems
Communication Terminal Synchronous (C S)
UNIVAC 1004 Subsystem
UNISCOPE 300 Visual Communications
Terminal.
CONFIGURATIONS
The basic UNIVAC 1110 Processing System (1x1
configuration) consists of two functionally and
physically independent units: one CAU and one IOAU.
The processor organisation is intrinsically that of a
multitask processor and is designed for operation in a
multiprogramming and multiprocessing environment.
The basic processor may be expanded by adding CAUs
and/or lOAUs up to a total of four CAUs and four
lOAUs (4x4). The basic 1x1 configuration is shown in
Figure 2; Table I lists all fully supported configurations.
260
-------
Table I
Fully Supported Configurations
UNITS
CAU
IOAU
PRIMARY STORAGE (words)
EXTENDED STORAGE (words)
MAI
SYSTEM CONSOLE
SYSTEM PARTITIONING UNIT
CONFIGURATION
txl
1
1
32K- 262K
I31K-1048K
1-8
1
0-1
2x1
2
1
65 K- 262K
262K- 1048K
2-8
1
0-1
2x2
2
2
65K- 262K
262K-1048K
2-8
1-2
0-1
4x2
4
2
131K- 262K
262K-1048K
2-8
2
1
4x4
4
4
131K- 262K
262K-1048K
2-8
2-4
1
Minimum Peripheral Complement
The following list of peripheral equipment is the
minimum available with the UN1VAC 1110 System. This
minimum has been established to ensure an adequate
complement for customer engineering and software
support-
RTP COMMUNICATIONS SUPPORT
The UNIVAC Communications Terminal Module
Controller (CTMC) Subsystem enables the
UNIVAC 1110 System to receive and transmit data by
way of any common carrier at any of tin standard rates
of transmission up to 50,000 bits per second. It can re-
ceive data from or transmit data to low-speed (up to 300
bits per second), medium-speed (up to 1800 bits per
second), or high-speed (2000 to 50,000 bits per second)
lines in any combination. The RTP system contains two
CTMCs giving a total of 64 ports for terminal activity.
Figure 3 shows how these 64 ports are configured on the
RTP system.
Minimum Complement
Communications/Symbiont Processor (C/SP)
with card reader and high-speed printer
Drum Subsystem*
I'll 4.U/I7HJ Drum Subsystem with two
til -U2 drums
Muss Suuugc Subsystem
UNIVAC H4I4 Disc Subsystem with (wo
8414 disc drives
4. Magnetic Tape Subsystem
'UNIVAC I 2/16 Magnetic Tape
Subsystem with lour magnetic tape units
Alternate
UNIVAC l>300 Subsystem with card reader and
integral printer, or Multisubsystem Adapter with
high-speed printer subsystem and card subsystem
III 4.52/1 782 Drum Subsystem will) one
HI 17H2drum
UNIVAC 8424/8425 Disc Subsystem with two
8424/8425 disc drives, or UNIVAC 8440 Disc
Subsystem with one 8440 disc drive, or UNIVAC
8460 Disc Subsystem with one disc Ale unit
UNISERVO VlllC Magnetic Tape
Subsystem with four magnetic tape units
*Ntit required for disc icsiilciil systems (IxI. 2x1, and 2x2)
261
-------
STORAGE S'ORASE
HORACE
EIPANStC*
II II III II I
- EKTEWtD MAIN STORAGE -
TIT
a
»
3
||
,
£
5
%
;
||
i
i
$
*
|
i
* s
= 1
2 »
- *
1 I
W) t)
= :--. Ill MM
*
>
;
_L
II
* s •
1 1 II
1 MU
Tl
CH*MNEl ExfANStO*.
"MTOO
C*'J ' —
;
II
3 3
-z z
; -.
1 1
2
UKfVAC 1118 CAU
vain
UMDKC
CVTHL
MM OSC
COtTROL
UNIVAC noo
PROCEKOR
5
a
-------
EXTENDED
STORAGE
131K MINIMUM
i
\
i
r
MULTIPLE
ACCESS
INTERFACE (MAI)
i
h
PRIMARY STORAGE
32K MINIMUM
t
\
i
f i
L
r
COMMAND/ARITHMETIC
UNIT (CAU)
j
i
I
r
INPUT/OUTPUT
ACCESS UNIT
(IOAU)
I/O CHANNELS
8, 16 OR 24
PERIPHERAL
SUBSYSTEMS
I
SYSTEM
CONSOLE
Figure!
UNIV AC 1110 System
Bask Configuration (1x1)
263
-------
04 Ports
0 - Dcdicntoil High Speed
f, - 9200 R.H-:
1 - U-100 Multiplexer
10 - 2780 Type
20 - RJE @ 2000 bps
12 - 1200 bps
iJemand
26 Demand
10 - 1004 Type
,— 3 Commercial
18 - mf Type @ 300 bps
2 WATS
2 Commercial
1 Dedicated Mux,
5 FTS
2 WATS
1 Commercial
3 Dedicated Mux,
4 FTS
8 - 2741 Type @ 134.5 bps
2 WATS
2 Commercial
4 Dedicated Mux.
10 FTS
2 WATS
4 Dedicated Mux.
2 FTS
Figure 3
Data Communications Support
264
-------
RTCC SOFTWARE AND ACCESSIBILITY
By Maureen Johnson
Research Triangle Park offers a full range of data
processing software on the UNIVAC1110 including
scientific, statistical and data management packages as
well as many compiler languages. This software falls into
three general categories:
1. UNIVAC supplied and supported
2. Packages obtained from private vendors or
other government agencies
3. Packages converted from the IBM/360 system
replaced by the UNIVAC 1110.
1. UNIVAC SOFTWARE
The following is a brief description of available UN-
jVAC supported software.
1100 Series Assembler
The 1100 Series Assembler contains the following
features:
Mnemonic codes describe hardware function
of each instruction
Multiple location counters provide for pro-
gram segmentation and control address
generation.
ASCII COBOL
Based on American National Standard plus
CODASYL and UNIVAC extensions, ASCII COBOL
contains the following features:
STRING and UNSTRING statements
providing powerful character manipulation
Multiple data formats:
- ASCII
- FIELDATA
EBCDIC-Reading and writing IBM flies
single and double precision floating point
binary
Variable length DISPLAY items
MONITOR and EXHIBIT (MONITOR tre-
mendous aid in program debugging)
Cross-reference of all data and paragraph
names
Index-sequential file organization on mass
storage (some differences from IBM)
Reentrant object code and libraries, i.e., one
copy in memory accessed by multiple users
concurrently
Interprogram and interlanguage com-
munications
Internal sort capability
FORTRAN V
FORTRAN V, containing all features of American
National Standard FORTRAN V plus extensions, has the
following characteristics:
Up to seven subscripts on variables
Extended subscript expressions NAME (1+1)
Forward and backward Do Loops
FLD intrinsic function: used for extraction
and insertion of list fields, U., bit manipu-
lation
. NAMEUST: may be used instead of a LIST
on an INPUT/OUTPUT (I/O) statement and
associated FORMAT statement; provides data
characteristic information
DELETE: provides facility to prevent com-
pilation of a section of source code
Free field input FORMAT ( )
NTRAN: reads and writes blocks of data
ERTRAN: means of dynamically executing
ECL from FORTRAN program.
-------
ASCII I OUTRAN
RTCC is ;i designated Its I sile for UNIVAC ASCII
FORTRAN. This implies that the compiler is available
for user testing with the recognition that it is still in a
prerelease stage and compiler problems are likely to be
encountered. It features:
ASCII file formats compatible with ASCII
COBOL
List-directed I/O
Enhanced debugging capabilities including
TRACE & DISPLAY.
STAT-PACK AND MATH-PACK
Comprehensive libraries of statistical and mathe-
matical subprograms callable from FORTRAN (com-
parable to IBM Scientific Subroutine Package).
The STAT-PACK subprgrams are grouped into
13 categories:
Descriptive statistics
Elementary population statistics
Distribution fitting and plotting
Chi-square tests
Significance tests
Confidence intervals
Analysis of variance
Regression analysis
Time series analysis
Multivariate analysis
Distribution functions
Inverse distribution functions
Miscellaneous.
The MATH-PACK subprograms ire grouped into
14 categories:
Interpolation
Numerical integration
Solution of equations
Differentiation
Polynomial manipulation
Matrix manipulation: real matrices
Matrix manipulation: complex matrices
Matrix manipulation: eigenvalues and
eigenvectors
Matrix manipulation: miscellaneous
Ordinary differential equations
Systems of Equations
Curve fitting
Pseudo-random number generators
Specific functions.
Conversational Time Sharing (CTS)
Conversational Time Sharing (CTS) features the
following:
Extensive file creation and editing capabilities
FORTRAN prescan capability
Desk calculator facilities enabling user to
evaluate expressions and mathematical func-
tions without any programming.
Functional Mathematical Programming System
The Functional Mathematical Programming System
(FMPS) has the following characteristics:
266
-------
Procedures commonly used to solve linear pro-
gramming problems
Generalized matrix generator
Procedures for saving the basis, restoring the
basis, and the procedures for obtaining error
estimates and sensitivity analysis on the solu-
tion.
SORT/MERGE
The standalone, parameter-driven sort/merge pro-
cessor features the following:
Will process ASCII COBOL fixed length tape
records
Will process FASTRAND-format mass storage
card files
. Will process FORTRAN V formatted
(80 character) recorded and unformatted flies.
PERT
PERT is a generalized applications program for
project/program planning and control. It contains both
COST modules.
Language Conversion Programs
COBOL language conversion programs include:
IBM PL/ 1 to UNI VAC ASCII COBOL
IBM ANSI and COBOL F to UNIVAC ASCII
COBOL.
Symbolic Stream Generator
The symbolic stream generator creates a symbolic
stream of data and/or control statements with great
flexibility and powerful modification capabilities.
Directions and models for building stream images are
written in SYMSTREAM, an extensive manipulative
language.
CPDMPH
CPDMPH is a utility which can be used to print,
punch and copy (ape or mass storage flics.
TAPETRAN
TAPETRAN translates tapes written on other
operating systems (primarily EBCDIC oriented) to tapes
which are UNIVAC 1110 compatible.
ED
ED is a processor with powerful capabilities in
creating and modifying data and progflun files and ele-
ments. It features:
FIND subcommand for scanning entire file for
particular character string
CHANGE subcommand for modifying one or
all entries in a file.
DOC
DOC is a processor for creating, maintaining and
retrieving text-type data files.
TPD
TPD provides the facility to do a Directory Listing
of a tape or dump it in Alpha or Octal format. Fastran
files or elements can also be used as input.
2. PACKAGES OBTAINED FROM PRIVATE
VENDORS OR OTHER GOVERNMENT
AGENCIES
SPSS
SPSS is a Statistical Package for the Social Sciences
and features:
Flexibility in format of the data
Routines commonly required by the social
scientist including:
Descriptive statistics
Frequency distributions
Simple and partial correlation
Multiple regression
Factor analysis
Guttman Scaling
Cross tabulation
Does not require programming experience.
267
-------
CALCOMP Graphic/Plotting Routines
CALCOMP Graphic/Plotting Routines are basic
plotter subroutines including:
PLOT
SYMBOL
WHERE
PLOTS
NUMBER
SCALE
AXIS
FACTOR
NliWPEN
LINI:
TIIRI-K-D
GI'CI'.
A Tektronix interlace providing interactive graphics ca-
pability has also been developed.
SYMAP
SYMAP is a graphics system designed for making
presentations to nontechnical people and has the
following characteristics:
SYMAP uses 10 intervals plus high and low to
present the information as a shaded map pro-
duced on a printer
The symbol printed can be changed and the
interval can be controlled in such a way that
shading density does not increase with
increasing values of the variable
Legends can be included.
SYSTEM 2000
SYSTEM 2000 is a general purpose data base
management system and features:
Creating and modifying data base definitions
Highly selective and flexible capabilities for
retrieving and updating values in these data
bases
Interactive access capabilities
Report Writer
Program Language Interface, i.e., user-written
COBOL and FORTRAN programs can access
the data base.
APL
APL is an interactive computer implementation of
a language defined by Kenneth Iverson and named "A
Programming Language." The version implemented at
RTCC was obtained from the University of Maryland.
This implementation is designed to be as nearly like
IBM APL/360 as possible.
SNOBOL/SPITBOL
SNOBOL/SPITBOL is a string manipulating
language especially well-suited to the processing of non-
numeric data. It provides a means for searching through
arbitrary character strings in order to find patterns, to
rearrange the strings and form new strings.
PL/1 Subset
PL/1 Subset was developed at the University of
Maryland and is only available on a testing basis.
UNIVAC plans announcement of a PL/1 compiled
during the first quarter of 1975.
3. SOFTWARE CONVERTED FROM PACKAGES
AVAILABLE ON THE IBM/360
Statistical Analysis System
The Statistical Analysis System (SAS) feature
Ease and flexibility in defining input data
Wide range of available statistical routines
Simple command language requires no pro-
gramming knowledge
Data management capabilities.
Time Sharing Library - Interactive Statistical Routines
The Time Sharing Library (TSL) is composed of
interactive statistical routines and includes the following
characteristics:
Allows user to enter data and retrieve results
through interactive terminals
268
-------
Requires no programming experience
Particularly well-suited for obtaining quick
results on relatively small amounts of data.
Keyword In Context
Keyword in Context (KWIC) provides powerful
' indexing and retrieval capabilities for text storage
T applications.
FAST
FAST provides the ability to retrieve selected
portions of information from a data file and print it in a
usable format and/or make it available for further pro-
cessing- The control cards required are quite simple
allowing nondata processing oriented people to use
FAST.
ACCESSING THE RTCC UNIVAC 1110
R.UHS submitted for processing on the
ilJNlVACUlO require a valid account number and
•ect code. New users must register via forms provided
L Us«rs S6™068 (FTS 919-549-2501) or found in the
\-rrC f sers Reference Manual. The information needed
includ*5 user's DIPS Organization Code and Project
g. ent Code, user's mailing address, brief project de-
icription an(i facilities requirements, and a project code
de up by tne user f°r each project being registered.
TO obtain SITE-IDs for remote and demand ter-
rninals, contact Users Services. An information form
t be completed describing the terminal, its location
. indicating the person to be contacted in matters
coocerning that terminal. Data Systems will then
plete the form with a SITE-ID and phone numbers
to be used, and mail a copy to the user.
RTCC SCIENTIFIC SOFTWARE AVAILABLE
RTCC scientific software available includes the
following:
CALCOMP Graphics
SAS - Statistical Analysis System
SPSS • Statistical Package for the Social
Sciences
STAT PACK - UNIVAC supplied subprogram
callable from FORTRAN
MATH PACK • UNIVAC supplied subprogram
callable from FORTRAN
TSL - Time Sharing Library of interactive sta-
tistical routines
FMPS - Functional Mathematic Programming
System, procedures for solving linear pro-
gramming problems
SYMAP - Graphics system
APL - Interactive computer implementation of
a language defined by Kenneth Iverson
requiring special character set or dygraphs
FORTRAN V - Fieldata FORTRAN
ASCII FORTRAN (RTCC is • designated
test-site for ASCII FORTRAN).
DATA MANAGEMENT SOFTWARE AVAILABLE
Data management software available includes the
following:
. COBOL (ASCII)
CTS - Conversational Time Sharing
PERT • Project/Program Planning Control
Language Converters
- IBM COBOL to UNIVAC ASCII COBOL
IBM PL/1 to UNIVAC ASCII COBOL
SSG • Symbolic Stream Generator
CPDMPH - Prints, punches and copies files
. TAPETRAN - Translates EBCDIC tapes to
UNIVAC compatible tapes
ED-Processor for creating and editing data
and programs
DOC • Processor for creating, maintaining, and
retrieving test-type data
269
-------
TDP-Dumps tapes and FASTRAND mass
storage files and elements
FLUSH - Flowcharting package
SYSTEM 2000 - Data Base Management
System
SNOBOL/SP1TBOL - String Manipulating
Language
KWIC - Key Word in Contest
FAST - Generalized data retrieval system.
270
-------
TECHNICAL AND ENVIRONMENTAL INFORMATION SYSTEM
By Donald L. Woriey
INTRODUCTION
The gross incompatibility of the software system
(CFSS) that supported APTIC with the new En-
vironmental Protection Agency/Research Triangle Park
(EPA/RTP) computer system necessitated a thorough
evaluation of APTIC's ADP needs and alternative
methods of support.
Results of this evaluation clearly indicated that the
most cost effective and recommended course of action
was to develop a storage and retrieval system designed to
meet APTIC's specific needs. The Data Systems Division
(DSD) joined with APTIC in the development and im-
plementation of a system.
The following sections of this paper will describe
the basic design of the system developed by DSD for
$upport of APTIC's current requirements.
pESIGN GOALS AND BENEFITS
The Technical and Environmental Information
System (TENIS) was designed to provide the following
features:
1. The support of the activities which APTIC cur-
rently performs on a regular batch processing basis
including:
The processing of input data received from
Franklin Institute
The facility to retrieve selected technical or
environmental information through the use of
descriptor terms and/or specific document
numbers
The use of a dictionary to control the
vocabulary of descriptors either In document
input or in the searching process
The pottback dictionary function currently in
use by APTIC
Output may be provided in the form of
printed reports or formatted magnetic tapes to
be processed by the Government Printing
Office
The maintaining of the current concept and
form for document storage and line
construction. This will enable the continued
use of the programs which use the report tape
output of CFSS. These programs will require
conversion to Univac compatible code
A remote terminal search capability.
2. The later expansion of the system to provide
capabilities which are desirable but not a necessity at
this time. This expansion may include but is not limited
to:
A file maintenance capability for updating and
deleting all or selected parts of documents
An extended search system to provide the ca-
pability of the Identification File
Special reports such as Inverted File Statistics,
Output'Terms report, and others which are
desirable.
3. A system which is developed and programmed
using a high-order language (COBOL) that ta an industry
standard and is understandable by the user.
The benefits of this approach to APTIC and DSD
are numerous but include:
The removal of agency dependency on an out-
side contractor for diagnostic, maintenance
and development support
The reduced developmental costs of a
government-produced system
A system developed specifically for the needs
of APTIC
A system developed by the people who are
responsible for using and maintaining it
The reduced operational cost of a system
which meets the specific need* of APTIC
A modular system which will be expandable as
needs dictate.
271
-------
TENIS SYSTEM DESIGN
An overview of the initial TENIS is shown in
Figure 1. The Secondary Search File (previously referred
lo as the Identificatio; File) is shown by dotted lines as
an extension. With lii.i files and systems (groups of
programs) reflected on the overview as a reference, a
general description of the system follows. A more de-
tailed systems (low chart of programs is included as
Attachment A and a sample run of the terminal system
is included as Attachment B.
1. New Document Input.—Input to TENIS will be
structured in the same format as that presently received
from Franklin Institute. In addition, the structure and
content of documents and lines of text will remain
consistent with the present design.
2. Dictionary. -The dictionary will be organized as
a vocabulary control file, and as an indexer for entering
the search file. The postback capability currently
provided will be available. One listing of dictionary
contents is included as well as the ability to add and
delete terms. The ability to delete terms will be provided
when the file maintenance extension is implemented.
The dictionary will be supported on a direct access
device.
3. Table of Contents. A one bit per APTIC
Number (I to 200,000) file to indicate the status of the
number. This file shall reside on a direct access device.
4. Text File.—The text file will contain data
types 01, 035, 036, and 045. One author (06), the
journaJ (09), and the year (10) will be included for
sorting output. The text file is designed for efficient use
in a hatch environment and will be sequential file stored
on magnetic tape. Data items are summarized in
Figure 2.
5. Search File.-The search file shall include all the
descriptors necessary for searching. Data types include:
ID No. Data Item
05 Method of support
10 Year of publication (for internal
storage and for searching, it will
be prefixed by a P)
13 Language
14 Translated (yes only)
15 Primary category
17 Secondary category
18 Document attributes
19 Indexing descriptors
All entries to the search file must first pass a vocabulary
check by the dictionary. The search file will also be
directly accessible.
6. Secondary Search File (not implemented).-The
secondary search file will include the additional
descriptors which do not require a vocabulary check.
The file will include authors (06,07) and journals (09).
7. Search Input.-The search input shall use an
80 column card format that is shown in Figure 3.
Columns 1 to 4 are fixed and other columns are free
form with punctuation.
8. Document List.-This is a temporary file used
in the text retrieval process. It will consist of APTIC
identification numbers of all documents that have met
the search criteria.
9. Report Tape.-The report tape consists of all
selected text material (data types 033, 036, and 045)
that has been sorted into appropriate reporting se-
quences. The format will be consistent with the present
report tape and may be used as input to many of the
current APTIC programs written in COBOL.
10. Selected Documents.-Documents that have
been searched and retrieved according to the pre-ent ed
criteria will be formatted and printed according to pre-
sent standards.
11. Edit and File Update System.—This subsystem
consists of six individual COBOL programs and several
utility sorts.
12. Search System.-These two COBOL programs
will provide the Boolean logic necessary to effect the
descriptor search of the data files.
1 3. Text Retrieval System and Report
Program. -This subsystem consists of two new COBOL
programs and several pf the publishing programs that
will be converted from the present APTIC system.
272
-------
EDIT. &
FILE UPDATE
SYSTEM
SECONDARY
SEARCH
FILE
TEXT
RETRIEVAL
SYSTEM
I
I
t
GPO
1
F
REPORT
PROGRAM
1
SELECTED
DOCUMENTS
Figure 1
TENIS System Overview
273
-------
ID
01
05
06
07
09
10
13
14
16
17
18
19
.15
.U>
45
ftj A Vlfcjl IM
MAXIMUM
DATA ITEM NUMBER
APTIC Number
Method of Support
First Personal Author
Other Authors 35
Title of Publication
Year of Publication
Language of Publication
Translated
Primary Category
Secondary Categories
Document Attributes 5
Indexing Descriptors
APTIC Number and Authors
Bibliographic Citation
Abstract
TEXT
X
X
X
X
X
X
X
RECEIVING
SEARCH
X
X
X
X
X
X
X
DATA FILE
SECONDARY SEARCH
X
X
X
X
Figure 2
Data Items To Be Accepted
by TENIS
27-4
-------
General Format
Column 1 to 2 — A two character search identify number which occurs on every card for a search and
which identifies the output.
Column 3 to 4 — Card'Type
010 to 015 Title Card 1 to 6 occurrences
020 to 025 Control Card 0 to 6 occurrences
* 300 to 599 APTIC Number Search 0 to 300 occurrences
* 100 to 299 Boolean Search 0 to 200 occurrences
600 Reserved for Secondary File Search
-When used as a sequence number, it raust be unique and ascending.
Columns 5 to a column containing an P — Control Information
Column containing $ to column 75 — Comments not processed by Search Program
I:OL.
L'S E .
J 2
J I
345
C T
6 to Cd
control
C" to 76
comments
77-80
Reserved
Title Card
from 0 to 71 characters of title information, to be used on printed report. Printed in order of
occurrence for up to 3 lines. The title card may not contain comments (if Included, they will be
printed as part of title).
Control Card
fields are identified by key term and concluded by ';.'. Present value is used if not entered.
Maximum Number of Documents
3 •
Key is M
Torm i« M=NNNNNN;
1'ivsi't is M = 1000;
j, _ Output Control
Key is P
Form is P - A, B, T, N, D, any combination which is meanful.
D - Text only (eliminate 35 4 36)
A - Segment 35 i 36
B - Segment 35 4 36 & 45
T - Text Record
N - Document Number Only
Preset is P « B
Ci Sort Key
Key is S
form is S=SN, SN, SN;
N»D - Document Number
N»J - Journal Title (Ascending Only)
N'-A - Author Title (Ascending Only)
N«Y - Year
S»A - Ascending
S=D - Descending
4 5
Pi-fsct is S * DY, AD
Figure 3
Search Control Input Format
275
-------
Figure 3
d . Date ol hnlry P>gB 2
1) - MMDDYY; everything after the date
I) = -MMDDYY; everything up to the date
D = MMDDYY - MMDDYY; between these two dates but not including them
Preset is P - 00 00 00 - 123199
e. Report Number
Key is R
Form is R=NN;
Preset is R=01;
:;ote: Control cards are continued by a ';'.
If all presets are acceptable, a control card is not required.
4. Document Number Card
Six digit numbers separated by ';'. Space may occur only between ';' and first digit; and leading zeros
are not re-quired.
Ranges may be selected by placing a hyphen between two numbers.
Lxample: 5178:06034; 5178-Of)03A ;
'- . '•'•'.iroli I'i le
('unib ina t ions of operators
• - open parenthesis
^ - close parenthesis
? - end of search
: - end of Boo]ean search
& - and
* - or
- not (underline)
operands selected from the dictionary
$ - truncation
; - document or year range selection;
Range selections are made using:
F.Q - Kqual
GT - Greater than
GE - Greater than or Equal
LT - Less than
LE - Less than or Equal
A document range is selected by "DOC1 and year by 'YR'.
The f onnat i s :
;iH>c C;T r>ooo LT 10000;
; YK FQ GT;
"•'he second condition is not required.
Note: Two range tests may not be combined with themselves.
Acceptable - R,SASR0
Not Acceptable - (R^ 1 R,)1A
276
-------
TENIS SYSTEM FLOW CHART
Attachment A(l)
MAINTENANCE SUBSYSTEM
CODE TRANSLATE
& CONVERT: EDIT
CONTENT & LIMITS
DICTIONARY AND
TABLE OF CONTENTS
EDIT AND
FOR MAT OUTPUT
INVERTED FILE
UPDATE & KEY
UPDATE IN THE
DICTIONARY FILE
V INV. /
\™*s
277
-------
SEARCH AND REPORT
SUBSYSTEM
Attachment A(2)
SEARCH
ERRORS
SORT
1
TEXT
SEARCH
2
FINDS
SORT
i
REPORT
PROGRAMS
2
REPORT
278
-------
DICTIONARY UPDATE
Attachment A(3)
UPDATE CARDS
TERM NO.
PARAMETER
CARD
IDS I
IIASDICT
CALLED BY
UPDATE PGM
DICTIONARY LISTING
DICT. LOAD
DICT.
MASS-STORAGE
279
-------
Attachment B(l)
APTIC Using TEN IS
The current implementation of APTIC includes 2800 tef^is in the
dictionary and over 60,000 documents in the master te^t .file. The
master file is growing at over 700 documents per month.
Three methods may be used for obtaining results from APTfC:
1. Call Mr. /lalpth*s group as in the past - and wait for
results via mail.
2. Use terminal and review Citation File directly.
3. Use terminal - review Citation File and submit search for
results to be returned by mail .
A sample terminal session follows.
TO Pi
06 NOV 74 10:54:0
S'lAUCH:
>AA010 INITIAL tNTKY
>AA100 ACROLEIN?
NUMrtE'< OF DOCUMENTS .tr'L C Fi iJ = fl PIS
INSTWUC r?
>Y
*ei . -'it "I «\* lt: V| ' i)
G FO^ OnCUMK'-jr P^s HA 1:V, • (,);'iil', >iF l,
R TO .ir.C TV «, I. CT ) '"-r'Jl A. I. ! K-< TTM
T '0 iFRMtNAI'"- .If S'7A'G
DOCUMFNTS '10 f MISTAYKH = 0 qi;>
APTTC NO = it • 1 fiPQ
VFNGt- RSKAY-\» KH. Y.A., V. ••'. T"K-SOV» ;,Nn (.. T.
AI3 pot' UTT^N o.i'-:' 'G TI-IL MA j'!P cr^-;:. OF ^ • i 'TjSE D on.. («-'••> i i--i . . K. J ^ .1 , . .ri CY J. M/'.'^.IA'.'O
OlViU FH (Fs.lOI.D .,1 '.IT'JAI' )i, -I- > •> •, > -Jr'i) Ai'"H. t>r,i AU''iu ,.>. 11 ,r, IK- l(|i AIK "OL. HTION
AS 0(;r.\TI"i|» M. :'VH» -IT'.' .. J l A- — U I«-»SH. i>A -:-'^ 6H-1
-------
Attachment B(2)
APTTC NO = n ' '1
ALRKS« Y VA» M. V.» A. S. (>/»-•<. KIY. A.JD V. A. KHW ISTALEV/A
5 mi-I C>) J:V iTUA. I:> . F r -Xif ••'•H-Mi'.T FROM
( (0 -.UT7HF Jl • KOiJl'S* 't .".; O.)
HES':.A;M HTi.« WKH OVIfjAT'- 1 • T , ) » TRANSLATED
•' '>V'M. (v>-v,C w>> :f.J. ,•/:>-', OEC.
:AMM-\
- -in-'
FOR f>3
«. OCIATTON»
.. on •. I«J.,M
L. OTTs i"^1K>
L. CLAHKK* AND
APTTC NO = n> M
CAMPRFI.i »
WO'iF. -. L. P»AT;.
INHALATION rJXlClfY -)F IMF AIi< IJDL ilTrtU- i"-'.i:OXYACKTYL NITRATE:
DEPP'S' -TON >p v )i.ilMTA>H« >, UATln'-'AL AfR
POLLUTION CONTROL AnMlrj!ST,*A: tOi4» «1<*)iP., ((1QfjH)\. MREFS.
!••• jr
>r
F.MT.-^ q -A--LM:.
>AAniO.AC^vJLHIN WITH LIMITIiJG PARAMETER
>AA100 AC^OLEIN « I Y^l GE 1973 I ?
OF DOCUMENTS ' EL-.CT'O = n. 23
>N
FN
>B
SUHMI reo ro BATCH OOL- EC TOR
NF.W SrANCH? >OFIN
IN FX C M01»*
-<.)iio:
AC r: «*»!»?•.:•
«MOJF.CT:
17=
n
o
n
n
I T>A"I
)' ) ''•'
MI
rrr
I.). • I '. ' . H '
n :
n «
oc=
i/o: » :
WATT: n :
ES --
' Fin:
NOV
281
-------
THE CONVERSION OF CHESS AND OTHER SYSTEMS
By Andrea Kelsey, Gene R. Lowrimore,* and Jane Smith
THE CHESS PROGRAM
The CHESS rescind) program is a program of
epidciniological studies designed lo determine the effects
of elevated levels of air pollutants on human health. This
program provides the bulk of the information upon
which current ambient air quality standards are based.
The data processing problem presented by CHESS is the
collection and analysis of data on certain disease
symptoms, data on demographic characteristics, and
data on air pollutant concentrations, from which esti-
mates of the effects of air pollution on health are
computed.
Some background on the evolution of the CHESS
program will explain the situation that prevailed when
the selection of the Univac 1110 system was announced.
The CHESS program was conceived about four years ago
as a scries of replicated studies in several metropolitan
areas. When originally planned, these studies were to he
conducted independently. Replication was planned from
area lo area as well as from time lo lime in the same
area. One replication of the study in an area is referred
lo as a round. Several indicators of human health were
measured:
Chronic Respiratory Disease (CRD)
Acute Respiratory Disease (ARD)
Pulmonary Function (PFT)
Acute Episodes
Asthma Panel
Aggravation
1'anels)
of Chronic Symptoms (Adult
I ower Respiratory Disease (l.RD).
Ity the end of the first year, researchers realized
thai: (1) ARD panelists should only be selected from the
CRD population, (2)data on children in the PFT study
should be related lo the CRD data for the family from
which they came, and (3) any other possible linkages
*Speaker
between studies should be made. In this way informa-
tion could be shared between studies and all available
information relevant to the analysis could be obtained.
A short time later an additional extension was
made to require us to link together the data of those
who participated in more than one round. The extension
was made to enable researchers to block out constant
person-to-person differences when making comparisons
over time. These latter changes in the CHESS program
were made in the summer of 1972. Figure 1 shows the
relationships between the various CHESS studies for a
single round. Partial or total overlapping of clouds
implies that data collected on individuals in both studies
are linked together. With the exception of CRD, one
round of all studies is conducted each year. One round
of CRD is done every three years.
The data processing system was redesigned to
handle the CHESS concept as it existed in the summer
of 1972. Two parallel efforts were thus begun, one to
implement the redesigned system, and the other to make
past data conform in format and structure to those of
the redesigned system.
THE SYSTEMS DESIGN
For the purposes of this paper, the CHESS data
processing system is composed of several smaller sub-
systems, each of which handles one of the health indi-
cators mentioned earlier. A typical subsystem includes
processes to:
Edit background questionnaires
Establish Master Files
Print optical mark reader forms
Read batches of mark reader forms
Edit (detect errors) data files of periodic
responses
•282
-------
Purge (correct errors) data Hies of periodic
responses
Combine all periods of responses to update
Master Files.
All these operations are standard business-type data
processing applications, and they are best done in
COBOL. After the Master Files have been updated, they
are used in large-scale statistical analyses. Standard
packages are used, such as Statistical Package for the
Social Sciences (SPSS) and Statistical Analysis System
(SAS), along with other user-written packages and
routines which are considered less standard. These user-
written programs are almost always written in
FORTRAN.
Muster Files are also used to update what is called a
Linkage File. The Linkage File contains all the informa-
tion collected in all rounds on an individual or a family,
as the case may be. This file will also be used for statisti-
cal analyses of health status changes over time, but no
statistical analyses have actually been done on the
Linkage File.
Figure 2 shows an overall system flow indicating
I he language resources used. Figure 3 is a more detailed
description of the subsystem for performing the data
processing for the Acute Respiratory Disease indicator.
Manual processes are necessary to complete the sub-
system but are not shown.
IMPLEMENTATION
Through a work agreement with the General
Services Administration (GSA), the Human Studies
Laboratory has access to programming and data clerical
services furnished by Data Processing Associated (DPA),
a GSA contractor. To implement the system, the
Laboratory gave DPA a written specification for each
process, and DPA delivered a tested program to perform
the process along with complete program documenta-
tion. The Laboratory maintained control by requiring
DPA to obtain approval at several steps in the develop-
ment process.
The implementation of the redesigned system had
been in progress for about seven months when it became
necessary to begin the conversion effort. Laboratory
project commitments also required the continuation of
the development of the IBM version of the system.
CONVERSION PLAN
The following conversion plan was announced
concurrently with the procurement announcement of
the Univac 1110 computer system:
Step 1: The user would submit source code, sys-
tems flow charts and sample run streams for all programs
which the user wanted to have converted.
Step 2: The user would submit file descriptions of
all files to be converted.
Step 3: The user would submit test data and
sample results for the program as run on the
IBM 360-50.
Step 4: The contractor would submit complete
programs and successful test runs to the user for
concurrence.
Step 5: Data Systems Division would convert all
user production files. This promise was later withdrawn.
No direct contact between the user and the converting
contractor was planned. This procedure was expected to
provide the major means for converting user systems
from the IBM 360 system to the Univac 1110 system.
For reasons discussed below, this plan did not work for
the CHESS system. Yet, it seemed to work fairly well
for other systems in which the Laboratory has an
interest, such as SAS.
CONVERSION, GOOD AND BAD
It has been concluded that contracted conversion
for CHESS did not work well for two reasons:
The CHESS system, like many others, is a
changing thing which refuses to remain con-
stant long enough for someone to convert it.
The users all tend to underestimate how much
a system depends upon the environment in
which it was developed.
The Laboratory staff submitted about 60,000 lines
of source code to be converted under the conversion
plan. Between the time of submission of these programs
and the time ready for checkout, changed functional
requirements, program modifications, and incompatible
architectural changes made by the converting contractor,
283
-------
iniiitc it possible lo lest only about 10,000 lines of
source code. The remainder were accepted in the clean-
compile stale.
After consideration of the above events, the staff
became skeptical that "turnkey" conversion is even
possible for the majority of systems. Any application
system has to be thought of as being embedded in an
operating environment. The scope of the system is
determined in part by this environment. Capabilities
essential to the successful operation of the system are
provided by the environment in one operating system
but would have to he part of the application system in
another. Cases in point are the ability to concatenate
files through Job Control Language and the ability to
specify blocking factors at the execution time. Both
these features are available in the IBM system and absent
in the Univac system. Suppose that the files labeled F7
in Figure 3 had been created using several different
blocking factors and the files had all been converted
using the original blocking factors, then there would be
no way that the next process could be run successfully
for all these valid inputs. A process also has to be in-
serted to do the concatenation. In IBM COBOL there is
a utility assignment which can be either tape or disk,
specifiable at run time. Univac COBOL does not have
this capability. The person converting the program is
forced to make this decision although it is obviously a
systems decision and is greatly influenced by the policy
decision made by central site personnel.
Following the decision of the Laboratory staff to
perform the conversion of CHESS, the system was re-
designed to account for the differences in Univac and
IBM environments; these changes were translated into
modifications for existing programs and specifications
for new programs. No attempt was made to optimize the
design of the system with respect to the Univac environ-
ment. The staff made all blocking factor and device type
decisions and began to convert existing files to Univac
media using these factors; they also assigned the con-
version of the programs to the Data Processing
Associates (DPA) and monitored the effort according to
methods employed in any other program development.
Since then the conversion of CHESS to the Univac has
gone smoothly. Of course, there are problems, such as a
bug in ASCII COBOL, which prevents the conversion of
the variable blocked Hies. Comprehensive training was
made available lo the users by the Data Systems Division
al,,| was one of the big factors which enabled the stalf to
confidently assume the conversion effort.
Some of the pioblems encountered in the con-
version of CHIiSS are evident even for the conversion of
SAS which, except lor some delays, has gone well. While
the official goal for the conversion of SAS was that all
features available in the IBM version would also be
available in the Univac version, fairly restrictive require-
ments were, in fact, placed on input by the converting
programmers. The input now must be either cards or a
catalogued tape or disk file written one record per block
in unformatted FORTRAN. A record may not exceed
3200 characters. The IBM version had a record length
restriction of 32000 characters and accepted files with
•any blocking factors. Also, IBM SAS provided its own
sort routine for small sorts and interfaced with the
system sort utility for large sorts. The Univac version has
no interface with a system sort utility. Of course, many
of these restrictions are made inevitable by the Univac
operating environment, but had these decisions not been
left to the discretion of the converting programmer,
there would almost certainly be fewer input restrictions
in the Univac version.
INCREASED CAPABILITIES
Increased capabilities resulting from the design
changes are as follows:
Fortran V is more powerful than Fortran IV,
particularly in the area of string manipulation.
The extra precision provided by the longer
word of Fortran V in many cases would obvi-
ate the use of double precision.
Run stream construction is somewhat simpler
but may be partially due to the decreased
flexibility.
Management of program files is probably the
single most impressive feature of the
Univac 1110 system. Source programs, run
streams, and executable programs, are all
easily established as program files. Powerful
software is available to edit these files. Source
decks will probably no longer be used except
as a means to establish the initial version of a
file. Up to five updates of a given file can be
kept and easily referenced.
DECREASED CAPABILITIES
Decreased capabilities resulting from the design
changes are as follows:
The Univac System is very weak in utility
support such as copy routines for data files
and stand-alone sorts. Particularly if °ne
784
-------
writes in COBOL ami ii is not feasible to use
inlcinal sort capabilities, programs may be
luiill jusl to pcrl'orni the sort steps. There is a
file dump program called CPDMPH but it does
no formatting and is useful only as a last
resort.
Many decisions which were previously made at
run time, must now be made at the time the
program is designed. Some of these decisions
concern: (l)the number of input Hies for a
given external name, (2) the blocking factor,
and (3) the storage device. Fixing blocking
factors at program design time will probably
be a blessing once the conversion period has
passed.
There is no adequate batch terminal support.
Univac batch terminals do not provide any
indication that a job was successfully entered.
Likewise there is no way to determine through
the batch terminal what the status of a job is.
The absence of such feedback has hampered
the process of learning to use the terminals.
Files written by different processors are
incompatible. In the Univac operating system
concept, much of the file management has
been delegated to the language processors.
Each language processor has its own way of
writing Hies. As a result, users must introduce
at least one more process into the converted
systems just to reformat the file for statistical
programs which follow.
Input formats of analysis packages are incom-
patible. The input to SAS is FORTRAN
unformatted binary. The input to SPSS is
FORTRAN formatted. These two packages
arc not totally interchangeable in function,
uiul if both are needed, parallel flies must be
provided.
Management of data files is inadequate.
Putting multiple files on tape, identifying
them, and using them seems to be an un-
reasonably difficult task.
MIXED BLESSING
Using a large fixed mass storage device is a new
concept for the Laboratory staff. Instead of having
many small files on tape, they may now be stored on
mass storage as catalogued files. 11° the system runs out
of disk space, it rolls some of the less frequently refer-
enced flies out to tapes, a process which is almost trans-
parent to the user. This concept greatly improves the
ability to perform production data processing. But in the
process of rolling files in and out, flies have been lost.
Sometimes the latest good system backup is one week
old. The whole concept of file backup has to be recon-
sidered to keep from permanently losing data.
CHANGES TO CHESS SYSTEM
One of the biggest factors in the development of
future systems is the availability of a powerful Data Base
Management System, System 2000, on the Univac 1110
System. Establishment of the Linkage File is being con-
sidered as well as the Master File shown in Figure 2 as
System 2000 Data Bases. Such a move would offer in-
creased flexibility in terms of responding to uncertain
requirements for statistical analyses. However, many
additional people, mostly analysis programmers, would
have to be trained in the use of System 2000 to allow
the change to be of any real benefit.
Several other design changes are being considered,
such as the on-line input and correction to data.
RECOMMENDATIONS
For similar conversions in the future, the following
approaches are recommended:
1. Those systems which are reasonably stable and
have a clearly-defined function, such as SAS, can be
turned over to a contractor to convert as a system. Even
for these systems, the primary communication channel
must be between the user and the contractor converting
the system. The user must approve all design changes
before they are implemented.
2. For other systems, the user must redesign the
system to be compatible with the new operating system.
A level-of-effort contract can be helpful to convert and
modify programs and to convert flies. But the important
point is that this contract is an extension of the user and
not an alternative route for the conversion.
3. Put first priority on making the batch terminals
intelligent. This in itself would have a surprisingly posi-
tive effect on getting off the old machine and onto the
new.
285
-------
Chronic Respiratory Disease
Lower Respiratory (CRD-LRD)
Acute
Respiratory
Disease
(ARD)
Acute
Episodes
Asthma
Panels
Adult
Panels
Pulmonary Function
(PFT)
-------
Inputs —[>j
Master
File
COBOL
Linkage
File
SPSS
SAS
FORTRAN
r
Analyses
to
00
-4
-------
"«,n U '«'"
_««,»»m j 1 ,„
Mv|* .( (f *•• i-M-jtwl
IQMPUTtO I -^
— -JL. ""
1. This first step in the ARD system depends upon the CRD
Master File having been completed. The forms printed
are readable on a Sentry 70 Optical Mark Reader.
Interviewers visit the families selected and attempt
to enroll them in the study. Any listing errors in the
information about the family is corrected at that time.
2. The Master File has two functions. First it is
used to print all interview forms and secondly it
will contain all background and response information
for a family. Direct access is required because
interview forms are randomized and variable length
records are used because of the distribution of
family sizes.
3. All identifying information is printed on the form and
also slugged in machine readable format. Each
family is called every two weeks for 32 weeks.
4. The forms are read on a Sentry 70 Optical Mark Reader
to 9 track 1600 bpi EBCDIC tape.
5. Logical errors are spotted and recorded in file F6
along with each record somewhat reformatted. There
1s one record per form.
6. The output F7 Is the corrected file and is optional.
It is only created after the deck of corrections has
been cleaned up and there are no remaining errors
which could be corrected.
7. Again the output is optional. If there are any
discrepancies one more cycle back through Step 6
is done. All periods are batched together for this
process. When the data is as clean as is
reasonable, the output file is updated.
8. The records are reformatted to be consistent with
standard packages. It is also necessaryit compute
certain intermediate values.
9. Canned packages are used wherever possible. Some
user written analysis programs are usually required.
10. The data for the current round is added to the data
for previous rounds. Information for families 1n the
study for more than one year 1s tied together before
storing.
FigureS
Description of the Acute Respiratory Disease Subsystem
-------
PRESENT STATE OF EDP POLICY AND
DEVELOPMENT IN EPA*
By Theodore R. Harris
This article presents some items currently under-
taken by the Management Information and Data
Systems Division (MIDSD) of the Environmental Protec-
tion Agency (EPA).
AGENCY-WIDE EDP PLAN
The House of Representatives in Report
No. 93-1120 has given EPA a directive to conduct a
detailed Agency-wide study of its ADP requirements and
to report its findings to the House and Senate Appropri-
ation committees. No additional ADP equipment is to be
purchased until the study is completed. The milestones
for the plan are shown below. The completed plan is to
be submitted January 2, 1975.
Agency - Wide EDP Plan Development
8/ 7/74 Statement of work for assistance and con-
sultation in development of plan
9/ 1/74 Award of contract
12/11/74 Review draft of EDP Plan from contractor
12/28/74 Delivery of final EDP Plan from contractor
I/ 2/75 Suhimltal ofl-;i)P Plan to the WlHlten
Committee
WASHINGTON COMPUTER CENTER PROJECT
This project was established to determine the ADP
requirements of the Agency. The milestones for the
project are shown below.
Washington Computer Center Project
ft 9/ 2/74 Feasibility Study, first draft
^lO/ 1/74 Feasibility Study incorporating Tele-
communications, second draft (mail out
to members of Task Force)
^10/16/74 Fourth meeting of Task Force (to approve
study)
tt.
11/20/74 EPA review and approval of study (Messner
and Aim)
I/ 2/75 Whitten approval of Feasibility Study
4/ 1/7S Development of RFP
5/ 1/75 EPA review and approval of RFP
6/ 1/75 GSA review and approval of RFP
7/ 1/75 Issue RFP to public
10/ 1/75 Question and answer period, receipt of
proposals
5/ 1/76 Review and evaluation of proposals
8/ 1/76 Cost evaluation and audit of proposals
10/1 /76 Selection and award
4/ 1/77 Delivery and acceptance of equipment/
service
7/ 1/77 Test and acceptance
11 / 1/77 Conversion of work load (vendor dependent
from 6 to 18 months)
INTERIM EDP RESOURCES PROJECT
The purpose of this project is to procure an interim
computer resource to fill the gap between Optimum
Systems Inc. (OSI) and the Washington Computer
Center. The milestones for this project are shown below.
Interim EDP Resources Project
8/ 5/74 Decision to establish project
I2/ 1/74 Development of RFP
•Dates have been updated as of December 5, 1974.
t Project has been inactivated. Hie Interim LDP Resources Project has taken precedence to provide computer resources by July 1975.
The Washington Computer Center Project will be activated after conversion to Interim Computer Resource.
f |Milestones were nut met.
289
-------
I / 2/75 EPA review and approval of RFP
21 I /75 GSA review and approval of RFP
2/15/75 Issue RFP to pul lie
4/ 1/75 Question and answer period, receipt of
proposals
5/ 1/75 Review and evaluation of proposals
6/ 1/75 Cost evaluation and audit of proposals
7/ 1/75 Selection and award
I I/ 1/75 Delivery and acceptance of equipment/service
12/ 1/75 Test and acceptance
4/ 1/76 Conversion of work load
STANDARD EDP COMMUNICATING TERMINAL
PROJECT
The goal of this project is to standardize types of
terminals and procurement of the terminals for the
Agency. Preliminary specifications were sent out in June
with responses coming back in late July. The milestones
are shown below. The current status is October 1 with
momentary issue of the RFP.
Standard EDP Communicating Terminal Project
8/15/74 EPA/MIDSD review and approval of RFP
9/24/74 GSA review and approval of RFP
12118/74 Issue RFP to public
21 1/75 Question and answer period, receipt of
proposals
3/ 1/75 Review and evaluation of proposals for all
Categories I, II, HI and IV
4/ 1/75 Selection and award Categories I and II
5/ 1/75 Benchmark period for Categories III and IV
6/15/75 Selection and award Categories III and IV
SUMMARY OF TERMINAL DESCRIPTION
Category Description
I.A Low-speed typewriter style
Quality impact print
Off-line text editing
I.B Low-speed typewriter style
General purpose portable
I.C Low-speed typewriter style
General purpose nonportable
11. A Display terminals
General purpose A/N
Display
II.B Display terminals
Graphics
III. A Medium-speed remote job
Entry terminal
III.B High-speed remote job
Entry terminal
1V.A Remote job entry with
concurrent processing
capabilities
Data entry oriented
I V.B Remote job entry with
concurrent processing
capabilities
Scientific oriented
EDP DATA COMMUNICATIONS NETWORK
A feasibility study contract was awarded to deter-
mine EPA network feasibility. The study has been
completed. A common network with a design linking all
users to the two major computer resources of the
Agency has been recommer.ded.
Figure 1 shows the EPA network WATS telephone
service. Figure 2 shows the multiplexor placement.
Figure 3 shows a typical multiplexor configuration.
Figure 4 shows a functional representation of the
network. The milestones of the project are shown below.
2'>0
-------
EUP Data Communications Network
6/10/74 Date of conlrucl awurd lor 'Tcusibility of
an EPA Data Communications Network"
9/18/74 Delivery of Data Communications Feasi-
bility Study Report
121 \ /74 Initiation of Phase II, produce RFP specs
12/15/74 Delivery of RFP from contractor
\l 2/75 HPA review and approval of RFP
I/ 2/75 Whitten approval of feasibility study
2/ 1/75 GSA review and approval of RFP
2115/75 Issue RFP to public
4/ 1/75 Question and answer period, receipt of
proposals
5/ 1/75 Evaluate proposals
6/ 1/75 Cost evaluation and audit of proposals
II 1/75 Selection and award
11/ 1/75 Delivery and acceptance of equipment/
service
12/ 1/75 Test and acceptance
4/ 1/76 Conversion of work load
291
-------
c
J
-------
-------
High Speed channel
Modem
Network Co.T..T;u.-.;cal.o."
Cor.uoller (Front-tr.d)
Medium Spaed
Modem
, 7*xas
Washington, D. C.
-------
Remote EPA Multiplexed Users
TTS SERVICED CITIES SERVICED CITJCS SERVICED
Seattle (2)
Chicago (2)
Atlanta (1)
Corvallls. Ore. (1)
Lai Vecai (1)
Athens, Ga. (1)
Ktw York City 0)
Boston (1)
Dallas (1)
Phllajolphla (1)
9600 Baud
Channels (14)
CITIES SERVICED
• it 1 '
Atlanta (1)
7200 Baud
Channel* (1)
Atlanta (1)
RTP (1)
Cincinnati (1)
Philadelphia (1)
CITIES SERVICl
\ I /
RVICCD
W-
Seattle
Denver (2)
Chicago (1)
San Francisco (1)
Portland. Ore. (1)
Corvallis, Ore. (1)
Las Vegas (1)
Athens, Ga. (1)
Grosse He, Mich. (1)
New York City (1)
Boston (1)
Namagansett, R.I. (1)
Rochester,'K.Y. (1)
Rose vine. Minn. (1)
Dallas (1)
Madison. Wls. (1)
Jackson, Mtsi. (1)
Kansas City, Mo. (1)
Medium
Speed
(21
Channel*
REMOTE
Network
Communications
Controller
Washington. D.C.
- C
Local
Telephone Service
-------
STATUS OF THE WASHINGTON COMPUTER CENTER PROCUREMENT
By Denise Swink
The Washington Computer Center (WCC) Task
Force was established by the Office of Planning and
Management to evaluate the need for and the resources
required to implement and maintain an Agency-wide
consolidated ADP facility. The membership of the task
force is composed of a representative from each of the
Assistant Administrators' and Regional Administrators'
offices.
The first meeting of the WCC Task Force was held
February 4, 1974, to review the General Electric (GE)
Siudy for content and acceptability. The GE Study de-
fined the Agency's data processing requirements and
recommended alternative courses of action to support
the requirements. Members were to produce a response
to the GE Study by the next meeting in terms of
selecting an alternative, planning necessary personnel
support for the alternative, and planning for additional
ADP funding to support the alternative.
The second meeting of the WCC Task Force was
held March 15, 1974. At that time, the consensus of
opinion was that the GE Study should be rejected and
that a new feasibility study should be developed by the
Task Force. Reasons for eliminating the GE Study from
consideration and evaluation were based on the facts
that the study was out-of-date (two years old at the
time) and that the workload information presented in
the study was inaccurate. Consequently, a Validation
Committee was established to perform an Agency-wide
workload survey.
The third meeting of the WCC Task Force was held
April 29, 1974. The Validation Committee presented the
findings of the workload survey noting that at the time
they had only a 50 percent response. A revalidation
cycle was suggested by the committee to complete the
survey and correct any erroneous data. Michael Springer,
Chairman of the WCC Task Force, then established five
working groups with functions as follows:
I. General coordination of other groups'activities
and higher-management interaction.
2. Workload, Expansion, and Feasibility
Study production of a formal document (WCC Feasi-
bility Study) which defines the Agency's data processing
workload requirements now and in the future, and rec-
ommends a course of action to support the
requirements.
3. Hardware, Software, and Telecommunications-
production of RFP specifications to implement action
recommended in the WCC Feasibility Study in the areas
of hardware, software, and telecommunications.
4. Facilities Management and Security-
production of RFP specifications to implement action
recommended in the WCC Feasibility Study in the areas
of facilities management and security.
5. Conversion and Benchmark-production of
RFP specifications to convert and benchmark the EPA
workload on the action recommended in the WCC
Feasibility Study.
The fourth meeting of the WCC Task Force hinges
on the completion of the WCC Feasibility Study since
three of the working groups cannot produce documents
until the results of the feasibility study are available.
The Workload, Expansion and Feasibility Study
Group began meeting May 16, 1974. After completing
the Validation Survey, the group designed, developed,
and wrote the Feasibility Study. The group met through
July at three-week intervals, five days in August, an
four days in September. The final draft of the Feasibility
Study is now in the review process by the group. The
study includes the following:
Definition of present services including
workload data and reasons for using a par-
ticular service
Definition of problems with present service
arrangements
Definition of requirements
Definition of feasible alternatives
Analysis of alternatives (technical, managerial,
and cost)
Recommended action.
29<>
-------
It should be noted that the Washington Computer ment Information and Data Systems Division), and the
Center was considered a misnomer by the group since EDP Plan required by Congress will be used by higher
there is no requirement that the facility for the im- management to procure, beginning in fiscal year 1978,
plementation 'of possible alternatives be physically the ADP services required to support EPA programs for
located in the Washington, D.C. area. a life cycle of eight to ten years.
The WCC Feasibility Study along with the Tele-
communications Feasibility Study (directed by Manage-
297
-------
APPENDIX A
LIST OF ATTENDEES
Allen, Ralph G.
Senior Programmer
D. P. Associates
Grosse lie Laboratory
National Environmental Research Center
9311 GrohRoad
Grosse lie, Michigan 48138
Barrow, David R.
Systems Programmer
Surveillance and Analysis Division
Southeastern Environmental Research Laboratory
Environmental Protection Agency
College Station Road
Athens, Georgia 30601
Bliss, James D.
Environmentalist
Monitoring Operations Laboratory
National Environmental Research Center
Environmental Protection Agency
P.O.Box 15027
Las Vegas, Nevada 89114
Borthwick, Patrick W.
Biologist/ADP Coordinator
Gulf Breeze Environmental Research Laboratory
Gulf Breeze Environmental Research Laboratory
National Environmental Research Center
Environmental Protection Agency
Sabine Island
Gulf Breeze, Florida 32561
Broadway, Jon A.
Supervisor, Computer Services Section
Oil and Special Materials Division, Office of Radiation Programs
Environmental Protection Agency
P.O. Box 3009
Montgomery, Alabama 36109
A-l
-------
Brooks, Dorothy
Computer Programmer
D. P. Associates
Grosse He Laboratory
National Environmental Research Center
9311GrohRoad
Grosse He, Michigan 48138
Bryan, Sam D.
Mathematician
Clinical Studies Branch
Human Studies Laboratory
National Environmental Research Center
Environmental Protection Agency
Research Triangle Park, North Carolina 27711
Budde, Bill
Chief, Advanced Instrumentation Section
Advanced Instrumentation Section
Methods Development and Quality Assurance Research Laboratory
National Environmental Research Center
Environmental Protection Agency
Cincinnati, Ohio 45268
Burton, Judy K.
Computer Programmer
Laboratory Services Branch (PNERL)
National Environmental Research Center
Environmental Protection Agency
200 S.W. 35th Street
Corvallis, Oregon 97330
Byram, Kenneth V.
Computer Specialist
Laboratory Services Branch (PNERL)
National Environmental Research Center
Environmental Protection Agency
200 S.W. 35th Street
Corvallis. Oregon 97330
A-2
-------
Bystroff, Roman I.
Chemist
Lawrence Livermore Laboratory
Lawrence Livermore Laboratory
P.O. Box 808, L-404, Chemistry Department
Livermore, California 94550
Cline, David M.
Computer Systems Analyst
Southeast Environmental Research Laboratory
Southeast Environmental Research Laboratory
National Environmental Research Center
Environmental Protection Agency
College Station Road
Athens, Georgia 30601
Conger, Charles S.
Chief, Information Access and User Assistance Branch
Monitoring and Data Support Division
Office of Water and Hazardous Materials (AW-453)
Environmental Protection Agency
Washington, D.C. 20460
Dell, Robert
Computer Engineer
Central Regional Laboratory
Region V-CRL
Environmental Protection Agency
1819PershingRoad
Chicago, Illinois 60609
Dipert, Merlin H.
Chief, Data Systems Branch, Management Division
Data Systems Branch, Management Division
Region V
Environmental Protection Agency
1 North Wacker Drive
Chicago, Illinois 60606
A-3
-------
Hairless, William
Deputy Director and Chief, Chemistry Branch
Central Regional Laboratory
Region V-CRL
Environmental Protection Agency
1819 W.Pershing Road
Chicago, Illinois 60609
Feldmann, Richard
Computer Specialist
Computer Center Branch
National Institutes of Health
Division of Computer Research and Technology
Bethesda, Maryland 20014
Florence, Cecil E.
Chief, Data Processing Branch
Data Processing Branch, Management Division
Region VII
Environmental Protection Agency
17 35 Baltimore
Kansas City, Missouri 64108
Friedland, Michael J.
Systems Analyst/Programmer
Data Services Branch, (TSL)
National Environmental Research Center
Environmental Protection Agency
P.O.Box 15027
Las Vegas, Nevada 89114
Gangler, James
Programmer/Analyst
Vitro Laboratories
Vitro Laboratories
14000 Georgia Avenue
Silver Spring, Maryland 20910
A-4
-------
(Joldberg, Ncal
Computer Programmer
Ecological Research Branch
National Marine Water Quality Laboratory
National Environmental Research Center
Environmental Protection Agency
P.O. Box 277
West Kingston, Rhode Island 02982
Greaves, John 0. B.
Assistant Professor
Southeastern Massachusetts University
Department of Electrical Engineering
Southeastern Massachusetts University
North Dartmouth, Massachusetts 02747
Harris, Theodore R.
Computer Specialist
Management Information & Data Systems Division
Office of Planning and Management (PM-218)
Environmental Protection Agency
Washington, D.C. 20460
Hart, John J.
Chief, Systems Analysis and Programming Branch
Computer Services and System (OA)
National Environmental Research Center
Environmental Protection Agency
5555 Ridge Avenue
Cincinnati, Ohio 45268
Heller, Stephen R.
Computer Specialist
Management Information and Data Systems Division
Office of Planning and Management (PM-218)
Environmental Protection Agency
Washington, D.C. 20460
Hertz, Marvin
Chief, Systems Engineering Section
Bio-Environmental Measurement Branch
Human Studies Laboratory
National Environmental Research Center
Environmental Protection Agency
Research Triangle Park, North Carolina 27711
A-5
-------
Holsomback, Will F.
Computer Specialist
Surveillance and Analysis Division
Southeastern Research Laboratory
Environmental Protection Agency
College Station Road
Athens, Georgia 30601
Johnson, Maureen M.
Computer Specialist
Data Systems Division (OA)
National Environmental Research Center
Environmental Protection Agency
Research Triangle Park, North Carolina 27711
Johnson, Richard
Technical Information Specialist
Air Pollution Technology Information Center
Air Pollution Technology Information Center
Room 255
Chemstrand Building
Research Triangle Park, North Carolina 27711
Jurgens, Robert B.
Physicist
Regional Air Pollution Studies Branch
Meteorology Laboratory
National Environmental Research Center
Environmental Protection Agency
Research Triangle Park, North Carolina 27711
Kinnison, Robert R.
Mathematical Statistician
Monitoring Systems Analysis Staff (MSRDL)
National Environmental Research Center
Environmental Protection Agency
P.O.Box 15027
Las Vegas, Nevada 89114
Knight. John E.
Technical Information Specialist
Special Studies Staff
National Environmental Research Center
Environmental Protection Agency
Research Triangle Park, North Carolina 27711
A-6
-------
Kyle, Kirby D.
Physicist
Bio-Environmental Measurement Branch
Human Studies Laboratory
National Environmental Research Center
Environmental Protection Agency
Research Triangle Park, North Carolina 27711
Lackey, Curtis S.
Chief, Data Systems Branch
Data Systems Branch, Management Division
Region IV
Environmental Protection Agency
1421 Peachtree Street
Atlanta, Georgia 30309
Laurie, Vernon J.
Physical Science Administrator
Office of Monitoring Systems
Office of Research and Development (RD-687)
Environmental Protection Agency
Washington, D.C. 20460
Lowrimore, Gene R.
Chief, Data Processing Section
Biometry Branch
Human Studies Laboratory
National Environmental Research Center
Environmental Protection Agency
Research Triangle Park, North Carolina 27711
Madsen.Mark J.
Systems Analyst
Data Services Branch (TSL)
National Environmental Research Center
Environmental Protection Agency
P.O. Box 15027
Las Vegas, Nevada 89114
A-7
-------
Male, Larry M.
Supervisory Operations Research Analyst
National Ecological Research Laboratory
National Environmental Research Center
Environmental Protection Agency
200 S.W. 35th Street
Corvallis, Oregon 97330
McCarthy, William N.( Jr.
Chemical Engineer
Office of Program Management
Office of Research and Development (RD-674)
Environmental Protection Agency
Washington, D.C. 20460
McGuire, John M.
Chief, Chromatography and Mass Spectrometry Section
Analytical Chemistry Staff
Southeast Environmental Research Laboratory
National Environmental Research Center
Environmental Protection Agency
College Station Road
Athens, Georgia 30601
Myers, Melvin L.
Center Staff Officer
Program Coordination Staff
National Environmental Research Center
Environmental Protection Agency
Research Triangle Park, North Carolina 27711
Nime, Edward J.
Director, Computer Services and Systems
Computer Services and Systems (OA)
National Environmental Research Center
Environmental Protection Agency
5555 Ridge Avenue
Cincinnati, Ohio 45268
A-8
-------
Ott, Wayne R.
Systems Analyst
Office of Monitoring Systems
Office of Research and Development (RD-687)
Environmental Protection Agency
Washington, D.C. 20460
Poole, Elijah L.
Computer Specialist
Management Information and Data Systems Divison
Office of Planning and Management (PM-218)
Environmental Protection Agency
Washington, D.C. 20460
Richardson, William L.
Environmentalist
Large Lakes Branch
Grosse He Laboratory
National Environmental Research Center
Environmental Protection Agency
Rockwell, David C.
Acting Chief, Data Management Section
Surveillance and Analysis Division
Region V
Environmental Protection Agency
1 North Wacker Drive
Chicago, Illinois 60606
Rogers, Tommie L.
Chief Hardware and Communications Management Section
Data Systems Division (OA)
National Environmental Research Center
Environmental Protection Agency
Research Triangle Park, North Carolina 27711
Schuk, Walter W.
Lied ionics Technician
Technology Development Support Branch
EPA-DC Pilot Plant
5000 Overlook Avenue S.W.
Washington, D.C. 20032
A-9
-------
Scotton, John W.
Technical Information Specialist
Office of Monitoring Systems
Office of Research and Development (RD-689)
Environmental Protection Agency
Washington, D.C. 20460
Shew, D. Craig
Research Chemist
Subsurface Environmental Branch
Robert S. Kerr Environmental
Research Laboratory
National Environmental Research Center
Environmental Protection Agency
P.O. Box 1198
Ada, Oklahoma 74820
Snelling, Robert N.
Chief, Data Services Branch
Da la Services Branch (TSL)
National Environmental Research Center
Environmental Protection Agency
P.O. Box 15027
Las Vegas, Nevada 89114
Swink, Denise
ADP Coordinator
Office of Program Management
Office of Research and Development (RD-674)
Environmental Protection Agency
Washington, D.C. 20460
Teuschler, Jack
Electrical Engineer
Instrumentation Development Branch
Methods Development and Quality
Assurance Research Laboratory
National Environmental Research Center
Environmental Protection Agency
Cincinnati, Ohio 4S268
A-10
-------
Tiffuny, William C.
Computer Programmer
Eutrophication Survey Branch (PNERL)
National Environmental Research Center
Environmental Protection Agency
200 S.W. 35th Street
Corvallis, Oregon 97330
Tittle, Catherine
Manager, Customer Services
Bowne Timesharing, Inc.
Bowne Timesharing, Inc.
1025 Connecticut Avenue N.W.
Washington, D.C. 20036
Uchrin, Christopher
Sanitary Engineer
Surveillance and Analysis Division
Region II
Environmental Protection Agency
21 Stonehenge Drive
Lincroft, New Jersey 07738
Webb, Ronald H.
Senior Account Representative
Bowne Timesharing, Inc.
Bowne Timesharing, Inc.
1025 Connecticut Avenue, N.W.
Washington, D.C. 20036
Williams, Edward R.
Program Manager: Comprehensive Analyses
Forecasting and Analysis Branch
Washington Environmental Research Center (RD-691)
Environmental Protection Agency
Washington, D.C. 20460
A-1I
-------
WUIiams, Robert T.
Chief, Waste Identification and Analysis
Waste Identification and Analysis Section
Advanced Waste Treatment Research Laboratory
National Environmental Research Center
Environmental Protection Agency
Cincinnati, Ohio 45268
Worley. Donald L.
Chief Analysis and System Design Section
Data Systems Division (OA)
National Environmental Research Center
Environmental Protection Agency
Research Triangle Park, North Carolina 27711
A-12
-------
APPENDIX B
AREAS OF EXPERTISE
Air Monitoring Systems
Hert/., Marvin
Jurgens, Robert B.
Kyle, Kirby D.
Analysis, Design, and Programming
Allen, Ralph G.
Burrow. David R.
Brooks, Dorothy
Bryan, Sum D.
Burton, Judy K.
Byram, Kenneth V.
Cli.nc, David M.
Dipcrt. Merlin H.
Friedland, Michael J.
Gangler, James
Goldberg, Neal
Knight, John E.
Lackey, Curtis S.
Lowrimorc, Gene R.
Madscn, Mark J.
Male, Larry M.
McCarthy, William N.
Ott, Wayne R.
Tiffany, William C.
Jr.
Analytical Methods Development
Buddc. Bill
Automated Chromatography
McGuirc. John M.
Uchrin, Christopher
Basin Planning
CHAMP/CHESS
Her!/, Marvin
Kyle, Kirby D.
Lowrimorc, Gene R.
Chemical Structure Searching
Feldmann, Richard
Heller, Stephen R.
CLEVER/CLEANS
Lowrimore, Gene R.
Coordination of EPA's Use of Word/One
Tittle, Catherine
Webb, Ronald H.
Data Base Management
Byram, Kenneth V.
Bystroff, Roman I.
Dell, Robert
Florence, Cecil E.
Gangler, James
Jurgens, Robert B.
Knight, John E.
Lowrimore, Gene R.
Madsen, Mark J.
Rockwell, David C.
Scotton, John W.
Snelling, Robert N.
Worley, Donald I
Data Storage and Retrieval
Bliss. James D.
Friedland. Michael J.
Holsomback, Will F.
Knight. John E.
Richardson, William L.
Worley, Donald L.
Development of Library Reference Spectra
Heller, Stephen R.
McGuire, John M.
B-l
-------
Direction and Coordination of ADP Needs
Laboratory Automation
Byram, Kenneth V.
Harris. Theodore R.
Johnson, Maureen M.
Knight, John E.
Niinc, Bdward J.
Snelling, Robert N.
Swink, Denise
Effects of Pollutants on Organism Behavior
Borthwick, Patrick W.
Greaves, John O.B.
Engineering
flyers. Mclvin L.
Cotton, John W.
ENVIR/EDMPAS
Budde, Bill
Byrani, Kenneth V.
Bystroff, Roman I.
Cline, David M.
Dell, Robert
Fairless, William
Goldberg, Neal
Heller, Stephen R.
Laurie, Vemon J.
Nime, Edward J.
Ott, Wayne R.
Teuschler, Jack
Modeling
Dipert, Merlin H.
Hart, John J.
Ott. Wayne R.
Richardson, William L.
Williams. Edward R.
Ocean Monitoring
Uchrin, Christopher
Equipment Utilization
. William N., Jr.
Facilities Management
McCarthy, William N., Jr.
Forecasts and Costs of Pollution Control
Williams. Edward R.
Pollution Monitoring Systems
Rockwell, David C.
Clinc, David M.
Feldmann. Richard
Goldberg, Neal
Jurgcns, Robert B.
Lackey, Curt is S.
Male. Larry M.
Geophysics
Graphics
Broadway,Jon A.
Hertz, Marvin
Jurgens, Robert B.
Kinnison, Robert R.
Bliss. James D.
Johnson, Richard
Lackey, Curtis S.
Lowrimore, Gene R.
Williams. Edward R.
Budde, Bill
Heller, Stephen R.
McGuire, John M.
Shew, D. Craig
Kyle, Kirby D.
Scotton, John W.
Uchrin, Christopher
Williams, Robert J.
SAS
SEAS
Spectrometry
B-2
-------
Statistical Analysis
Bnrlhwick, Patrick W. Lowrimorc, Gene R.
Byrain, Kenneth V. Male, Larry M.
Dell, Robert Ott, Wayne R.
Kinnison, Robert R. Poole, Elijah L.
STORET
Bliss. James D.
Conger, Charles S.
Friedland, Michael J.
Holsomback, Will F.
Telecommunications Development
Harris, Theodore R.
McCiiire, John M.
Rogers, Tommic L.
UNIV AC 1110 System
Johnson, Maureen
Rogers, Tommie L.
Worley, Donald L
Waste Treatment Research and Analysis
Schuk, Walter W.
Williams, Robert T.
Water Quality Analysis
Dincrl, Merlin II.
I Lin. John J.
Richardson. William L.
B-3
-------
APPENDIX C
AREAS OF INTEREST
ADP Utilization/Management
By ram, Kenneth V.
Dipert, Merlin H.
Florence, Cecil E.
Johnson, Maureen M.
Knight, John I;.
Lackey, Curt is S.
Broadway,Jon A.
Jurgens, Robert B.
Ott, Wayne R.
Myers, Melvin L.
Nime, Edward J,
Poole, Elijah L.
Snelling, Robert N.
Swink, Denise
Air Modeling
Johnson, Richard
Jurgens, Robert B.
Knight, John E.
Lackey,Curtis S.
Harris. Theodore R.
Scot ton, John W.
Snelling, Robert N.
Worley, Donald L.
EDP Hardware
Effective Use of Word/One System by OR&D
Tittle, Catherine
Webb, Ronald H.
Air/Water Surveillance and Monitoring
Barrow, David R.
Bliss,James D.
Lackey, Curtis S.
Ott, Wayne R.
Rockwell, David C.
Uchrin, Christopher
Chemical/Organic Analysis of Pollutants
Budde, Bill
F airless, William
Heller, Stephen R.
Shew, D. Craig
Williams, Robert J.
Date Base Management
garrow, David R.
Byram, Kenneth V.
Conger, Charles S.
G angler, James
Hertz, Marvin
Laurie, Vernon J.
Lowrimore, Gene R.
Madsen, Mark J.
Ott, Wayne R.
Rockwell. David C.
Environmental Residuak and Land Use Modeling
Williams, Edward R.
Graphics
Borthwick, Patrick W.
Cline. David M.
Conger, Charles S.
Feldmann, Richard
Florence, Cecil E.
Goldberg, Ncal
Harris. Theodore R.
Jurgens, Robert B.
Lackey, Curtis S.
Madsen, Mark J.
McCarthy, William N.. Jr.
Snelling, Robert N.
Swink, Denise
Hardware Dependability
Rogers, Tommie L.
Information Storage and Retrieval Systems
Johnson, Richard
Knight, John E.
Nime, Edward J.
Scotton, John W.
C-l
-------
Laboratory Automation
Programming Techniques
Borlhwick, I'iilrick W.
Broadway, Jon A.
Buddc, Bill
Byram, Kenneth V.
Bystroff, Roman I.
Cline, David M.
Fairlcss, William
I la 1t, John J.
I Idler, Stephen R.
Laurie, Vernon J.
McCarthy, William N., Jr.
Ott, Wayne R.
Richardson, William L.
Scotton, John W.
Snelling, Robert N.
Swink, Denisc
Teuschler, Jack
Williams, Robert T.
Mathematical Analysis Simulation, and Modeling
Byram, Kenneth V.
Cline, David M.
Dipert, Merlin H.
Johnson, Richard
Kinnison, Robert R.
Lowrimore, Gene R.
McCarthy, William N., Jr.
Nime, Edward J.
Ott, Wayne R.
Poole, Elijah L.
Richardson, William L.
Schuk, Walter W.
Uchrin, Christopher
Williams, Edward R.
Minicomputers
Allen, Ralph G.
Bryan, Sam D.
Bystroff, Roman I.
Cline, David M.
Dell, Robert
Peldmann, Richard
Greaves, John O.B.
llcrt/., Marvin
Knight, John E.
Kyle, Kirby D.
Shew, D. Craig
Swink, Denise
Operating Systems and Analysis
Bryan, Sam D.
Cline, David M.
Dell, Robert
Greaves, John O.B.
Kyle, Kirby D.
Uchrin, Christopher
Operations Support
Rogers, Tommie L.
By ram, Kenneth V.
Ganglcr, James
Madsen,Mark J.
Quality Control in Laboratories
Borthwick, Patrick W.
Fairlcss, William
Florence, Cecil E.
Hertz, Marvin
Ott, Wayne R.
Schuk, Walter W.
Real-Time Data Acquisition and Handling
Brooks, Dorothy
Cline, David M.
Dell, Robert
Feldmann, Richard
Goldberg, Neal
Swink, Denise
Scientific/Statistical Applications and Packages
Borthwick, Patrick W.
Brooks, Dorothy
Bryan, Sam D.
Burton, Judy K.
Byram, Kenneth V.
Friedland, Michael J.
Johnson, Richard
Jurgens, Robert B.
Kinnison, Robert R.
Lowrimore, Gene R.
Male, Larry M.
Nime, Edward J.
Poole, Elijah L.
Richardson, William L.
Swink, Denise
Software Development
Harris, Theodore R.
Johnson, Maureen M.
Scotton, John W.
Spectrometry
Broadway,Jon A.
Budde, Bill
Heller, Stephen R.
McGuire, John M.
Shew, D. Craig
C-2
-------
STORET
Allen, Ralph G. Friedland, Michael J.
Barrow, David R. Holsomback, Will F.
Bliss, James D. Richardson, William L.
Brooks, Dorothy Rockwell, David C.
Byram, Kenneth V. Tiffany, WUliam C.
Florence, Cecil E.
Telecommunications
Bryan, Sam D.
Conger, Charles S.
Dell, Robert
Harris, Theodore R.
Rogers, TommieL.
EPA-*TP LIBRARY"
-------