SEPA
                                 United States
                                 Environmental Protection
                                 Agency
                                 Industrial Environmental Research
                                 Laboratory
                                 Cincinnati OH 45268
                                 Research and Development
                                 EPA-600/PS2-80-164  Sept. 1980
Project Summary
                                 The Revised Organic Chemical
                                 Producers  Data  Base  System


                                 G. E. Wilkins, C. H. Tucker, and E. D. Gibson
                                   This report describes the revised
                                 Organic  Chemical  Producers  Data
                                 Base  (OCPDB),  an  automated
                                 chemical industry information system
                                 developed  in  1976 for  the  U.S.
                                 Environmental   Protection  Agency
                                 (EPA).
                                   Improvements by Radian  Corpora-
                                 tion, Austin, Texas, have been made in
                                 two ways:  (1) expansion of the data
                                 base to include more chemicals and
                                 more information about  each
                                 chemical, and (2) implementation of
                                 the system through a  data  base
                                 management system.
                                   The revised   data base  includes
                                 almost 600 chemicals and their more
                                 than 1300  producers. Chemicals are
                                 described  by  Chemical  Abstracts
                                 Services (CAS)  registry  number,
                                 Wiswesser  Line  Notation  (WLN),
                                 industrial  process descriptions,
                                 chemical uses, synonyms, toxicity
                                 data, economic data, and producers.
                                 Priority pollutants identified as a result
                                 of Natural Resources Defense Council
                                 (NRDC)  vs. EPA are marked  and
                                 process descriptions  are cross-
                                 referenced with another  EPA
                                 reference  source.  The  Industrial
                                 Process  Profiles for Environmental
                                 Use (IPPEU) Chapter 6.  Locations of
                                 producers are described by city, state,
                                 EPA region, and river basin.  The
                                 chemicals that  are produced at each
                                 location are listed, along with name-
                                 plate capacities, when available.
                                   Retrieval  is possible through use of
                                 any of  a  number of  "key"  data
                                 elements: chemical name, synonyms.
                                 OCPDB number, CAS numbers, WLN,
                                 priority pollutant markers, process ID
                                 number,  IPPEU numbers, producer
                                 company  name,  parent  company
                                 name, city, state, river basin, and EPA
                                 region.


                                 Introduction
                                   This report describes the  revised
                                 Organic Chemical Producers Data Base
                                 (OCPDB) system. The original OCPDB
                                 was developed in 1976 for EPA's Indus-
                                 trial Environmental Research  Labora-
                                 tory (IERL) in Cincinnati under EPA
                                 Contract 68-02-1319, Task 51.
                                   The computerized data base that was
                                 established  in 1976  provided easy
                                 access to data concerning  organic
                                 chemicals  and their production in a
                                 format that facilitated comparisons of
                                 various aspects of the industry. It served
                                 as a tool for understanding the organic
                                 chemical industry, forguiding EPA work
                                 in a knowledgeable and systematic
                                 manner, and for increasing work effort
                                 efficiency.
                                   Since 1977  Radian has updated the
                                 data in the OCPDB and increased the
                                 size and capabilities of the system. This
                                 interim report describes progress made
                                 toward  this  objective. The  revised
                                 OCPDB was made fully operational in
                                 1979.  While this report describes  the
                                 basic form and substance of the system,
                                 it is not meant to imply that the system is
                                 static. Changes,  expansions, and
                                 improvements are  expected,  as  the
                                 needs  arise.  The new  system  is
                                 expected to be even more responsive to

-------
changes  in  program  needs and  will
allow more flexibility in operation.

Discussion and Procedure
  The original OCPDB  consisted of  a
matrix of about 300 chemicals and their
610 production sites. The chemical list
was  begun with the one compiled by
Monsanto Research  Corporation under
EPA   Contract  68-02-1320.  Several
additions were  made to complete the
list. The basic petrochemical feedstocks
were  added: toluene, xylene, ethylene,
propylene,  C2-C4 hydrocarbons.  Also
added were  chemicals  that  had
production volumes equal to or greater
than  those chemicals already included
(about 10  million  pounds  per  year).
Prioritized lists of toxic chemicals were
examined, and chemicals not in the data
base  were added.  The  list was  then
compared to the list generated in the
Source  Assessment  Program  (EPA
Contract  68-02-1874)  to  check  for
omissions.  Production   sites  for  the
chemicals were obtained from the open
literature, and this file formed the other
dimension of the computerized matrix.
The   mechanical  structures  of  the
chemical and producer  data files are
shown in Figures 1 and 2.
  New adjectives or descriptors were
added to  describe  chemical  entries.
These include  Chemical  Abstracts
Services   (CAS)  registry  numbers,
process routes, additional toxicity data,
use descriptions, sales, and synonyms.
New  data files describing production
sites include parent company name and
river  basins.  The data files from the
original OCPDB have been updated in
cases in which new  data have become
available.
  Table 1 lists  all  of  the data  files
included in the  OCPDB system. It also
shows the number of unique data items,
the total number of occurrences of data
items within a file, and the amount of
computer storage required for each file.
  The  revised  data base  has been
implemented  with  a  data  base
management system (DBMS): System
2000®. A DBMS was chosen to elimi-
nate   inefficiencies  in  the  original
system. System 2000® was selected
because  it is well-proven and widely
used in commercial  and governmental
institutions, including EPA.
  Flexibility  in   reporting  is  another
major  improvement in  the  revised
                                  Entry
                              (Entry Type =
                              \*EntrylDtt
c
Di
hemical
ascription
*Name
*Casti
*WLN
Toxicity Data 	
    Process Routes.
       /^Process ID
      (   Process
       \Description1
                         IPPEU it's
Pro
Annual I
duction I Data


Year
Volume
Unit Cost
Sales •
t
                                                  .End Uses
                                                                 ^Synonyms
                                             End Use
                                           t Description
                                           (Amount %
                                           (Domestic
                                                         (Entry Type = 2 A
                                                           *f ntry IDtf   )
System 2000® is a registered trademark of MRI
Systems Corporation.
*Key Data Element

Figure 1.     Hierarchial structure of the chemical data files of the OCPDB.
system.  Both  interactive  and  batch
modes of access are possible, and addi-
tional report formats may be defined at
any time. Analytical report capabilities
such as the "minimum  sites search"           Level 0
have  been  retained  and  expanded
through the possibility of linking  the
DBMS to other data handling systems.
  The sections of the report describe in
more detail the system contents and its
workings. Section 1 is an introduction to
the system and its revision. Section 2 is
a summary provided for those who want
a  quick  overview  of  the  OCPDB. A
description of the data  contained in the
data base is presented in Section 3.  The
system mechanics and  structure  are
detailed in Section 4. The capabilities of
the system and an understanding of its
uses can best be  gained in Section 5
which  contains  sample  reports  and
example  access  modes. Several long
tables and technical sections have been
appended to  facilitate  reading  the
report. Appendix A is a  listing of OCPDB
chemicals; Appendix B is a listing of
                                          Level 1
*Name of Company
 City
•*Parent Company
*State
*River Basin
*EPA  Region
                                        *Key Data Elements

                                        Figure 2.    Hierarchial structure
                                                     the producer  data fill
                                                     of the OCPDB.

-------
Table 1. Data Tally for the OCPDB"

Entry Type
Entry ID
Chemical Related Data
OCPDB Chemicals
CAS Numbers
New Chemical Markers
Priority Pollutant Markers
Wiswesser Line Notation
Process ID
Process Description
IPPEU Numbers
Uses Description
Use Volume
Use by % of Consumption
Use IPPEU Numbers
Synonyms
Toxicity Data
NIOSH Registry Number
LDgg Mode
LDgg Species
LDgg Amount
LDg, Units
LCLO Mode
LCLO Species
LCLO Amount
LCLO Units
AQTX
TLV
TLV Units
Sax Ratings
Economic Data
Year
Production Volume
Unit Cost
Sales
Producer Related Data
Plant ID
Plant Capacity
Company Names
Cities
States
River Basin
River Basin Code
Parent Companies
TOTALS
No. of Unique
Values

2
1,621

597
518
1
1
500
8
640
224
1,794
b
b
90
5,285

b
6
6
b
4
3
9
b
37
b
b
5
b

15
b
b
b

1,246
b
615
748
49
321
326
182
14,853
No. of
Occurrences

1,944
1,944

597
525
224
135
506
1,131
1,133
318
2,763
784
711
106
5,426

439
335
335
335
597
134
134
135
597
170
185
597
1,470

649
287
348
176

3,703
1,190
1,346
1,346
1,346
1,082
1,081
340
36,604
Total Computer
Storage Volume
in No. of Bytes

1,944
8,720

14,925
5,250
224
135
20,240
2,262
28,325
954
138,150
6,272
4,266
318
217,040

3,512
1,005
1,005
2,680
5,970
402
402
1,080
5,970
2,496
1,850
5,970
1,470

1,298
2,296
2,784
1,408

18,515
10,710
33,650
26,920
2,692
32,460
8,648
8,500
632,718C
Table 2. List of Key Data Elements in
OCPDB
Chemical Product Entries
Name
OCPDB Number
CAS Number
Wiswesser Line Notation
Use Description
Original OCPDB Chemical Indicator
Priority Pollutant Indicator
Synonym
Process OCPDB Number
Process IPPEU Number
Producer Entries
OCPDB Number
Parent Company Name
Producing Company Name
City
State
River Basin
EPA Region


National Technical Information Service.
Three major types of directed retrieval
are possible using the OCPDB: informa-
tion about chemical products, informa-
tion about producers, and relationships
between producers and products.
Retrieval is possible through use of any
of a number of "key" data elements
which are listed in Table 2. Using these
retrieval "keys" to access the files, the
sorting, filing and reporting possibilities
are virtually limitless.















'As of the date of this interim report.
bFor non-key data files, the number of unique values is not known.
cWith indices and other internal data base tables, the total number for the entire  data base
is approximately 2,000,000 bytes.
OCPDB  producers.  River  basins  in
which OCPDB producers are located are
listed in Appendix C.
  Information  about   accessing  the
system may be obtained by contacting
the project officer. This publication is a
summary of the complete project report,
which  can  be  purchased  from  the
                                                                                       a U.S. OOVERNMENT PBINTINO OFFICE: 1981 -757-064/0233

-------
G. £. Wilkins, C. H. Tucker, andE. D. Gibson are with Radian Corporation, Austin,
  TX 78766,
Ms. Audrey McBath is the EPA  Project Officer (see below)
The complete report,  entitled "The Revised Organic Chemical Producers Data
  Base System," (Order No. PB 1 99 805; Cost: $11,00 subject to change) will be
  available from:
      National Technical Information Service
      5285 Port Royal Road
      Springfield, VA 22161
      Telephone:  703-557-4650
The EPA Project Officer can be contacted at:
      Industrial Environmental Research Laboratory
      U.S. Environmental Protection Agency
      Cincinnati, OH 45268

-------