SEPA
United States
Environmental Protection
Agency
Industrial Environmental Research
Laboratory
Cincinnati OH 45268
Research and Development
EPA-600/PS2-80-164 Sept. 1980
Project Summary
The Revised Organic Chemical
Producers Data Base System
G. E. Wilkins, C. H. Tucker, and E. D. Gibson
This report describes the revised
Organic Chemical Producers Data
Base (OCPDB), an automated
chemical industry information system
developed in 1976 for the U.S.
Environmental Protection Agency
(EPA).
Improvements by Radian Corpora-
tion, Austin, Texas, have been made in
two ways: (1) expansion of the data
base to include more chemicals and
more information about each
chemical, and (2) implementation of
the system through a data base
management system.
The revised data base includes
almost 600 chemicals and their more
than 1300 producers. Chemicals are
described by Chemical Abstracts
Services (CAS) registry number,
Wiswesser Line Notation (WLN),
industrial process descriptions,
chemical uses, synonyms, toxicity
data, economic data, and producers.
Priority pollutants identified as a result
of Natural Resources Defense Council
(NRDC) vs. EPA are marked and
process descriptions are cross-
referenced with another EPA
reference source. The Industrial
Process Profiles for Environmental
Use (IPPEU) Chapter 6. Locations of
producers are described by city, state,
EPA region, and river basin. The
chemicals that are produced at each
location are listed, along with name-
plate capacities, when available.
Retrieval is possible through use of
any of a number of "key" data
elements: chemical name, synonyms.
OCPDB number, CAS numbers, WLN,
priority pollutant markers, process ID
number, IPPEU numbers, producer
company name, parent company
name, city, state, river basin, and EPA
region.
Introduction
This report describes the revised
Organic Chemical Producers Data Base
(OCPDB) system. The original OCPDB
was developed in 1976 for EPA's Indus-
trial Environmental Research Labora-
tory (IERL) in Cincinnati under EPA
Contract 68-02-1319, Task 51.
The computerized data base that was
established in 1976 provided easy
access to data concerning organic
chemicals and their production in a
format that facilitated comparisons of
various aspects of the industry. It served
as a tool for understanding the organic
chemical industry, forguiding EPA work
in a knowledgeable and systematic
manner, and for increasing work effort
efficiency.
Since 1977 Radian has updated the
data in the OCPDB and increased the
size and capabilities of the system. This
interim report describes progress made
toward this objective. The revised
OCPDB was made fully operational in
1979. While this report describes the
basic form and substance of the system,
it is not meant to imply that the system is
static. Changes, expansions, and
improvements are expected, as the
needs arise. The new system is
expected to be even more responsive to
-------
changes in program needs and will
allow more flexibility in operation.
Discussion and Procedure
The original OCPDB consisted of a
matrix of about 300 chemicals and their
610 production sites. The chemical list
was begun with the one compiled by
Monsanto Research Corporation under
EPA Contract 68-02-1320. Several
additions were made to complete the
list. The basic petrochemical feedstocks
were added: toluene, xylene, ethylene,
propylene, C2-C4 hydrocarbons. Also
added were chemicals that had
production volumes equal to or greater
than those chemicals already included
(about 10 million pounds per year).
Prioritized lists of toxic chemicals were
examined, and chemicals not in the data
base were added. The list was then
compared to the list generated in the
Source Assessment Program (EPA
Contract 68-02-1874) to check for
omissions. Production sites for the
chemicals were obtained from the open
literature, and this file formed the other
dimension of the computerized matrix.
The mechanical structures of the
chemical and producer data files are
shown in Figures 1 and 2.
New adjectives or descriptors were
added to describe chemical entries.
These include Chemical Abstracts
Services (CAS) registry numbers,
process routes, additional toxicity data,
use descriptions, sales, and synonyms.
New data files describing production
sites include parent company name and
river basins. The data files from the
original OCPDB have been updated in
cases in which new data have become
available.
Table 1 lists all of the data files
included in the OCPDB system. It also
shows the number of unique data items,
the total number of occurrences of data
items within a file, and the amount of
computer storage required for each file.
The revised data base has been
implemented with a data base
management system (DBMS): System
2000®. A DBMS was chosen to elimi-
nate inefficiencies in the original
system. System 2000® was selected
because it is well-proven and widely
used in commercial and governmental
institutions, including EPA.
Flexibility in reporting is another
major improvement in the revised
Entry
(Entry Type =
\*EntrylDtt
c
Di
hemical
ascription
*Name
*Casti
*WLN
Toxicity Data
Process Routes.
/^Process ID
( Process
\Description1
IPPEU it's
Pro
Annual I
duction I Data
Year
Volume
Unit Cost
Sales •
t
.End Uses
^Synonyms
End Use
t Description
(Amount %
(Domestic
(Entry Type = 2 A
*f ntry IDtf )
System 2000® is a registered trademark of MRI
Systems Corporation.
*Key Data Element
Figure 1. Hierarchial structure of the chemical data files of the OCPDB.
system. Both interactive and batch
modes of access are possible, and addi-
tional report formats may be defined at
any time. Analytical report capabilities
such as the "minimum sites search" Level 0
have been retained and expanded
through the possibility of linking the
DBMS to other data handling systems.
The sections of the report describe in
more detail the system contents and its
workings. Section 1 is an introduction to
the system and its revision. Section 2 is
a summary provided for those who want
a quick overview of the OCPDB. A
description of the data contained in the
data base is presented in Section 3. The
system mechanics and structure are
detailed in Section 4. The capabilities of
the system and an understanding of its
uses can best be gained in Section 5
which contains sample reports and
example access modes. Several long
tables and technical sections have been
appended to facilitate reading the
report. Appendix A is a listing of OCPDB
chemicals; Appendix B is a listing of
Level 1
*Name of Company
City
•*Parent Company
*State
*River Basin
*EPA Region
*Key Data Elements
Figure 2. Hierarchial structure
the producer data fill
of the OCPDB.
-------
Table 1. Data Tally for the OCPDB"
Entry Type
Entry ID
Chemical Related Data
OCPDB Chemicals
CAS Numbers
New Chemical Markers
Priority Pollutant Markers
Wiswesser Line Notation
Process ID
Process Description
IPPEU Numbers
Uses Description
Use Volume
Use by % of Consumption
Use IPPEU Numbers
Synonyms
Toxicity Data
NIOSH Registry Number
LDgg Mode
LDgg Species
LDgg Amount
LDg, Units
LCLO Mode
LCLO Species
LCLO Amount
LCLO Units
AQTX
TLV
TLV Units
Sax Ratings
Economic Data
Year
Production Volume
Unit Cost
Sales
Producer Related Data
Plant ID
Plant Capacity
Company Names
Cities
States
River Basin
River Basin Code
Parent Companies
TOTALS
No. of Unique
Values
2
1,621
597
518
1
1
500
8
640
224
1,794
b
b
90
5,285
b
6
6
b
4
3
9
b
37
b
b
5
b
15
b
b
b
1,246
b
615
748
49
321
326
182
14,853
No. of
Occurrences
1,944
1,944
597
525
224
135
506
1,131
1,133
318
2,763
784
711
106
5,426
439
335
335
335
597
134
134
135
597
170
185
597
1,470
649
287
348
176
3,703
1,190
1,346
1,346
1,346
1,082
1,081
340
36,604
Total Computer
Storage Volume
in No. of Bytes
1,944
8,720
14,925
5,250
224
135
20,240
2,262
28,325
954
138,150
6,272
4,266
318
217,040
3,512
1,005
1,005
2,680
5,970
402
402
1,080
5,970
2,496
1,850
5,970
1,470
1,298
2,296
2,784
1,408
18,515
10,710
33,650
26,920
2,692
32,460
8,648
8,500
632,718C
Table 2. List of Key Data Elements in
OCPDB
Chemical Product Entries
Name
OCPDB Number
CAS Number
Wiswesser Line Notation
Use Description
Original OCPDB Chemical Indicator
Priority Pollutant Indicator
Synonym
Process OCPDB Number
Process IPPEU Number
Producer Entries
OCPDB Number
Parent Company Name
Producing Company Name
City
State
River Basin
EPA Region
National Technical Information Service.
Three major types of directed retrieval
are possible using the OCPDB: informa-
tion about chemical products, informa-
tion about producers, and relationships
between producers and products.
Retrieval is possible through use of any
of a number of "key" data elements
which are listed in Table 2. Using these
retrieval "keys" to access the files, the
sorting, filing and reporting possibilities
are virtually limitless.
'As of the date of this interim report.
bFor non-key data files, the number of unique values is not known.
cWith indices and other internal data base tables, the total number for the entire data base
is approximately 2,000,000 bytes.
OCPDB producers. River basins in
which OCPDB producers are located are
listed in Appendix C.
Information about accessing the
system may be obtained by contacting
the project officer. This publication is a
summary of the complete project report,
which can be purchased from the
a U.S. OOVERNMENT PBINTINO OFFICE: 1981 -757-064/0233
-------
G. £. Wilkins, C. H. Tucker, andE. D. Gibson are with Radian Corporation, Austin,
TX 78766,
Ms. Audrey McBath is the EPA Project Officer (see below)
The complete report, entitled "The Revised Organic Chemical Producers Data
Base System," (Order No. PB 1 99 805; Cost: $11,00 subject to change) will be
available from:
National Technical Information Service
5285 Port Royal Road
Springfield, VA 22161
Telephone: 703-557-4650
The EPA Project Officer can be contacted at:
Industrial Environmental Research Laboratory
U.S. Environmental Protection Agency
Cincinnati, OH 45268
------- |