SEPA United States Environmental Protection Agency Industrial Environmental Research Laboratory Cincinnati OH 45268 Research and Development EPA-600/PS2-80-164 Sept. 1980 Project Summary The Revised Organic Chemical Producers Data Base System G. E. Wilkins, C. H. Tucker, and E. D. Gibson This report describes the revised Organic Chemical Producers Data Base (OCPDB), an automated chemical industry information system developed in 1976 for the U.S. Environmental Protection Agency (EPA). Improvements by Radian Corpora- tion, Austin, Texas, have been made in two ways: (1) expansion of the data base to include more chemicals and more information about each chemical, and (2) implementation of the system through a data base management system. The revised data base includes almost 600 chemicals and their more than 1300 producers. Chemicals are described by Chemical Abstracts Services (CAS) registry number, Wiswesser Line Notation (WLN), industrial process descriptions, chemical uses, synonyms, toxicity data, economic data, and producers. Priority pollutants identified as a result of Natural Resources Defense Council (NRDC) vs. EPA are marked and process descriptions are cross- referenced with another EPA reference source. The Industrial Process Profiles for Environmental Use (IPPEU) Chapter 6. Locations of producers are described by city, state, EPA region, and river basin. The chemicals that are produced at each location are listed, along with name- plate capacities, when available. Retrieval is possible through use of any of a number of "key" data elements: chemical name, synonyms. OCPDB number, CAS numbers, WLN, priority pollutant markers, process ID number, IPPEU numbers, producer company name, parent company name, city, state, river basin, and EPA region. Introduction This report describes the revised Organic Chemical Producers Data Base (OCPDB) system. The original OCPDB was developed in 1976 for EPA's Indus- trial Environmental Research Labora- tory (IERL) in Cincinnati under EPA Contract 68-02-1319, Task 51. The computerized data base that was established in 1976 provided easy access to data concerning organic chemicals and their production in a format that facilitated comparisons of various aspects of the industry. It served as a tool for understanding the organic chemical industry, forguiding EPA work in a knowledgeable and systematic manner, and for increasing work effort efficiency. Since 1977 Radian has updated the data in the OCPDB and increased the size and capabilities of the system. This interim report describes progress made toward this objective. The revised OCPDB was made fully operational in 1979. While this report describes the basic form and substance of the system, it is not meant to imply that the system is static. Changes, expansions, and improvements are expected, as the needs arise. The new system is expected to be even more responsive to ------- changes in program needs and will allow more flexibility in operation. Discussion and Procedure The original OCPDB consisted of a matrix of about 300 chemicals and their 610 production sites. The chemical list was begun with the one compiled by Monsanto Research Corporation under EPA Contract 68-02-1320. Several additions were made to complete the list. The basic petrochemical feedstocks were added: toluene, xylene, ethylene, propylene, C2-C4 hydrocarbons. Also added were chemicals that had production volumes equal to or greater than those chemicals already included (about 10 million pounds per year). Prioritized lists of toxic chemicals were examined, and chemicals not in the data base were added. The list was then compared to the list generated in the Source Assessment Program (EPA Contract 68-02-1874) to check for omissions. Production sites for the chemicals were obtained from the open literature, and this file formed the other dimension of the computerized matrix. The mechanical structures of the chemical and producer data files are shown in Figures 1 and 2. New adjectives or descriptors were added to describe chemical entries. These include Chemical Abstracts Services (CAS) registry numbers, process routes, additional toxicity data, use descriptions, sales, and synonyms. New data files describing production sites include parent company name and river basins. The data files from the original OCPDB have been updated in cases in which new data have become available. Table 1 lists all of the data files included in the OCPDB system. It also shows the number of unique data items, the total number of occurrences of data items within a file, and the amount of computer storage required for each file. The revised data base has been implemented with a data base management system (DBMS): System 2000®. A DBMS was chosen to elimi- nate inefficiencies in the original system. System 2000® was selected because it is well-proven and widely used in commercial and governmental institutions, including EPA. Flexibility in reporting is another major improvement in the revised Entry (Entry Type = \*EntrylDtt c Di hemical ascription *Name *Casti *WLN Toxicity Data Process Routes. /^Process ID ( Process \Description1 IPPEU it's Pro Annual I duction I Data Year Volume Unit Cost Sales • t .End Uses ^Synonyms End Use t Description (Amount % (Domestic (Entry Type = 2 A *f ntry IDtf ) System 2000® is a registered trademark of MRI Systems Corporation. *Key Data Element Figure 1. Hierarchial structure of the chemical data files of the OCPDB. system. Both interactive and batch modes of access are possible, and addi- tional report formats may be defined at any time. Analytical report capabilities such as the "minimum sites search" Level 0 have been retained and expanded through the possibility of linking the DBMS to other data handling systems. The sections of the report describe in more detail the system contents and its workings. Section 1 is an introduction to the system and its revision. Section 2 is a summary provided for those who want a quick overview of the OCPDB. A description of the data contained in the data base is presented in Section 3. The system mechanics and structure are detailed in Section 4. The capabilities of the system and an understanding of its uses can best be gained in Section 5 which contains sample reports and example access modes. Several long tables and technical sections have been appended to facilitate reading the report. Appendix A is a listing of OCPDB chemicals; Appendix B is a listing of Level 1 *Name of Company City •*Parent Company *State *River Basin *EPA Region *Key Data Elements Figure 2. Hierarchial structure the producer data fill of the OCPDB. ------- Table 1. Data Tally for the OCPDB" Entry Type Entry ID Chemical Related Data OCPDB Chemicals CAS Numbers New Chemical Markers Priority Pollutant Markers Wiswesser Line Notation Process ID Process Description IPPEU Numbers Uses Description Use Volume Use by % of Consumption Use IPPEU Numbers Synonyms Toxicity Data NIOSH Registry Number LDgg Mode LDgg Species LDgg Amount LDg, Units LCLO Mode LCLO Species LCLO Amount LCLO Units AQTX TLV TLV Units Sax Ratings Economic Data Year Production Volume Unit Cost Sales Producer Related Data Plant ID Plant Capacity Company Names Cities States River Basin River Basin Code Parent Companies TOTALS No. of Unique Values 2 1,621 597 518 1 1 500 8 640 224 1,794 b b 90 5,285 b 6 6 b 4 3 9 b 37 b b 5 b 15 b b b 1,246 b 615 748 49 321 326 182 14,853 No. of Occurrences 1,944 1,944 597 525 224 135 506 1,131 1,133 318 2,763 784 711 106 5,426 439 335 335 335 597 134 134 135 597 170 185 597 1,470 649 287 348 176 3,703 1,190 1,346 1,346 1,346 1,082 1,081 340 36,604 Total Computer Storage Volume in No. of Bytes 1,944 8,720 14,925 5,250 224 135 20,240 2,262 28,325 954 138,150 6,272 4,266 318 217,040 3,512 1,005 1,005 2,680 5,970 402 402 1,080 5,970 2,496 1,850 5,970 1,470 1,298 2,296 2,784 1,408 18,515 10,710 33,650 26,920 2,692 32,460 8,648 8,500 632,718C Table 2. List of Key Data Elements in OCPDB Chemical Product Entries Name OCPDB Number CAS Number Wiswesser Line Notation Use Description Original OCPDB Chemical Indicator Priority Pollutant Indicator Synonym Process OCPDB Number Process IPPEU Number Producer Entries OCPDB Number Parent Company Name Producing Company Name City State River Basin EPA Region National Technical Information Service. Three major types of directed retrieval are possible using the OCPDB: informa- tion about chemical products, informa- tion about producers, and relationships between producers and products. Retrieval is possible through use of any of a number of "key" data elements which are listed in Table 2. Using these retrieval "keys" to access the files, the sorting, filing and reporting possibilities are virtually limitless. 'As of the date of this interim report. bFor non-key data files, the number of unique values is not known. cWith indices and other internal data base tables, the total number for the entire data base is approximately 2,000,000 bytes. OCPDB producers. River basins in which OCPDB producers are located are listed in Appendix C. Information about accessing the system may be obtained by contacting the project officer. This publication is a summary of the complete project report, which can be purchased from the a U.S. OOVERNMENT PBINTINO OFFICE: 1981 -757-064/0233 ------- G. £. Wilkins, C. H. Tucker, andE. D. Gibson are with Radian Corporation, Austin, TX 78766, Ms. Audrey McBath is the EPA Project Officer (see below) The complete report, entitled "The Revised Organic Chemical Producers Data Base System," (Order No. PB 1 99 805; Cost: $11,00 subject to change) will be available from: National Technical Information Service 5285 Port Royal Road Springfield, VA 22161 Telephone: 703-557-4650 The EPA Project Officer can be contacted at: Industrial Environmental Research Laboratory U.S. Environmental Protection Agency Cincinnati, OH 45268 ------- |