United States Environmental Protection Agency Industrial Environmental Research Laboratory Cincinnati OH 45268 Research and Development EPA-600/S2-84-034 May 1984 Project Summary Organic Chemical Producers Data Base Development and Update Robert Soklow This report describes modification, content expansion and update activities performed on the Organic Chemical Producers Data Base (OCPDB). The primary function of this data base is to provide a means of quick access to reliable chemical, producer, process and end use data for the industrial organic chemicals industry. A brief description is given of the OCPDB system structure as implemented under System 2000®, a data base management system (DBMS). A discussion of OCPDB data is presented, describing the types of information available in the i?ata base. Revisions made to the OCPDB schema for the incorporation of additional data are described and a newly developed standard report for chemical uses is presented. Content expansion, verifica- tion and update activities for all data are discussed in detail and appropriate reference material is cited specifically for each component of data. This Project Summary was developed by EPA's Industrial Environmental Research Laboratory, Cincinnati. OH, to announce key findings of the research project that is fully document- ed in a separate report of the same title (see Project Report ordering information at back). Introduction The original OCPDB was developed in 1976 for EPA's Industrial Environmental Research Laboratory in Cincinnati (IERL- Cl). Initially, the OCPDB consisted of 380 chemicals and 610 producing plants, which were incorporated into a iSSystem 2000* is a registered trademark of Intel Corporation computerized information system in order to provide ready access to data. Since 1976 the OCPDB was revised and updated and now contains chemical, economic, process, end use, and producer related data for 605 chemicals. During this developmental period, the capabilities of the OCPDB were expanded by the addition of chemical and producer related data. These additions include: • Chemical Abstract Services (CAS) Registry Numbers • Process Data • River Basin Codes • EPA Region Numbers for Producers • Chemical Use Data • Standard Nomenclature for Use Descriptions • Standard Industrial Classification Codes The OCPDB was implemented in 1979 with System 2000. This data base man- agement system (DBMS) has widespread availability and offers substantial pro- gramming flexibility and excellent capabilities for retrieving, reporting and analyzing data. The OCPDB was revised and updated during this project using the System 2000 DBMS. A major focus of this project was to enhance the chemical use presentation capabilities of the OCPDB. In this regard, data expansion and update activities, modifications to the schema and the creation of a new chemical use report ------- have added new utility to the OCPDB. With sources of data used during previous updates (including the latest editions of these sources) and with addi- tional sources, a content update was performed during which producer locational data and chemical use data were added to the data base. Several components of chemical toxicity data were removed. Technical Discussion The structure and information retrieval capability of the OCPDB is defined by the "Key Data Elements" shown in Table 1. Data pointers exist in the OCPDB, creating links between these key data elements and other types of entries. Data within each entry are organized into a logical hierarchical structure, as presented in Figure 1. Modifications performed to the OCPDB schema facilitated the inclusion of additional data to the producers record group (a group of data elements which may occur more than once under the same entry on a given level). Zip code. Air Entry Type = 1 Entry Type = 2 Level 0 Table 1. Key Data Elements Chemical Product Entries Chemical Name Chemical ID Number CAS Number Wiswesser Line Notation New Chemical Indicator Priority Pollutant Indicator Production Year Chemical Use Name Chemical USE IPPEU Number Chemical Use SIC Chemical/Industrial Use Name Chemical/Industrial Use SIC Name Synonym Process OCPDB Number Process ICPDB Number Reactant Chemical Name Reactant OCPDB Number Reactant ICPDB Number Reactant SIC Number Producer Entries OCPDB and ID Number Parent Company Name Producing Company Name City County State Zip Code River Basin Name River Basin Code EPA Region AQCR Code Level 2 Level 3 Figure 1. General Hierarchical Structure of the OCPDB. Quality Control Region (AQCR) code, and chemical use Standard Industrial Classification (SIC) code are three non- KEY data components added to the OCPDB during this activity. In another modification, a Chemical Use Report was created to enable users to locate chemicals according to use. In this report, a list of OCPDB chemicals is gen- erated for a given use. Each chemical is listed along with its corresponding CAS registry number, amount produced yearly for each use, and the percent of total domestic production of each chemical for each use. Reference IPPEU numbers indicating the process(es) used to manu- facture each chemical, as well as a list of producing plants for each of these chemicals, are also presented in the Chemical Use Report. A knowledgeable System 2000 user can employ Natural Language commands to retrieve pertinent data for the broad spectrum of industrial and environmental information contained in the OCPDB. However, the System 2000 Natural Language commands are limited in their capability to perform computations and produce special output formatting. Moreover, to engage the System 2000 Natural Language commands efficiently, the user must have previous experience in the use of the System 2000 DBMS and other computer systems. To alleviate such user restrictions and limitations, a . program library of several standard report formats was developed during an earlier OCPDB development activity to retrieve and display key OCPDB chemical and producer data. S-CUBED continued the development of this program library using the System 2000 Procedural Language Interface methodology. Currently, the OCPDB program library is comprised of the eleven standard report formats listed in Table 2. The Product Data Report (PDR), one of the standard reports available in the OCPDB program library, presents all the information contained in the OCPDB. Chemical, economic, chemical use, process and chemical producer information are the four categories in which this information is presented. In Table 3, data components contained within each of these categories are listed. The selection of literature used to update the OCPDB was based on an investigation to determine the most accurate, complete and readily available information. The most up-to-date versions of literature utilized during pre- Table 2. Standard Data Report Formats Description Plants and Product Slates Plants and Product Slates (by EPA Region) Plants and Product Slates (by River Basin) Product Slate Chemicals and Production Sites - Nationwide Chemicals and Production Sites - Statewide Product Data Report - (PDR) Chemical Producers Chemical Use Report (Original Version) Minimum Site Search (Input Required) Chemical Use Report (S-CUBED Version) ------- Table 3. OCPDB Data Components Chemical Data Chemical Identification Number Priority Pollutant Flag CAS Registry Number Wiswesser Line Notation Synonym NIOSH Registry Number Use Data Use Description Use Amount Percent Domestic Use IPPEU Reference Number for Uses Use SIC Code Economic Data Annual Production Volume Annual Sales Unit Cost Product Process Data Process Description IPPEU Reference Number Reaction Components Industrial Origin of Reactants SIC Code Ancillary Process Material Producer Data Name Identification Number City State Zip Code County River Basin Code River Basin Name AQCR Process Capacity vious maintenance activities, as well as previously unused literature, were assessed. Literature selected to perform the update was used to supplement incomplete information and construct the new data. A discussion of work performed to update chemical, economic, use, product, process, and producer- related information is presented in the project report. The literature used to perform this update should be referred to when developing an appropriate maintenance protocol for the OCPDB. The main benefit of this most recent data expansion, update and modification to the OCPDB is that this data base can now be referred to when drawing chemical-chemical lines of evolution (trees). In this way, process chemicals used during the manufacture of a given OCPDB chemical are accounted for. Also accounted for are those chemicals for which a given OCPDB chemical is used as a process chemical. This tree concept can be developed further to produce a standard report for major evolutionary roots, thus enabling specific queries regarding finer levels of the derivation and fate of OCPDB chemicals. Additional software should be developed for a procedure to conduct specific queries by interactive dialogue with the OCPDB at a Video Display Terminal. The project report for this effort provides detailed information on the following: • Historical background of the OCPDB and a discussion of the most recent modification and update activities. • Recommendations pertinent to on- going and future OCPDB maintenance and update efforts. • Discussion of the System 2000 DBMS structure and the information retrieval capabilities of the OCPDB. • Operations conducted for the incorporation of additional data. • Operations performed to increase the OCPDB's data reporting capabil- ities. • Description of the OCPDB data. • Activities performed to update outdated information, supplement missing information and derive information for new data. Extramural requests standard reports. for OCPDB • Discussion of each of the eleven standard reports available in the OCPDB library along with a sample excerpt for each report type to illus- trate format and content. Finally, the report provides appendices which present a complete list of all OCPDB chemicals and a list of the producers of the 605 chemicals in the OCPDB. ------- Robert Soklow is with S-Cubed, San Diego, CA 92121. MarkJ. Stutsman is the EPA Project Officer (see below). The complete report, entitled "Organic Chemical Producers Data Base Develop- ment and Update." (Order No. PB84-148 204; Cost: $13.00, subject to change) will be available only from: National Technical Information Service 5285 Port Royal Road Springfield, VA 22161 Telephone: 703-487-4650 The EPA Project Officer can be contacted at: Industrial Environmental Research Laboratory U.S. Environmental Protect/on Agency Cincinnati, OH 45268 •ft U.S GOVERNMENT PRINTING OFFICE; 1984 — 759-015/7697 United States Environmental Protection Agency Center for Environmental Research Information Cincinnati OH 45268 Official Business Penalty for Private Use $300 , b°b ------- |