U.S. Environmental Protection Agency
                      Office of Toxic Substances
                             401 M Street S.W.
                        Washington, DC 20460
Toxic Release Inventory
   Productivity Review
        Final Report
                                   July 14,1990
                 Contract #68-W9-0037 Delivery Order #037
                         Lucille C. Henschel, D.O.P.O.
                                     Prepared by

                        Booz* Allen & Hamilton
                            4330 East West Highway
                       Bethesda, Maryland 20814-4455

                         TABLE OF CONTENTS

I. Executive Summary	1
      I.I. Study Approach	1
      1.2. Description of TRI	2
      1.3. Assessment Summary	2
            1.3.1.  Data Accuracy, Timeliness and Cost	2
            1.3.2.  Conclusion	3
      1.4. Summary of Issues and Recommendations	3
            1.4.1.  Management	3
            1.4.2.  Technical	5
II. Introduction	7
      n.l.  Background	7
            H.1.1. History of TRI	7
            H.I.2. Summary Description of TRI	8
            H.1.3. TRI Future Possibilities	10
      H.2.  Purpose of Study and Report	10
      H.3.  Study Approach	11
III. Productivity Assessment	13
      HI.l.  Productivity Status Summary	13
            ffl.1.1. Data Accuracy	13
            HI.1.2.  Timeliness	14
            m.1.3. Cost	14
            ni.1.4.  Conclusion	15
      IH.2.  Productivity Issues	16
            III.2.1.  Management Issues	16
                  IH.2.1.1.  Introduction	16
                  ni.2.1.2.  Changes in Regulatory Environment	16
                  m.2.1.3.  Productivity Standards	17
                  III.2.1.4.  Planning Environment	17
                  III.2.1.5.  Contractor Organization	18
                  m.2.1.6.  Contractor Statements of Work	20
                  m.2.2.1.  Introduction	20
                  ffl.2.2.2.  Data Input Technology	20
                  HI.2.2.3.  Form Storage and Access	22
                  m.2.2.4.  Form Design	22
                  IE.2.2.5.  Develop Integrated Facility File	23
            HI.2.3.  Conclusion	23
                  m.2.3.1.  Management Effectiveness	23
                  HI.2.3.2.  Technical Effectiveness	24
IV. Current EPA Initiatives	27
      IV.l. Introduction	27
      IV.2. Productivity Standards	27
            IV.2.1. SOW modifications	27
            IV.2.2. Form verification by reporting parties	27


      IV.3.  Planning Environment	28
            IV.3.1.  Hardware Replacement Planning Efforts	28
      IV.4. Data Input Technology	29
            IV.4.1.  Magnetic Media Improvements	29
            IV.4.2.  OCR Data Input Pilot	30
      IV.5.  Facility File Development	30
      IV.6. Conclusion	31
V. Recommendations to Improve TRI	33
      V.I. Management	34
            V.I.I. Impact of Change in Regulatory Environment	34
                  V.I.1.1.  Issue Summary	34
                  V.l.1.2.  Alternatives	34
                        V.I.1.2.1. Minor System Modifications	34
                        V.l.1.2.2. Reassess the System	35
                  V.I.1.3.  Recommendation	35
            V.1.2. Clarification of Productivity Standards	36
                  V.l.2.1.  Issue Summary	36
                  V.l.2.2.  Alternatives	36
                        V.l.2.2.1. Status Quo	36
                        V.l.2.2.2. Define Productivity Standards	36
                  V.l.2.3.  Recommendations	37
            V.I.3.  Strengthen Planning Environment	37
                  V.l.3.1.  Issue Summary	37
                  V.l.3.2.  Alternatives	37
                        V.l.3.2.1. Plan for ad hoc requirements	37
                        V.l.3.2.2. Establish a formal software
                        development  process	38
                        V.l.3.2.3. Establish a deliberate hardware
                        replacement process	39
                  V.l.3.3.  Recommendation	39
            V.I.4. Change Contractor Organization	39
                  V.l.4.1.  Issue Summary	39
                  V.l.4.2.  Alternatives	40
                        V.l.4.2.1. Status Quo	40
                        V.I.4.2.2. Separate Contractors for Facility
                        Operations and Data Reconciliation	40
                        V.l.4.2.3. One Contractor for Facility
                        Operations, Development, and
                        Reconciliation	41
                  V.l.4.3.  Recommendation	42
                                 Page ii


      V.1.5. Revise Contractor SOWS	42
            V.l.5.1.  Issue Summary	42
            V.l.5.2.  Alternatives	42
                  V.l.5.2.1. Status Quo	42
                  V.l.5.2.2. General Revision of SOWs	42
            V.l.5.3.  Recommendation	43
V.2.  Technical	43
      V.2.1. Upgrade Data Input Technology	44
            V.2.1.1.  Issue Summary	44
            V.2.1.2.   Alternatives	44
                  V.  Keyboard Entry Enhancements	44
                  V. Magnetic Media	45
                  V.  OCR Scanning	45
            Recommendation	46
      V.2.2. Upgrade Form Storage and Access	46
            V.2.2.1.  Issue Summary	46
            V.2.2.2.  Alternatives	46
                  V. Paper Document Storage and
                  Retrieval	46
                  V. Microfiche	47
                  V. Microfilm	48
                  V. Electronic Image Capture and
                  Optical Storage	48
            V.2.2.3.  Recommendation	49
      V.2.3.  Redesign Form R	49
            V.2.3.1.  Issue  Summary	49
            V.2.3.2.  Alternatives	49
                  V. Status Quo	49
                  V. Redesign Form R	50
            V.2.3.3.  Recommendation	51
                           Page iii


Appendix I Glossary of Terms	53
Appendix II Management Information	55
      EPA Organization and Responsibilities	55
      Contractor Organization and Responsibilities	55
Appendix III. Technical Information	59
      Hardware	59
           LAN System	59
           Detailed Hardware Assessment	59
                 LAN	60
                 Storage Capacity	60
                 Security	60
                 Uninterruptible Power Supplies	60
                 Disk Mirroring	61
                 Back-Up Hardware	61
                 Access Time	61
           Mainframe System	62-
      Software	62
           LAN Software	62
           Mainframe Software	62
           Detailed Software Description and Assessment	62
                 LAN Software	62
                 Data Transfer Software	63
      Process flows	64
      Data Flow Diagrams	69
Appendix IV Documents Reviewed	77
Appendix V Data Collection Participants	79
                              Page iv

                      I.  EXECUTIVE SUMMARY

   In February 1990, the Office of Toxic Substances (OTS) of the
Environmental Protection Agency (EPA) requested Booz«Allen & Hamilton
Inc. conduct a productivity review of the national Toxic Release Inventory
(TRI). TRI is a public database which resides on EPA's mainframe at Research
Triangle Park, N.C. and the National Library of Medicine's (NLM) Toxicology
Network (TOXNET). The purpose of the study is to provide an independent
assessment of the strengths and weaknesses of the program as it exists in June
1990 and to  provide recommendations for improvement.

   This report, the final deliverable  in this task,  is an assessment of the
current productivity of TRI, including software,  hardware, and management
activities.  It also evaluates current improvements EPA is making to increase
the effectiveness of the system.  Finally, this report contains
recommendations for  further improvement to TRI.


   The mission for TRI as defined by Title in of the Superfund Amendments
and Reauthorization Act (SARA) of  1986 is to collect and make available toxic
chemical release information to the public.  However, for this study, Booz,
Allen utilized the following standards to define  productivity for TRI:

      •    Accuracy: The degree to which the data entered into the database
           is correct, both in terms  of data entry accuracy and detection and
           correction of reporter errors.

      •    Timeliness:  The extent to which EPA was able to provide the TRI
           database to NLM in a timely manner.

      •    Cost:  The adequacy of funding for TRI and the extent to which
           the above criteria impact funding requirements.

   These three criteria represent the fundamental performance dynamics that
surround TRI and its user requirements and provide a useful framework in
which to evaluate the  overall productivity of the system.

   Productivity for TRI is determined by the quality of the system architecture
and the management operations performance that supports TRI.  The
interplay between the criteria that define productivity and the system and
management actions that constitute TRI is complex.  Short-term productivity
of TRI is derived from effective management exercised within the
technological constraints of the system as it currently exists. Long-term
productivity is related to successful  refinement of the existing system,

TRI Productivity Review                                            T

                                                  Executive  Summary
including the introduction of new technology to improve productivity and to
meet new, emerging requirements.


   The Emergency Planning and Community Right-to-Know Act, Title ffl of
the Superfund Amendments and Reauthorization Act (SARA) of 1986,
requires facilities which manufacture, process, or use any of the specified toxic
chemicals to report annually the amounts of these chemicals released directly
to air, water, or land or that are transported to off-site dumping facilities.
These reports are due on July 1 of each year.  The same law requires EPA to
establish a national TRI and to make this information available to the public
annually via  telecommunications and other means.

   Reports are submitted by industry on a Form R to the Title in Reporting
Center (TRC) in Washington, D.C. The data is processed onto a LAN at the
center, verified for accuracy, and then periodically uploaded to a database
located on EPA's mainframe at Research Triangle Park. The data is then
analyzed for accuracy, and erroneous records in the database are corrected by
EPA and contractor personnel. Finally, when an acceptable level of data
quality is reached, the data is delivered to NLM.


   This section provides an overview of TRI productivity as defined by the
productivity standards


   In TRI's first reporting year (RY87), EPA achieved a data entry accuracy
level of 97.5% for all data fields in all records in the system. In RY88, EPA
reached 98.5%.  However, this level of accuracy is not considered by EPA staff
sufficient to make the data completely useful to users, and additional efforts
are underway to insure a higher level of accuracy for RY89. Currently, EPA
attempts to not only correct data input errors but also ensures that
information provided by reporters is as accurate as possible. TRI
management's goal is to achieve "near 100% accuracy for certain key data
fields," particularly release values. EPA has expended a large amount of effort
to ensure data accuracy, and the result is that the quality of the data is high
and will improve more as further modifications to the system are made.

   TRI's RY87 data was released to NLM on June 19,1989, and the RY88 data
was released on May 29,1990. The Information Management Division (IMD),
which is responsible for TRI operations, has established an internal goal of
having the data ready for release to the public through NLM nine months
after the filing deadline. So far,  this goal has not been achieved.

TRI  Productivity Review                                            2

                                                 Executive  Summary
   Initial planning for TRI projected variable processing costs for TRI at
$18.97 per form, assuming only very basic data quality controls.  TRI has been
given recurring funding for data entry at a rate of $12.00 per form and was
given approximately $250,000 (a little over $3.00 per form, assuming 80,000
forms) for additional data quality activities.  In FY90, TRI was given a
Congressional supplemental (non-recurring funds) of $240,000 for IMD's data
normalization activities and was funded $25,000 for reporter verification of
the forms (Chemical Manufacturer's Association members).  Including this
one-time funding, TRI is still operating with less funding than originally
projected in 1987 to be necessary while simultaneously trying to meet more
stringent data quality objectives.

   1.3.2. CONCLUSION

   EPA has placed an extraordinary level of emphasis on data accuracy in TRI
due to the public nature of the database. This level of accuracy has been
achieved at significant cost as resources and attention have been diverted
from other activities to focus on extensive data quality /accuracy.  In
addition, this emphasis on accuracy has resulted in significant delays in
releasing the database to the public through NLM, preventing EPA from
meeting timeliness standards.  This situation is exacerbated by the lack of

   Ultimately, EPA must achieve a balance between data accuracy, timeliness,
and cost.  Establishing unambiguous, realistic, and achievable levels for these
standards is necessary if management and technical stability for the program
is to be achieved.


   This section identifies critical areas where improvements can be made to
enhance overall performance and  provides our recommendations in these

   1.4.1. MANAGEMENT

   The following issues were identified during the study as areas where
improvement could be made to EPA's management approach to TRI:

      •    Impact of Change in Regulatory Environment: The potential for
           changes in regulations, which will  significantly increase the
           number of data fields stored in the database and/or the number of
           reporting parties, is high.  Should this occur, EPA will need to
           determine the most appropriate means of adapting the system to
           meet the new requirements.  Alternatives  include modifying  the
           current system to meet the new requirements or completely
           redesigning the system. Recommendation: Selecting a specific

TRI Productivity Review                                            3~

                                                  Executive Summary
           alternative in this case is not possible until the full impact of the
           changes is known.

      •     Clarification of Productivity Standards: In order for TRI to know
           whether or not it is accomplishing production goals, productivity
           standards must be clearly defined for data accuracy levels and
           timeliness.  Although an internal timeliness goal of 9 months
           from the reporting deadline has been selected by IMD, an
           unambiguous goal has not been identified for data accuracy.
           Selecting a specific goal for data accuracy and articulating these
           goals would enable EPA to measure productivity against an
           accurate standard, and facilitate the prioritization of resources.  It
           would also focus staff and contractor efforts on meeting a specific
           productivity goal. Recommendation: Establish unambiguous
           goals for data accuracy and timeliness.

      •     Strengthen Planning Environment:   A formal planning process
           within TRI does exist, but plans are often not followed through
           due to the responsibility of the public system to respond in a
           timely manner to demands from  external parties.  Although this
           response capability is critical for TRI, the time and funds spent
           responding to ad hoc requests often impede  long-term growth of
           the system. Three planning areas are critical in improving the
           planning process: planning for ad hoc requirements, establishing
           a formal software development process, and establishing a formal
           hardware replacement process. Recommendation: All three
           alternatives should be implemented.

      •     Change Contractor Organization: A fundamental issue that
           affects TRI operations  is the functional responsibilities that have
           been assigned to the contractors who are working on TRI.
           Currently, three contractors share the operational responsibilities
           for TRI, and this results  in a complex structure with some
           functional overlap. Three practical alternatives exist for TRI:
           maintain current tasking arrangement, separate contractors for
           facility operations and data reconciliation, or utilize one
           contractor for facility operations,  software development and data
           reconciliation.  Recommendation: EPA should utilize one
           contractor for TRI operations.

      •     Revise Contractor Statements of Work (SOW):  The SOWs of
           TRI's two primary contractors were written to support TRI during
           its start-up phase. Therefore, several activities listed are obsolete
           and current standards  for productivity are not reflected.
           Although the present contractors are performing well, this
           recommendation addresses the benefits to be gained by modifying

TRI Productivity  Review                                             4

                                                  Executive  Summary
           these SOWs in either present or future contracts to further protect
           the government's interest and to provide incentives for
           continued high performance. Recommendation:  EPA should
           rewrite its contractors' SOWs.

   1.4.2. TECHNICAL

   Fundamentally, EPA's technical approach to TRI's information system is
sound, but enhancements can be made in the following areas:

      •    Data Input Technology:  A primary area where application of
           new technology has the potential to significantly increase TRI
           productivity is the data input process. Enhancements in the
           software utilized for manual keying, as well as magnetic media
           submissions and OCR scanning could have potentially large
           impacts.  Recommendation:  EPA is already addressing this issue
           with regard to magnetic media and OCR scanning and should
           continue its efforts to improve data input speed and accuracy.
           Since the data entry software is so crucial to TRI success, we
           recommend continued efforts to improve the design and
           functionality of the current data input software.

      •    Upgrade Form Storage and Access: With over 25,000 TRI
           facilities submitting more than 80,000 five page forms annually to
           the TRC, document filing, storage,  and retrieval has become a
           prominent issue for the overall success of the system. EPA has
           recognized that the current practice of storing paper forms in
           filing-cabinets is neither cost effective nor practical and is
           investigating alternative storage means.  Recommendation  EPA
           should select and implement optical disk storage for Fonn Rs.

      •    Redesign Form R:  There are many concerns regardin ^ the
           inefficient design of the present Form R as it is difficult for
           reporters to understand and hinders data entry and OCR
           scanning. This issue is time critical because present OMB
           approval for the form expires in January of  1991.  EPA will need
           to start the re-approval process  very soon to ensure that it is
           completed before the form expires. Two alternatives exist for this
           issue: retain existing form design or redesign the form.
           Recommendation:  EPA should redesign Form R and the
           instructions on how to complete the form.
TRI Productivity  Review

                                                 Executive  Summary
   Booz, Allen feels that implementation of these recommendations will
result in substantial improvement in TRI's productivity in terms of
timeliness and data accuracy. Adoption of the short-term recommendations
from this report and continued refinement of the information system and
procedures should result in tapes being ready for NLM in less than nine
months for RY89 without reducing data quality standards.  Completion of
actions on long term recommendations from this study should result in TRI
data entry taking six months or less within three years with high accuracy
TRI Productivity Review

                         II.  INTRODUCTION
      This section provides background information on the Toxic Release
Inventory System, including a summary of the history of TRI, an overall
high-level description of TRI operations, and a discussion of potential future
legislative and regulatory developments which will impact the system.


   The Emergency Planning and Community Right-to-Know Act, Title HI of
the Superfund Amendments and Reauthorization Act (SARA) of 1986,
requires facilities which manufacture, process, or use any of the specified toxic
chemicals to report annually the amounts of these chemicals released directly
to air, water, or land or that are transported to off-site dumping facilities.  This
legislation is based on the belief that the public has a "right-to-know" about
toxic chemicals in their communities.  Specifically, the legislation has two
main purposes to:

      •    Encourage response planning for chemical emergencies

      •    Provide the public with information on potential chemical
           hazards in  their communities.

   The same law requires EPA to establish a national Toxic Release Inventory
(TRI) and to make this  information available to the public annually via
computer telecommunications and other means.

   Although SARA was passed on October 17,1986, administrative rules
governing TRI were not finalized until February 1988.  These rules required
facilities to submit their first Form R in July of the same year.  Although
preliminary planning and staffing for TRI began prior to February 1988, the
system design could not be finalized until after rulemaking was completed
and specific reporting requirements were formalized. Consequently, EPA had
approximately six months to finalize its plans and procedures, to establish a
facility for processing the documents, and to implement an information
system. Given the daunting task of designing a system with so many
unknowns (e.g., number of reporting facilities, complexity of release
estimates, and other difficulties), the successful establishment and
organization of TRI was a remarkable accomplishment.

   During this initial planning phase, EPA chose the National Library of
Medicine's (NLM's) Toxicology Network (TOXNET) as the vehicle to satisfy
the Congressional telecommunications requirements.  Other methods of

TRI Productivity Review                                           7

public access have also been developed, including publications, diskettes, and
magnetic tape.

   TRI has continued to evolve operationally from its first year. The
following sections provide a high level view of TRI operations as they exist at
the time of this study.


   The process by which data is received and entered into the TRI database is
summarized below and in Exhibit 1. This process has five main components:

      •    Industry submits Form Rs, containing the required reporting
           information, to the Title IE Reporting Center (TRC) where they
           are processed, entered into a Local Area Network (LAN) database
           and a sample is verified for data accuracy.

      •    Records from the LAN database are periodically uploaded to
           EPA's mainframe computer (an IBM 3090) located at the National
           Data Processing Division in Research Triangle Park (RTF), North

      •    TRC staff, assisted by other contractors and EPA personnel, use
           terminals connected to the mainframe to run data quality reports,
           analyze the data for accuracy, correct records in the database, and
           standardize data (such as parent company names) across the

      •    Submitting facilities are contacted, when necessary, by EPA and
           TRC staff to notify them of noncompliance with the regulation
           and to resolve technical errors in the data.

      •    Data is transferred to NLM in segments of approximately 20,000
           records each after the data reconciliation process has been
           completed for that section. NLM indexes and loads the data on
           TOXNET. Data is also available on various other media, such as
           CD ROM, diskette, magnetic tape, and through the National
           Report and other publications.  The hardcopy forms for the
           current year are available for review by the public as soon as the
           data is entered into the network.

   Responsibility for TRI operations has been primarily delegated to two
divisions within OTS, the Information Management Division (IMD) and the
Economics and Technology Division (ETD) although other divisions also play
a role in outreach, analysis, and other areas. IMD has the responsibility for
data management implementation and operations while ETD is responsible
for overall program guidance, regulation development, and regulatory

TRI  Productivity Review                                            8

                                           TRI COMPONENTS
Reouired Section 313 Reporting
 Industries throughout the U.S.
                                                                  EPA Processing
                                                                              Release of
                                                                               to Public
                                       Shaded area represents focus of this study.
                                                                                                            Public Access

interpretation. Three main contractors assist EPA in TRI operations: one
with responsibility for the TRC, another which deals primarily with software
development and maintenance, and the third which assists with data quality

   More detailed information on the management and organization of TRI
(including an organizational chart) is found in Appendix H Technical  details
are located in Appendix HI.


      TRI was mandated by Congress to be a public system. The nature of
this mandate as well as the topical nature of the Form R data has resulted in
TRI receiving extensive public attention. This attention is continuing to
grow, and the public is demanding more information.  This momentum
may, in turn, result in additional reporting requirements for TRI. For
example, EPA is considering requiring new industries, such as federal
facilities, mining, agriculture, or utility companies to  comply with the
reporting requirements of Title HI. Also under consideration, is the addition
of certain data fields under the waste minimization section of the form or
requirements to report peak release  information.

   Additionally, TRI will soon be encountering two situations which will
require strategic decisions by management.  The first situation involves the
contracts with the the TRC facility contractor and the system development
contractor. Both of these contracts must be recompeted within the next
eighteen months. Additionally, in January 1991, Form R approval must be
once again obtained from the Office of Management and Budget (OMB).
These situations combined with the dynamic regulatory environment
provide an appropriate opportunity  for TRI to examine its current operations
in order to ensure that they are adequate to meet future requirements and


   The purpose of this study was to  conduct a productivity review of TRI.
Booz •  Allen & Hamilton Inc. was tasked with identifying current TRI
strengths and weaknesses and targeting specific areas where operations may
be improved to lessen costs, strengthen data reliability, and reduce the  time
required to release data to NLM.  The study focused in particular on TRI data
collection and receipt, data entry, quality control and assurance, storage,
retrieval and tracking and retention  of data submissions.

   This report, the final deliverable in this task, is an assessment of the
current productivity of TRI, including management, software, and hardware
activities. It also evaluates current initiatives that EPA is taking to increase

TR7 Productivity Review                                            To"

the effectiveness of the system. Finally, this report contains
recommendations for further improvement to TRI.


   This section describes the methodology utilized by Booz, Allen in
conducting the study through describing: the key standards utilized to
measure productivity for TRI, the management and technical perspective
used to evaluate TRI productivity, and our approach to understanding these

   Productivity for TRI is defined in this study by the following standards:

      •    Accuracy: The degree to which the data entered into the database
           is correct, both in terms of data entry accuracy and detection and
           correction of reporter errors.

      •    Timeliness: The extent to which EPA is able to provide the
           database to NLM for public use in a timely manner.

      •    Cost: The level of funding for TRI and the extent to which the
           above criteria impact funding requirements.

   These three criteria, which are interrelated, represent the fundamental
performance dynamics that surround TRI and its user requirements.  Users,
public and private sector, would like to have access to the data as soon after
the end of the reporting period as possible.  At the same  time, it is  critical that
the data in TRI be as accurate as possible as toxic chemical release numbers are
a key public indicator of environmental compliance and performance.
Locational consistency and accuracy are also important if fundamental
environmental decisions are  to be made for particular counties and cities
based upon this information. Finally, the amount of effort and resources
required to input and reconcile the data for TRI is high, and is of significant
concern to EPA. Therefore, as just described, there is a high degree of
interdependence between these criteria, and the overall performance of TRI is
strongly affected by the balance achieved between them.

   As shown by Exhibit 2, productivity for TRI is determined by the quality of
the technical  system architecture and the management operations that
surround TRI.  Productivity  issues that are fundamentally driven by
technology considerations  and the limitations imposed by the current
information system architecture were viewed from a technical perspective.
Issues that revolve around planning activities,  the organization of human
resources, procedures that control the behavior  of those  who interact with the
system and the day to day operation of the information system were
evaluated from a management perspective.

TRI Productivity  Review                                            TT

                                Exhibit 2

        Relationship between TRI Operations and Performance
               TRI Operations

TRI Perfonnance


   The interplay between these two perspectives is complex and fundamental
to effective operation of TRI. Short-term productivity of the system consists
of effective management exercised within the technological constraints of the
current system. Long-term productivity is related to successful refinement of
the existing system, including the introduction of new technology to improve
productivity and to meet new, emerging requirements.

   Our approach to this study was to understand the current productivity of
TRI in terms of data accuracy, timeliness and cost through assessing the
strengths and weaknesses of the management and technical aspects of TRI
operations. Additionally, we evaluated the tradeoffs that were necessary in
order to meet program requirements within budget constraints. Finally, we
developed  management  and technical recommendations to enhance the
performance of TRI.
TRI Productivity Review

                  in.  PRODUCTIVITY ASSESSMENT

   This section summarizes our assessment of the management and
technical operational effectiveness of TRI based upon the standards of data
accuracy, timeliness, and cost. We also discuss key management and
technical issues which are presently driving TRI productivity and in which
improvements can be made.


   An assessment of TRI implementation, must recognize that TRI, a public
database mandated by Federal law, was implemented under very tight time
constraints and that operational pressures to rapidly improve the system
have been intense. Unlike other information  systems which have an
internal, well defined set of users and user requirements, public access
systems are required to meet the sometimes conflicting requirements of a
broad range of users. This makes it difficult to establish consensus on goals
and objectives.

   Booz, Allen also realized that the performance expectations for TRI in
terms of data accuracy, timeliness, and cost are interdependent.  Efforts to
improve any one area are likely to result  in decreased performance in one or
both of the other areas.  A key aspect of TRI performance, therefore, is
understanding the tradeoffs that have been made between the performance

   The following sections provide discussion  on TRI's current level of
productivity as measured by accuracy, timeliness, and cost.


   EPA's original policy for data accuracy in TRI was defined as entering into
the database information exactly as it appeared on Form Rs submitted by
reporting parties, with a goal of achieving 92% accuracy for data entry.
Experiences during and after RY87 data input  demonstrated that this level of
accuracy was insufficient to meet the needs of users. As a result, EPA has
internally refined these data quality objectives significantly and has extended
the concept of data correction to include the interpretation of data submitted
in a non-standard format (i.e., chemical names with minor misspelling,
transposed latitudes and longitudes, etc.) by reporting parties on Form Rs as
well as errors made in the data entry process.

   Currently, procedures to ensure data accuracy are extensive.  EPA has
expended a large amount of effort in the data accuracy area, and the result is
that the quality of the data is high. TRI achieved an overall data input
TRI Productivity Review                                           1

                                                Productivity  Assessment
accuracy level of 97.5% for all data fields in all records in the system in RY87
and 98.5% in RY88. At this point, data accuracy standards require near 100%
accuracy (no specific target percentage has been defined by EPA) in critical data
fields, such as release amounts.  Where reporter errors are obvious and can be
changed, they are corrected by TRI personnel, and in other cases the reporting
party is contacted and required to provide correct data.


   IMD, which is responsible for TRI operations, has established an internal
goal of having the data ready for release to the public via NLM nine months
after the filing deadline. So far, this deadline has not been met. TRI's RY87
data was released to TOXNET on June  19,1989, and the RY88 data was made
available to NLM on May 29,1990.  In both years, start-up software, hardware,
and procedural problems as well as the need to correct errors in the data have
all significantly delayed completion of  the process.  Data entry procedure and
software changes were minimized between RY88 and RY89 in order to
stabilize the system, which should result in earlier public availability for the
1989 data assuming stable funding.

   External expectations concerning the time that should be necessary to
produce the database for a particular year for release to NLM vary from three
to six months. Given EPA's decision to ensure high data quality prior to
release of the database, the volume of data to be entered into the database,  and
the current level of staffing and funding, EPA has not been able to meet these

   111.13. COST

   During initial planning done prior to system implementation in 1987,
variable data processing costs for TRI were projected to be $18.97 per form*
assuming only very basic data quality controls. TRI was funded for data entry
(including data verification) at $12.00 per form and was given approximately
$250,000 (a little over $3.00 per form, assuming 80,000 forms) for  automated
data quality activities - Notices of Noncompliance (NONs) and Notices of
Technical Error (NOTEs). Subsequently, as indicated in the previous section,
data quality standards have been increased significantly.  In FY90, TRI was
  1986, Contract No, 68-02-4235, Task Order No. 2-24, May 1987, Debra Harper, Economics and
  Technology Division, Office of Toxic Substances, Washington, D.C. 20460. This cost estimate
  includes $1.15 for microfiche preparation, processing, and storage of Form R's and $1.15 for
  retrieval and refile of microfiche. Currently, EPA is storing paper copies of Form R's and does
  not use any advanced storage technology. These estimates have not been removed from the
  overall estimate because we believe that the actual costs of processing, storing, and retrieving
  paper copies of Form R's are at least as expensive as utilizing microfiche technology.

 TRI Productivity Review                                             U

                                               Productivity Assessment
given a Congressional supplemental (non-recurring funds) of $240,000 for
IMD's data normalization activities and was funded $25,000 for reporter
verification of the forms (Chemical Manufacturer's Association members).
Including this one-time funding,  TRI is still operating with less funding than
originally projected to be necessary and simultaneously trying to meet more
stringent data quality objectives.  The following circumstances reflect the
stress imposed upon the system by this situation:

      •    TRI staff have routinely exerted extremely high levels of personal
           effort to satisfy regular TRI operational demands as well as special
           requests. EPA realizes that the current level of personal effort
           cannot be sustained in the long run without causing burnout for
           key individuals.

      •    TRI periodically has to redirect resources to critical data quality
           areas.  An example of this is when significant numbers of EPA
           personnel assist the TRC with data reconciliation. This has a
           disruptive  effect on other responsibilities.

      •    Lower priority tasks, such as documentation of procedures, tend
           to slip significantly. As a result, communications and
           coordination between contractors suffers.

   Even if original cost estimates for TRI nrere overstated, the combination  of
funding the program at less than  the of,    ai estimates, increases in  costs due
to inflation, utilizing non-rr .urring fur ' , and significantly increasing data
quality standards has resided in a lev "• of mnding which is insufficient  to
allow TRI management .0 fully accc  plibh its task. The current funding
level also gives TRI very little flextf .iity in refining the system to make it
more productive.


   EPA has r *aced an extraordinary level of emphasis on data accuracy in
TRI. The rarrent objective of achieving "near 100% accuracy" in critical data
fields without setting a specific, finite accuracy goal has resulted in
continr Jly increasing efforts  to improve data accuracy.  A new initiative to
mail release information back  to all reporting parties for review and
correction demonstrates that EPA desires to achieve an even higher level of
d? :a accuracy for chese critical  fields for RY89.

   This emphasis on data accuracy has been achieved at a significant cost as
resources ana attention have been diverted from other activities to focus on
extensive d? >a quality /accuracy. In addition, this emphasis on accuracy has
resulted in significant delays in releasing the database to the public through
NLM, ar\f" as a result EPA has not met timeliness standards.
TRI ; roductivity Review                                             15

                                              Productivity Assessment
   Ultimately, EPA must achieve a balance between data accuracy and
timeliness.  Establishing realistic and achievable levels for these standards is
necessary if management and technical stability for the program is to be
   The next section of the report discusses the management and technical
configuration issues which affect the productivity of TRI.



   III.2.1.1.  Introduction
   TRI is a system in transition from an intense and creative start-up period
to a more stable, institutionalized mode of operation. The first two years of
operation have been  filled with one challenge after another as original plans
and standards were changed to meet reality, as problems in the system were
discovered and overcome, and as EPA adapted to the demands of a public
access system. Flexibility and the ability to solve critical problems as they
occurred, were of prime importance during the last two reporting years.

   Now, circumstances are changing. Staff and contractor personnel are
beginning to experience burnout and are no long?r as capable of high levels of
sustained effort as they once were.  Errors, dela>  and mistakes which were
expected in a start-up mode are t.o longer tolerable. In addition, expectations
for more rapid turnaround of data, better quality, and budget constraints are
exerting  additional pressure on the organization.

   The management issues discussed beliw must be addressed to change and
improve the system  in the present environment. Careful attention to
potential regulatory changes combined  with planning of long  term
software/hardware enhancement efforts, refinements  in management
control, and a reassessment of contracting strategies are all necessary if the
system is to meet current and future expectations.

   HI.2.1.2.   Changes in  Regulatory Environment
      The present regulatory environment includes several proposals which
could substantially increase the number of reporting parties and/or the
number of data elements collected.  For example,  new pollution prevention
requirements, which are being considered within the Agency,  could increase
the number of data elements by as many as 30 fields per form (the current
form has approximately 60 fields), resulting in a substantial increase in the
amount of data maintained in the TRI system and significantly altering the
database structure. Other proposals being considered would significantly
increase the number of reporting parties by adding SIC  codes of facilities or by

TRI  Productivity Review                                            16

                                               Productivity Assessment
adding new chemicals to the list.  Requirement changes of this type will
require careful evaluation and planning to ensure that they are met in a
timely manner without disrupting the stability of the system and to make
certain that the ability to meet Congress's public access goals is maintained.
See Chapter V, Section 2.1 for our recommendations in this area.

   HL2.1.3.  Productivity Standards
   Currently, significant confusion exists concerning productivity standards
for TRI.  Although IMD has established an internal timeliness standard for
TRI of nine months from the reporting deadline, expectations for earlier
release remain high, even within EPA. This creates additional pressure on
TRI management to attempt to satisfy a variety of timeliness goals.

   In the case of data accuracy, productivity standards have evolved from the
original level of 92% as  identified in a contractor's SOW to EPA's current
stated desire of achieving "near 100% accuracy for certain key fields,"
particularly release figures. This is an ambiguous standard as near 100% can
be defined in many different manners, therefore causing confusion.
Additionally, other than release data, "key fields" have not been clearly
identified, and an explicit standard for non-key fields has not been specifically
articulated.  Furthermore, this revised accuracy standard has not been
formalized or communicated  to all personnel as evidenced by the previously
mentioned contractor's SOW  which still contains the original standard.  (EPA
is aware of this issue and is responding to it.  Further discussion on the
response is provided in Chapter IV Section 2, Productivity Standards.)

   This lack of a clear standard has not translated into a low level of accuracy
for TRI as the actual level of data accuracy is quite high as stated earlier.
However, the confusion has caused several other problems:

      •    Data accuracy "success" for TRI is undefined. Without finite data
           accuracy goals, it is not possible to determine when TRI has met
           program objectives.

      •    Prioritization of enhancement efforts based upon their impact on
           productivity cannot be made in a quantifiable manner.

      •    Budgeting decisions are not being made based upon explicit
           productivity increases measured relative to a quantifiable goal.

   Chapter V, Section 2.2 discusses our recommendation in this area.

   1112.1.4.  Planning  Environment
   Management direction for TRI is often subject to numerous compromises
with regard to the implementation of originally scheduled plans.
Unexpected, ad hoc requests have frequently diverted resources from planned
activities. These requests are  often a result of TRI's need, as a public access

TRI Productivity Review                                            IT"

                                              Productivity Assessment
system, to respond to demands from external parties, especially Congress. TRI
management, therefore, feels pressure from the public as well as within the
Agency to meet these numerous ad hoc requests and to also produce the
database more rapidly, accurately, and cost efficiently at the same time.

   This responsiveness to ad hoc activities combined with TRI's need to focus
on solutions to short-term challenges during system start-up have often
resulted in a failure to meet planned development deadlines  and  to
adequately plan and execute system enhancements.  The penalties for failing
to meet internal deadlines are often very high as software has not always been
developed in a timely manner and enhancements have had to be postponed
for later reporting years.

   Another major planning issue for TRI involves the LAN hardware.  The
majority of the LAN hardware is at least two years old and soon will begin to
experience failures due to age. Additionally, the increased data load on the
LAN is adding stress.  Hardware failures during the critical data input period
following the July 1 reporting deadline can have a direct day for day impact
on timeliness and a very  disruptive effect on the data entry process. Positive
steps need to be taken to  ensure that reliability of the LAN does not
deteriorate due to age.

   Our recommendations for planning initiatives are found in Chapter V,
Sections, -

   IIL2.1.5.  Contractor Organization
   A number of factors, including existing EPA contract utilization policy,
and a lack of in-depth operational understanding which could only be known
after the system was online, resulted in a contractor organization  which has
significant operational overlap between contractors.  This, in turn, has
resulted in some duplication of effort as well as an increased  need for
coordination and direction to keep joint operations proceeding smoothly.

   Two contractors share the primary operational load for TRI and another
contractor plays a secondary role.  Computer Based Systems, Inc. (CBSI) is
responsible for operation of the TRC. Major activities performed  by this
contractor include receipt and storage of Form Rs (for RY87 they processed
79,784 submissions and for RY88 they processed 82,123 forms), data entry, data
searching and retrieval, and data quality/reconciliation support.

   SYCOM Inc., a subcontractor to Planning Research Corporation (PRO, is
mainly responsible for software development and maintenance.  This
contractor initially developed the data entry  software on the LAN and
currently provides ongoing support for software development and special
programming requirements for developing system enhancements, generating
reports from the mainframe, and responding to ad hoc requests for assistance.
SYCOM also completes certain day to day operational functions, such as

TRI Productivity Review                                            18

                                              Productivity Assessment
uploading data from the LAN to the IBM mainframe, and performs
troubleshooting activities on the LAN and for magnetic media data loads.

   Finally, PEL supports data quality efforts by providing expert advice on
chemical nomenclature^ data entry operators as Form Rs are keyed into the
LAN system. Furthermore, they assist with the review of NOTEs and NONs.
Further information on contractor organization and function can be found in
Appendix n.

   As a result of the current structure, multiple organizations are responsible
for overlapping functions.  Some examples of this are listed below:

      •    Data quality activities in the past have involved at least two
           contractors in order to run and analyze routine data
           reconciliation reports.  This has resulted in delays and errors
           which have caused data reconciliation to not be performed on
           some records in a timely manner. This issue is being addressed
           by EPA for RY89 by giving CBSI control of the data reconciliation

      •    Data input software that runs on the LAN is sensitive to the
           hardware and system software configuration. It is necessary for
           the TRC contractor, who operates the LAN, to coordinate
           hardware and software changes with the software development
           contractor in order to ensure that changes in hardware or system
           software do not cause failures in the application software.

      •    Some data revision activities require synchronized  activities by
           two contractors in order to successfully delete records from both
           the LAN and mainframe databases and reentry of modified
           information in cases where it is necessary to revise CAS  numbers.

      •    Magnetic media submissions that do not load properly are shifted
           to a different contractor for analysis and troubleshooting. The
           impact is that employees from the second contractor are pulled
           from normal responsibilities to assist with magnetic media.

   Within the next two years, the PRC/SYCOM and CBSI contracts are up for
renewal. At this point, there is sufficient time to evaluate the possible
options for contractor organization and tasking and ensure that any new
contracts issued reflect the best possible organizational match with TRI
operational requirements.

   Our recommendations for this issue are found in Chapter V, Section 2.4.
TRI Productivity Review                                            19

                                              Productivity Assessment  Contractor Statements of Work
   The SOWs for the two primary contractors were written during initial
system development, when knowledge of operational procedures was limited
and a significant need for flexibility in contractor tasking to enable response to
unforeseen circumstances existed. As a result, the SOWs, particularly in the
case of the software development and maintenance contractor, are written in
very general language with broad tasking.  Additionally, some initial
operational concepts, that were included in the SOWs but were either not
implemented or were subsequently altered significantly, have never been

   This contractor SOW issue is critical for TRI as two of its contracts will
need to be rebid during the next two years. Our specific recommendations for
this area are found in Chapter V, Section 2.5.

   IIL2.2.1.  Introduction
   With a short initial start-up period and limited funding, EPA was
significantly constrained in its choice for a  system architecture. Excessive
charges to utilize EPA's mainframe in RTF for online data entry as well as
slow response time during peak periods led TRI staff to consider a PC LAN or
a mini-computer based system as the only practical data entry architectures to
pursue  at the time.  Although a mini-computer solution was viable, it was
not chosen in order to maintain compliance with overall agency hardware
architecture requirements.  The LAN, however, was not a suitable platform to
serve as the repository for the TRI database for external connectivity,
reliability, and capacity reasons. Therefore, EPA selected a LAN for the data
entry operations and utilized the mainframe at RTF as the TRI database

   To meet its software needs, EPA selected industry standard tools to support
TRI development. Some data entry and data transfer software problems were
encountered during the start-up phase.  These problems have since been
overcome, and  over the two years of TRI production,  the application has
stabilized and steady progress is being made in upgrading the software.

   Fundamentally, this architecture, software,  and  EPA's overall technical
approach to TRI's information system, are sound; however there are several
areas where technical issues exist and where enhancements and
improvements to the system are possible.  These issues are discussed in  the
following section.

   III.2.2.2.  Data Input Technology
   The data input technology utilized by EPA has a major impact on the level
of timeliness and data accuracy which can  be achieved in TRI. Of the several
technologies which exist to transfer data to the computer from paper
collection media, EPA is currently using two methods, keyboard entry and

TRI  Productivity Review                                            20

                                               Productivity  Assessment
magnetic media entry and will be testing a third technology this year - Optical
Character Recognition (OCR) systems.

   EPA primarily relies upon manual keying of data by data entry operators
to enter information into the database.  Fundamentally, the data entry
software is sound although alternative mechanisms could be utilized to
enhance data quality checks.  However, manual entry of data is relatively
slow and provides many opportunities to introduce errors into the database.

   To improve upon the speed and accuracy of the data entry process, TRI
staff have encouraged the use of magnetic media for reporting.* When
reporters provide submissions in the correct format, data entry speed is much
faster than manual data entry* although not insignificant  (e.g.,  the loading
process can take over two hours for one disk with 49 submissions.), and data
entry accuracy is 100%. However, data entry from magnetic media
submissions has suffered in the past due to a lack of adherence  by reporting
facilities to EPA published standards.  The instructions for submitting
Form Rs on magnetic media were interpreted differently resulting in a
variety of submitted formats.

   Several commercial vendors have developed software to assist reporting
facilities in filling out Form Rs and creating properly formatted submissions.
However, even these commercially available products generated incorrectly
formatted files which caused errors when loading the data. This slowed data
input terribly because the incorrectly formatted submissions had to be
deciphered and corrected prior to loading the data into the database.  As EPA
staff realizes the substantial productivity gains which can  be realized with
correctly formatted magnetic media submissions, they are seeking to improve
this process.  This initiative is discussed in Chapter IV, Section  4.1.

   Finally, EPA is testing OCR technology to further attempt to speed data
entry and improve accuracy.  This technology will allow reporters to still
submit on the paper form, although Form Redesign would need to be
considered (see Chapter V, Section 3.3), as OCR can input typed or
handwritten reports by reading data directly from the form and entering it
into the database. To be an effective alternative for manual data entry, the
OCR scanner must be faster  and more accurate than a human.  However, if
the scanner does not accurately capture the data on the form, its speed and
cost benefits are quickly eroded due to the monitoring and intervention
* EPA only accepts magnetic media Form R submissions on nine track magnetic tape or
 microcomputer diskettes (either 5.25 or 3.5 inch formats) formatted in DOS 2.1 or higher from
 an IBM PC/XT/AT or compatible microcomputer.
" Tracking system data can be entered at the rate of 20-25 records per hour or 2.5 to 3 minutes per
 record. TRIS keying can be done at the rate of 8 records per hour or 7.5 minutes per record.

TRI Productivity Review                                            z

                                              Productivity Assessment
required by an operator to correct the scanner's mistakes. Chapter IV, Section
4.2 provides further detail on EPA's test of OCR technology.

   Booz, Allen's recommendations in this area are discussed in Chapter V,
Section 3.1.

   11122.3. form Storage and Access
   With over 25,000 TRI facilities submitting more than 80,000 five page
forms annually to the TRC, document filing, storage and retrieval becomes a
prominent issue for the overall success of the system.

   Even though all  the data on the Form R is  captured and entered into the
database, the paper  Form R must be retained to verify the database, answer
FOIA queries and ad hoc requests, and support administrative, civil, and
criminal actions by the EPA against TRI facilities. The current storage and
retrieval system is entirely manual as well as extremely laborious and slow.
Additionally, the physical storage requirements for a year's worth of TRI
submissions requires about 1200 square feet of storage space, including
sufficient working room around  file cabinets.*

   EPA staff has recognized that their practice of storing the Form Rs on-site
in filing cabinets will quickly overrun their current personnel and building
resources.  A January 1990 report by Mathtech, Inc for IMD examines the
feasibility of alternative records management  technologies to assist TRI
managers in condensing the volume and improving the accessibility of
Form R reports.

   A synopsis of the technologies examined in the Mathtech report are listed,
along with our recommendations, in Chapter V, Section 3.2.

   1112.2A. Form Design
   One of the basic  barricades to faster data entry is improperly completed
forms. The TRI Reporting Package for 1989 contains over 100 pages of
instructions, including examples and answers to common questions, on how
to complete the five page Form R. Still, reporting errors are common and
dramatically disrupt the data input process. The data entry software has
incorporated several edit checks  to trap  common errors; however, there are
far too many errors that cannot be classified and many require a supervisor's
attention to correct.

   Fundamentally,  the form's instructions are confusing and the form is
awkward to fill out. In the interest of reducing the number of pages a facility
must fill out, common items from the report were grouped together on the
* This figure came from Mathtech Inc's "Feasibility Study for Alternatives for EPA Form R
  Records Management" January 1990.

TRI Productivity Review                                            22

                                              Productivity Assessment
first two pages. Consequently, when filling out the last three pages, the
reporter must constantly refer to the first two pages.  All too often, the
submitter overlooks or erroneously enters required references to data on
pages one and two.  A strong case can be made that whatever time is saved by
placing common items on the first two pages is lost in the confusion and page
flipping required when filling out the remainder of the form.

   One potentially dangerous flaw on the form is the lack of an identifier on
each page of the form.  Should any of the remaining pages be separated from
the first page, there is no identifier to match it with its proper Form R. Strict
document handling procedures implemented by TRC staff have thus far
prevented the loss or misfiling of Form R pages. However, the potential for
chaos is high.

   In addition, the present design of the form may impact on the
effectiveness of the OCR technology  in the data input process.  EPA's current
pilot test will allow TRI to determine the extent of these impacts, if any.

   Our recommendations for  Form R can be found in Chapter V, Section 3.3.

   IH.2.2.5.  Develop Integrated  Facility File
   In the first year of TRI operations, the data entry software required
operators to enter complete name and address information for each reporting
facility. Furthermore, for facilities that submit multiple Form Rs, the facility's
entire address was keyed into the TRI database for each Form R submitted
even though the address did not change. Also, facilities used a variety of
abbreviations for company names, as well as for city and county names.
These variations in abbreviations made accurate retrieval and use of data
very difficult.  For example, a  report requesting all releases for the city of San
Francisco would not pick up data where the city name was abbreviated SF,
San Fran, or any other variation.

   Additionally, facilities may change names from year to year.  Without an
integrated facility file, there is no way of tracking these name changes. As TRI
data may be used by the EPA as evidence in administrative, civil or criminal
actions against TRI facilities, EPA absolutely requires the ability to precisely
prepare release reports for a facility in order to successfully execute a case
against a facility.

   EPA is in the process of implementing a facility file to improve this
situation. See Chapter IV, Section 5  for a description of this effort.


   IIL2.3.1.  Management  Effectiveness
   Overall management effectiveness for TRI must be considered in the
context of the start-up of a new and unique system. As a public access system,

TRI Productivity  Review                                            23~

                                              Productivity Assessment
the managers of TRI were faced with the requirement to rapidly design and
implement an information system with unique requirements. A high level
of commitment and personal effort has enabled TRI to:

      •    Make the database available to the public, fundamentally
           fulfilling the legal mandate under SARA.

      •    Make significant improvements to data accuracy and the overall
           effectiveness of the data entry software for RY88 and RY89.

      •    Implement several long term initiatives to improve  the data
           input process and upgrade the capability to do data reconciliation.

   At this point, the program is  fulfilling its basic mission of providing public
access to toxic release data, and is providing a very high quality database to the

   Although TRI management has been successful  in implementing TRI,
there are several areas where improvements can be  made:

      •    TRI is beginning its third reporting year and is still being
           managed in a manner more appropriate for a start-up system.
           Ad hoc activities interfere significantly with  long  term planned
           activities and contractor task redirection is common  place.

      •    Productivity standards have not been definitively established,
           and, in the case of data accuracy, the standard has changed
           significantly over time without overt management decision.

      •    The distribution of tasks among contractors is complex as
           discussed previously and tasking overlaps require unnecessary
           coordination and communication to accomplish routine
           operational activities.

   The fundamental challenge for TRI management is to find the time and
perspective to continue to meet  operational responsibilities, and at the same
time capitalize on upcoming opportunities for improvement to  the system.

   IIL2.3.2.   Technical  Effectiveness
   TRI has experienced growing pains typical of an  emerging  production
system. However, the architecture, which was selected, remains
fundamentally sound. As PC-based LANs do not have the hardware and
software reliability present in mini-computer and mainframe computer
systems, TRC operations have and will continue to  experience equipment
breakdowns and other reliability problems, ranging from the merely
annoying (such as jammed printers) to those which interfere significantly
with production (complete LAN failures). These reliability problems are not

TRI Productivity Review                                           24

                                             Productivity Assessment
insurmountable if careful system and configuration management practices
are followed.

   So long as data entry requirements for the system do not change
substantially, in terms of the number of fields or the number of records/ the
combination LAN/Mainframe architecture should be satisfactory. Constant
attention to good system management, hardware replacement, and data
upload procedures is necessary in order to keep the LAN functioning
   The following two sections in this report will address ongoing EPA
initiatives to upgrade TRI, and Booz, Allen's specific recommendations to
improve the system.
TRI Productivity Review                                          25

                   IV.  CURRENT EPA INITIATIVES

   EPA is already aware of many of the management and technical issues
which were discussed in the previous chapter and their effects on overall TRI
performance. To attempt to improve performance, EPA is specifically
addressing some of these issues through management initiatives. These
initiatives, which come in the form of either operational modifications or
pilot projects, seek to enhance TRI productivity through improving data
quality and timeliness.


   The following EPA initiatives are intended primarily to address issues
raised in Chapter III Section 2.1.3., Productivity Standards.


   EPA has recognized the need  to modify the TRC contractor's current SOW
so that the stated data accuracy requirement will reflect revised data quality
standards. This initiative is a positive step in translating  system experience
into clearly defined requirements.

   As of July 13,1990, this modified SOW has not been finalized, therefore it
could not be fully analyzed. Nonetheless, Booz, Allen strongly encourages
the development of precisely worded modifications (see Chapter V, Section
2.5.), as in this particular case, in  order to dearly define revised activities,
roles, and responsibilities for TRI contractors.


   The reporter verification initiative (under discussion at the beginning of
RY89) is also intended to enhance overall data quality.  As of the drafting of
this report, the details of this plan are still being discussed. However, the
basic idea of reporter verification is to send copies of release data back to all
reporters for verification that the data entered is accurate. In one proposed
scenario, the reporter would then have a period of time, fifteen days, to
respond with any necessary modifications. Modifications would be returned
to the TRC, and revisions would  be made to the data in the system.

   There are both advantages and disadvantages  to reporter verification,
resulting primarily  from the increase in the number of players involved in
the data quality process.  First, reporter verification would prevent most
subsequent data quality complaints by reporters, assuming that the data	
TRI Productivity  Review                                            27

                                              Current EPA Initiatives
correction process was performed accurately, as they would have an
opportunity to actually review the data entered into the system.

   The other major effect of the reporter verification initiative will be in the
area of additional revision processing and analyzing. Even if reporter
verification response is only 10% to 20%, this increase in volume will still
result in additional operational requirements impacting  mailroom
operations, document retrieval, data input/revision, and  data verification.  In
general, these additional operational requirements will increase costs by a
little over $1.00 per form ($100,000 is being allocated for RY89) and could also
provide the opportunity for additional delays at the TRC. EPA should
carefully assess this initiative to ensure that it does not unduly increase
processing burdens for the TRC and that it is contributing significantly to data
quality objectives. In view of TRI's already extensive data quality program,
Booz, Allen suggests that EPA assess the cost/benefits of this program after a
one year period.


   The following initiative addresses the planning issues discussed in
Chapter m, Section 2.1.4.


   The TRC and software development contractors have both made
recommendations to TRI management for hardware upgrades to improve
overall system reliability and performance.  In June 1990, the two contractors
and EPA staff met to discuss possible upgrade paths for the existing TRI
hardware and software.

   Several recommendations were made and EPA is currently reviewing
each.  For the short term, the upgrades will focus  on relatively low cost but
high impact improvements. For example, to speed the printing of
reconciliation reports, the TRC recommends purchasing a high-speed impact
printer to replace the low speed laser printer currently in use.

   The TRC also recommends upgrading its storage hardware with the
purchase of an optical disk archival system. The 300MB hard disks on the
LAN are quickly approaching capacity and TRC management is planning to
remove RY87 data from the active file server. Tracking data for RY87 and
RY88 cannot be removed from the LAN because the TRC is still receiving
RY87 and RY88 reports and corrections. However, in the upcoming years,
even tracking data will have to be removed from the active server. The
optical disk archival system under consideration by the TRC uses Write-Once,
Read-Many (WORM) technology. Thus, it cannot be used for data that must
TRI Productivity  Review                                           28

                                              Current EPA Initiatives
be modified.  The WORM optical disk is intended as a high volume, online
storage device for past reporting year's data.

   Long-term recommendations were also made concerning upgrading the
file servers with faster, more powerful PC compatibles or even replacing
them with microcomputer-based database management hardware that would
solely be responsible for the database management while the existing servers
would handle routine LAN requests.

   These solutions should improve overall system reliability and thus
decrease the potential for system failures and lost data entry time. See
Chapter V, Section 2.3. for our recommendations.


   Through  the following initiatives, TRI management is seeking to address
certain data input technology issues raised in Chapter III Section 2.2.2., Data
Input Technology.


   Currently only a limited number of facilities use magnetic media to report
their release data, and data entry from these submissions has not substantially
contributed to improvements in TRI performance due to a lack of adherence
by reporting facilities to EPA published standards.

   However, for submissions in the correct format, data entry speed has been
swift and accuracy has been 100%. Because of this promising potential, EPA is
aggressively working towards increasing magnetic media submissions with a
sharp focus on preventing incorrectly formatted submissions.

   To this end, EPA is considering soliciting in 1990 for a contract to write a
software program that will accept Form R data and save it to a disk in proper
format.  The strategy is for EPA to send a diskette, free of charge, with the
Form R software to facilities interested in magnetic media submissions. The
facility would run the software, answer the appropriate questions, and fill in
the forms on their computer. The software would properly format the
answers and write the Form R to the disk.  The reporting facility  would then,
simply, return the disk to the TRC for data entry.

   This initiative should greatly increase the efficiency of magnetic media
submissions  and could also encourage use of this data input technology by
other submitters.  The appropriateness of magnetic media versus other input
options is further discussed in Chapter V, Section 3.1.
TRI Productivity Review                                           29

                                               Current EPA Initiatives

   To improve the efficiency with which Form R information is entered into
the TRI database, EPA will be testing the feasibility of OCR hardware and
software to read Form Rs, and to enter the report information into the TRI
database. This pilot program, involving approximately 6,000 submissions and
over a thousand facilities, will allow TRI management to evaluate the
effectiveness of this technology to improve the speed and quality of data

   The scanning software, used in this pilot, forced a slight adjustment to
data element boxes on the Form R; however, the form was not redesigned.
Besides the adjustments, the form's background color was changed to red so
that it would not be picked up by the OCR scanner. The hard ware/ software
combination being used in the pilot program will only read one page of the
Form R at a time.  Therefore, an additional burden has been placed  on the
TRC staff to separate Form R pages before scanning and then recollate them
after scanning.  EPA recognizes that this limitation will create additional
processing overhead but asserts that the pilot is still a valuable test of the
capability of OCR hardware to scan numbers and text.  If the initial pilot
shows promise, an expanded pilot may be performed with a hardware and
software combination that will scan the entire form at one time. The OCR
vendor is projecting the ability to scan multiple forms within the next eight
to ten months.

   The first test of scanning forms is not planned until August of 1990.
Consequently, for this report, the effectiveness of the OCR pilot cannot be
assessed. However, the appropriateness of this technology is further
evaluated in Chapter V, Section  In addition, other issues which
impact the effectiveness of the OCR pilot, such as the redesign of Form R and
using optical media for form storage and retrieval, are discussed in Chapter V,
Sections 3.3.2. and 3.2.2.


   As discussed in Chapter HI, Section 2.2.5., there is a need to provide facility
information that can be tracked from year to year in order to reduce data entry
keystrokes, improve data accuracy, and to enable TRI to track facility
performance.  EPA has designed and is implementing enhancements to the
data entry software through the addition of a facility file. The file includes the
proper name and  address for facilities which submitted release data during
RY88. Each address in the facility file is identified by a unique TRI Facility
Identification Number (TRI ID) which is also included on the facility's
Form R when the form is mailed to the facility. Therefore, the data entry
operator has only to type the TRI ID and the data entry software automatically
retrieves and displays the name and address associated with  that number.

TRI Productivity Review                                           30~

                                                Current EPA Initiatives
This single improvement to the data entry software increases data accuracy by
standardizing facility addresses and decreases input time by significantly
reducing keystrokes for data entry operators.

   The facility file is currently only operational on the LAN but will be added
and tested on the mainframe system in November  of 1990.  With the
mainframe capability and the addition of the TRI ID to release records, EPA
will be able to perform year-to-year trend analysis on individual facilities.
The trend analysis will be able to detect reporting inconsistencies* as well as
show the improvements a certain facility is making in reducing toxic releases.
Because of the legal importance of data in the TRI database, the facility file is a
significant addition to improving accuracy of the TRI system. Therefore,
Booz, Allen strongly endorses this initiative.


   The EPA initiatives discussed in this section will successfully address
many of the areas where improvements are necessary in TRI operations.  If
implemented correctly, these improvements should allow TRI data to be
available through the NLM system in a more timely and accurate fashion.
With the exception of the reporter verification initiative which will have a
recurring cost, the remainder of the  improvements will demand mainly
upfront investments.  However, these investments will pay off in the long
run with an improvement in overall TRI productivity.
* For example, if a company reports 25,000 pounds of a certain chemical was released in RY 87,
  24,500 pounds were released in RY88, but only 2,400 pounds were released in RY89, the data
  would be rechecked against the original form to ensure accuracy.

TRI Productivity Review                                             31


   This section presents Booz, Allen's recommendations to improve TRI
operations. The recommendations are presented in two sections:
management recommendations and technical recommendations.

   Each alternative is described and evaluated separately for its contribution
to the primary criteria of data accuracy, timeliness, and cost.  In essence, this
provides the long-term contribution of each alternative to the program.
Additionally, each alternative is also evaluated against the following criteria
which assess the practical feasibility of implementation:

      •    Implementation cost: The estimated cost to implement a
           particular option.

      •    Implementation time  The estimated time to implement a
           particular option.

      •    Implementation risk: An assessment of the likelihood of
           successful implementation of a particular option.

   The priority of a particular alternative is based upon a combination of the
overall contribution to long-term productivity, coupled with the short-term
cost and ease of implementation.  The recommendation summary provides
an overview of the alternatives and a graphic illustration of their
contribution to both the primary and ease of implementation- standards.
Additionally, we have indicated areas where our recommendations agree
with ongoing EPA initiatives.
TRI Productivity Review                                          33

                                   Recommendations to  Improve  TRI

   Management recommendations will be discussed in this section along
with the impacts of the recommendations on TRI.


   V.I.1.1. Issue Summary
      As previously discussed in Chapter HI Section 2.1.2., the potential for
changes in the regulations, which will significantly increase the number of
data fields stored in the database and /or the number of reporting parties, is
high.  Should this occur, EPA will need to determine the most appropriate
means of adapting the system to meet the new requirements. This
recommendation will address potential courses of action should these new
regulations become reality.

   V.I.1.2. Alternatives
      V.1.1 J.I. Minor System Modifications
   One option is to adapt the existing system to meet the new requirements.
To accommodate the new reporting requirements, EPA would need to
perform an impact study to evaluate the additional workload and subsequent
impacts on the system. If additional fields are added to the form, the database
and data input applications would need to be modified to handle the changes.
The results of this study would then be utilized to determine the necessity for
any increase in system size or requirement to add hardware, including the
possibility of adding a second LAN to handle the additional workload.

   Cost/ Timeliness, and Data Quality Impacts: If the new laws or regulations
necessitate the addition of significant amounts of data  to TRI, this option
would have an adverse impact on cost and timeliness  (in direct correlation to
the additional data load).  These adverse impacts would arise due to an
increased amount of data which would have to be entered, verified, and
reconciled by approximately the same number of data entry operators. If
manual keying remained the primary data input method, the time and cost
impacts would be even more extreme. The impact of this option on data
quality would depend primarily on the complexity of the new data fields.

   If the new laws  or regulations only require the addition of minor amounts
of data, this option would have only minimal negative impacts on time and
cost. These impacts again would be directly related to  the amount of extra
keying which would be required, and data quality would basically remain
unaffected unless the new data fields were much more complex.

   Implementation Time, Cost, and Risk Impacts:  This alternative would
initially have a negative effect on time and cost as both would be required to
perform the impact study as well as to purchase new hardware as determined
to be necessary. In addition, time and money would have to be allocated to

TRI Productivity Review                                           34~

                                   Recommendations to Improve  TRI
modify application and output software as well as public access products to
accommodate additional data fields.  The amount of time and money
required for that task will depend upon the complexity and number of
modifications which are necessary.  The risk involved with implementing
this alternative is extremely low as TRI operations and architecture will be
fundamentally the same.

      V.l.l.2.2. Reassess the System
   Another option is to reassess the architecture for the system. To
accomplish this, EPA would perform life cycle analysis in accordance with
OIRM policy* and assess the costs and benefits of redesigning the system to
accommodate the additional requirements.  This assessment would include
evaluating alternative hardware and software approaches to meeting system

   Cost, Timeliness and Data Quality Impacts:  If this particular option is
selected, it provides the ability to optimize all three performance standards in
the long run as the new system would be designed utilizing operational
knowledge gained during the first three reporting years.  The system could
also be developed to handle additional data loads  in a more efficient manner.

   Implementation Time, Cost and Risk: The time and cost required to
implement this recommendation would be significant. If a life-cycle
reassessment of the system is necessary, then a minimum of two years should
be allowed to design and build a new system. The implementation costs will
depend upon the new hardware and software which is determined to be
necessary.  Implementation risk is relatively low in this case as the system can
utilize commercial off-the-shelf hardware and software development

   V.I.1.3.   Recommendation
   The appropriate path for TRI to take depends upon the extent of the
changes mandated by the new requirements. If the new  regulations do not
significantly increase the number of data elements to be stored in  the
database, resulting in the need for only minimal changes to the data input
and database software, the most effective approach would be to perform a
brief impact study and to add hardware to the LAN as necessary to meet the
additional data input requirements.  However, if the number of additional
data fields to be added to the form are sufficient to require extensive
modifications to  the data input and database software, then EPA will almost
certainly need to design a new system and should, therefore, begin with a
reassessment of the entire system.  If the application software must be
redesigned and coded in any case, then EPA should take  advantage of newer
* EPA System Design & Development Guidance, Office of Information Resource Management,
  June 1989.

TRI Productivity  Review                                           35

                                    Recommendations to Improve  TRI
technology and operational experience gained with the present system to
redesign and rebuild the system. Any reassessment should also take into
consideration changes in data input/storage technology (i.e., the use of OCR
or optical storage).


   V.I.2.1.  Issue Summary
   In order for TRI to measure success in terms of its production goals,
productivity standards must be clearly defined for timeliness and data
accuracy levels. Currently, there is no consensus for a realistic database
production schedule between EPA and the public, so TRI is continuously
plagued by complaints that it is not realizing timeliness objectives.
Additionally, the data accuracy objective is too vague and is hindering EPA's
ability to determine which data quality activities are most appropriate based
on explicit, quantifiable productivity increases.

   V.l.2.2.  Alternatives
      V.L22.1. Status Quo
   In this alternative, TRI continues to function with a variety of timeliness
expectations, including IMD's nine month NLM release deadline.  TRI's data
quality objective remains at "near  to 100%" in key fields.

   Cost, Timeliness and Data Quality Impacts:  Maintaining the status quo
will adversely impact cost and timeliness.  Effective management of TRI is
inhibited by the difficulties of operating according to ambiguous productivity
goals. Because the data accuracy goal is to strive for near to 100%, EPA has
dedicated itself to this end, usually at the expense of other activities and, in
the first reporting years, specifically at the expense of timeliness.  This is a
pattern which will be sustained as new methods of improving data accuracy
will, most likely, be proposed with each new reporting year. Although this
environment leads to somewhat greater accuracy, the small gains in accuracy
result in significant, negative impacts on both cost and timeliness.

   Implementation Time, Cost and Risk Impacts: There are no
implementation impacts associated with the status quo alternative.

      V.l.2.2.2. Define Productivity Standards
   In this option, more specific, realistic productivity standards would be
defined through a consensus of key TRI parties.  The creation of a concrete
target for data accuracy would provide a measure for success. Similar
agreement on  timeliness would allow for true assessments of TRI
accomplishments  in regard to the pace of database production.

   Cost, Timeliness and Data Quality Impacts: This alternative would have
positive overall impacts on productivity. A finite, stated data quality goal
would lessen the  pressure to reach perfection (while still maintaining high

TRI Productivity Review                                           36

                                   Recommendations to  Improve TRI
data accuracy) so costs associated with data quality would be stabilized.  This
goal would also allow TRI to quantitatively measure success in terms of data
accuracy thus strengthening funding requests during the budgeting process.
At the same time, a sharp focus on production schedules would be possible
with fewer interruptions for additional or emergency data accuracy activities,
thus improving timeliness of the database.

   Implementation Time, Cost and Risk Impacts: The time/ costs required to
implement this alternative would depend upon the amount of time
necessary  to reach agreement on reasonable productivity goals. One potential
risk is that the creation of specific goals will place pressure on TRI to meet
these goals.  However as TRI is already under pressure to meet widespread
expectations, establishing specific goals will not significantly add to the risk.

   V.I.2.3.   Recommendations
   Booz, Allen recommends that EPA define productivity goals. Specific
accuracy and time goals would provide TRI with a concrete means of
measuring success and would allow EPA to dearly communicate its priorities
to contractors and to the public. These goals should be incorporated into the
contractors SOWs.


   V.I.3.1.  Issue Summary
   A formal planning process within TRI does exist, but plans are often not
followed due to the responsibility of the public system to respond in a timely
manner to demands from external parties.  Although this response capability
is critical for TRI, the time and funds spent responding to ad hoc requests
often impede long-term growth of the system.  The planning process is
further complicated by the rigid time frames in which TRI must operate.  If
deadlines  are not strictly adhered to, improvements in the system (especially,
data entry software and hardware) must be postponed until the next reporting
year.  Specific issues related to strengthening the planning process and
addressing these concerns will be presented in this section.

   V.I.3.2.  Alternatives
      V.U.2.1. Plan for ad hoc requirements
   During the annual planning process, the necessary requirements to
respond to the estimated ad hoc response volume should be approximated.
These requirements would then be incorporated into the budgeting process so
that specific resources could be allocated in support of these activities. The
resources that are utilized to respond to ad hoc requests should be tracked to
determine the amount of funding employed.  Armed with this knowledge,
management will then be able to better understand the tradeoffs which are
being made between ad hoc activities and day-to-day operations and make
deliberate decisions as to the most appropriate utilization of resources.
TRI Productivity  Review                                           37

                                   Recommendations  to  Improve TRI
   It could also be determined from this process whether the amount of
inquiries received necessitates dedicating (at least partially) FTEs to
responding to these requests.  This would allow the remaining staff members
to concentrate on day-to-day tasks and long-range planning and
improvement of TRI operations.

   Cost Timeliness and Data Quality Impacts:  This option would help
monitor and control costs as EPA managers would be more aware of the
resources which are spent on ad hoc requests and could determine the level
of funding which they feel is'appropriate.  In the long run, the inclusion of
ad hoc activities in the planning process will improve both timeliness and
data quality as certain staff would be dedicated to responding to requests, thus
allowing regular staff to continue with regular operations.

   Implementation Time, Cost and Risk Impacts: There are only very
minimal implementation effects associated with this option.  Extra time and
resources would be required in the planning process to include ad hoc
activities, however this should not be significant. There is really no risk
associated with this option as  the planning process is not being substantially

      V.U.2.Z. Establish a formal software development process
   In the second option, a formal software development planning process
lasting at least a year and a half would be established to help define software
requirements and desired improvements and to ensure that improvements
are implemented in a timely fashion. This planning  process would allow
EPA to determine when software modifications can best be made and tested
in order to meet operational deadlines and to anticipate the workload
required to complete these modifications.

   Cost, Timeliness and Data Quality Impacts: The establishment of a formal
software development process would provide improvements  in all three
areas.  If software modifications are planned over a period of a year and a half,
then data entry software should be ready, including complete testing, to start
data entry at the designated time.  Planning time for complete testing of the
software should ensure that "bugs" have been identified and corrected so that
problems do not occur and data entry is not delayed.  Although it will not  be
possible to predict all requirements eighteen months in advance, those which
can be ascertained can be completed first, reserving time for any last minute
development.  Timely data entry also allows the data verification and
reconciliation teams to begin their work immediately, therefore feedback on
errors can be provided to keyers and alterations can be made while data is  still
being entered. When delays are eliminated, costs are reduced.  Additionally, a
formal planning process will help optimize the utilization of funds which
TRI has to spend on software development.
TRI Productivity Review                                           38

                                   Recommendations to Improve  TRI
   Implementation Time, Cost and Risk Impacts: This option would have
minimal implementation impacts. The establishment of this formal
planning process may initially require additional staff and contractor time.
However, this would not be extensive as staff members  already informally
deal with software development issues now.  Because this is not a new area,
there is no risk associated with formalizing this process.

      V.l.3.23. Establish a deliberate hardware replacement process
   In this option, EPA would establish a formal hardware replacement
process. In this process, current hardware would be assessed each year from
an overall system reliability and performance standpoint and necessary
replacements/upgrades would be identified.  This information would then be
utilized as an input to the overall planning process. This process is presently
occurring within EPA in an informal manner as discussed in Chapter IV,
Section 3., Hardware Replacements. However, under this option this process
would  be formalized so that it occurs annually.

   Cost, Timeliness and Data Quality Impacts:  This option would allow EPA
to plan for hardware replacements/upgrades. Although, TRI will still have to
expend funds to  replace the hardware, these funds will have been planned for
and budgeted. Additionally, available funds can be optimized through this
planning process. A formal replacement process should improve timeliness
and data quality  as the potential for system failures and lost data entry time
will decrease as overall system reliability and performance improves.

   Implementation Time, Cost and Risk Impacts:  As previously discussed,
EPA is already assessing hardware needs.  Therefore, the formalization of this
function and its  incorporation into the planning process should have no
significant implementation impacts.

   V.I.3.3.    Recommendation
   Booz, Allen's recommendation is that all  three  of these options should be
adopted. The implementation costs associated with each option are small and
would  have a high payback. However, the benefits of strengthening the
planning process through these options are substantial as they will allow TRI
management to better anticipate and plan for ad hoc requests, software
modifications, and hardware replacements and to optimize the  use of
available funding. This improved planning results in fewer delays and
improved data quality - stronger overall TRI performance.


   V.I A.I.   Issue Summary
      As discussed in Chapter III, Section 2.1.5., a  fundamental  issue that
affects  TRI operations is the functional responsibilities that have been
assigned to the contractors who are working on TRI.  Based on EPA's
contracting policies, three contractors  currently share the operational

TRI Productivity  Review                                           39

                                    Recommendations to Improve TRI
responsibilities for TRI, and this results in a complex structure with
significant functional overlap.

   V.l.4.2.  Alternatives
      V.l.42.1. Status Quo
   EPA can continue to operate with the current division of responsibility
between software maintenance and TRC facilities management contracts with
the same functional responsibilities as contained in current contracts. This
option would require no evaluation of current operational procedures, but
would also mean that the current overlap in functions would continue into
the future. This option was effective during system start-up when
development and debugging of initial software was a critical activity.
However, at this point, having been utilized for two data input cycles, the
software is more stable and is likely to remain so unless major changes to
program requirements occur.

   Cost, Timeliness and Data Quality Impacts:  Maintaining the current
contractor tasking would provide no improvement to program cost,
timeliness or data quality as present overlap, coordination and
communications difficulties would persist.

   Implementation Time, Cost and Risk Impacts: As little or no change is
contemplated with this option, impact on implementation time, cost and risk
for this option is negligible.

      V.l.4.2.2. Separate Contractors for Facility Operations and Data Reconciliation
   Another viable alternative would be to have one contractor provide TRI
operations including data input and form verification as well as software
development and maintenance support, and another contractor provide data
reconciliation services.  This would align contractors' work with the primary
performance criteria for TRI as the operations contractor would be responsible
for timeliness, and the reconciliation contractor primarily responsible for data
quality. It would be particularly important under this scenario to ensure that
an adequate feedback loop existed between the data quality contractor and the
data input contractor to ensure that the causes of data entry problems detected
in data reconciliation process are reported back and corrected by the data entry

   Should program requirements change so that a major rebuild of the
application software is necessary, it may be necessary under this option to
utilize a separate contractor to develop the application. If this is necessary,
care must be taken to acquire detailed programming documentation so that
the resulting application is maintainable by  the data entry contractor.

   Cost, Timeliness and Data Quality Impacts:  This option would
considerably improve timeliness and data quality due to the focused approach
that each contractor  would have.  Cost improvements under this option

TRI Productivity Review                                            40"

                                    Recommendations  to Improve  TRI
would be achieved by focusing contractors on primary productivity goals (data
quality and timeliness).  This option would also improve timeliness and data
quality by maintaining checks and balances in the system by having
independent responsibilities for data input and data quality review.  If the
functional boundaries were carefully engineered, much of the coordination
and communications issues could be minimized. However, there would still
be a need for EPA coordination and supervision to ensure that both
contractors worked together effectively.

   Implementation Time, Cost and Risk Impacts: This option would require
EPA staff or contractor analysis to structure new SOWs in such a way as to
comprehensively cover current TRI operations.   This process should be done
carefully so that start up problems are minimized. Since approximately one
year remains before the first contract expires, sufficient time exists to
accomplish this task without disrupting ongoing management activities.
Relative to overall TRI operations, the cost to accomplish this analysis would
be low.  If done properly so that all aspects of TRI operations are covered, the
risk of implementing this option is low.

      V.l.4.23. One Contractor for Facility Operations, Development and Reconciliation
   A  third option  is  to have one contractor perform all  three  functions of
center operations,  software development and maintenance and data
reconciliation. This option provides total responsibility for all operational
work  under one contractor. EPA's role as a coordinator between contractors
would be simplified  significantly under this option as this option would focus
all responsibility for  timeliness and data quality under one contractor.

   Cost, Timeliness and Data Quality Impacts:  This option provides
complete responsibility for TRI operations in the hands of one contractor.
This minimizes coordination and communication problems,  and would
allow the contractor  to integrate all TRC operational activities in the most
efficient way possible. System development and support activities would be
integrated and would focus on providing comprehensive support to TRC
activities. The feedback loop between users and software developers would be
shortened and made responsible to the same internal management structure.
This will result in more efficient data entry software, and reduce the
complexity of the software development planning process.  EPA staff
personnel would be able to focus their energy on planning and directing the
activities of a single contractor Because overlap potential is eliminated, costs
would be reduced. The key to this option is good contractor performance, and
that requires careful  evaluation of bidders to a well written SOW.

   Implementation Time, Cost and Risk Impacts:  Implementation time and
cost are essentially the same for this option as for the one previous. This
option places all responsibility for TRI operations in the hands of one
contractor and thus slightly raises implementation risk.

TRI Productivity Review                                           4T

                                    Recommendations  to Improve TRI
   V.I .4.3.  Recommendation
   Booz, Allen's recommendation is that utilizing a single contractor for all
TRI operational functions would provide the most effective contractor
organization for TRI. This option relieves EPA staff from the most contract
management and coordination details so that they can devote more time to
focus on other management issues related to TRI. On the operational level,
configuration management issues on the LAN which affect software
development and support would be simplified, and the need  to coordinate
activities in order to run and analyze reconciliation reports would be


   V.1.5.1. Issue Summary
   TRI contractors' current SOWs were written during system start-up when
the need for considerable flexibility to respond to problems and broad tasking
were important requirements.  Additionally, some initial operational
concepts, that were included in the SOWs but were either not implemented
or were subsequently altered significantly, have never been deleted.

   This issue is especially timely for TRI as two of its contracts will be
renewed during the next two years.

   V.l.5.2. Alternatives
      V.l.5.2.1. Status Quo
   One alternative is for TRI to continue to operate with the SOWs as they
were originally written.

   Cost, Timeliness and Data Quality Impacts: In the long-run,
improvements in the cost, timeliness, and data quality areas would not be
shown.  Contractor operations would still be dictated by SOWs which were
written  for the system start-up phase, rather than reflecting the requirements
of the current operational reality.  This is not to imply that the current
contractors are not performing well as both contractors have been providing
services in excess of what is required in their current SOWs. However,
during the contract rebidding process, another contractor could be chosen. If
this occurs and the current SOWs are utilized, there would be no mechanism
for ensuring that this contractor meets appropriate timeliness and data quality

   Implementation Time, Cost and Risk Impacts:  There are no
implementation impacts associated with the status quo.

      V.l.5.2.2. General Revision of SOWs
   Under this alternative, the current SOWs would be rewritten to delete
obsolete activities and to more accurately and specifically describe and
delegate current TRI roles and responsibilities.  Furthermore, EPA would

TRI Productivity  Review                                           42

                                   Recommendations  to Improve TRI
incorporate current productivity goals into the SOWs as is presently being
done with data accuracy (see Chapter IV, Section 2.1 for additional

   Additionally, the SOWs would be modified to provide separate tasking for
ad hoc requests  so that performance on these requests does not adversely
impact routine operations or software development efforts without
management approval by EPA. This applies particularly to the current
software development and maintenance contract.

   Cost, Timeliness and Data Quality Impacts:  Implementation of this
alternative will  have a positive impact on all three criteria. SOWs with
clearly defined productivity goals and management controls to enforce these
goals  should allow TRI management to more closely control costs, while
monitoring progress in terms of deadlines and accuracy levels. In addition,
the government's interests are more appropriately protected in case of a
contractor change.

   By providing separate tasking for ad hoc requests, resources planned for
designated activities will not be diverted without management consciously
deciding to do so.  Therefore, overall timeliness and data quality should
improve if special requests are not allowed to divert extensive resources from
day-to-day operations or long-range operational improvements.

   Implementation Time, Cost and Risk Impacts:  Implementation of this
option would require a minimal time/financial investment. To rewrite the
SOWs, the lessons learned during TRI operations and the specific
productivity standards would have to be incorporated.  Therefore, consensus
would first need to be reached on the productivity standards before
implementation of this alternative.  The risk associated with rewriting
contractor SOWs is low as many of the necessary modifications are known to
TRI staff and are actually already being adhered to by present contractor staff.

   V.I.5.3.  Recommendation
   Booz, Allen  recommends that TRI pursue the second alternative and
rewrite the SOWs.  This option would not involve extensive resources or risk
to implement, but it would allow EPA to ensure that the SOWs more closely
match the operational reality and performance expectations. As TRI will
soon be required to rebid two contracts anyway, this is an ideal time for the
Agency to pursue this option.


   Technical recommendations will be provided along with the impacts of
the recommendations.
TRI Productivity Review                                           43

                                    Recommendations  to Improve TRI

   V.2.1.1.  Issue Summary
   New data input technology has great potential for affecting data accuracy
and timeliness.  However, the demand for fast, accurate data input must be
balanced by cost.  Different methods of data input offer different levels of
speed and accuracy at different costs.  The following recommendations discuss
available data input technologies and the desirability of each balanced against
costs. Additionally, there remains considerable productivity gains to be made
by increasing the effectiveness of the manual keying process which will
always be necessary for some portion of the submissions.

   V.2.1.2.   Alternatives
      V.  Keyboard Entry Enhancements
   After two years of use and improvements, the current keyboard data entry
software is robust and effective. There is, however, one area where
improvements can be made.  One method EPA has devised to improve  data
entry accuracy is to place "stops" in the program  for critical fields. After
keying data in certain fields, the program "stops" and asks the entry operator
to visually check the entry for accuracy. To verify the data field, the entry
operator must locate  the data on the form, memorize the data, then locate the
entry on the screen and compare the data.  The entry operator then must then
find their place on the form to continue.

   Our opinion is that this type of data checking is cumbersome and counter
productive. The simple motions required to perform this check are very
disruptive and multiplied over the course of a day results in reduced data
entry throughput.  Additionally, operators are able to bypass the intention of
the check simply by continue processing without performing a visual check of
the data by simply pressing the continue key.

   A variation of the "stop" check would be to replace the "stops" in the data
entry software with double entry keying. After keying data  in certain fields,
the program hides the data and asks the operator to key it in again.  After
rekeying the data, the software compares the two entries.  If the two entries
match the program continues. A skilled touch typist can type a number twice
without lifting their eyes from the form much faster than they can type the
number, stop and compare the number on the form with the number on the

   Cost, Timeliness and Data Quality Impacts:  This option will bring an
additional quality control check to data input which cannot be bypassed  by the
data entry operator.  Furthermore, it allows data  entry to proceed smoothly so
timeliness will benefit. By ensuring better data quality during the entry stage,
the reconciliation staff would expend less funds  on corrections, thus
decreasing overall costs for TRI.
TRI Productivity Review                                           44

                                   Recommendations to Improve  TRI
   Implementation Time, Cost and Risk Impacts:  Implementing this change
would require only minor programming changes to a minimal number of
fields in the data entry software, and hence has low implementation impact
both in time, cost and risk.

      V.  Magnetic Media
   Another option for data entry is magnetic media submissions in which
the required Form R submission is accepted on a floppy disk instead of a
paper form.  As stated in Chapter IV, Section 4.1, EPA is considering an
initiative as a follow-up to RY88's magnetic media pilot project, to
implement magnetic media through the development of standard software.
Booz, Allen supports this initiative and other activities which would increase
the number of magnetic media submissions.  The pilot project proved that if
the data input software and the format of the data on the disk could be
standardized this approach provided rapid, error free data input.  However,
data normalization activities still need to be performed.

   Cost, Timeliness and Data Quality Impacts:  This option will bring
additional quality control checks to data input. Furthermore, by design, it
enforces EPA standards that have been ignored or overlooked in the past.
With a 100% guarantee that the magnetic media data file is properly
formatted, both timeliness and data entry quality will benefit. Also, by
ensuring better data quality during the entry stage,  the reconciliation staff
would expend less funds on corrections, thus decreasing overall costs for TRI.

   Implementation Time, Cost and Risk Impacts: The cost and time required
for developing the software and implementing the program, while not
insignificant, is far exceeded by the potential reduction in manual data entry,
and improved data entry quality.  Hence timeliness, data quality and cost are
all positively affected. If EPA proceeds on schedule, software should be
available for RY90 entry. Because this option has already been explored by a
pilot program, risks are low.

      V.  OCR Scanning
   To improve the efficiency with which Form R information is entered into
the TRI database, EPA is conducting a pilot program utilizing an an Optical
Character Recognition (OCR) system.  This test will allow TRI management to
evaluate  the effectiveness of this technology to improve the speed and quality
of data input from typed or handwritten  forms.

   Full integration of OCR scanning into the data input process will require
significant additions to the data input software. Additionally, EPA may wish
to evaluate the possibility of integrating this option with the utilization of
optical storage for form access and retrieval.  Finally, if Form R is redesigned,
the requirements of optical scanning should be considered. See Chapter V,
Sections 3.2. and 3.3.
TRI Productivity  Review                                            45

                                   Recommendations to Improve  TRI
   Cost, Timeliness and Data Quality Impacts:  If successful, this option has
the potential of greatly reducing the cost and time required for data input.
Unlike magnetic media, which requires the reporter to possess and use a
computer to provide the required information, this option allows the use of
paper forms and hence would be attractive to more reporting facilities.  The
impact on timeliness and data quality is directly related to the number of
unreadable characters on the form.

   Implementation Time, Cost and Risk Impacts: Implementing scanning
technology would require a major investment in new hardware.  A seamless
interface between the scanning software and the data entry software would
require moderate modifications to the LAN data entry software. The major
risk with this option is with accuracy. It remains to be seen if the current
scanning technology can read the information with sufficient accuracy to be
effective. EPA's pilot program will provide pertinent data to make this

   EPA should proceed aggressively to improve the existing manual
keyboard entry software where ever possible, and based upon the results of
the magnetic media pilot test, encourage full implementation of that
program. Following the OCR Scanning Pilot, EPA should assess the viability
of implementing OCR scanning as the primary means of data entry.
Significant reduction of data entry effort will be necessary to offset the cost of
this option.  If OCR is a viable option, EPA should strongly encourage
submission of forms either through magnetic media or  OCR.


   V.2.2.1.   Issue  Summary
   The current Form R filing, storage and retrieval system is labor and
resource intensive. TRI documents must be securely  stored, available for
quick access and easily maintainable.  Therefore, it is crucial that EPA
evaluate alternatives to their current filing system.

   Data for the following alternatives was found in a study of "Alternatives
for EPA Form R Records Management." The study was prepared in January
1990, for EPA's Office of Pesticides and Toxic Substances by Mathtech, Inc.

   V.2.2.2.   Alternatives
      V. Paper Document Storage and Retrieval
   Currently, the TRC is storing forms in paper form in file cabinets. TRC
procedures require current year Form R reports be stored in the TRC, with the
prior year's reports stored in storage room nearby. Data entry, error correction
and quality assurance activities are accomplished using the original paper
TRI Productivity Review                                           46

                                   Recommendations  to  Improve TRI
   Cost, Timeliness and Data Quality Impacts:  As the number of forms
increases, costs will increase as more storage space is required. If forms older
that two years are sent to the Federal Records Center (FRC), then the staff time
and delay involved in retrieving information from forms which must be
retrieved from the FRC will increase significantly. The current data storage
and retrieval strategy will only serve to reduce efficiency, hinder data quality
efforts and drive up costs.  At the current rate of submissions, TRC operations
will require over 6,400 square feet of space to store five years of Form R
reports and to accommodate a work area for staff to file, retrieve and process
Form Rs.  Continuation of current operations implies an investment of
$875,700 for space to store and support five years' worth of Form R
submissions in one  facility.

   Implementation Time, Cost and Risk Impacts: If the decision is made to
store more forms on-site, the cost of storage space and staff effort to retrieve
forms will increase significantly.  The fundamental risk is that EPA will be
unable to access forms in a timely manner if storage requirements become

      V. Microfiche
   Microfiche is a relatively mature and inexpensive technology that could
significantly reduce storage costs for paper  forms. This technology has the
advantage over paper storage of being compact and dependable. Microfiche
can accommodate roughly 600 times the number of pages stored in hard copy
form, a savings in physical storage space of 99.8  percent.

   Cost, Timeliness and Data Quality Impacts:  Form R  access would be much
faster compared to the paper system improving  both ad hoc requests and daily
quality control efforts. Overall costs for TRI operations would drop as storage
requirements for Form Rs is reduced.  However, this technology introduces
additional labor costs by requiring documents to be processed onto film
creating a lag time between the time a group of  documents is sent out for
processing and when they are returned and are  available to the user.
Moreover, microfiche does not allow the ability to randomly access a
document, or group of documents.  Since several forms  are stored on one
fiche, this technology lacks the ability to simultaneously  share a collection of

   This system has the added disadvantage of relying on the integrity of the
users to maintain the fiche files  to prevent loss  and  misplacement.  The loss
of a single fiche will result in the loss  of approximately 16 complete forms.

   Implementation Time, Cost and Risk Impacts: The five year life cycle cost
for a microfiche system to support TRI operations is approximately $133,888.
This would produce a significant savings for TRI operations over the long
run. Several microfiche systems are available on GSA schedule allowing EPA
to immediately procure and install a microfiche system.  Because microfiche

TRI Productivity  Review                                           47

                                   Recommendations to  Improve TRI
is a well established technology, the risk of using it in TRI operations is very

      V.2.2.2J. Microfilm
   Microfilm technology is very similar to microfiche technology except that
the media is a continuous strip of film. One 215 feet reel of microfilm has the
capacity to contain approximately 5,500 letter sized pages, or about 1,100
Form R reports.

   Cost, Timeliness and Data Quality Impacts: For the most part, microfilm
technology has the same productivity benefits and problems as microfiche,
but has the additional benefit of being able to locate a document faster than
microfiche. Microfilm does require a significant amount of space for storage.

   Implementation Time, Cost and Risk Impacts:  Costs for implementing a
microfilm filing and retrieval system are roughly equal to that of a microfiche
system. The five year life cycle cost for a microfilm system to support TRI
operations is approximately $129,973.  This would produce a significant
savings for TRI operations over the long run.  Microfilm systems are on the
GSA schedule allowing EPA to immediately procure and install a microfilm
system. Because of its reputation, the risk of using microfilm in TRI
operations is very low.

      V. Electronic Image Capture and Optical Storage
   This newer document storage and retrieval technology uses a scanner,
similar to a photocopy machine, that captures an electronic image of the
Form R.  The image of the document is stored on a computer disk. Optical
image stored on disks can be accessed using various search methods and can
be simultaneously viewed on multiple workstations.

   Disk storage requirements for optical images are high so electronic image
capturing systems are typically paired with high volume storage devices such
as optical disk drives and Write Once, Read Many (WORM) storage devices.

   Cost, Timeliness and Data Quality Impacts: Current electronic imaging
systems can locate and display a document in about 10 seconds. Compared to
TRC's present goal of locating a Form R within 10 days, this option offers vast
improvements in timeliness.  Furthermore, since documents are available
for multiple viewing as soon as they are scanned, data entry operators,
validation staff, EPA management, and the public can all simultaneously
view the same Form R with no delay.  Data quality efforts would be free to
perform checks on any document at any time and, thus increasing the quality
of the database.

   Overall costs for TRI operations would quickly drop as storage
requirements for Form R management are reduced. Five years of Form Rs
TRf Productivity Review                                           48

                                   Recommendations to Improve  TRI
could be stored in as little a 100 square feet.  Further savings will be realized by
the reduction in effort required to retrieve and file Form R documents.

   Implementation Time, Cost and Risk Impacts: The costs for
implementing a document imaging and storage system, although not
insignificant, are small compared to the savings realized in storage and
management costs. The five year life cycle cost for an imaging system to
support TRI operations is approximately $106,135. Several imaging systems
are on GSA schedule allowing EPA to immediately procure and install an
imaging system.  Although this technology is comparatively new, from an
industry standpoint the technology is solid, so implementation risk is
relatively low.

   V.2.2.3.   Recommendation
   We strongly concur with Mathtech's recommendation  that EPA use a
WORM based electronic image management system to support TRI's
document storage efforts. Maintaining a paper document filing system for
the more than 80,000 reports submitted annually will become increasingly
difficult over time. Such practice will only  serve to perpetuate operating
inefficiencies and will become the major obstruction in TRI productivity.
Although microfiche and microfilm provide reduced storage costs, these
technologies do not provide the retrieval capability provided by electronic
imaging systems.

   Prior to implementing this technology, TRI will need to consult the Office
of Information Resources Management's Image Processing Systems Policy
which is currently  under development. Additionally, Electronic Records
Management Final Rules, which were published in the Federal Register on
May, 8 1990, contain guidance on the legal requirements for official records
maintained in electronic form.


   V.2.3.3.  Issue Summary
   As discussed in Chapter in, Section  2.2.4. there are concerns regarding the
inefficient design of the  Form R.  This  recommendation addresses these
concerns through discussion of modification of the form.

   V.2.3.2.  Alternatives
      V.23.2.1. Status Quo
      Under this option, EPA would continue to use the existing form
making only modifications that are necessary from year to year to comply
with regulatory changes.

   Cost, Timeliness and Data Quality Impacts: With this option, there will be
no additional cost or time implications.  In addition, data quality should
remain at its present level. However, a serious potential for data quality

TRI Productivity Review                                           49~

                                   Recommendations to Improve  TRI
problems with both the submitter and the TRC remains due to the lack of
submission identification information on form pages 2 through 5.
Additionally, the current form which is optimized for  reducing the number
of fields which must be filled out on the form, does so at the expense of
simplicity, making the form complicated and difficult to understand and fill
out properly. This results in a large number of data entry errors on the part of
the reporter.  In the end, both EPA and the reporting party spend more time
on the form as NOTE'S and NON's are necessary to get the correct
information.  For reporter errors that are not detected,  the result is poorer
data quality in the database.

   Consequently, with this option, there  is also no chance for improvements
in any of these areas. As the form is considered to be a major element leading
to TRI's present time and data quality problems, this option does not
capitalize on an opportunity to improve performance.

   Implementation Time, Cost and Risk  Impacts: As nothing would be
changed as a result of this option, there are no implementation impacts
associated with it.

      V.23.2.2. Redesign Form R
      Under this option, Form R and its instructions would be systematically
redesigned to make it easier for the reporting party to complete while keeping
data entry time as low as possible  The form should also be designed to
facilitate its use for OCR scanning should that option be selected.  More
specifically, data elements would be more logically organized, and
instructions would be clarified.

   Cost, Timeliness and Data Quality Impacts: This option will improve
timeliness and data quality in the long run. Although  a new look may
initially distract previous years' reporters, a more logically organized form
would eventually reduce the time  they spend completing the form.  First
time reporters will also benefit from a more effective design. After reporters
achieve familiarity with the new form, the  number of  errors the reporter
commits should decrease. If the form is redesigned to  accommodate OCR
scanning, both timeliness and data quality should improve as the data would
be entered quickly and exactly as the reporters submit it.

   Implementation Time/ Cost and Risk Impacts:  There are implementation
costs associated with this option as the form would have to initially be
redesigned. However, this cost should be fairly minimal and since OMB
approval must be obtained again in any case, this is an opportune time to
investigate this option.  The redesign of the form would also require a time
investment at the front end for design and  review.  The risk with this option
is that reporters who are familiar with the old form may initially have
difficulties with a new form. However, this risk should be minimal.
TRI Productivity  Review                                           50

                                   Recommendations to  Improve TRI
   V.2.3.3.  Recommendation
   Booz,Allen  strongly recommends EPA redesign Form R.  Although this
option would require EPA to initially expend time and funding, data input
time should decrease in the long run with a new form and data quality
should improve substantially, especially with the use of OCR technology.

   As already  stated, the current OMB approval for the form expires in
January of 1991.  EPA has made minor modifications to the form in the waste
minimization and range reporting sections for this year and is presently
beginning the approval process with OMB.  As this effort is already ongoing
and as a complete form redesign before January 1991 could be tremendously
disruptive to the system, EPA should begin redesigning the form for use in
the next reporting year.  The Form R Redesign effort should include input
from representative reporting parties, IMD, TRC, and system design staff.
                                                Material belongs to:
                                                Office of Toxic Substances Library
                                                U.S. Environmental Protection Agency
                                                401 M Street, S.W. TS-793
                                                Washington, D.C. 20460
                                                (202) 382-3944
TRI Productivity Review                                            51


                         GLOSSARY OF TERMS
   ADABAS	Database management system for an IBM Mainframe
                  computer. Used at RTF to manage TRI data.
   CAS Number	Chemical Abstract Services Number.  Unique number
                  that helps identify precise chemical nomenclature.
   CBSI, Inc	Computer Based Systems Inc. Facility contractor for TRI
                  operations. Processes forms, enters data, performs
                  quality assurance and responds to requests for
   CMA	Chemical Manufacturers Association.
   Management	Maintaining management control over hardware and
                  software elements.
   Reconciliation....Process by which TRI data is validated against original
                  Form Rs.
   ETD	Economics and Technology Division.  Responsible for
                  overall program guidance, regulation development, and
                  regulatory interpretation.
   Form R	Form on which required  facilities submit annually to
                  EPA,  toxic chemical release information.
   IMD	Information Management Division.  Responsible for
                  TRI data management implementation and operations.
   LAN	Local Area Network. System of linking personal
                  computers so they can share data and printers.
   MB	Megabyte. One million bytes. Measurement of storage
                  capacity on a computer's hard disk.
   MS-DOS	Microsoft's Disk Operating System.  Operating system
                  used on most IBM PCs and compatible microcomputers.
   Natural	Computer language for entering and extracting data
                  from a database.
   NON	Notice of Noncompliance. Notification to the reporting
                  party that it has failed to provide or meet a critical
                  reporting requirement such as the reporting year or
                  signature. Impedes data  input to the mainframe
TRI Productivity  Review                                          53

                                      Appendix I Glossary  of Terms
                       APPENDIX I (CONTINUED)

                         GLOSSARY OF TERMS
   NOTE	Notice of Technical Error.  Logical error in the database
                  which is a less serious reporting difficulty than a NON.
                  Data can still be inputted to the mainframe database.
   Novell	Brand name for the LAN hardware and software in use
                  at the TRC
   OCR	Optical Character Recognition.
   PEI	Contractor who provides chemical expertise to data
                  entry and reconciliation staff.
   PRC	Planning Research Corporation. EPA's software
                  development  contractor.
   RAM	Random Access Memory.  Working or core memory in a
   RTF	Research Triangle Park. EPA's IBM Mainframe
                  computer site in North Carolina.
   RY	Reporting Year. Terminology for the calendar year of
                  Form Rs being processed.  In the summer of 1990, RY89
                  Form Rs will be processed.
   of Work (SOW) ..Document describing the exact tasks a contractor is
                  required to perform for a contract.
   SYCOM, Inc	A subcontractor to Planning Research Corporation
                  (PRC). Contractor responsible for TRI data entry
                  software development and maintenance.
   Title HI	The Emergency Planning and Community Right-to-
                  Know Act of 1986, a free-standing section of the
                  SUPERFUND Amendments and Reauthorization Act of
                  1986, which mandates  that EPA collect and maintain TRI
   TOXNET	National Library of Medicine's Toxicology Network.
                  TRI data is available to the public through this network.
   TRC	Title IH Reporting Center. Facility which receives TRI
                  forms and processes them into the TRI database.
   TRI	Toxic Release Inventory
   TRI ID	Toxic Release Inventory Facility Identification Number.
                  A unique number assigned to each reporting facility for
                  identification purposes.
TRI Productivity Review                                          54


      This appendix provides detail on TRI's management structure,
including both EPA's and its contractors' organization and responsibilities.


   The responsibility for operation of TRI was delegated to the Office of Toxic
Substances within EPA.  This office  assigned two divisions - the Information
Management Division  (IMD) and the Economics and Technology Division
(ETD) actual responsibility for day-to-day planning and operation of TRI
although other divisions provide some support to TRI. Specifically, IMD is
charged with managing and implementing TRI data and operations. Staff
within IMD's Public Data Branch fulfill these responsibilities although they
do not work exclusively on TRI.  ETD provides overall program guidance,
regulation development, and regulatory interpretation.  Within ETD, staff in
three branches - the Chemical Engineering Branch, the Regulatory Impacts
Branch, and the Toxic  Release Inventory Management Staff - have TRI
responsibilities. As with IMD staff, ETD personnel with the exception of
TRIM staff, also have non-TRI responsibilities.

   EPA utilizes this multidivisional  approach as it allows TRI to draw upon a
diverse  mix of personnel with expertise in information management,
regulatory and policy issues, chemistry, and organizational management.  In
addition, TRI is able to strategically tap EPA resources, both personnel and
financial due to this structure.


   The other primary component of the TRI organizational structure is
contractor operations.  Exhibit 3 illustrates the relationships of the various
contractors to EPA staff.  EPA chose one contractor to operate the Title ffl
Reporting Center. This contractor, who reports to IMD, performs several
major activities, including receipt and storage of Form Rs (79,784 submissions
were processed in RY87 and 82,123 in RY88), data entry, data searching and
retrieval, and data quality control and reconciliation.  The organizational
structure of this contractor parallels  these tasks which are also defined in the
statement of work. All TRC contractor personnel are dedicated solely to TRI
operations. To fulfill its  obligations, this contractor employs additional
personnel during peak periods. This contractor operates under an award fee
type contract which expires in September of 1992.
TRI Productivity Review                                           55

                                                                       Office of Toxic Substances
                                             Economics & Technology
                                                  Division (ETD)
                                                    Information Management
                                                         Division (IMD)
Regulatory Impacts
  Toxics Release
Management Staff
Public Data
                                                        Services Section
g     «
I    B-
S.     5s
                                                                              Based Systems
                            *Shaded boxes denote contractor organizations.

                                   Appendix  II   Management  Information
approximately twelve regular staff who are primarily dedicated to TRI. These
staff members are divided into two groups - a Local Area Network and a
mainframe group.  In addition to developing the initial application software,
this contractor provides ongoing support in software development and
maintenance, performs  data upload functions and special programming for
system enhancements, responds to ad hoc requests, and generates data
reconciliation and other reports from the mainframe. This contractor is a
subcontractor to another firm operating under a cost plus fixed fee type
contract which expires in July of 1991.

   A third contractor supports data quality efforts through providing expert
advice on chemical nomenclature to data entry operators as Form R data is
keyed into the LAN system. These contractor personnel also assist with the
review of Notices of Technical Error (NOTEs) and certain Notices of
Noncompliance (NONs). NOTEs are issued when logical errors in the data
are identified, and NONs serve as notification to the reporting party that it
failed to meet certain critical reporting requirements, such as chemical name,
reporting year, or a signature.  This contractor reports to ETD personnel.
TRI Productivity Review                                            57


   TRI operations are performed on two separate hardware platforms.  Data
entry and document tracking is performed on a LAN at the TRC.  TRI data is
collected at the TRC, typically in batches of 5,000 records, and uploaded into
the TRI database residing on the RTF mainframe.  Maintenance and
reconciliation of TRI data is done on the mainframe through terminals.
located at the TRC by CBSI, and EPA staff.


   The data entry LAN is a group of forty seven AST 286 personal computers
clustered together on a Novell Token Ring LAN. Each PC has a Proteon
communications card and cable to connect it to the LAN.  Three Compaq 386
personal computers, with 9MB RAM memory each, act as network servers.
One server is for administrative computing, and the other two are for TRI
production.  Of the two production servers, one is for actual production and
the other is a back-up unit in case the primary server fails.  Each server is
connected to two 300MB hard disks to store data. Back-ups of the data are
done with two RAP-300 Emerald 150MB cartridge tape drives. The LAN also
supports four laser and seven dot matrix printers for printing NONs and
reports. Uninterruptible power supplies (UPS) provide electric power, free of
spikes and surges, to the PCs and the network servers and, in case of an
electrical blackout, approximately thirty minutes of battery power to enable a
controlled shutdown of TRI data entry operations.

   Access to the mainframe computer is provided by gateway hardware and
software running on two of the AST 286 PCs. One is equipped with a
synchronous data link communications card and emulates a sixteen port IBM
controller. With only one session per emulated port, sixteen sessions are
available on  the mainframe through  this gateway. The other gateway server
is equipped with a multiplexor board that emulates an IBM 3299 multiplexor.
The board has eight ports and RTF has allocated four mainframe sessions per
port for a  total of thirty two mainframe sessions available through the second
gateway.  A  total of forty eight mainframe sessions are available through the
LAN, which, under current staffing,  is more than enough for TRI operations.


      The following section provides a detailed description and assessment
of the LAN  hardware environment.
TRI Productivity Review                                          59

                                   Appendix III Technical Information
   Data entry PCs create an almost continuous load of data bursts on the
LAN. The Novell LAN with its ten megabits per second transfer speed and
its token ring access mechanism is well suited for data entry operations.

   Storage Capacity
   The main TTS and TRIS data files for RY87 and RY88 average around
55MB each and fit tightly on a single 300MB drive. The production server is
equipped with two 300MB drives. The operating system, however, allows
access to only 250MB of a 300MB drive.  Furthermore, due to disk mirroring,
the effective storage is 250MB, not 500MB.  In addition to data files, the hard
disks also store temporary files for other processes such  as printing.
Occasionally, when the  disk has neared capacity, printing jobs have aborted
due to insufficient disk  space.

   The initial LAN architecture was designed with enough disk space to have
a complete duplicate set of the tracking data and the TRIS data prior to
upload. When FY89 data start entering the system, TRC management will
have to transfer FY87 tracking data to the back-up server, which has an
additional 250MB of storage available.  This temporarily solves the storage
problem at the sacrifice of having a complete set of data on the back-up
server. TRC management is keenly aware of the status  of each drive on the
LAN and has made recommendations for purchasing optical disk drives to
alleviate the storage problem.

   Several layers of security are implemented on  all three LAN servers
preventing unauthorized access to both file servers and  the data files
themselves.  In addition to the LAN security, mainframe users must pass
through security checks when signing on to the mainframe.  TRC
management has taken  an extra step of security  by using software that
encrypts the TRI and TTS data on the LAN hard disks.  Data is automatically
unencrypted for authorized users. TRI data is well protected against
unauthorized viewing,  modifying, or deleting by  users  or mischievous

   Uninterruptible  Power  Supplies
   TRC has provided UPSs to the file servers and the data entry PCs to protect
the critical data entry operations from accidental loss of  power.  A UPS
provides two key protective measures. First, the UPS filters out dangerous
spikes and variances in electrical power that can damage delicate computer
equipment. Second, and more importantly, when building power fails, the
UPS will, without interruption, provide electrical  power to the entire LAN
for approximately thirty minutes to let each user finish  work in progress and
allow TRC management to gracefully shut down data entry operations.
TRI Productivity Review                                           60

                                   Appendix III Technical Information
Overall, TRI entry operations are well protected against unexpected power
outages or fluctuations.

   Disk Mirroring
   ITS and TRI data are both stored on one of the 300MB drives on the
primary production server.  The other 300MB is configured to be a mirror
image of the first. Data is written first to one drive and then, as a
precautionary measure, is automatically written to the second drive.  This
feature, known as disk mirroring, provides two fundamental benefits to TRI
production.  First, since the TRI data is written to two different drives, a
breakdown of one drive does not cripple the system. Production staff can
continue writing data to the working drive. Second, since both disks have the
same data, queries by reconciliation staff, for example, can be answered by
either disk.  Therefore, users spend less time "waiting in line" for responses
from the LAN.  Although disk mirroring uses twice as much disk space to
store data than a single drive, this practice is well regarded in the data
processing community as being a practical and prudent method of
safeguarding valuable data.

   Back-Up  Hardware
   CBSI uses the Emerald tape drive to schedule unattended back-ups each
night of the data entry LAN disk drive. The process includes both a complete
disk back-up and a verification of the tape. The back-up software also
provides for an audit trail that CBSI management reviews each morning for
back-up errors before starting production.  The Emerald tape drive has the
capability to write data to the tape, rewind the tape, then verify that the data
on the tape is readable. However, it does not perform a true verify whereby
the data on the tape is compared to the data on the disk.  Problems initially
associated with the back-up  hardware were subsequently diagnosed as either
software errors or operator errors.  As long as proper procedures are followed
TRI managers can have a high confidence level in their back-up tapes.
However, like any hardware, routine testing is prudent and will identify
problems well before they cripple a system.

   Access Time
   During peak processing months (July to September), response from  the
LAN's data entry server to the PCs is generally good.  Data entry personnel do
not experience overwhelming or unbearable delays and documents are
entered as fast as they can be typed. On the other hand, access to the
mainframe can be slow, especially during normal working hours.  During
midday hours (10:00 A.M. to 3:00 P.M. E.S.T.), as much as thirty to forty five
seconds can elapse between  a keypress and an acknowledgement from the
mainframe. (Normal access  times averages ten to fifteen seconds with
excellent access time being under five seconds.) This slow access time can be
attributed to the numerous demands on the mainframe by thousands of EPA
employees nationwide.

TRI  Productivity  Review                                           61

                                  Appendix III Technical Information

   The mainframe that stores the TRI data is an IBM 3090 mainframe
computer located at EPA's Computer Center at RTF. The RTF Computer
Center provides total support of mainframe activities freeing TRI managers
from mainframe management concerns.  TRI shares the cost of supporting
RTF operations with other EPA programs that use the mainframe. TRI staff
can access the mainframe database through PCs on  their LAN. Transparent to
the users, the LAN accesses the mainframe through a standard gateway
located on two of the network PCs using Wide Area Network (WAN)


   Software on the LAN can be divided into two groups:  commercial
software for both LAN management and general use (wordprocessing,
spreadsheets, and graphics) and EPA-developed software for TRI specific


   MS-DOS (version 3.1) is the operating system used by each of the LAN's
PCs with Novell Netware handling the LAN operations.  SYCOM wrote and
compiled both the TTS and TRI data entry software in Clipper as well as the
software that reads and enters magnetic media submissions of Form Rs.
Ad hoc reports and queries to the ITS and TRI databases are made with
dBase m+, Clipper, and other off-the-shelf database development products.
Other commercial software includes standard word processing, spreadsheet,
and graphics programs to support report writing, training programs, and
other administrative needs.


   The IBM mainframe runs the MVS/XA operating system with JES2 and
TSO/ISPF for telecommunications with the LAN PCs. TRI data is stored
using ADABAS, EPA's standard database management system, with Natural
language, and Natural Security as the programming and security interfaces.


   The study  of the system's software focused on the LAN software and
interfaces. The LAN software was analyzed from several aspects including
how it is written and how it performs.

   LAN Software
   Simple oversights by LAN system managers with configuring software
and hardware caused problems, in the start-up phase of TRI operations,

TRI Productivity Review                                         62

                                   Appendix III Technical Information
ranging from unreadable back-up tapes to LAN crashes.  One by one these
problems were tracked down and corrected. Currently the software
controlling LAN operations is generally stable.

   Data Transfer Software
   Data transfers from the LAN to the mainframe were initially attempted
with the communication software Natural Connection.  Unexplained errors/
and frequent disconnections between the LAN and the mainframe
interrupted numerous transfers. A night shift computer operator was
specifically hired to oversee the upload operation. Being unreliable, the direct
transfer procedures were dropped in November 1988 and replaced with a
safer, but more manually intensive, method to transfer TRI data to the
mainframe.  Presently reel-to-reel tapes are made of the data, uploaded at the
TRC, and sent to the WIC. The tapes are loaded onto a computer at the WIC
and transmitted via a dedicated cable to the RTF mainframe.
TRI Productivity Review                                           63

                                  Appendix III Technical Information
                          PROCESS FLOWS
   The diagrams on the following pages illustrate the work flow involved
with producing the TRI database at a high level. These process flows start
with receipt of the Form Rs at the Title HI Reporting Center, continue
through entry of ITS and TRIS data into the system and reconciliation.
Finally, the process ends with transfer of the data to TOXNET.
TRI Productivity Review                                          64

                                     Appendix III Technical Information
                       PROCESS Rows (CONTINUED)
                RECEIVE, DATE, AND SORT
                                ADDITIONAL INFORMATION
                                 MODIFIED SUBMISSIONS
                          NEW SUBMISSIONS
                          NON RESPONSES
TRI Productivity Review

                                 Appendix III Technical Information
                     PROCESS Rows (CONTINUED)
                                   VALIDATE 100% OF ITS
                                  ENTER TRI DATA INTO LAN
                                   VALIDATE 25% OF EACH
TRI Productivity Review

                                   Appendix III Technical Information
                      PROCESS FLOWS (CONTINUED)
                  UPLOAD 1 OR 2 TIMES PER
                      PRODUCE TRI



1 1

TRI Productivity Review

                                 Appendix III Technical Information
                    PROCESS FLOWS (CONTINUED)
                             OLDER THAN 2
                             YEARS OFF-SITE
TRI Productivity Review

                                Appendix III Technical Information
                     DATA FLOW DIAGRAMS
   The following data flow diagrams illustrate a detailed account of the data
flow in TRI operations. The first diagram is at the context level; subsequent
diagrams are decompositions.
TRI Productivity Review                                        69


•— »

Data Review \S)* 	 ~ '".
andReeonciliation Trade Secret Petitions
FOIA Requests
Mac. Mail
/\ Reporting Reporting Dala Review
*^ ,i, XXPany Harty andKeconci
Additional Information ^ ' X* >v
Trade Secret Petitions ^S' VJ'
A..!*.** | ^ ^^
*l~l*~m ~HJ~mf*-.~l.~mfM DOM CU
^ '
1.1 v«
*™*c Prepare Documents Received
FOIA Requests
Batches ot
r Notice of Technical Error
Modification Notices
/pS^ Transferred TRI ttala. • •••
TOXNET . 	 „„, „ 	
FOIA Responses
4&* TRIRept
PubUc Congressional Respoi
Data Correction Respo
Dala Review
and Reconciliation


hesof t
'ified ChemDala
ssions Deletion
For Deletion


RI Database

ChemDala 1 L2
Confirmation LAN TRI
— Ghent Data For Deletion— —+ uaiaoase
Pr*v^ ec<" c
Batches of New Submissions — 1\ "»*»« fl
	 Downloaded Tracking Dala
UreltxulfdTRI f\itn... 	
New Submissions


Dala Review
lalion andReeonciliation
mfication Requests
Approved NON £3
Trade Secret Decisions H
ata Clarification Responses ^
ReccomendedPOIA Responses
Congressional Requests
Data. Correction Requests

Data Review
and Reconciliation

»— 1
>— 1
>— i

                                                  PREPARE  DOCUMENTS RECEIVED
                        I Annual Rcportj
                        I Additional Information
                        I Trade Secret Petitions
                                                                    Data Review
                                                                    and Reconciliation
                                           Mac. Mail'
                                        FOIA Requests
                                    Trade Secret Petitions
                                                             FOIA Request!
                                                            —Mac. Mail  -
                     TRI Database
                                   Receive. Date, and Son Mail
                              Batches of
                              Batches of
-New Submuiioru —

- Modified Submissions	
                                                                                        For Deletion
                                                      Chem Data For Deletion
                                       Batches of New Submissions
                                             TRI Database
                                               LAN TRI Database Processes
                                                          -Batches of New Submissions 	
                                       Chem Data
                                                          >LAN TRI Database Processes
                                  Chem Data
                                        Batch Submissions To
                                            Be Re-keyed

TRI Productivity Review 73
T EPA (01
Data Rev
Data Clarification Reques
Data, Clarification Responses T j^| ^
A / raae secret oecuuiu Tnlpkill_pwn
CD rm*\ Bmy and -
Cr A (OTS) 1/VWZ. B«B«ri«u
Data Review
find Rcconcilt&iion
t '
/QS-Ao/cAcf of New Submissions J
PrepBTC Docurncnu Received
1 j-

Data Review
and Reconciliation
1 ' V

LAN Searching
«»4— NON for Review • and Reporting
Data Review
and Reconciliation I ^
My < Atone* ofNoncompliaace *
"S) .A^EPA (OTS) ..g^EPA (OTS)
iew \S/ Data Review vS' Data Review
nctliation T and Reconciliation 4 and Reconciliau
" 1 Data Clarification Requests
Data Clarification Responses

^ New Submit
TRI Data Entry
Batches of + and 25%
New Submissions Review
7W5 TJU5
Data Data
Store Documents
on ^^
r .2.3 i
^ >

Prepare Documents Received
4— Chan Data For Deletion
	 Ctem Dote 	 . B
Deletion 1 Q
Confirmation 1 ^
Prepare Documents Received S
racking Data 	 1.2/D1 LAN +— Record to be Deleted— 1 g

nus r«/5
_^ __ 	
4-DownloadedTraclung Data
TRI Database
»— (

>— 1



EPA(OTS) . -
Data Review  O)
and Reconciliation J ]L


k. ^

Store Documents
. Reporting .A.
^ i

on Notices
Modified Submissions
—Additional Information *

4— Tracking andTRI Data—

) — Tracking and TRI Datf

\ '
.Tracking and , 3/D1 Mainframe
TRI Data TRIDatabai
h Tracking Datar
i '


	 Record, to be deleted —


Party Store Documents
<|> r^>
Notice of Technical Erron
Store Documents
Pulled Documents yfis.
| V
f 13A }
Data Review
lalofreCorrccterf. ReconclliaUon
— TRISDala-t
\^ _J
t t r
Notice of jpj
Technical R
Error 'f „
Search and
-4 	 G«"> 	 Report
-r- 	 Data— *
Document Requests 1
^—Returned Documents — '
Data Review
and Reconciliation

Congressional Requests ^^
ReccomendedFOIA Responses
Data Review ygv
and ReconciliationV^
Congressional Responses 1
TRI Reports |
1 >««.
' 	 POIA Responses ~~ *^^
^. 	 ... ^ Public
f 1.3.6
Transfer to
!— TRIS— »• TOXNEr

-Transferred TRI Data-^^\


». j

                                                                 STORE DOCUMENTS
LAN TRI Database Processes
         -New Submissions
                                     Additional Information'
                            TRI Database
                                                                                      . Archmable Documenu-
                                                                                            Documents and
                                                                                            Store Off-Site
                                                                         	Pulled Doeuaienls-

                                                                         - Lagged Requests	
                                                                    • Returned
                                                Documents      \
                                                          TRI Database
                                      Document Requests
                                      •Returned Documents •
                                                                    Pulled Documents .
                                                                                                            Pulled Box
                                                                                                           ' Bo* Request'
                                                                                                            'Returned Box'

     The following documents were reviewed for this study:

           CBSI's Statement of Work
           SYCOM's Statement of Work
           CBSI's TRC 1989 Annual Report
           Regulatory Impact Analysis in Support of Proposed Rulemaking
           Under Section 313 of SARA (ETD, May 1987)
           1987 Data Quality Report
           TRI Reporting Form R and Instructions (March 1988)
           TRI Reporting Package for 1989
           Alternatives for EPA Form R Records Management:
           A Feasibility Study (Mathtech, January 1990)
           TRC Operating Procedures ((Draft) May 1988)
           TRIS Data Entry User's Guide (June 1988)
           TRIS Supervisory Operations Manual (September 1988)
           TRIS Logical Design (June 1988)
           TRIS Physical Design (June 1988)
           OTS Operating Plans FY89 and FY90
           FY89 and FY90 TRI Budgets
           Public Access: Two Case Studies of Federal Electronic
           Dissemination (GAO, May 1990).
TRI Productivity Review                                         77

              APPENDIX V Data Collection Participants

Thirteen staff members in the following organizations within EPA's Office of
Toxic Substances (OTS) were interviewed as part of this study.

•     OTS
            Information Management Division
            • •    Non-Confidential Information Services Section
            • •    Public Information Section
            • •    Non-Confidential Systems Section
            Economics and Technology Division
            • •    Regulatory Impacts Branch
            • •    Chemical Engineering  Branch
            • •    TRIM Staff

Functional areas and organizations interviewed for  the two main contractors

•     CBSI
            Operations Manager
            Program Manager
            LAN Administrator
            Data Processing
            Document Preparation and Storage
            Data Reconciliation
            Training Administrator

•     SYCOM
            Program Director
            Project Manager
            Mainframe Coordinator
            PC/LAN Coordinator
TRI Productivity Review                                          79