U.S. Environmental Protection Agency
Office of Toxic Substances
401 M Street S.W.
Washington, DC 20460
Toxic Release Inventory
Productivity Review
Final Report
July 14,1990
Contract #68-W9-0037 Delivery Order #037
Lucille C. Henschel, D.O.P.O.
Prepared by
Booz* Allen & Hamilton
4330 East West Highway
Bethesda, Maryland 20814-4455
01)9512200
-------
TABLE OF CONTENTS
I. Executive Summary 1
I.I. Study Approach 1
1.2. Description of TRI 2
1.3. Assessment Summary 2
1.3.1. Data Accuracy, Timeliness and Cost 2
1.3.2. Conclusion 3
1.4. Summary of Issues and Recommendations 3
1.4.1. Management 3
1.4.2. Technical 5
II. Introduction 7
n.l. Background 7
H.1.1. History of TRI 7
H.I.2. Summary Description of TRI 8
H.1.3. TRI Future Possibilities 10
H.2. Purpose of Study and Report 10
H.3. Study Approach 11
III. Productivity Assessment 13
HI.l. Productivity Status Summary 13
ffl.1.1. Data Accuracy 13
HI.1.2. Timeliness 14
m.1.3. Cost 14
ni.1.4. Conclusion 15
IH.2. Productivity Issues 16
III.2.1. Management Issues 16
IH.2.1.1. Introduction 16
ni.2.1.2. Changes in Regulatory Environment 16
m.2.1.3. Productivity Standards 17
III.2.1.4. Planning Environment 17
III.2.1.5. Contractor Organization 18
m.2.1.6. Contractor Statements of Work 20
m.2.2.1. Introduction 20
ffl.2.2.2. Data Input Technology 20
HI.2.2.3. Form Storage and Access 22
m.2.2.4. Form Design 22
IE.2.2.5. Develop Integrated Facility File 23
HI.2.3. Conclusion 23
m.2.3.1. Management Effectiveness 23
HI.2.3.2. Technical Effectiveness 24
IV. Current EPA Initiatives 27
IV.l. Introduction 27
IV.2. Productivity Standards 27
IV.2.1. SOW modifications 27
IV.2.2. Form verification by reporting parties 27
Pagei
-------
TABLE OF CONTENTS (CONTINUED)
IV.3. Planning Environment 28
IV.3.1. Hardware Replacement Planning Efforts 28
IV.4. Data Input Technology 29
IV.4.1. Magnetic Media Improvements 29
IV.4.2. OCR Data Input Pilot 30
IV.5. Facility File Development 30
IV.6. Conclusion 31
V. Recommendations to Improve TRI 33
V.I. Management 34
V.I.I. Impact of Change in Regulatory Environment 34
V.I.1.1. Issue Summary 34
V.l.1.2. Alternatives 34
V.I.1.2.1. Minor System Modifications 34
V.l.1.2.2. Reassess the System 35
V.I.1.3. Recommendation 35
V.1.2. Clarification of Productivity Standards 36
V.l.2.1. Issue Summary 36
V.l.2.2. Alternatives 36
V.l.2.2.1. Status Quo 36
V.l.2.2.2. Define Productivity Standards 36
V.l.2.3. Recommendations 37
V.I.3. Strengthen Planning Environment 37
V.l.3.1. Issue Summary 37
V.l.3.2. Alternatives 37
V.l.3.2.1. Plan for ad hoc requirements 37
V.l.3.2.2. Establish a formal software
development process 38
V.l.3.2.3. Establish a deliberate hardware
replacement process 39
V.l.3.3. Recommendation 39
V.I.4. Change Contractor Organization 39
V.l.4.1. Issue Summary 39
V.l.4.2. Alternatives 40
V.l.4.2.1. Status Quo 40
V.I.4.2.2. Separate Contractors for Facility
Operations and Data Reconciliation 40
V.l.4.2.3. One Contractor for Facility
Operations, Development, and
Reconciliation 41
V.l.4.3. Recommendation 42
Page ii
-------
TABLE OF CONTENTS (CONTINUED)
V.1.5. Revise Contractor SOWS 42
V.l.5.1. Issue Summary 42
V.l.5.2. Alternatives 42
V.l.5.2.1. Status Quo 42
V.l.5.2.2. General Revision of SOWs 42
V.l.5.3. Recommendation 43
V.2. Technical 43
V.2.1. Upgrade Data Input Technology 44
V.2.1.1. Issue Summary 44
V.2.1.2. Alternatives 44
V.2.1.2.1. Keyboard Entry Enhancements 44
V.2.1.2.2. Magnetic Media 45
V.2.1.2.3. OCR Scanning 45
Recommendation 46
V.2.2. Upgrade Form Storage and Access 46
V.2.2.1. Issue Summary 46
V.2.2.2. Alternatives 46
V.2.2.2.1. Paper Document Storage and
Retrieval 46
V.2.2.2.2. Microfiche 47
V.2.2.2.3. Microfilm 48
V.2.2.2.4. Electronic Image Capture and
Optical Storage 48
V.2.2.3. Recommendation 49
V.2.3. Redesign Form R 49
V.2.3.1. Issue Summary 49
V.2.3.2. Alternatives 49
V.2.3.2.1. Status Quo 49
V.2.3.2.2. Redesign Form R 50
V.2.3.3. Recommendation 51
Page iii
-------
TABLE OF CONTENTS (CONTINUED)
Appendix I Glossary of Terms 53
Appendix II Management Information 55
EPA Organization and Responsibilities 55
Contractor Organization and Responsibilities 55
Appendix III. Technical Information 59
Hardware 59
LAN System 59
Detailed Hardware Assessment 59
LAN 60
Storage Capacity 60
Security 60
Uninterruptible Power Supplies 60
Disk Mirroring 61
Back-Up Hardware 61
Access Time 61
Mainframe System 62-
Software 62
LAN Software 62
Mainframe Software 62
Detailed Software Description and Assessment 62
LAN Software 62
Data Transfer Software 63
Process flows 64
Data Flow Diagrams 69
Appendix IV Documents Reviewed 77
Appendix V Data Collection Participants 79
Page iv
-------
I. EXECUTIVE SUMMARY
In February 1990, the Office of Toxic Substances (OTS) of the
Environmental Protection Agency (EPA) requested Booz«Allen & Hamilton
Inc. conduct a productivity review of the national Toxic Release Inventory
(TRI). TRI is a public database which resides on EPA's mainframe at Research
Triangle Park, N.C. and the National Library of Medicine's (NLM) Toxicology
Network (TOXNET). The purpose of the study is to provide an independent
assessment of the strengths and weaknesses of the program as it exists in June
1990 and to provide recommendations for improvement.
This report, the final deliverable in this task, is an assessment of the
current productivity of TRI, including software, hardware, and management
activities. It also evaluates current improvements EPA is making to increase
the effectiveness of the system. Finally, this report contains
recommendations for further improvement to TRI.
I.I. STUDY APPROACH
The mission for TRI as defined by Title in of the Superfund Amendments
and Reauthorization Act (SARA) of 1986 is to collect and make available toxic
chemical release information to the public. However, for this study, Booz,
Allen utilized the following standards to define productivity for TRI:
Accuracy: The degree to which the data entered into the database
is correct, both in terms of data entry accuracy and detection and
correction of reporter errors.
Timeliness: The extent to which EPA was able to provide the TRI
database to NLM in a timely manner.
Cost: The adequacy of funding for TRI and the extent to which
the above criteria impact funding requirements.
These three criteria represent the fundamental performance dynamics that
surround TRI and its user requirements and provide a useful framework in
which to evaluate the overall productivity of the system.
Productivity for TRI is determined by the quality of the system architecture
and the management operations performance that supports TRI. The
interplay between the criteria that define productivity and the system and
management actions that constitute TRI is complex. Short-term productivity
of TRI is derived from effective management exercised within the
technological constraints of the system as it currently exists. Long-term
productivity is related to successful refinement of the existing system,
TRI Productivity Review T
-------
Executive Summary
including the introduction of new technology to improve productivity and to
meet new, emerging requirements.
1.2. DESCRIPTION OF TRI
The Emergency Planning and Community Right-to-Know Act, Title ffl of
the Superfund Amendments and Reauthorization Act (SARA) of 1986,
requires facilities which manufacture, process, or use any of the specified toxic
chemicals to report annually the amounts of these chemicals released directly
to air, water, or land or that are transported to off-site dumping facilities.
These reports are due on July 1 of each year. The same law requires EPA to
establish a national TRI and to make this information available to the public
annually via telecommunications and other means.
Reports are submitted by industry on a Form R to the Title in Reporting
Center (TRC) in Washington, D.C. The data is processed onto a LAN at the
center, verified for accuracy, and then periodically uploaded to a database
located on EPA's mainframe at Research Triangle Park. The data is then
analyzed for accuracy, and erroneous records in the database are corrected by
EPA and contractor personnel. Finally, when an acceptable level of data
quality is reached, the data is delivered to NLM.
1.3. ASSESSMENT SUMMARY
This section provides an overview of TRI productivity as defined by the
productivity standards
1.3.1. DATA ACCURACY, TIMELINESS AND COST
In TRI's first reporting year (RY87), EPA achieved a data entry accuracy
level of 97.5% for all data fields in all records in the system. In RY88, EPA
reached 98.5%. However, this level of accuracy is not considered by EPA staff
sufficient to make the data completely useful to users, and additional efforts
are underway to insure a higher level of accuracy for RY89. Currently, EPA
attempts to not only correct data input errors but also ensures that
information provided by reporters is as accurate as possible. TRI
management's goal is to achieve "near 100% accuracy for certain key data
fields," particularly release values. EPA has expended a large amount of effort
to ensure data accuracy, and the result is that the quality of the data is high
and will improve more as further modifications to the system are made.
TRI's RY87 data was released to NLM on June 19,1989, and the RY88 data
was released on May 29,1990. The Information Management Division (IMD),
which is responsible for TRI operations, has established an internal goal of
having the data ready for release to the public through NLM nine months
after the filing deadline. So far, this goal has not been achieved.
TRI Productivity Review 2
-------
Executive Summary
Initial planning for TRI projected variable processing costs for TRI at
$18.97 per form, assuming only very basic data quality controls. TRI has been
given recurring funding for data entry at a rate of $12.00 per form and was
given approximately $250,000 (a little over $3.00 per form, assuming 80,000
forms) for additional data quality activities. In FY90, TRI was given a
Congressional supplemental (non-recurring funds) of $240,000 for IMD's data
normalization activities and was funded $25,000 for reporter verification of
the forms (Chemical Manufacturer's Association members). Including this
one-time funding, TRI is still operating with less funding than originally
projected in 1987 to be necessary while simultaneously trying to meet more
stringent data quality objectives.
1.3.2. CONCLUSION
EPA has placed an extraordinary level of emphasis on data accuracy in TRI
due to the public nature of the database. This level of accuracy has been
achieved at significant cost as resources and attention have been diverted
from other activities to focus on extensive data quality /accuracy. In
addition, this emphasis on accuracy has resulted in significant delays in
releasing the database to the public through NLM, preventing EPA from
meeting timeliness standards. This situation is exacerbated by the lack of
funding.
Ultimately, EPA must achieve a balance between data accuracy, timeliness,
and cost. Establishing unambiguous, realistic, and achievable levels for these
standards is necessary if management and technical stability for the program
is to be achieved.
1.4. SUMMARY OF ISSUES AND RECOMMENDATIONS
This section identifies critical areas where improvements can be made to
enhance overall performance and provides our recommendations in these
areas.
1.4.1. MANAGEMENT
The following issues were identified during the study as areas where
improvement could be made to EPA's management approach to TRI:
Impact of Change in Regulatory Environment: The potential for
changes in regulations, which will significantly increase the
number of data fields stored in the database and/or the number of
reporting parties, is high. Should this occur, EPA will need to
determine the most appropriate means of adapting the system to
meet the new requirements. Alternatives include modifying the
current system to meet the new requirements or completely
redesigning the system. Recommendation: Selecting a specific
TRI Productivity Review 3~
-------
Executive Summary
alternative in this case is not possible until the full impact of the
changes is known.
Clarification of Productivity Standards: In order for TRI to know
whether or not it is accomplishing production goals, productivity
standards must be clearly defined for data accuracy levels and
timeliness. Although an internal timeliness goal of 9 months
from the reporting deadline has been selected by IMD, an
unambiguous goal has not been identified for data accuracy.
Selecting a specific goal for data accuracy and articulating these
goals would enable EPA to measure productivity against an
accurate standard, and facilitate the prioritization of resources. It
would also focus staff and contractor efforts on meeting a specific
productivity goal. Recommendation: Establish unambiguous
goals for data accuracy and timeliness.
Strengthen Planning Environment: A formal planning process
within TRI does exist, but plans are often not followed through
due to the responsibility of the public system to respond in a
timely manner to demands from external parties. Although this
response capability is critical for TRI, the time and funds spent
responding to ad hoc requests often impede long-term growth of
the system. Three planning areas are critical in improving the
planning process: planning for ad hoc requirements, establishing
a formal software development process, and establishing a formal
hardware replacement process. Recommendation: All three
alternatives should be implemented.
Change Contractor Organization: A fundamental issue that
affects TRI operations is the functional responsibilities that have
been assigned to the contractors who are working on TRI.
Currently, three contractors share the operational responsibilities
for TRI, and this results in a complex structure with some
functional overlap. Three practical alternatives exist for TRI:
maintain current tasking arrangement, separate contractors for
facility operations and data reconciliation, or utilize one
contractor for facility operations, software development and data
reconciliation. Recommendation: EPA should utilize one
contractor for TRI operations.
Revise Contractor Statements of Work (SOW): The SOWs of
TRI's two primary contractors were written to support TRI during
its start-up phase. Therefore, several activities listed are obsolete
and current standards for productivity are not reflected.
Although the present contractors are performing well, this
recommendation addresses the benefits to be gained by modifying
TRI Productivity Review 4
-------
Executive Summary
these SOWs in either present or future contracts to further protect
the government's interest and to provide incentives for
continued high performance. Recommendation: EPA should
rewrite its contractors' SOWs.
1.4.2. TECHNICAL
Fundamentally, EPA's technical approach to TRI's information system is
sound, but enhancements can be made in the following areas:
Data Input Technology: A primary area where application of
new technology has the potential to significantly increase TRI
productivity is the data input process. Enhancements in the
software utilized for manual keying, as well as magnetic media
submissions and OCR scanning could have potentially large
impacts. Recommendation: EPA is already addressing this issue
with regard to magnetic media and OCR scanning and should
continue its efforts to improve data input speed and accuracy.
Since the data entry software is so crucial to TRI success, we
recommend continued efforts to improve the design and
functionality of the current data input software.
Upgrade Form Storage and Access: With over 25,000 TRI
facilities submitting more than 80,000 five page forms annually to
the TRC, document filing, storage, and retrieval has become a
prominent issue for the overall success of the system. EPA has
recognized that the current practice of storing paper forms in
filing-cabinets is neither cost effective nor practical and is
investigating alternative storage means. Recommendation EPA
should select and implement optical disk storage for Fonn Rs.
Redesign Form R: There are many concerns regardin ^ the
inefficient design of the present Form R as it is difficult for
reporters to understand and hinders data entry and OCR
scanning. This issue is time critical because present OMB
approval for the form expires in January of 1991. EPA will need
to start the re-approval process very soon to ensure that it is
completed before the form expires. Two alternatives exist for this
issue: retain existing form design or redesign the form.
Recommendation: EPA should redesign Form R and the
instructions on how to complete the form.
TRI Productivity Review
-------
Executive Summary
Booz, Allen feels that implementation of these recommendations will
result in substantial improvement in TRI's productivity in terms of
timeliness and data accuracy. Adoption of the short-term recommendations
from this report and continued refinement of the information system and
procedures should result in tapes being ready for NLM in less than nine
months for RY89 without reducing data quality standards. Completion of
actions on long term recommendations from this study should result in TRI
data entry taking six months or less within three years with high accuracy
levels.
TRI Productivity Review
-------
II. INTRODUCTION
11.1. BACKGROUND
This section provides background information on the Toxic Release
Inventory System, including a summary of the history of TRI, an overall
high-level description of TRI operations, and a discussion of potential future
legislative and regulatory developments which will impact the system.
H.l.1. HISTORY OF TRI
The Emergency Planning and Community Right-to-Know Act, Title HI of
the Superfund Amendments and Reauthorization Act (SARA) of 1986,
requires facilities which manufacture, process, or use any of the specified toxic
chemicals to report annually the amounts of these chemicals released directly
to air, water, or land or that are transported to off-site dumping facilities. This
legislation is based on the belief that the public has a "right-to-know" about
toxic chemicals in their communities. Specifically, the legislation has two
main purposes to:
Encourage response planning for chemical emergencies
Provide the public with information on potential chemical
hazards in their communities.
The same law requires EPA to establish a national Toxic Release Inventory
(TRI) and to make this information available to the public annually via
computer telecommunications and other means.
Although SARA was passed on October 17,1986, administrative rules
governing TRI were not finalized until February 1988. These rules required
facilities to submit their first Form R in July of the same year. Although
preliminary planning and staffing for TRI began prior to February 1988, the
system design could not be finalized until after rulemaking was completed
and specific reporting requirements were formalized. Consequently, EPA had
approximately six months to finalize its plans and procedures, to establish a
facility for processing the documents, and to implement an information
system. Given the daunting task of designing a system with so many
unknowns (e.g., number of reporting facilities, complexity of release
estimates, and other difficulties), the successful establishment and
organization of TRI was a remarkable accomplishment.
During this initial planning phase, EPA chose the National Library of
Medicine's (NLM's) Toxicology Network (TOXNET) as the vehicle to satisfy
the Congressional telecommunications requirements. Other methods of
TRI Productivity Review 7
-------
Introduction
public access have also been developed, including publications, diskettes, and
magnetic tape.
TRI has continued to evolve operationally from its first year. The
following sections provide a high level view of TRI operations as they exist at
the time of this study.
II.1.2. SUMMARY DESCRIPTION OF TRI
The process by which data is received and entered into the TRI database is
summarized below and in Exhibit 1. This process has five main components:
Industry submits Form Rs, containing the required reporting
information, to the Title IE Reporting Center (TRC) where they
are processed, entered into a Local Area Network (LAN) database
and a sample is verified for data accuracy.
Records from the LAN database are periodically uploaded to
EPA's mainframe computer (an IBM 3090) located at the National
Data Processing Division in Research Triangle Park (RTF), North
Carolina.
TRC staff, assisted by other contractors and EPA personnel, use
terminals connected to the mainframe to run data quality reports,
analyze the data for accuracy, correct records in the database, and
standardize data (such as parent company names) across the
database.
Submitting facilities are contacted, when necessary, by EPA and
TRC staff to notify them of noncompliance with the regulation
and to resolve technical errors in the data.
Data is transferred to NLM in segments of approximately 20,000
records each after the data reconciliation process has been
completed for that section. NLM indexes and loads the data on
TOXNET. Data is also available on various other media, such as
CD ROM, diskette, magnetic tape, and through the National
Report and other publications. The hardcopy forms for the
current year are available for review by the public as soon as the
data is entered into the network.
Responsibility for TRI operations has been primarily delegated to two
divisions within OTS, the Information Management Division (IMD) and the
Economics and Technology Division (ETD) although other divisions also play
a role in outreach, analysis, and other areas. IMD has the responsibility for
data management implementation and operations while ETD is responsible
for overall program guidance, regulation development, and regulatory
TRI Productivity Review 8
-------
TRI COMPONENTS
K
n
»-»
*^4
CJ
*»*
^
Reouired Section 313 Reporting
Industries throughout the U.S.
EPA Processing
Annual
Release of
Data
to Public
Shaded area represents focus of this study.
Electronic
Public Access
-t
O
EX.
-------
Introduction
interpretation. Three main contractors assist EPA in TRI operations: one
with responsibility for the TRC, another which deals primarily with software
development and maintenance, and the third which assists with data quality
efforts.
More detailed information on the management and organization of TRI
(including an organizational chart) is found in Appendix H Technical details
are located in Appendix HI.
11.13. TRI FUTURE POSSIBILITIES
TRI was mandated by Congress to be a public system. The nature of
this mandate as well as the topical nature of the Form R data has resulted in
TRI receiving extensive public attention. This attention is continuing to
grow, and the public is demanding more information. This momentum
may, in turn, result in additional reporting requirements for TRI. For
example, EPA is considering requiring new industries, such as federal
facilities, mining, agriculture, or utility companies to comply with the
reporting requirements of Title HI. Also under consideration, is the addition
of certain data fields under the waste minimization section of the form or
requirements to report peak release information.
Additionally, TRI will soon be encountering two situations which will
require strategic decisions by management. The first situation involves the
contracts with the the TRC facility contractor and the system development
contractor. Both of these contracts must be recompeted within the next
eighteen months. Additionally, in January 1991, Form R approval must be
once again obtained from the Office of Management and Budget (OMB).
These situations combined with the dynamic regulatory environment
provide an appropriate opportunity for TRI to examine its current operations
in order to ensure that they are adequate to meet future requirements and
demands.
II.2. PURPOSE OF STUDY AND REPORT
The purpose of this study was to conduct a productivity review of TRI.
Booz Allen & Hamilton Inc. was tasked with identifying current TRI
strengths and weaknesses and targeting specific areas where operations may
be improved to lessen costs, strengthen data reliability, and reduce the time
required to release data to NLM. The study focused in particular on TRI data
collection and receipt, data entry, quality control and assurance, storage,
retrieval and tracking and retention of data submissions.
This report, the final deliverable in this task, is an assessment of the
current productivity of TRI, including management, software, and hardware
activities. It also evaluates current initiatives that EPA is taking to increase
TR7 Productivity Review To"
-------
Introduction
the effectiveness of the system. Finally, this report contains
recommendations for further improvement to TRI.
II.3. STUDY APPROACH
This section describes the methodology utilized by Booz, Allen in
conducting the study through describing: the key standards utilized to
measure productivity for TRI, the management and technical perspective
used to evaluate TRI productivity, and our approach to understanding these
issues.
Productivity for TRI is defined in this study by the following standards:
Accuracy: The degree to which the data entered into the database
is correct, both in terms of data entry accuracy and detection and
correction of reporter errors.
Timeliness: The extent to which EPA is able to provide the
database to NLM for public use in a timely manner.
Cost: The level of funding for TRI and the extent to which the
above criteria impact funding requirements.
These three criteria, which are interrelated, represent the fundamental
performance dynamics that surround TRI and its user requirements. Users,
public and private sector, would like to have access to the data as soon after
the end of the reporting period as possible. At the same time, it is critical that
the data in TRI be as accurate as possible as toxic chemical release numbers are
a key public indicator of environmental compliance and performance.
Locational consistency and accuracy are also important if fundamental
environmental decisions are to be made for particular counties and cities
based upon this information. Finally, the amount of effort and resources
required to input and reconcile the data for TRI is high, and is of significant
concern to EPA. Therefore, as just described, there is a high degree of
interdependence between these criteria, and the overall performance of TRI is
strongly affected by the balance achieved between them.
As shown by Exhibit 2, productivity for TRI is determined by the quality of
the technical system architecture and the management operations that
surround TRI. Productivity issues that are fundamentally driven by
technology considerations and the limitations imposed by the current
information system architecture were viewed from a technical perspective.
Issues that revolve around planning activities, the organization of human
resources, procedures that control the behavior of those who interact with the
system and the day to day operation of the information system were
evaluated from a management perspective.
TRI Productivity Review TT
-------
Introduction
Exhibit 2
Relationship between TRI Operations and Performance
TRI Operations
Management
Technical
TRI Perfonnance
Timeliness
Accuracy
Cost
The interplay between these two perspectives is complex and fundamental
to effective operation of TRI. Short-term productivity of the system consists
of effective management exercised within the technological constraints of the
current system. Long-term productivity is related to successful refinement of
the existing system, including the introduction of new technology to improve
productivity and to meet new, emerging requirements.
Our approach to this study was to understand the current productivity of
TRI in terms of data accuracy, timeliness and cost through assessing the
strengths and weaknesses of the management and technical aspects of TRI
operations. Additionally, we evaluated the tradeoffs that were necessary in
order to meet program requirements within budget constraints. Finally, we
developed management and technical recommendations to enhance the
performance of TRI.
TRI Productivity Review
12
-------
in. PRODUCTIVITY ASSESSMENT
This section summarizes our assessment of the management and
technical operational effectiveness of TRI based upon the standards of data
accuracy, timeliness, and cost. We also discuss key management and
technical issues which are presently driving TRI productivity and in which
improvements can be made.
lll.i. PRODUCTIVITY STATUS SUMMARY
An assessment of TRI implementation, must recognize that TRI, a public
database mandated by Federal law, was implemented under very tight time
constraints and that operational pressures to rapidly improve the system
have been intense. Unlike other information systems which have an
internal, well defined set of users and user requirements, public access
systems are required to meet the sometimes conflicting requirements of a
broad range of users. This makes it difficult to establish consensus on goals
and objectives.
Booz, Allen also realized that the performance expectations for TRI in
terms of data accuracy, timeliness, and cost are interdependent. Efforts to
improve any one area are likely to result in decreased performance in one or
both of the other areas. A key aspect of TRI performance, therefore, is
understanding the tradeoffs that have been made between the performance
criteria.
The following sections provide discussion on TRI's current level of
productivity as measured by accuracy, timeliness, and cost.
IH.1.1. DATA ACCURACY
EPA's original policy for data accuracy in TRI was defined as entering into
the database information exactly as it appeared on Form Rs submitted by
reporting parties, with a goal of achieving 92% accuracy for data entry.
Experiences during and after RY87 data input demonstrated that this level of
accuracy was insufficient to meet the needs of users. As a result, EPA has
internally refined these data quality objectives significantly and has extended
the concept of data correction to include the interpretation of data submitted
in a non-standard format (i.e., chemical names with minor misspelling,
transposed latitudes and longitudes, etc.) by reporting parties on Form Rs as
well as errors made in the data entry process.
Currently, procedures to ensure data accuracy are extensive. EPA has
expended a large amount of effort in the data accuracy area, and the result is
that the quality of the data is high. TRI achieved an overall data input
TRI Productivity Review 1
-------
Productivity Assessment
accuracy level of 97.5% for all data fields in all records in the system in RY87
and 98.5% in RY88. At this point, data accuracy standards require near 100%
accuracy (no specific target percentage has been defined by EPA) in critical data
fields, such as release amounts. Where reporter errors are obvious and can be
changed, they are corrected by TRI personnel, and in other cases the reporting
party is contacted and required to provide correct data.
III.1.2. TIMELINESS
IMD, which is responsible for TRI operations, has established an internal
goal of having the data ready for release to the public via NLM nine months
after the filing deadline. So far, this deadline has not been met. TRI's RY87
data was released to TOXNET on June 19,1989, and the RY88 data was made
available to NLM on May 29,1990. In both years, start-up software, hardware,
and procedural problems as well as the need to correct errors in the data have
all significantly delayed completion of the process. Data entry procedure and
software changes were minimized between RY88 and RY89 in order to
stabilize the system, which should result in earlier public availability for the
1989 data assuming stable funding.
External expectations concerning the time that should be necessary to
produce the database for a particular year for release to NLM vary from three
to six months. Given EPA's decision to ensure high data quality prior to
release of the database, the volume of data to be entered into the database, and
the current level of staffing and funding, EPA has not been able to meet these
expectations.
111.13. COST
During initial planning done prior to system implementation in 1987,
variable data processing costs for TRI were projected to be $18.97 per form*
assuming only very basic data quality controls. TRI was funded for data entry
(including data verification) at $12.00 per form and was given approximately
$250,000 (a little over $3.00 per form, assuming 80,000 forms) for automated
data quality activities - Notices of Noncompliance (NONs) and Notices of
Technical Error (NOTEs). Subsequently, as indicated in the previous section,
data quality standards have been increased significantly. In FY90, TRI was
* REGULATORY IMPACT ANALYSIS IN SUPPORT OF PROPOSED RULEMAKING UNDER
SECTION 313 OF THE SUPERFUND AMENDMENTS AND REAUTHORIZATION ACT OF
1986, Contract No, 68-02-4235, Task Order No. 2-24, May 1987, Debra Harper, Economics and
Technology Division, Office of Toxic Substances, Washington, D.C. 20460. This cost estimate
includes $1.15 for microfiche preparation, processing, and storage of Form R's and $1.15 for
retrieval and refile of microfiche. Currently, EPA is storing paper copies of Form R's and does
not use any advanced storage technology. These estimates have not been removed from the
overall estimate because we believe that the actual costs of processing, storing, and retrieving
paper copies of Form R's are at least as expensive as utilizing microfiche technology.
TRI Productivity Review U
-------
Productivity Assessment
given a Congressional supplemental (non-recurring funds) of $240,000 for
IMD's data normalization activities and was funded $25,000 for reporter
verification of the forms (Chemical Manufacturer's Association members).
Including this one-time funding, TRI is still operating with less funding than
originally projected to be necessary and simultaneously trying to meet more
stringent data quality objectives. The following circumstances reflect the
stress imposed upon the system by this situation:
TRI staff have routinely exerted extremely high levels of personal
effort to satisfy regular TRI operational demands as well as special
requests. EPA realizes that the current level of personal effort
cannot be sustained in the long run without causing burnout for
key individuals.
TRI periodically has to redirect resources to critical data quality
areas. An example of this is when significant numbers of EPA
personnel assist the TRC with data reconciliation. This has a
disruptive effect on other responsibilities.
Lower priority tasks, such as documentation of procedures, tend
to slip significantly. As a result, communications and
coordination between contractors suffers.
Even if original cost estimates for TRI nrere overstated, the combination of
funding the program at less than the of, ai estimates, increases in costs due
to inflation, utilizing non-rr .urring fur ' , and significantly increasing data
quality standards has resided in a lev " of mnding which is insufficient to
allow TRI management .0 fully accc plibh its task. The current funding
level also gives TRI very little flextf .iity in refining the system to make it
more productive.
Hl.1.4. CONCLUSION
EPA has r *aced an extraordinary level of emphasis on data accuracy in
TRI. The rarrent objective of achieving "near 100% accuracy" in critical data
fields without setting a specific, finite accuracy goal has resulted in
continr Jly increasing efforts to improve data accuracy. A new initiative to
mail release information back to all reporting parties for review and
correction demonstrates that EPA desires to achieve an even higher level of
d? :a accuracy for chese critical fields for RY89.
This emphasis on data accuracy has been achieved at a significant cost as
resources ana attention have been diverted from other activities to focus on
extensive d? >a quality /accuracy. In addition, this emphasis on accuracy has
resulted in significant delays in releasing the database to the public through
NLM, ar\f" as a result EPA has not met timeliness standards.
TRI ; roductivity Review 15
-------
Productivity Assessment
Ultimately, EPA must achieve a balance between data accuracy and
timeliness. Establishing realistic and achievable levels for these standards is
necessary if management and technical stability for the program is to be
achieved.
The next section of the report discusses the management and technical
configuration issues which affect the productivity of TRI.
III.2. PRODUCTIVITY ISSUES
IIL2.L MANAGEMENT ISSUES
III.2.1.1. Introduction
TRI is a system in transition from an intense and creative start-up period
to a more stable, institutionalized mode of operation. The first two years of
operation have been filled with one challenge after another as original plans
and standards were changed to meet reality, as problems in the system were
discovered and overcome, and as EPA adapted to the demands of a public
access system. Flexibility and the ability to solve critical problems as they
occurred, were of prime importance during the last two reporting years.
Now, circumstances are changing. Staff and contractor personnel are
beginning to experience burnout and are no long?r as capable of high levels of
sustained effort as they once were. Errors, dela> and mistakes which were
expected in a start-up mode are t.o longer tolerable. In addition, expectations
for more rapid turnaround of data, better quality, and budget constraints are
exerting additional pressure on the organization.
The management issues discussed beliw must be addressed to change and
improve the system in the present environment. Careful attention to
potential regulatory changes combined with planning of long term
software/hardware enhancement efforts, refinements in management
control, and a reassessment of contracting strategies are all necessary if the
system is to meet current and future expectations.
HI.2.1.2. Changes in Regulatory Environment
The present regulatory environment includes several proposals which
could substantially increase the number of reporting parties and/or the
number of data elements collected. For example, new pollution prevention
requirements, which are being considered within the Agency, could increase
the number of data elements by as many as 30 fields per form (the current
form has approximately 60 fields), resulting in a substantial increase in the
amount of data maintained in the TRI system and significantly altering the
database structure. Other proposals being considered would significantly
increase the number of reporting parties by adding SIC codes of facilities or by
TRI Productivity Review 16
-------
Productivity Assessment
adding new chemicals to the list. Requirement changes of this type will
require careful evaluation and planning to ensure that they are met in a
timely manner without disrupting the stability of the system and to make
certain that the ability to meet Congress's public access goals is maintained.
See Chapter V, Section 2.1 for our recommendations in this area.
HL2.1.3. Productivity Standards
Currently, significant confusion exists concerning productivity standards
for TRI. Although IMD has established an internal timeliness standard for
TRI of nine months from the reporting deadline, expectations for earlier
release remain high, even within EPA. This creates additional pressure on
TRI management to attempt to satisfy a variety of timeliness goals.
In the case of data accuracy, productivity standards have evolved from the
original level of 92% as identified in a contractor's SOW to EPA's current
stated desire of achieving "near 100% accuracy for certain key fields,"
particularly release figures. This is an ambiguous standard as near 100% can
be defined in many different manners, therefore causing confusion.
Additionally, other than release data, "key fields" have not been clearly
identified, and an explicit standard for non-key fields has not been specifically
articulated. Furthermore, this revised accuracy standard has not been
formalized or communicated to all personnel as evidenced by the previously
mentioned contractor's SOW which still contains the original standard. (EPA
is aware of this issue and is responding to it. Further discussion on the
response is provided in Chapter IV Section 2, Productivity Standards.)
This lack of a clear standard has not translated into a low level of accuracy
for TRI as the actual level of data accuracy is quite high as stated earlier.
However, the confusion has caused several other problems:
Data accuracy "success" for TRI is undefined. Without finite data
accuracy goals, it is not possible to determine when TRI has met
program objectives.
Prioritization of enhancement efforts based upon their impact on
productivity cannot be made in a quantifiable manner.
Budgeting decisions are not being made based upon explicit
productivity increases measured relative to a quantifiable goal.
Chapter V, Section 2.2 discusses our recommendation in this area.
1112.1.4. Planning Environment
Management direction for TRI is often subject to numerous compromises
with regard to the implementation of originally scheduled plans.
Unexpected, ad hoc requests have frequently diverted resources from planned
activities. These requests are often a result of TRI's need, as a public access
TRI Productivity Review IT"
-------
Productivity Assessment
system, to respond to demands from external parties, especially Congress. TRI
management, therefore, feels pressure from the public as well as within the
Agency to meet these numerous ad hoc requests and to also produce the
database more rapidly, accurately, and cost efficiently at the same time.
This responsiveness to ad hoc activities combined with TRI's need to focus
on solutions to short-term challenges during system start-up have often
resulted in a failure to meet planned development deadlines and to
adequately plan and execute system enhancements. The penalties for failing
to meet internal deadlines are often very high as software has not always been
developed in a timely manner and enhancements have had to be postponed
for later reporting years.
Another major planning issue for TRI involves the LAN hardware. The
majority of the LAN hardware is at least two years old and soon will begin to
experience failures due to age. Additionally, the increased data load on the
LAN is adding stress. Hardware failures during the critical data input period
following the July 1 reporting deadline can have a direct day for day impact
on timeliness and a very disruptive effect on the data entry process. Positive
steps need to be taken to ensure that reliability of the LAN does not
deteriorate due to age.
Our recommendations for planning initiatives are found in Chapter V,
Sections, 2.3.2.1 - 2.3.2.3.
IIL2.1.5. Contractor Organization
A number of factors, including existing EPA contract utilization policy,
and a lack of in-depth operational understanding which could only be known
after the system was online, resulted in a contractor organization which has
significant operational overlap between contractors. This, in turn, has
resulted in some duplication of effort as well as an increased need for
coordination and direction to keep joint operations proceeding smoothly.
Two contractors share the primary operational load for TRI and another
contractor plays a secondary role. Computer Based Systems, Inc. (CBSI) is
responsible for operation of the TRC. Major activities performed by this
contractor include receipt and storage of Form Rs (for RY87 they processed
79,784 submissions and for RY88 they processed 82,123 forms), data entry, data
searching and retrieval, and data quality/reconciliation support.
SYCOM Inc., a subcontractor to Planning Research Corporation (PRO, is
mainly responsible for software development and maintenance. This
contractor initially developed the data entry software on the LAN and
currently provides ongoing support for software development and special
programming requirements for developing system enhancements, generating
reports from the mainframe, and responding to ad hoc requests for assistance.
SYCOM also completes certain day to day operational functions, such as
TRI Productivity Review 18
-------
Productivity Assessment
uploading data from the LAN to the IBM mainframe, and performs
troubleshooting activities on the LAN and for magnetic media data loads.
Finally, PEL supports data quality efforts by providing expert advice on
chemical nomenclature^ data entry operators as Form Rs are keyed into the
LAN system. Furthermore, they assist with the review of NOTEs and NONs.
Further information on contractor organization and function can be found in
Appendix n.
As a result of the current structure, multiple organizations are responsible
for overlapping functions. Some examples of this are listed below:
Data quality activities in the past have involved at least two
contractors in order to run and analyze routine data
reconciliation reports. This has resulted in delays and errors
which have caused data reconciliation to not be performed on
some records in a timely manner. This issue is being addressed
by EPA for RY89 by giving CBSI control of the data reconciliation
reports.
Data input software that runs on the LAN is sensitive to the
hardware and system software configuration. It is necessary for
the TRC contractor, who operates the LAN, to coordinate
hardware and software changes with the software development
contractor in order to ensure that changes in hardware or system
software do not cause failures in the application software.
Some data revision activities require synchronized activities by
two contractors in order to successfully delete records from both
the LAN and mainframe databases and reentry of modified
information in cases where it is necessary to revise CAS numbers.
Magnetic media submissions that do not load properly are shifted
to a different contractor for analysis and troubleshooting. The
impact is that employees from the second contractor are pulled
from normal responsibilities to assist with magnetic media.
Within the next two years, the PRC/SYCOM and CBSI contracts are up for
renewal. At this point, there is sufficient time to evaluate the possible
options for contractor organization and tasking and ensure that any new
contracts issued reflect the best possible organizational match with TRI
operational requirements.
Our recommendations for this issue are found in Chapter V, Section 2.4.
TRI Productivity Review 19
-------
Productivity Assessment
111.2.1.6. Contractor Statements of Work
The SOWs for the two primary contractors were written during initial
system development, when knowledge of operational procedures was limited
and a significant need for flexibility in contractor tasking to enable response to
unforeseen circumstances existed. As a result, the SOWs, particularly in the
case of the software development and maintenance contractor, are written in
very general language with broad tasking. Additionally, some initial
operational concepts, that were included in the SOWs but were either not
implemented or were subsequently altered significantly, have never been
deleted.
This contractor SOW issue is critical for TRI as two of its contracts will
need to be rebid during the next two years. Our specific recommendations for
this area are found in Chapter V, Section 2.5.
IIL2.2.1. Introduction
With a short initial start-up period and limited funding, EPA was
significantly constrained in its choice for a system architecture. Excessive
charges to utilize EPA's mainframe in RTF for online data entry as well as
slow response time during peak periods led TRI staff to consider a PC LAN or
a mini-computer based system as the only practical data entry architectures to
pursue at the time. Although a mini-computer solution was viable, it was
not chosen in order to maintain compliance with overall agency hardware
architecture requirements. The LAN, however, was not a suitable platform to
serve as the repository for the TRI database for external connectivity,
reliability, and capacity reasons. Therefore, EPA selected a LAN for the data
entry operations and utilized the mainframe at RTF as the TRI database
repository.
To meet its software needs, EPA selected industry standard tools to support
TRI development. Some data entry and data transfer software problems were
encountered during the start-up phase. These problems have since been
overcome, and over the two years of TRI production, the application has
stabilized and steady progress is being made in upgrading the software.
Fundamentally, this architecture, software, and EPA's overall technical
approach to TRI's information system, are sound; however there are several
areas where technical issues exist and where enhancements and
improvements to the system are possible. These issues are discussed in the
following section.
III.2.2.2. Data Input Technology
The data input technology utilized by EPA has a major impact on the level
of timeliness and data accuracy which can be achieved in TRI. Of the several
technologies which exist to transfer data to the computer from paper
collection media, EPA is currently using two methods, keyboard entry and
TRI Productivity Review 20
-------
Productivity Assessment
magnetic media entry and will be testing a third technology this year - Optical
Character Recognition (OCR) systems.
EPA primarily relies upon manual keying of data by data entry operators
to enter information into the database. Fundamentally, the data entry
software is sound although alternative mechanisms could be utilized to
enhance data quality checks. However, manual entry of data is relatively
slow and provides many opportunities to introduce errors into the database.
To improve upon the speed and accuracy of the data entry process, TRI
staff have encouraged the use of magnetic media for reporting.* When
reporters provide submissions in the correct format, data entry speed is much
faster than manual data entry* although not insignificant (e.g., the loading
process can take over two hours for one disk with 49 submissions.), and data
entry accuracy is 100%. However, data entry from magnetic media
submissions has suffered in the past due to a lack of adherence by reporting
facilities to EPA published standards. The instructions for submitting
Form Rs on magnetic media were interpreted differently resulting in a
variety of submitted formats.
Several commercial vendors have developed software to assist reporting
facilities in filling out Form Rs and creating properly formatted submissions.
However, even these commercially available products generated incorrectly
formatted files which caused errors when loading the data. This slowed data
input terribly because the incorrectly formatted submissions had to be
deciphered and corrected prior to loading the data into the database. As EPA
staff realizes the substantial productivity gains which can be realized with
correctly formatted magnetic media submissions, they are seeking to improve
this process. This initiative is discussed in Chapter IV, Section 4.1.
Finally, EPA is testing OCR technology to further attempt to speed data
entry and improve accuracy. This technology will allow reporters to still
submit on the paper form, although Form Redesign would need to be
considered (see Chapter V, Section 3.3), as OCR can input typed or
handwritten reports by reading data directly from the form and entering it
into the database. To be an effective alternative for manual data entry, the
OCR scanner must be faster and more accurate than a human. However, if
the scanner does not accurately capture the data on the form, its speed and
cost benefits are quickly eroded due to the monitoring and intervention
* EPA only accepts magnetic media Form R submissions on nine track magnetic tape or
microcomputer diskettes (either 5.25 or 3.5 inch formats) formatted in DOS 2.1 or higher from
an IBM PC/XT/AT or compatible microcomputer.
" Tracking system data can be entered at the rate of 20-25 records per hour or 2.5 to 3 minutes per
record. TRIS keying can be done at the rate of 8 records per hour or 7.5 minutes per record.
TRI Productivity Review z
-------
Productivity Assessment
required by an operator to correct the scanner's mistakes. Chapter IV, Section
4.2 provides further detail on EPA's test of OCR technology.
Booz, Allen's recommendations in this area are discussed in Chapter V,
Section 3.1.
11122.3. form Storage and Access
With over 25,000 TRI facilities submitting more than 80,000 five page
forms annually to the TRC, document filing, storage and retrieval becomes a
prominent issue for the overall success of the system.
Even though all the data on the Form R is captured and entered into the
database, the paper Form R must be retained to verify the database, answer
FOIA queries and ad hoc requests, and support administrative, civil, and
criminal actions by the EPA against TRI facilities. The current storage and
retrieval system is entirely manual as well as extremely laborious and slow.
Additionally, the physical storage requirements for a year's worth of TRI
submissions requires about 1200 square feet of storage space, including
sufficient working room around file cabinets.*
EPA staff has recognized that their practice of storing the Form Rs on-site
in filing cabinets will quickly overrun their current personnel and building
resources. A January 1990 report by Mathtech, Inc for IMD examines the
feasibility of alternative records management technologies to assist TRI
managers in condensing the volume and improving the accessibility of
Form R reports.
A synopsis of the technologies examined in the Mathtech report are listed,
along with our recommendations, in Chapter V, Section 3.2.
1112.2A. Form Design
One of the basic barricades to faster data entry is improperly completed
forms. The TRI Reporting Package for 1989 contains over 100 pages of
instructions, including examples and answers to common questions, on how
to complete the five page Form R. Still, reporting errors are common and
dramatically disrupt the data input process. The data entry software has
incorporated several edit checks to trap common errors; however, there are
far too many errors that cannot be classified and many require a supervisor's
attention to correct.
Fundamentally, the form's instructions are confusing and the form is
awkward to fill out. In the interest of reducing the number of pages a facility
must fill out, common items from the report were grouped together on the
* This figure came from Mathtech Inc's "Feasibility Study for Alternatives for EPA Form R
Records Management" January 1990.
TRI Productivity Review 22
-------
Productivity Assessment
first two pages. Consequently, when filling out the last three pages, the
reporter must constantly refer to the first two pages. All too often, the
submitter overlooks or erroneously enters required references to data on
pages one and two. A strong case can be made that whatever time is saved by
placing common items on the first two pages is lost in the confusion and page
flipping required when filling out the remainder of the form.
One potentially dangerous flaw on the form is the lack of an identifier on
each page of the form. Should any of the remaining pages be separated from
the first page, there is no identifier to match it with its proper Form R. Strict
document handling procedures implemented by TRC staff have thus far
prevented the loss or misfiling of Form R pages. However, the potential for
chaos is high.
In addition, the present design of the form may impact on the
effectiveness of the OCR technology in the data input process. EPA's current
pilot test will allow TRI to determine the extent of these impacts, if any.
Our recommendations for Form R can be found in Chapter V, Section 3.3.
IH.2.2.5. Develop Integrated Facility File
In the first year of TRI operations, the data entry software required
operators to enter complete name and address information for each reporting
facility. Furthermore, for facilities that submit multiple Form Rs, the facility's
entire address was keyed into the TRI database for each Form R submitted
even though the address did not change. Also, facilities used a variety of
abbreviations for company names, as well as for city and county names.
These variations in abbreviations made accurate retrieval and use of data
very difficult. For example, a report requesting all releases for the city of San
Francisco would not pick up data where the city name was abbreviated SF,
San Fran, or any other variation.
Additionally, facilities may change names from year to year. Without an
integrated facility file, there is no way of tracking these name changes. As TRI
data may be used by the EPA as evidence in administrative, civil or criminal
actions against TRI facilities, EPA absolutely requires the ability to precisely
prepare release reports for a facility in order to successfully execute a case
against a facility.
EPA is in the process of implementing a facility file to improve this
situation. See Chapter IV, Section 5 for a description of this effort.
IH.2.3. CONCLUSION
IIL2.3.1. Management Effectiveness
Overall management effectiveness for TRI must be considered in the
context of the start-up of a new and unique system. As a public access system,
TRI Productivity Review 23~
-------
Productivity Assessment
the managers of TRI were faced with the requirement to rapidly design and
implement an information system with unique requirements. A high level
of commitment and personal effort has enabled TRI to:
Make the database available to the public, fundamentally
fulfilling the legal mandate under SARA.
Make significant improvements to data accuracy and the overall
effectiveness of the data entry software for RY88 and RY89.
Implement several long term initiatives to improve the data
input process and upgrade the capability to do data reconciliation.
At this point, the program is fulfilling its basic mission of providing public
access to toxic release data, and is providing a very high quality database to the
public.
Although TRI management has been successful in implementing TRI,
there are several areas where improvements can be made:
TRI is beginning its third reporting year and is still being
managed in a manner more appropriate for a start-up system.
Ad hoc activities interfere significantly with long term planned
activities and contractor task redirection is common place.
Productivity standards have not been definitively established,
and, in the case of data accuracy, the standard has changed
significantly over time without overt management decision.
The distribution of tasks among contractors is complex as
discussed previously and tasking overlaps require unnecessary
coordination and communication to accomplish routine
operational activities.
The fundamental challenge for TRI management is to find the time and
perspective to continue to meet operational responsibilities, and at the same
time capitalize on upcoming opportunities for improvement to the system.
IIL2.3.2. Technical Effectiveness
TRI has experienced growing pains typical of an emerging production
system. However, the architecture, which was selected, remains
fundamentally sound. As PC-based LANs do not have the hardware and
software reliability present in mini-computer and mainframe computer
systems, TRC operations have and will continue to experience equipment
breakdowns and other reliability problems, ranging from the merely
annoying (such as jammed printers) to those which interfere significantly
with production (complete LAN failures). These reliability problems are not
TRI Productivity Review 24
-------
Productivity Assessment
insurmountable if careful system and configuration management practices
are followed.
So long as data entry requirements for the system do not change
substantially, in terms of the number of fields or the number of records/ the
combination LAN/Mainframe architecture should be satisfactory. Constant
attention to good system management, hardware replacement, and data
upload procedures is necessary in order to keep the LAN functioning
properly.
The following two sections in this report will address ongoing EPA
initiatives to upgrade TRI, and Booz, Allen's specific recommendations to
improve the system.
TRI Productivity Review 25
-------
IV. CURRENT EPA INITIATIVES
iv.i. INTRODUCTION
EPA is already aware of many of the management and technical issues
which were discussed in the previous chapter and their effects on overall TRI
performance. To attempt to improve performance, EPA is specifically
addressing some of these issues through management initiatives. These
initiatives, which come in the form of either operational modifications or
pilot projects, seek to enhance TRI productivity through improving data
quality and timeliness.
IV.2. PRODUCTIVITY STANDARDS
The following EPA initiatives are intended primarily to address issues
raised in Chapter III Section 2.1.3., Productivity Standards.
IV.2.1. SOW MODIFICATIONS
EPA has recognized the need to modify the TRC contractor's current SOW
so that the stated data accuracy requirement will reflect revised data quality
standards. This initiative is a positive step in translating system experience
into clearly defined requirements.
As of July 13,1990, this modified SOW has not been finalized, therefore it
could not be fully analyzed. Nonetheless, Booz, Allen strongly encourages
the development of precisely worded modifications (see Chapter V, Section
2.5.), as in this particular case, in order to dearly define revised activities,
roles, and responsibilities for TRI contractors.
IV.2.2. FORM VERIFICATION BY REPORTING PARTIES
The reporter verification initiative (under discussion at the beginning of
RY89) is also intended to enhance overall data quality. As of the drafting of
this report, the details of this plan are still being discussed. However, the
basic idea of reporter verification is to send copies of release data back to all
reporters for verification that the data entered is accurate. In one proposed
scenario, the reporter would then have a period of time, fifteen days, to
respond with any necessary modifications. Modifications would be returned
to the TRC, and revisions would be made to the data in the system.
There are both advantages and disadvantages to reporter verification,
resulting primarily from the increase in the number of players involved in
the data quality process. First, reporter verification would prevent most
subsequent data quality complaints by reporters, assuming that the data
TRI Productivity Review 27
-------
Current EPA Initiatives
correction process was performed accurately, as they would have an
opportunity to actually review the data entered into the system.
The other major effect of the reporter verification initiative will be in the
area of additional revision processing and analyzing. Even if reporter
verification response is only 10% to 20%, this increase in volume will still
result in additional operational requirements impacting mailroom
operations, document retrieval, data input/revision, and data verification. In
general, these additional operational requirements will increase costs by a
little over $1.00 per form ($100,000 is being allocated for RY89) and could also
provide the opportunity for additional delays at the TRC. EPA should
carefully assess this initiative to ensure that it does not unduly increase
processing burdens for the TRC and that it is contributing significantly to data
quality objectives. In view of TRI's already extensive data quality program,
Booz, Allen suggests that EPA assess the cost/benefits of this program after a
one year period.
iv.3. PLANNING ENVIRONMENT
The following initiative addresses the planning issues discussed in
Chapter m, Section 2.1.4.
iv.3.1. HARDWARE REPLACEMENT PLANNING EFFORTS
The TRC and software development contractors have both made
recommendations to TRI management for hardware upgrades to improve
overall system reliability and performance. In June 1990, the two contractors
and EPA staff met to discuss possible upgrade paths for the existing TRI
hardware and software.
Several recommendations were made and EPA is currently reviewing
each. For the short term, the upgrades will focus on relatively low cost but
high impact improvements. For example, to speed the printing of
reconciliation reports, the TRC recommends purchasing a high-speed impact
printer to replace the low speed laser printer currently in use.
The TRC also recommends upgrading its storage hardware with the
purchase of an optical disk archival system. The 300MB hard disks on the
LAN are quickly approaching capacity and TRC management is planning to
remove RY87 data from the active file server. Tracking data for RY87 and
RY88 cannot be removed from the LAN because the TRC is still receiving
RY87 and RY88 reports and corrections. However, in the upcoming years,
even tracking data will have to be removed from the active server. The
optical disk archival system under consideration by the TRC uses Write-Once,
Read-Many (WORM) technology. Thus, it cannot be used for data that must
TRI Productivity Review 28
-------
Current EPA Initiatives
be modified. The WORM optical disk is intended as a high volume, online
storage device for past reporting year's data.
Long-term recommendations were also made concerning upgrading the
file servers with faster, more powerful PC compatibles or even replacing
them with microcomputer-based database management hardware that would
solely be responsible for the database management while the existing servers
would handle routine LAN requests.
These solutions should improve overall system reliability and thus
decrease the potential for system failures and lost data entry time. See
Chapter V, Section 2.3. for our recommendations.
iv.4. DATA INPUT TECHNOLOGY
Through the following initiatives, TRI management is seeking to address
certain data input technology issues raised in Chapter III Section 2.2.2., Data
Input Technology.
IV.4.1. MAGNETIC MEDIA IMPROVEMENTS
Currently only a limited number of facilities use magnetic media to report
their release data, and data entry from these submissions has not substantially
contributed to improvements in TRI performance due to a lack of adherence
by reporting facilities to EPA published standards.
However, for submissions in the correct format, data entry speed has been
swift and accuracy has been 100%. Because of this promising potential, EPA is
aggressively working towards increasing magnetic media submissions with a
sharp focus on preventing incorrectly formatted submissions.
To this end, EPA is considering soliciting in 1990 for a contract to write a
software program that will accept Form R data and save it to a disk in proper
format. The strategy is for EPA to send a diskette, free of charge, with the
Form R software to facilities interested in magnetic media submissions. The
facility would run the software, answer the appropriate questions, and fill in
the forms on their computer. The software would properly format the
answers and write the Form R to the disk. The reporting facility would then,
simply, return the disk to the TRC for data entry.
This initiative should greatly increase the efficiency of magnetic media
submissions and could also encourage use of this data input technology by
other submitters. The appropriateness of magnetic media versus other input
options is further discussed in Chapter V, Section 3.1.
TRI Productivity Review 29
-------
Current EPA Initiatives
I V.4.2. OCR DATA INPUT PILOT
To improve the efficiency with which Form R information is entered into
the TRI database, EPA will be testing the feasibility of OCR hardware and
software to read Form Rs, and to enter the report information into the TRI
database. This pilot program, involving approximately 6,000 submissions and
over a thousand facilities, will allow TRI management to evaluate the
effectiveness of this technology to improve the speed and quality of data
input.
The scanning software, used in this pilot, forced a slight adjustment to
data element boxes on the Form R; however, the form was not redesigned.
Besides the adjustments, the form's background color was changed to red so
that it would not be picked up by the OCR scanner. The hard ware/ software
combination being used in the pilot program will only read one page of the
Form R at a time. Therefore, an additional burden has been placed on the
TRC staff to separate Form R pages before scanning and then recollate them
after scanning. EPA recognizes that this limitation will create additional
processing overhead but asserts that the pilot is still a valuable test of the
capability of OCR hardware to scan numbers and text. If the initial pilot
shows promise, an expanded pilot may be performed with a hardware and
software combination that will scan the entire form at one time. The OCR
vendor is projecting the ability to scan multiple forms within the next eight
to ten months.
The first test of scanning forms is not planned until August of 1990.
Consequently, for this report, the effectiveness of the OCR pilot cannot be
assessed. However, the appropriateness of this technology is further
evaluated in Chapter V, Section 3.1.2.3. In addition, other issues which
impact the effectiveness of the OCR pilot, such as the redesign of Form R and
using optical media for form storage and retrieval, are discussed in Chapter V,
Sections 3.3.2. and 3.2.2.
IV.5. FACILITY FILE DEVELOPMENT
As discussed in Chapter HI, Section 2.2.5., there is a need to provide facility
information that can be tracked from year to year in order to reduce data entry
keystrokes, improve data accuracy, and to enable TRI to track facility
performance. EPA has designed and is implementing enhancements to the
data entry software through the addition of a facility file. The file includes the
proper name and address for facilities which submitted release data during
RY88. Each address in the facility file is identified by a unique TRI Facility
Identification Number (TRI ID) which is also included on the facility's
Form R when the form is mailed to the facility. Therefore, the data entry
operator has only to type the TRI ID and the data entry software automatically
retrieves and displays the name and address associated with that number.
TRI Productivity Review 30~
-------
Current EPA Initiatives
This single improvement to the data entry software increases data accuracy by
standardizing facility addresses and decreases input time by significantly
reducing keystrokes for data entry operators.
The facility file is currently only operational on the LAN but will be added
and tested on the mainframe system in November of 1990. With the
mainframe capability and the addition of the TRI ID to release records, EPA
will be able to perform year-to-year trend analysis on individual facilities.
The trend analysis will be able to detect reporting inconsistencies* as well as
show the improvements a certain facility is making in reducing toxic releases.
Because of the legal importance of data in the TRI database, the facility file is a
significant addition to improving accuracy of the TRI system. Therefore,
Booz, Allen strongly endorses this initiative.
iv.6. CONCLUSION
The EPA initiatives discussed in this section will successfully address
many of the areas where improvements are necessary in TRI operations. If
implemented correctly, these improvements should allow TRI data to be
available through the NLM system in a more timely and accurate fashion.
With the exception of the reporter verification initiative which will have a
recurring cost, the remainder of the improvements will demand mainly
upfront investments. However, these investments will pay off in the long
run with an improvement in overall TRI productivity.
* For example, if a company reports 25,000 pounds of a certain chemical was released in RY 87,
24,500 pounds were released in RY88, but only 2,400 pounds were released in RY89, the data
would be rechecked against the original form to ensure accuracy.
TRI Productivity Review 31
-------
V. RECOMMENDATIONS TO IMPROVE TRI
This section presents Booz, Allen's recommendations to improve TRI
operations. The recommendations are presented in two sections:
management recommendations and technical recommendations.
Each alternative is described and evaluated separately for its contribution
to the primary criteria of data accuracy, timeliness, and cost. In essence, this
provides the long-term contribution of each alternative to the program.
Additionally, each alternative is also evaluated against the following criteria
which assess the practical feasibility of implementation:
Implementation cost: The estimated cost to implement a
particular option.
Implementation time The estimated time to implement a
particular option.
Implementation risk: An assessment of the likelihood of
successful implementation of a particular option.
The priority of a particular alternative is based upon a combination of the
overall contribution to long-term productivity, coupled with the short-term
cost and ease of implementation. The recommendation summary provides
an overview of the alternatives and a graphic illustration of their
contribution to both the primary and ease of implementation- standards.
Additionally, we have indicated areas where our recommendations agree
with ongoing EPA initiatives.
TRI Productivity Review 33
-------
Recommendations to Improve TRI
V.I. MANAGEMENT
Management recommendations will be discussed in this section along
with the impacts of the recommendations on TRI.
v.1.1. IMPACT OF CHANGE IN REGULATORY ENVIRONMENT
V.I.1.1. Issue Summary
As previously discussed in Chapter HI Section 2.1.2., the potential for
changes in the regulations, which will significantly increase the number of
data fields stored in the database and /or the number of reporting parties, is
high. Should this occur, EPA will need to determine the most appropriate
means of adapting the system to meet the new requirements. This
recommendation will address potential courses of action should these new
regulations become reality.
V.I.1.2. Alternatives
V.1.1 J.I. Minor System Modifications
One option is to adapt the existing system to meet the new requirements.
To accommodate the new reporting requirements, EPA would need to
perform an impact study to evaluate the additional workload and subsequent
impacts on the system. If additional fields are added to the form, the database
and data input applications would need to be modified to handle the changes.
The results of this study would then be utilized to determine the necessity for
any increase in system size or requirement to add hardware, including the
possibility of adding a second LAN to handle the additional workload.
Cost/ Timeliness, and Data Quality Impacts: If the new laws or regulations
necessitate the addition of significant amounts of data to TRI, this option
would have an adverse impact on cost and timeliness (in direct correlation to
the additional data load). These adverse impacts would arise due to an
increased amount of data which would have to be entered, verified, and
reconciled by approximately the same number of data entry operators. If
manual keying remained the primary data input method, the time and cost
impacts would be even more extreme. The impact of this option on data
quality would depend primarily on the complexity of the new data fields.
If the new laws or regulations only require the addition of minor amounts
of data, this option would have only minimal negative impacts on time and
cost. These impacts again would be directly related to the amount of extra
keying which would be required, and data quality would basically remain
unaffected unless the new data fields were much more complex.
Implementation Time, Cost, and Risk Impacts: This alternative would
initially have a negative effect on time and cost as both would be required to
perform the impact study as well as to purchase new hardware as determined
to be necessary. In addition, time and money would have to be allocated to
TRI Productivity Review 34~
-------
Recommendations to Improve TRI
modify application and output software as well as public access products to
accommodate additional data fields. The amount of time and money
required for that task will depend upon the complexity and number of
modifications which are necessary. The risk involved with implementing
this alternative is extremely low as TRI operations and architecture will be
fundamentally the same.
V.l.l.2.2. Reassess the System
Another option is to reassess the architecture for the system. To
accomplish this, EPA would perform life cycle analysis in accordance with
OIRM policy* and assess the costs and benefits of redesigning the system to
accommodate the additional requirements. This assessment would include
evaluating alternative hardware and software approaches to meeting system
requirements.
Cost, Timeliness and Data Quality Impacts: If this particular option is
selected, it provides the ability to optimize all three performance standards in
the long run as the new system would be designed utilizing operational
knowledge gained during the first three reporting years. The system could
also be developed to handle additional data loads in a more efficient manner.
Implementation Time, Cost and Risk: The time and cost required to
implement this recommendation would be significant. If a life-cycle
reassessment of the system is necessary, then a minimum of two years should
be allowed to design and build a new system. The implementation costs will
depend upon the new hardware and software which is determined to be
necessary. Implementation risk is relatively low in this case as the system can
utilize commercial off-the-shelf hardware and software development
environments.
V.I.1.3. Recommendation
The appropriate path for TRI to take depends upon the extent of the
changes mandated by the new requirements. If the new regulations do not
significantly increase the number of data elements to be stored in the
database, resulting in the need for only minimal changes to the data input
and database software, the most effective approach would be to perform a
brief impact study and to add hardware to the LAN as necessary to meet the
additional data input requirements. However, if the number of additional
data fields to be added to the form are sufficient to require extensive
modifications to the data input and database software, then EPA will almost
certainly need to design a new system and should, therefore, begin with a
reassessment of the entire system. If the application software must be
redesigned and coded in any case, then EPA should take advantage of newer
* EPA System Design & Development Guidance, Office of Information Resource Management,
June 1989.
TRI Productivity Review 35
-------
Recommendations to Improve TRI
technology and operational experience gained with the present system to
redesign and rebuild the system. Any reassessment should also take into
consideration changes in data input/storage technology (i.e., the use of OCR
or optical storage).
V.1.2. CLARIFICATION OF PRODUCTIVITY STANDARDS
V.I.2.1. Issue Summary
In order for TRI to measure success in terms of its production goals,
productivity standards must be clearly defined for timeliness and data
accuracy levels. Currently, there is no consensus for a realistic database
production schedule between EPA and the public, so TRI is continuously
plagued by complaints that it is not realizing timeliness objectives.
Additionally, the data accuracy objective is too vague and is hindering EPA's
ability to determine which data quality activities are most appropriate based
on explicit, quantifiable productivity increases.
V.l.2.2. Alternatives
V.L22.1. Status Quo
In this alternative, TRI continues to function with a variety of timeliness
expectations, including IMD's nine month NLM release deadline. TRI's data
quality objective remains at "near to 100%" in key fields.
Cost, Timeliness and Data Quality Impacts: Maintaining the status quo
will adversely impact cost and timeliness. Effective management of TRI is
inhibited by the difficulties of operating according to ambiguous productivity
goals. Because the data accuracy goal is to strive for near to 100%, EPA has
dedicated itself to this end, usually at the expense of other activities and, in
the first reporting years, specifically at the expense of timeliness. This is a
pattern which will be sustained as new methods of improving data accuracy
will, most likely, be proposed with each new reporting year. Although this
environment leads to somewhat greater accuracy, the small gains in accuracy
result in significant, negative impacts on both cost and timeliness.
Implementation Time, Cost and Risk Impacts: There are no
implementation impacts associated with the status quo alternative.
V.l.2.2.2. Define Productivity Standards
In this option, more specific, realistic productivity standards would be
defined through a consensus of key TRI parties. The creation of a concrete
target for data accuracy would provide a measure for success. Similar
agreement on timeliness would allow for true assessments of TRI
accomplishments in regard to the pace of database production.
Cost, Timeliness and Data Quality Impacts: This alternative would have
positive overall impacts on productivity. A finite, stated data quality goal
would lessen the pressure to reach perfection (while still maintaining high
TRI Productivity Review 36
-------
Recommendations to Improve TRI
data accuracy) so costs associated with data quality would be stabilized. This
goal would also allow TRI to quantitatively measure success in terms of data
accuracy thus strengthening funding requests during the budgeting process.
At the same time, a sharp focus on production schedules would be possible
with fewer interruptions for additional or emergency data accuracy activities,
thus improving timeliness of the database.
Implementation Time, Cost and Risk Impacts: The time/ costs required to
implement this alternative would depend upon the amount of time
necessary to reach agreement on reasonable productivity goals. One potential
risk is that the creation of specific goals will place pressure on TRI to meet
these goals. However as TRI is already under pressure to meet widespread
expectations, establishing specific goals will not significantly add to the risk.
V.I.2.3. Recommendations
Booz, Allen recommends that EPA define productivity goals. Specific
accuracy and time goals would provide TRI with a concrete means of
measuring success and would allow EPA to dearly communicate its priorities
to contractors and to the public. These goals should be incorporated into the
contractors SOWs.
V.1.3. STRENGTHEN PLANNING ENVIRONMENT
V.I.3.1. Issue Summary
A formal planning process within TRI does exist, but plans are often not
followed due to the responsibility of the public system to respond in a timely
manner to demands from external parties. Although this response capability
is critical for TRI, the time and funds spent responding to ad hoc requests
often impede long-term growth of the system. The planning process is
further complicated by the rigid time frames in which TRI must operate. If
deadlines are not strictly adhered to, improvements in the system (especially,
data entry software and hardware) must be postponed until the next reporting
year. Specific issues related to strengthening the planning process and
addressing these concerns will be presented in this section.
V.I.3.2. Alternatives
V.U.2.1. Plan for ad hoc requirements
During the annual planning process, the necessary requirements to
respond to the estimated ad hoc response volume should be approximated.
These requirements would then be incorporated into the budgeting process so
that specific resources could be allocated in support of these activities. The
resources that are utilized to respond to ad hoc requests should be tracked to
determine the amount of funding employed. Armed with this knowledge,
management will then be able to better understand the tradeoffs which are
being made between ad hoc activities and day-to-day operations and make
deliberate decisions as to the most appropriate utilization of resources.
TRI Productivity Review 37
-------
Recommendations to Improve TRI
It could also be determined from this process whether the amount of
inquiries received necessitates dedicating (at least partially) FTEs to
responding to these requests. This would allow the remaining staff members
to concentrate on day-to-day tasks and long-range planning and
improvement of TRI operations.
Cost Timeliness and Data Quality Impacts: This option would help
monitor and control costs as EPA managers would be more aware of the
resources which are spent on ad hoc requests and could determine the level
of funding which they feel is'appropriate. In the long run, the inclusion of
ad hoc activities in the planning process will improve both timeliness and
data quality as certain staff would be dedicated to responding to requests, thus
allowing regular staff to continue with regular operations.
Implementation Time, Cost and Risk Impacts: There are only very
minimal implementation effects associated with this option. Extra time and
resources would be required in the planning process to include ad hoc
activities, however this should not be significant. There is really no risk
associated with this option as the planning process is not being substantially
altered.
V.U.2.Z. Establish a formal software development process
In the second option, a formal software development planning process
lasting at least a year and a half would be established to help define software
requirements and desired improvements and to ensure that improvements
are implemented in a timely fashion. This planning process would allow
EPA to determine when software modifications can best be made and tested
in order to meet operational deadlines and to anticipate the workload
required to complete these modifications.
Cost, Timeliness and Data Quality Impacts: The establishment of a formal
software development process would provide improvements in all three
areas. If software modifications are planned over a period of a year and a half,
then data entry software should be ready, including complete testing, to start
data entry at the designated time. Planning time for complete testing of the
software should ensure that "bugs" have been identified and corrected so that
problems do not occur and data entry is not delayed. Although it will not be
possible to predict all requirements eighteen months in advance, those which
can be ascertained can be completed first, reserving time for any last minute
development. Timely data entry also allows the data verification and
reconciliation teams to begin their work immediately, therefore feedback on
errors can be provided to keyers and alterations can be made while data is still
being entered. When delays are eliminated, costs are reduced. Additionally, a
formal planning process will help optimize the utilization of funds which
TRI has to spend on software development.
TRI Productivity Review 38
-------
Recommendations to Improve TRI
Implementation Time, Cost and Risk Impacts: This option would have
minimal implementation impacts. The establishment of this formal
planning process may initially require additional staff and contractor time.
However, this would not be extensive as staff members already informally
deal with software development issues now. Because this is not a new area,
there is no risk associated with formalizing this process.
V.l.3.23. Establish a deliberate hardware replacement process
In this option, EPA would establish a formal hardware replacement
process. In this process, current hardware would be assessed each year from
an overall system reliability and performance standpoint and necessary
replacements/upgrades would be identified. This information would then be
utilized as an input to the overall planning process. This process is presently
occurring within EPA in an informal manner as discussed in Chapter IV,
Section 3., Hardware Replacements. However, under this option this process
would be formalized so that it occurs annually.
Cost, Timeliness and Data Quality Impacts: This option would allow EPA
to plan for hardware replacements/upgrades. Although, TRI will still have to
expend funds to replace the hardware, these funds will have been planned for
and budgeted. Additionally, available funds can be optimized through this
planning process. A formal replacement process should improve timeliness
and data quality as the potential for system failures and lost data entry time
will decrease as overall system reliability and performance improves.
Implementation Time, Cost and Risk Impacts: As previously discussed,
EPA is already assessing hardware needs. Therefore, the formalization of this
function and its incorporation into the planning process should have no
significant implementation impacts.
V.I.3.3. Recommendation
Booz, Allen's recommendation is that all three of these options should be
adopted. The implementation costs associated with each option are small and
would have a high payback. However, the benefits of strengthening the
planning process through these options are substantial as they will allow TRI
management to better anticipate and plan for ad hoc requests, software
modifications, and hardware replacements and to optimize the use of
available funding. This improved planning results in fewer delays and
improved data quality - stronger overall TRI performance.
V.1.4. CHANGE CONTRACTOR ORGANIZATION
V.I A.I. Issue Summary
As discussed in Chapter III, Section 2.1.5., a fundamental issue that
affects TRI operations is the functional responsibilities that have been
assigned to the contractors who are working on TRI. Based on EPA's
contracting policies, three contractors currently share the operational
TRI Productivity Review 39
-------
Recommendations to Improve TRI
responsibilities for TRI, and this results in a complex structure with
significant functional overlap.
V.l.4.2. Alternatives
V.l.42.1. Status Quo
EPA can continue to operate with the current division of responsibility
between software maintenance and TRC facilities management contracts with
the same functional responsibilities as contained in current contracts. This
option would require no evaluation of current operational procedures, but
would also mean that the current overlap in functions would continue into
the future. This option was effective during system start-up when
development and debugging of initial software was a critical activity.
However, at this point, having been utilized for two data input cycles, the
software is more stable and is likely to remain so unless major changes to
program requirements occur.
Cost, Timeliness and Data Quality Impacts: Maintaining the current
contractor tasking would provide no improvement to program cost,
timeliness or data quality as present overlap, coordination and
communications difficulties would persist.
Implementation Time, Cost and Risk Impacts: As little or no change is
contemplated with this option, impact on implementation time, cost and risk
for this option is negligible.
V.l.4.2.2. Separate Contractors for Facility Operations and Data Reconciliation
Another viable alternative would be to have one contractor provide TRI
operations including data input and form verification as well as software
development and maintenance support, and another contractor provide data
reconciliation services. This would align contractors' work with the primary
performance criteria for TRI as the operations contractor would be responsible
for timeliness, and the reconciliation contractor primarily responsible for data
quality. It would be particularly important under this scenario to ensure that
an adequate feedback loop existed between the data quality contractor and the
data input contractor to ensure that the causes of data entry problems detected
in data reconciliation process are reported back and corrected by the data entry
contractor.
Should program requirements change so that a major rebuild of the
application software is necessary, it may be necessary under this option to
utilize a separate contractor to develop the application. If this is necessary,
care must be taken to acquire detailed programming documentation so that
the resulting application is maintainable by the data entry contractor.
Cost, Timeliness and Data Quality Impacts: This option would
considerably improve timeliness and data quality due to the focused approach
that each contractor would have. Cost improvements under this option
TRI Productivity Review 40"
-------
Recommendations to Improve TRI
would be achieved by focusing contractors on primary productivity goals (data
quality and timeliness). This option would also improve timeliness and data
quality by maintaining checks and balances in the system by having
independent responsibilities for data input and data quality review. If the
functional boundaries were carefully engineered, much of the coordination
and communications issues could be minimized. However, there would still
be a need for EPA coordination and supervision to ensure that both
contractors worked together effectively.
Implementation Time, Cost and Risk Impacts: This option would require
EPA staff or contractor analysis to structure new SOWs in such a way as to
comprehensively cover current TRI operations. This process should be done
carefully so that start up problems are minimized. Since approximately one
year remains before the first contract expires, sufficient time exists to
accomplish this task without disrupting ongoing management activities.
Relative to overall TRI operations, the cost to accomplish this analysis would
be low. If done properly so that all aspects of TRI operations are covered, the
risk of implementing this option is low.
V.l.4.23. One Contractor for Facility Operations, Development and Reconciliation
A third option is to have one contractor perform all three functions of
center operations, software development and maintenance and data
reconciliation. This option provides total responsibility for all operational
work under one contractor. EPA's role as a coordinator between contractors
would be simplified significantly under this option as this option would focus
all responsibility for timeliness and data quality under one contractor.
Cost, Timeliness and Data Quality Impacts: This option provides
complete responsibility for TRI operations in the hands of one contractor.
This minimizes coordination and communication problems, and would
allow the contractor to integrate all TRC operational activities in the most
efficient way possible. System development and support activities would be
integrated and would focus on providing comprehensive support to TRC
activities. The feedback loop between users and software developers would be
shortened and made responsible to the same internal management structure.
This will result in more efficient data entry software, and reduce the
complexity of the software development planning process. EPA staff
personnel would be able to focus their energy on planning and directing the
activities of a single contractor Because overlap potential is eliminated, costs
would be reduced. The key to this option is good contractor performance, and
that requires careful evaluation of bidders to a well written SOW.
Implementation Time, Cost and Risk Impacts: Implementation time and
cost are essentially the same for this option as for the one previous. This
option places all responsibility for TRI operations in the hands of one
contractor and thus slightly raises implementation risk.
TRI Productivity Review 4T
-------
Recommendations to Improve TRI
V.I .4.3. Recommendation
Booz, Allen's recommendation is that utilizing a single contractor for all
TRI operational functions would provide the most effective contractor
organization for TRI. This option relieves EPA staff from the most contract
management and coordination details so that they can devote more time to
focus on other management issues related to TRI. On the operational level,
configuration management issues on the LAN which affect software
development and support would be simplified, and the need to coordinate
activities in order to run and analyze reconciliation reports would be
eliminated.
v.1.5. REVISE CONTRACTOR SOWS
V.1.5.1. Issue Summary
TRI contractors' current SOWs were written during system start-up when
the need for considerable flexibility to respond to problems and broad tasking
were important requirements. Additionally, some initial operational
concepts, that were included in the SOWs but were either not implemented
or were subsequently altered significantly, have never been deleted.
This issue is especially timely for TRI as two of its contracts will be
renewed during the next two years.
V.l.5.2. Alternatives
V.l.5.2.1. Status Quo
One alternative is for TRI to continue to operate with the SOWs as they
were originally written.
Cost, Timeliness and Data Quality Impacts: In the long-run,
improvements in the cost, timeliness, and data quality areas would not be
shown. Contractor operations would still be dictated by SOWs which were
written for the system start-up phase, rather than reflecting the requirements
of the current operational reality. This is not to imply that the current
contractors are not performing well as both contractors have been providing
services in excess of what is required in their current SOWs. However,
during the contract rebidding process, another contractor could be chosen. If
this occurs and the current SOWs are utilized, there would be no mechanism
for ensuring that this contractor meets appropriate timeliness and data quality
goals.
Implementation Time, Cost and Risk Impacts: There are no
implementation impacts associated with the status quo.
V.l.5.2.2. General Revision of SOWs
Under this alternative, the current SOWs would be rewritten to delete
obsolete activities and to more accurately and specifically describe and
delegate current TRI roles and responsibilities. Furthermore, EPA would
TRI Productivity Review 42
-------
Recommendations to Improve TRI
incorporate current productivity goals into the SOWs as is presently being
done with data accuracy (see Chapter IV, Section 2.1 for additional
information).
Additionally, the SOWs would be modified to provide separate tasking for
ad hoc requests so that performance on these requests does not adversely
impact routine operations or software development efforts without
management approval by EPA. This applies particularly to the current
software development and maintenance contract.
Cost, Timeliness and Data Quality Impacts: Implementation of this
alternative will have a positive impact on all three criteria. SOWs with
clearly defined productivity goals and management controls to enforce these
goals should allow TRI management to more closely control costs, while
monitoring progress in terms of deadlines and accuracy levels. In addition,
the government's interests are more appropriately protected in case of a
contractor change.
By providing separate tasking for ad hoc requests, resources planned for
designated activities will not be diverted without management consciously
deciding to do so. Therefore, overall timeliness and data quality should
improve if special requests are not allowed to divert extensive resources from
day-to-day operations or long-range operational improvements.
Implementation Time, Cost and Risk Impacts: Implementation of this
option would require a minimal time/financial investment. To rewrite the
SOWs, the lessons learned during TRI operations and the specific
productivity standards would have to be incorporated. Therefore, consensus
would first need to be reached on the productivity standards before
implementation of this alternative. The risk associated with rewriting
contractor SOWs is low as many of the necessary modifications are known to
TRI staff and are actually already being adhered to by present contractor staff.
V.I.5.3. Recommendation
Booz, Allen recommends that TRI pursue the second alternative and
rewrite the SOWs. This option would not involve extensive resources or risk
to implement, but it would allow EPA to ensure that the SOWs more closely
match the operational reality and performance expectations. As TRI will
soon be required to rebid two contracts anyway, this is an ideal time for the
Agency to pursue this option.
v.2. TECHNICAL
Technical recommendations will be provided along with the impacts of
the recommendations.
TRI Productivity Review 43
-------
Recommendations to Improve TRI
v.2.1. UPGRADE DATA INPUT TECHNOLOGY
V.2.1.1. Issue Summary
New data input technology has great potential for affecting data accuracy
and timeliness. However, the demand for fast, accurate data input must be
balanced by cost. Different methods of data input offer different levels of
speed and accuracy at different costs. The following recommendations discuss
available data input technologies and the desirability of each balanced against
costs. Additionally, there remains considerable productivity gains to be made
by increasing the effectiveness of the manual keying process which will
always be necessary for some portion of the submissions.
V.2.1.2. Alternatives
V.2.1.2.1. Keyboard Entry Enhancements
After two years of use and improvements, the current keyboard data entry
software is robust and effective. There is, however, one area where
improvements can be made. One method EPA has devised to improve data
entry accuracy is to place "stops" in the program for critical fields. After
keying data in certain fields, the program "stops" and asks the entry operator
to visually check the entry for accuracy. To verify the data field, the entry
operator must locate the data on the form, memorize the data, then locate the
entry on the screen and compare the data. The entry operator then must then
find their place on the form to continue.
Our opinion is that this type of data checking is cumbersome and counter
productive. The simple motions required to perform this check are very
disruptive and multiplied over the course of a day results in reduced data
entry throughput. Additionally, operators are able to bypass the intention of
the check simply by continue processing without performing a visual check of
the data by simply pressing the continue key.
A variation of the "stop" check would be to replace the "stops" in the data
entry software with double entry keying. After keying data in certain fields,
the program hides the data and asks the operator to key it in again. After
rekeying the data, the software compares the two entries. If the two entries
match the program continues. A skilled touch typist can type a number twice
without lifting their eyes from the form much faster than they can type the
number, stop and compare the number on the form with the number on the
screen.
Cost, Timeliness and Data Quality Impacts: This option will bring an
additional quality control check to data input which cannot be bypassed by the
data entry operator. Furthermore, it allows data entry to proceed smoothly so
timeliness will benefit. By ensuring better data quality during the entry stage,
the reconciliation staff would expend less funds on corrections, thus
decreasing overall costs for TRI.
TRI Productivity Review 44
-------
Recommendations to Improve TRI
Implementation Time, Cost and Risk Impacts: Implementing this change
would require only minor programming changes to a minimal number of
fields in the data entry software, and hence has low implementation impact
both in time, cost and risk.
V.2.1.2.2. Magnetic Media
Another option for data entry is magnetic media submissions in which
the required Form R submission is accepted on a floppy disk instead of a
paper form. As stated in Chapter IV, Section 4.1, EPA is considering an
initiative as a follow-up to RY88's magnetic media pilot project, to
implement magnetic media through the development of standard software.
Booz, Allen supports this initiative and other activities which would increase
the number of magnetic media submissions. The pilot project proved that if
the data input software and the format of the data on the disk could be
standardized this approach provided rapid, error free data input. However,
data normalization activities still need to be performed.
Cost, Timeliness and Data Quality Impacts: This option will bring
additional quality control checks to data input. Furthermore, by design, it
enforces EPA standards that have been ignored or overlooked in the past.
With a 100% guarantee that the magnetic media data file is properly
formatted, both timeliness and data entry quality will benefit. Also, by
ensuring better data quality during the entry stage, the reconciliation staff
would expend less funds on corrections, thus decreasing overall costs for TRI.
Implementation Time, Cost and Risk Impacts: The cost and time required
for developing the software and implementing the program, while not
insignificant, is far exceeded by the potential reduction in manual data entry,
and improved data entry quality. Hence timeliness, data quality and cost are
all positively affected. If EPA proceeds on schedule, software should be
available for RY90 entry. Because this option has already been explored by a
pilot program, risks are low.
V.2.1.2.3. OCR Scanning
To improve the efficiency with which Form R information is entered into
the TRI database, EPA is conducting a pilot program utilizing an an Optical
Character Recognition (OCR) system. This test will allow TRI management to
evaluate the effectiveness of this technology to improve the speed and quality
of data input from typed or handwritten forms.
Full integration of OCR scanning into the data input process will require
significant additions to the data input software. Additionally, EPA may wish
to evaluate the possibility of integrating this option with the utilization of
optical storage for form access and retrieval. Finally, if Form R is redesigned,
the requirements of optical scanning should be considered. See Chapter V,
Sections 3.2. and 3.3.
TRI Productivity Review 45
-------
Recommendations to Improve TRI
Cost, Timeliness and Data Quality Impacts: If successful, this option has
the potential of greatly reducing the cost and time required for data input.
Unlike magnetic media, which requires the reporter to possess and use a
computer to provide the required information, this option allows the use of
paper forms and hence would be attractive to more reporting facilities. The
impact on timeliness and data quality is directly related to the number of
unreadable characters on the form.
Implementation Time, Cost and Risk Impacts: Implementing scanning
technology would require a major investment in new hardware. A seamless
interface between the scanning software and the data entry software would
require moderate modifications to the LAN data entry software. The major
risk with this option is with accuracy. It remains to be seen if the current
scanning technology can read the information with sufficient accuracy to be
effective. EPA's pilot program will provide pertinent data to make this
decision.
Recommendation
EPA should proceed aggressively to improve the existing manual
keyboard entry software where ever possible, and based upon the results of
the magnetic media pilot test, encourage full implementation of that
program. Following the OCR Scanning Pilot, EPA should assess the viability
of implementing OCR scanning as the primary means of data entry.
Significant reduction of data entry effort will be necessary to offset the cost of
this option. If OCR is a viable option, EPA should strongly encourage
submission of forms either through magnetic media or OCR.
V.2.2. UPGRADE FORM STORAGE AND ACCESS
V.2.2.1. Issue Summary
The current Form R filing, storage and retrieval system is labor and
resource intensive. TRI documents must be securely stored, available for
quick access and easily maintainable. Therefore, it is crucial that EPA
evaluate alternatives to their current filing system.
Data for the following alternatives was found in a study of "Alternatives
for EPA Form R Records Management." The study was prepared in January
1990, for EPA's Office of Pesticides and Toxic Substances by Mathtech, Inc.
V.2.2.2. Alternatives
V.2.2.2.1. Paper Document Storage and Retrieval
Currently, the TRC is storing forms in paper form in file cabinets. TRC
procedures require current year Form R reports be stored in the TRC, with the
prior year's reports stored in storage room nearby. Data entry, error correction
and quality assurance activities are accomplished using the original paper
forms.
TRI Productivity Review 46
-------
Recommendations to Improve TRI
Cost, Timeliness and Data Quality Impacts: As the number of forms
increases, costs will increase as more storage space is required. If forms older
that two years are sent to the Federal Records Center (FRC), then the staff time
and delay involved in retrieving information from forms which must be
retrieved from the FRC will increase significantly. The current data storage
and retrieval strategy will only serve to reduce efficiency, hinder data quality
efforts and drive up costs. At the current rate of submissions, TRC operations
will require over 6,400 square feet of space to store five years of Form R
reports and to accommodate a work area for staff to file, retrieve and process
Form Rs. Continuation of current operations implies an investment of
$875,700 for space to store and support five years' worth of Form R
submissions in one facility.
Implementation Time, Cost and Risk Impacts: If the decision is made to
store more forms on-site, the cost of storage space and staff effort to retrieve
forms will increase significantly. The fundamental risk is that EPA will be
unable to access forms in a timely manner if storage requirements become
unmanageable.
V.2.2.2.2. Microfiche
Microfiche is a relatively mature and inexpensive technology that could
significantly reduce storage costs for paper forms. This technology has the
advantage over paper storage of being compact and dependable. Microfiche
can accommodate roughly 600 times the number of pages stored in hard copy
form, a savings in physical storage space of 99.8 percent.
Cost, Timeliness and Data Quality Impacts: Form R access would be much
faster compared to the paper system improving both ad hoc requests and daily
quality control efforts. Overall costs for TRI operations would drop as storage
requirements for Form Rs is reduced. However, this technology introduces
additional labor costs by requiring documents to be processed onto film
creating a lag time between the time a group of documents is sent out for
processing and when they are returned and are available to the user.
Moreover, microfiche does not allow the ability to randomly access a
document, or group of documents. Since several forms are stored on one
fiche, this technology lacks the ability to simultaneously share a collection of
documents.
This system has the added disadvantage of relying on the integrity of the
users to maintain the fiche files to prevent loss and misplacement. The loss
of a single fiche will result in the loss of approximately 16 complete forms.
Implementation Time, Cost and Risk Impacts: The five year life cycle cost
for a microfiche system to support TRI operations is approximately $133,888.
This would produce a significant savings for TRI operations over the long
run. Several microfiche systems are available on GSA schedule allowing EPA
to immediately procure and install a microfiche system. Because microfiche
TRI Productivity Review 47
-------
Recommendations to Improve TRI
is a well established technology, the risk of using it in TRI operations is very
low.
V.2.2.2J. Microfilm
Microfilm technology is very similar to microfiche technology except that
the media is a continuous strip of film. One 215 feet reel of microfilm has the
capacity to contain approximately 5,500 letter sized pages, or about 1,100
Form R reports.
Cost, Timeliness and Data Quality Impacts: For the most part, microfilm
technology has the same productivity benefits and problems as microfiche,
but has the additional benefit of being able to locate a document faster than
microfiche. Microfilm does require a significant amount of space for storage.
Implementation Time, Cost and Risk Impacts: Costs for implementing a
microfilm filing and retrieval system are roughly equal to that of a microfiche
system. The five year life cycle cost for a microfilm system to support TRI
operations is approximately $129,973. This would produce a significant
savings for TRI operations over the long run. Microfilm systems are on the
GSA schedule allowing EPA to immediately procure and install a microfilm
system. Because of its reputation, the risk of using microfilm in TRI
operations is very low.
V.2.2.2.4. Electronic Image Capture and Optical Storage
This newer document storage and retrieval technology uses a scanner,
similar to a photocopy machine, that captures an electronic image of the
Form R. The image of the document is stored on a computer disk. Optical
image stored on disks can be accessed using various search methods and can
be simultaneously viewed on multiple workstations.
Disk storage requirements for optical images are high so electronic image
capturing systems are typically paired with high volume storage devices such
as optical disk drives and Write Once, Read Many (WORM) storage devices.
Cost, Timeliness and Data Quality Impacts: Current electronic imaging
systems can locate and display a document in about 10 seconds. Compared to
TRC's present goal of locating a Form R within 10 days, this option offers vast
improvements in timeliness. Furthermore, since documents are available
for multiple viewing as soon as they are scanned, data entry operators,
validation staff, EPA management, and the public can all simultaneously
view the same Form R with no delay. Data quality efforts would be free to
perform checks on any document at any time and, thus increasing the quality
of the database.
Overall costs for TRI operations would quickly drop as storage
requirements for Form R management are reduced. Five years of Form Rs
TRf Productivity Review 48
-------
Recommendations to Improve TRI
could be stored in as little a 100 square feet. Further savings will be realized by
the reduction in effort required to retrieve and file Form R documents.
Implementation Time, Cost and Risk Impacts: The costs for
implementing a document imaging and storage system, although not
insignificant, are small compared to the savings realized in storage and
management costs. The five year life cycle cost for an imaging system to
support TRI operations is approximately $106,135. Several imaging systems
are on GSA schedule allowing EPA to immediately procure and install an
imaging system. Although this technology is comparatively new, from an
industry standpoint the technology is solid, so implementation risk is
relatively low.
V.2.2.3. Recommendation
We strongly concur with Mathtech's recommendation that EPA use a
WORM based electronic image management system to support TRI's
document storage efforts. Maintaining a paper document filing system for
the more than 80,000 reports submitted annually will become increasingly
difficult over time. Such practice will only serve to perpetuate operating
inefficiencies and will become the major obstruction in TRI productivity.
Although microfiche and microfilm provide reduced storage costs, these
technologies do not provide the retrieval capability provided by electronic
imaging systems.
Prior to implementing this technology, TRI will need to consult the Office
of Information Resources Management's Image Processing Systems Policy
which is currently under development. Additionally, Electronic Records
Management Final Rules, which were published in the Federal Register on
May, 8 1990, contain guidance on the legal requirements for official records
maintained in electronic form.
V.2.3. REDESIGN FORM R
V.2.3.3. Issue Summary
As discussed in Chapter in, Section 2.2.4. there are concerns regarding the
inefficient design of the Form R. This recommendation addresses these
concerns through discussion of modification of the form.
V.2.3.2. Alternatives
V.23.2.1. Status Quo
Under this option, EPA would continue to use the existing form
making only modifications that are necessary from year to year to comply
with regulatory changes.
Cost, Timeliness and Data Quality Impacts: With this option, there will be
no additional cost or time implications. In addition, data quality should
remain at its present level. However, a serious potential for data quality
TRI Productivity Review 49~
-------
Recommendations to Improve TRI
problems with both the submitter and the TRC remains due to the lack of
submission identification information on form pages 2 through 5.
Additionally, the current form which is optimized for reducing the number
of fields which must be filled out on the form, does so at the expense of
simplicity, making the form complicated and difficult to understand and fill
out properly. This results in a large number of data entry errors on the part of
the reporter. In the end, both EPA and the reporting party spend more time
on the form as NOTE'S and NON's are necessary to get the correct
information. For reporter errors that are not detected, the result is poorer
data quality in the database.
Consequently, with this option, there is also no chance for improvements
in any of these areas. As the form is considered to be a major element leading
to TRI's present time and data quality problems, this option does not
capitalize on an opportunity to improve performance.
Implementation Time, Cost and Risk Impacts: As nothing would be
changed as a result of this option, there are no implementation impacts
associated with it.
V.23.2.2. Redesign Form R
Under this option, Form R and its instructions would be systematically
redesigned to make it easier for the reporting party to complete while keeping
data entry time as low as possible The form should also be designed to
facilitate its use for OCR scanning should that option be selected. More
specifically, data elements would be more logically organized, and
instructions would be clarified.
Cost, Timeliness and Data Quality Impacts: This option will improve
timeliness and data quality in the long run. Although a new look may
initially distract previous years' reporters, a more logically organized form
would eventually reduce the time they spend completing the form. First
time reporters will also benefit from a more effective design. After reporters
achieve familiarity with the new form, the number of errors the reporter
commits should decrease. If the form is redesigned to accommodate OCR
scanning, both timeliness and data quality should improve as the data would
be entered quickly and exactly as the reporters submit it.
Implementation Time/ Cost and Risk Impacts: There are implementation
costs associated with this option as the form would have to initially be
redesigned. However, this cost should be fairly minimal and since OMB
approval must be obtained again in any case, this is an opportune time to
investigate this option. The redesign of the form would also require a time
investment at the front end for design and review. The risk with this option
is that reporters who are familiar with the old form may initially have
difficulties with a new form. However, this risk should be minimal.
TRI Productivity Review 50
-------
Recommendations to Improve TRI
V.2.3.3. Recommendation
Booz,Allen strongly recommends EPA redesign Form R. Although this
option would require EPA to initially expend time and funding, data input
time should decrease in the long run with a new form and data quality
should improve substantially, especially with the use of OCR technology.
As already stated, the current OMB approval for the form expires in
January of 1991. EPA has made minor modifications to the form in the waste
minimization and range reporting sections for this year and is presently
beginning the approval process with OMB. As this effort is already ongoing
and as a complete form redesign before January 1991 could be tremendously
disruptive to the system, EPA should begin redesigning the form for use in
the next reporting year. The Form R Redesign effort should include input
from representative reporting parties, IMD, TRC, and system design staff.
Material belongs to:
Office of Toxic Substances Library
U.S. Environmental Protection Agency
401 M Street, S.W. TS-793
Washington, D.C. 20460
(202) 382-3944
TRI Productivity Review 51
-------
APPENDIX I GLOSSARY OF TERMS
GLOSSARY OF TERMS
ADABAS Database management system for an IBM Mainframe
computer. Used at RTF to manage TRI data.
CAS Number Chemical Abstract Services Number. Unique number
that helps identify precise chemical nomenclature.
CBSI, Inc Computer Based Systems Inc. Facility contractor for TRI
operations. Processes forms, enters data, performs
quality assurance and responds to requests for
information.
CMA Chemical Manufacturers Association.
Configuration
Management Maintaining management control over hardware and
software elements.
Data
Reconciliation....Process by which TRI data is validated against original
Form Rs.
ETD Economics and Technology Division. Responsible for
overall program guidance, regulation development, and
regulatory interpretation.
Form R Form on which required facilities submit annually to
EPA, toxic chemical release information.
IMD Information Management Division. Responsible for
TRI data management implementation and operations.
LAN Local Area Network. System of linking personal
computers so they can share data and printers.
MB Megabyte. One million bytes. Measurement of storage
capacity on a computer's hard disk.
MS-DOS Microsoft's Disk Operating System. Operating system
used on most IBM PCs and compatible microcomputers.
Natural Computer language for entering and extracting data
from a database.
NON Notice of Noncompliance. Notification to the reporting
party that it has failed to provide or meet a critical
reporting requirement such as the reporting year or
signature. Impedes data input to the mainframe
database.
TRI Productivity Review 53
-------
Appendix I Glossary of Terms
APPENDIX I (CONTINUED)
GLOSSARY OF TERMS
NOTE Notice of Technical Error. Logical error in the database
which is a less serious reporting difficulty than a NON.
Data can still be inputted to the mainframe database.
Novell Brand name for the LAN hardware and software in use
at the TRC
OCR Optical Character Recognition.
PEI Contractor who provides chemical expertise to data
entry and reconciliation staff.
PRC Planning Research Corporation. EPA's software
development contractor.
RAM Random Access Memory. Working or core memory in a
computer.
RTF Research Triangle Park. EPA's IBM Mainframe
computer site in North Carolina.
RY Reporting Year. Terminology for the calendar year of
Form Rs being processed. In the summer of 1990, RY89
Form Rs will be processed.
Statement
of Work (SOW) ..Document describing the exact tasks a contractor is
required to perform for a contract.
SYCOM, Inc A subcontractor to Planning Research Corporation
(PRC). Contractor responsible for TRI data entry
software development and maintenance.
Title HI The Emergency Planning and Community Right-to-
Know Act of 1986, a free-standing section of the
SUPERFUND Amendments and Reauthorization Act of
1986, which mandates that EPA collect and maintain TRI
information.
TOXNET National Library of Medicine's Toxicology Network.
TRI data is available to the public through this network.
TRC Title IH Reporting Center. Facility which receives TRI
forms and processes them into the TRI database.
TRI Toxic Release Inventory
TRI ID Toxic Release Inventory Facility Identification Number.
A unique number assigned to each reporting facility for
identification purposes.
TRI Productivity Review 54
-------
APPENDIX n MANAGEMENT INFORMATION
This appendix provides detail on TRI's management structure,
including both EPA's and its contractors' organization and responsibilities.
EPA ORGANIZATION AND RESPONSIBILITIES
The responsibility for operation of TRI was delegated to the Office of Toxic
Substances within EPA. This office assigned two divisions - the Information
Management Division (IMD) and the Economics and Technology Division
(ETD) actual responsibility for day-to-day planning and operation of TRI
although other divisions provide some support to TRI. Specifically, IMD is
charged with managing and implementing TRI data and operations. Staff
within IMD's Public Data Branch fulfill these responsibilities although they
do not work exclusively on TRI. ETD provides overall program guidance,
regulation development, and regulatory interpretation. Within ETD, staff in
three branches - the Chemical Engineering Branch, the Regulatory Impacts
Branch, and the Toxic Release Inventory Management Staff - have TRI
responsibilities. As with IMD staff, ETD personnel with the exception of
TRIM staff, also have non-TRI responsibilities.
EPA utilizes this multidivisional approach as it allows TRI to draw upon a
diverse mix of personnel with expertise in information management,
regulatory and policy issues, chemistry, and organizational management. In
addition, TRI is able to strategically tap EPA resources, both personnel and
financial due to this structure.
CONTRACTOR ORGANIZATION AND RESPONSIBILITIES
The other primary component of the TRI organizational structure is
contractor operations. Exhibit 3 illustrates the relationships of the various
contractors to EPA staff. EPA chose one contractor to operate the Title ffl
Reporting Center. This contractor, who reports to IMD, performs several
major activities, including receipt and storage of Form Rs (79,784 submissions
were processed in RY87 and 82,123 in RY88), data entry, data searching and
retrieval, and data quality control and reconciliation. The organizational
structure of this contractor parallels these tasks which are also defined in the
statement of work. All TRC contractor personnel are dedicated solely to TRI
operations. To fulfill its obligations, this contractor employs additional
personnel during peak periods. This contractor operates under an award fee
type contract which expires in September of 1992.
TRI Productivity Review 55
-------
o
K
r
Office of Toxic Substances
(OTS)
Economics & Technology
Division (ETD)
Chemical
Engineering
Branch
Information Management
Division (IMD)
Regulatory Impacts
Branch
Toxics Release
Inventory
Management Staff
(TRIMS)
Public Data
Branch
Non-Confidential
Systems
Section
Non-Confidential
Information
Services Section
PEI
SYCQMInc,
g «
I B-
S. 5s
Public
Information
Section
Computer
Based Systems
Inc.{CBSO
I
a
Oq
*Shaded boxes denote contractor organizations.
ON
3
&
-------
Appendix II Management Information
approximately twelve regular staff who are primarily dedicated to TRI. These
staff members are divided into two groups - a Local Area Network and a
mainframe group. In addition to developing the initial application software,
this contractor provides ongoing support in software development and
maintenance, performs data upload functions and special programming for
system enhancements, responds to ad hoc requests, and generates data
reconciliation and other reports from the mainframe. This contractor is a
subcontractor to another firm operating under a cost plus fixed fee type
contract which expires in July of 1991.
A third contractor supports data quality efforts through providing expert
advice on chemical nomenclature to data entry operators as Form R data is
keyed into the LAN system. These contractor personnel also assist with the
review of Notices of Technical Error (NOTEs) and certain Notices of
Noncompliance (NONs). NOTEs are issued when logical errors in the data
are identified, and NONs serve as notification to the reporting party that it
failed to meet certain critical reporting requirements, such as chemical name,
reporting year, or a signature. This contractor reports to ETD personnel.
TRI Productivity Review 57
-------
APPENDIX III. TECHNICAL INFORMATION
HARDWARE
TRI operations are performed on two separate hardware platforms. Data
entry and document tracking is performed on a LAN at the TRC. TRI data is
collected at the TRC, typically in batches of 5,000 records, and uploaded into
the TRI database residing on the RTF mainframe. Maintenance and
reconciliation of TRI data is done on the mainframe through terminals.
located at the TRC by CBSI, and EPA staff.
LAN SYSTEM
The data entry LAN is a group of forty seven AST 286 personal computers
clustered together on a Novell Token Ring LAN. Each PC has a Proteon
communications card and cable to connect it to the LAN. Three Compaq 386
personal computers, with 9MB RAM memory each, act as network servers.
One server is for administrative computing, and the other two are for TRI
production. Of the two production servers, one is for actual production and
the other is a back-up unit in case the primary server fails. Each server is
connected to two 300MB hard disks to store data. Back-ups of the data are
done with two RAP-300 Emerald 150MB cartridge tape drives. The LAN also
supports four laser and seven dot matrix printers for printing NONs and
reports. Uninterruptible power supplies (UPS) provide electric power, free of
spikes and surges, to the PCs and the network servers and, in case of an
electrical blackout, approximately thirty minutes of battery power to enable a
controlled shutdown of TRI data entry operations.
Access to the mainframe computer is provided by gateway hardware and
software running on two of the AST 286 PCs. One is equipped with a
synchronous data link communications card and emulates a sixteen port IBM
controller. With only one session per emulated port, sixteen sessions are
available on the mainframe through this gateway. The other gateway server
is equipped with a multiplexor board that emulates an IBM 3299 multiplexor.
The board has eight ports and RTF has allocated four mainframe sessions per
port for a total of thirty two mainframe sessions available through the second
gateway. A total of forty eight mainframe sessions are available through the
LAN, which, under current staffing, is more than enough for TRI operations.
DETAILED HARDWARE ASSESSMENT
The following section provides a detailed description and assessment
of the LAN hardware environment.
TRI Productivity Review 59
-------
Appendix III Technical Information
LAN
Data entry PCs create an almost continuous load of data bursts on the
LAN. The Novell LAN with its ten megabits per second transfer speed and
its token ring access mechanism is well suited for data entry operations.
Storage Capacity
The main TTS and TRIS data files for RY87 and RY88 average around
55MB each and fit tightly on a single 300MB drive. The production server is
equipped with two 300MB drives. The operating system, however, allows
access to only 250MB of a 300MB drive. Furthermore, due to disk mirroring,
the effective storage is 250MB, not 500MB. In addition to data files, the hard
disks also store temporary files for other processes such as printing.
Occasionally, when the disk has neared capacity, printing jobs have aborted
due to insufficient disk space.
The initial LAN architecture was designed with enough disk space to have
a complete duplicate set of the tracking data and the TRIS data prior to
upload. When FY89 data start entering the system, TRC management will
have to transfer FY87 tracking data to the back-up server, which has an
additional 250MB of storage available. This temporarily solves the storage
problem at the sacrifice of having a complete set of data on the back-up
server. TRC management is keenly aware of the status of each drive on the
LAN and has made recommendations for purchasing optical disk drives to
alleviate the storage problem.
Security
Several layers of security are implemented on all three LAN servers
preventing unauthorized access to both file servers and the data files
themselves. In addition to the LAN security, mainframe users must pass
through security checks when signing on to the mainframe. TRC
management has taken an extra step of security by using software that
encrypts the TRI and TTS data on the LAN hard disks. Data is automatically
unencrypted for authorized users. TRI data is well protected against
unauthorized viewing, modifying, or deleting by users or mischievous
hackers.
Uninterruptible Power Supplies
TRC has provided UPSs to the file servers and the data entry PCs to protect
the critical data entry operations from accidental loss of power. A UPS
provides two key protective measures. First, the UPS filters out dangerous
spikes and variances in electrical power that can damage delicate computer
equipment. Second, and more importantly, when building power fails, the
UPS will, without interruption, provide electrical power to the entire LAN
for approximately thirty minutes to let each user finish work in progress and
allow TRC management to gracefully shut down data entry operations.
TRI Productivity Review 60
-------
Appendix III Technical Information
Overall, TRI entry operations are well protected against unexpected power
outages or fluctuations.
Disk Mirroring
ITS and TRI data are both stored on one of the 300MB drives on the
primary production server. The other 300MB is configured to be a mirror
image of the first. Data is written first to one drive and then, as a
precautionary measure, is automatically written to the second drive. This
feature, known as disk mirroring, provides two fundamental benefits to TRI
production. First, since the TRI data is written to two different drives, a
breakdown of one drive does not cripple the system. Production staff can
continue writing data to the working drive. Second, since both disks have the
same data, queries by reconciliation staff, for example, can be answered by
either disk. Therefore, users spend less time "waiting in line" for responses
from the LAN. Although disk mirroring uses twice as much disk space to
store data than a single drive, this practice is well regarded in the data
processing community as being a practical and prudent method of
safeguarding valuable data.
Back-Up Hardware
CBSI uses the Emerald tape drive to schedule unattended back-ups each
night of the data entry LAN disk drive. The process includes both a complete
disk back-up and a verification of the tape. The back-up software also
provides for an audit trail that CBSI management reviews each morning for
back-up errors before starting production. The Emerald tape drive has the
capability to write data to the tape, rewind the tape, then verify that the data
on the tape is readable. However, it does not perform a true verify whereby
the data on the tape is compared to the data on the disk. Problems initially
associated with the back-up hardware were subsequently diagnosed as either
software errors or operator errors. As long as proper procedures are followed
TRI managers can have a high confidence level in their back-up tapes.
However, like any hardware, routine testing is prudent and will identify
problems well before they cripple a system.
Access Time
During peak processing months (July to September), response from the
LAN's data entry server to the PCs is generally good. Data entry personnel do
not experience overwhelming or unbearable delays and documents are
entered as fast as they can be typed. On the other hand, access to the
mainframe can be slow, especially during normal working hours. During
midday hours (10:00 A.M. to 3:00 P.M. E.S.T.), as much as thirty to forty five
seconds can elapse between a keypress and an acknowledgement from the
mainframe. (Normal access times averages ten to fifteen seconds with
excellent access time being under five seconds.) This slow access time can be
attributed to the numerous demands on the mainframe by thousands of EPA
employees nationwide.
TRI Productivity Review 61
-------
Appendix III Technical Information
MAINFRAME SYSTEM
The mainframe that stores the TRI data is an IBM 3090 mainframe
computer located at EPA's Computer Center at RTF. The RTF Computer
Center provides total support of mainframe activities freeing TRI managers
from mainframe management concerns. TRI shares the cost of supporting
RTF operations with other EPA programs that use the mainframe. TRI staff
can access the mainframe database through PCs on their LAN. Transparent to
the users, the LAN accesses the mainframe through a standard gateway
located on two of the network PCs using Wide Area Network (WAN)
protocols.
SOFTWARE
Software on the LAN can be divided into two groups: commercial
software for both LAN management and general use (wordprocessing,
spreadsheets, and graphics) and EPA-developed software for TRI specific
needs.
LAN SOFTWARE
MS-DOS (version 3.1) is the operating system used by each of the LAN's
PCs with Novell Netware handling the LAN operations. SYCOM wrote and
compiled both the TTS and TRI data entry software in Clipper as well as the
software that reads and enters magnetic media submissions of Form Rs.
Ad hoc reports and queries to the ITS and TRI databases are made with
dBase m+, Clipper, and other off-the-shelf database development products.
Other commercial software includes standard word processing, spreadsheet,
and graphics programs to support report writing, training programs, and
other administrative needs.
MAINFRAME SOFTWARE
The IBM mainframe runs the MVS/XA operating system with JES2 and
TSO/ISPF for telecommunications with the LAN PCs. TRI data is stored
using ADABAS, EPA's standard database management system, with Natural
language, and Natural Security as the programming and security interfaces.
DETAILED SOFTWARE DESCRIPTION AND ASSESSMENT
The study of the system's software focused on the LAN software and
interfaces. The LAN software was analyzed from several aspects including
how it is written and how it performs.
LAN Software
Simple oversights by LAN system managers with configuring software
and hardware caused problems, in the start-up phase of TRI operations,
TRI Productivity Review 62
-------
Appendix III Technical Information
ranging from unreadable back-up tapes to LAN crashes. One by one these
problems were tracked down and corrected. Currently the software
controlling LAN operations is generally stable.
Data Transfer Software
Data transfers from the LAN to the mainframe were initially attempted
with the communication software Natural Connection. Unexplained errors/
and frequent disconnections between the LAN and the mainframe
interrupted numerous transfers. A night shift computer operator was
specifically hired to oversee the upload operation. Being unreliable, the direct
transfer procedures were dropped in November 1988 and replaced with a
safer, but more manually intensive, method to transfer TRI data to the
mainframe. Presently reel-to-reel tapes are made of the data, uploaded at the
TRC, and sent to the WIC. The tapes are loaded onto a computer at the WIC
and transmitted via a dedicated cable to the RTF mainframe.
TRI Productivity Review 63
-------
Appendix III Technical Information
PROCESS FLOWS
The diagrams on the following pages illustrate the work flow involved
with producing the TRI database at a high level. These process flows start
with receipt of the Form Rs at the Title HI Reporting Center, continue
through entry of ITS and TRIS data into the system and reconciliation.
Finally, the process ends with transfer of the data to TOXNET.
TRI Productivity Review 64
-------
Appendix III Technical Information
PROCESS Rows (CONTINUED)
RECEIVE, DATE, AND SORT
MAIL
TRADE SECRET PETITIONS
ADDITIONAL INFORMATION
MODIFIED SUBMISSIONS
NEW SUBMISSIONS
NON RESPONSES
BATCH
T
TRI Productivity Review
65
-------
Appendix III Technical Information
PROCESS Rows (CONTINUED)
NO
VALIDATE 100% OF ITS
DATA
I
ENTER TRI DATA INTO LAN
DATABASE
VALIDATE 25% OF EACH
BATCH
-*-fc]
TRI Productivity Review
66
-------
Appendix III Technical Information
PROCESS FLOWS (CONTINUED)
UPLOAD 1 OR 2 TIMES PER
MONTH
t
PRODUCE TRI
RECONCILIATION
REPORTS
7
f
1
EPA
REVIEWS 9
REPORTS
CBSI
REVIEWS ALL
REPORTS
PEI REVIEWS
CHEMICAL
REPORTS
CMA REVIEWS
ALLCMA
FACILITY
DATA
REPORTS
1 1
t
TRI Productivity Review
67
-------
Appendix III Technical Information
PROCESS FLOWS (CONTINUED)
i
DOWNLOAD
TRACKING DATA
OCCASIONALLY
STORE 2
SUBMISSION
YEARS OF
DOCUMENTS
ON-SITE
I
1
TRANSFER DATA
TO TOXNET
ARCHIVE
DOCUMENTS
OLDER THAN 2
SUBMISSION
YEARS OFF-SITE
TRI Productivity Review
68
-------
Appendix III Technical Information
DATA FLOW DIAGRAMS
The following data flow diagrams illustrate a detailed account of the data
flow in TRI operations. The first diagram is at the context level; subsequent
diagrams are decompositions.
TRI Productivity Review 69
-------
I
TR1S
«
**.
Reporting
Party
Congress
Annual Report* .
.Trade Secret Petitto
, Additional Informal
. Notice efNoncompliance.
. Notice of Technical Errox-
- Modification Notices
Contractor
Operations
. Mac. Mail.
-FOIA Requests-
. FOI A Response*.
Public
. Transfer redTRI I
Data Clarification Requ
Approved NON-
-Congressional Request
^Reccomended FOI A Responses -\
Trade Secret Decisions
^Dala Clarification Responses
Data Correction Requests
TRI Reports Mac. Mail
. Congressional Requests.
. Congressional Responses-
EPA(OTS)
Data Review
and Reconciliation
TOXNET
I
3
8'
a
t^.
S'
-------
CONTRACTOR OPERATIONS
1
*»
*^B
CJ
"*
v<
**.
VI
»
EPA(OTS) ^^
Data Review \S)* ~ '".
andReeonciliation Trade Secret Petitions
FOIA Requests
Mac. Mail
f~
EPA(OTS)
/\ Reporting Reporting Dala Review
*^ ,i, XXPany Harty andKeconci
Additional Information ^ ' X* >v
Trade Secret Petitions ^S' VJ'
A..!*.** | ^ ^^
*l~l*~m ~HJ~mf*-.~l.~mfM DOM CU
^ '
1.1 v«
**c Prepare Documents Received
FOIA Requests
S
Batches ot
Additions
Informatio
Reporting
Ptavv
r Notice of Technical Error
Modification Notices
/pS^ Transferred TRI ttala.
TOXNET . ,
FOIA Responses
4&* TRIRept
PubUc Congressional Respoi
Data Correction Respo
EPA(OnrS)
Dala Review
and Reconciliation
Aa/c
Moc
uimi
r
i
n
>,
T
k.
>tf
aes
uses
J
hesof t
'ified ChemDala
ssions Deletion
Canfirmalirtm
onprmauon
ChemDala
For Deletion
X
X
RI Database
^J
|
ChemDala 1 L2
Confirmation LAN TRI
Ghent Data For Deletion + uaiaoase
Pr*v^ ec<" c
Batches of New Submissions 1\ "»*»« fl
^
Downloaded Tracking Dala
UreltxulfdTRI f\itn...
New Submissions
j
Documents
EPA(OTS)
Dala Review
lalion andReeonciliation
Review
mfication Requests
Approved NON £3
Trade Secret Decisions H
ata Clarification Responses ^
|
o
1
N^
I L
ReccomendedPOIA Responses
Congressional Requests
Data. Correction Requests
xx
EPA(OTS)
Data Review
and Reconciliation
a
ts
R'
K-l
» 1
> 1
1
1;
> i
&
r*.
*
o
-------
PREPARE DOCUMENTS RECEIVED
o
5X
CJ
<§"
OS
ci
*
OS
s
Reporting
Party
I Annual Rcportj
I Additional Information
I Trade Secret Petitions
EPA(OTS)
Data Review
and Reconciliation
Mac. Mail'
FOIA Requests
Trade Secret Petitions
Public
FOIA Request!
Mac. Mail -
Mainframe
TRI Database
Processes
1.1.1
Receive. Date, and Son Mail
Batches of
Additional
Information
Batches of
Modified
Submuiioru
\.\3.
Batth
-New Submuiioru
- Modified Submissions
ChemDota
For Deletion
Chem Data For Deletion
\
Batches of New Submissions
Mainframe
TRI Database
Processes
"V
LAN TRI Database Processes
-Batches of New Submissions
Chem Data
Deletion
Confirmation
>LAN TRI Database Processes
Chem Data
Deletion
Confirmation
1.13
Batch Submissions To
Be Re-keyed
a
a.
I?
s
a
«t-
5"
-------
LAN TRI DATABASE PROCESSES
TRI Productivity Review 73
T EPA (01
Data Rev
andReco
Data Clarification Reques
Data, Clarification Responses T j^| ^
A / raae secret oecuuiu Tnlpkill_pwn
CD rm*\ Bmy and -
Cr A (OTS) 1/VWZ. B«B«ri«u
1UV7O RCVICW
Data Review
find Rcconcilt&iion
t '
/QS-Ao/cAcf of New Submissions J
PrepBTC Docurncnu Received
1 j-
EPA(OTS)
Data Review
and Reconciliation
1 ' V
LAN Searching
«»4 NON for Review and Reporting
EPA(OTS)
Data Review
and Reconciliation I ^
My < Atone* ofNoncompliaace *
Reporting
Parry
"S) .A^EPA (OTS) ..g^EPA (OTS)
iew \S/ Data Review vS' Data Review
nctliation T and Reconciliation 4 and Reconciliau
" 1 Data Clarification Requests
Data Clarification Responses
ion
^ New Submit
TRI Data Entry
Batches of + and 25%
New Submissions Review
7W5 TJU5
Data Data
Store Documents
on ^^
r .2.3 i
Delete
Complete
Record
^ >
Prepare Documents Received
&
4 Chan Data For Deletion
O
g
Ctem Dote . B
Deletion 1 Q
Confirmation 1 ^
Prepare Documents Received S
5
racking Data 1.2/D1 LAN + Record to be Deleted 1 g
nus r«/5
£»^^
_^ __
4-DownloadedTraclung Data
Mainframe
TRI Database
Processes
XJ
^.
X
t-H
I-S
» (
|
8
VH^
*^<
1
S'
-------
> 1
a.
rj
t-t-
3'
»*.
^
^}
»
V]
>^
MAINFRAME TRI DATABASE PROCESSES
EPA(OTS) . -
Data Review O)
and Reconciliation J ]L
EPA
1.33
Delete
Complete
Record
k. ^
Store Documents
. Reporting .A.
/Party
^ i
on Notices
Modified Submissions
Additional Information *
4 Tracking andTRI Data
) Tracking and TRI Datf
\ '
.Tracking and , 3/D1 Mainframe
TRI Data TRIDatabai
h Tracking Datar
i '
V
Record, to be deleted
t
Reporting
Party Store Documents
<|> r^>
Notice of Technical Erron
Store Documents
Pulled Documents yfis.
| V
f 13A }
Data Review
lalofreCorrccterf. ReconclliaUon
TRISDala-t
\^ _J
t t r
Notice of jpj
Technical R
Error 'f
r
1.3.5
Search and
-4 G«"> Report
e
TRIS
-r- Data *
V
Document Requests 1
^Returned Documents '
EPA(OTS)
Data Review
and Reconciliation
/fis
Congressional Requests ^^
ReccomendedFOIA Responses
^
EPA(OTS)
Data Review ygv
and ReconciliationV^
Congressional Responses 1
TRI Reports |
J
1 >««.
' POIA Responses ~~ *^^
^. ... ^ Public
f 1.3.6
Transfer to
! TRIS » TOXNEr
PoAi
-Transferred TRI Data-^^\
TOXNHT
». j
O
^*
5-
-------
I
STORE DOCUMENTS
<2
»*
<<
LAN TRI Database Processes
-New Submissions
ModtfudSubmasions
Additional Information'
Mainframe
TRI Database
Processes
1.4.1
SuncOn-Silc
. Archmable Documenu-
1.4.2
Box
Documents and
Store Off-Site
o
5
!
Pulled Doeuaienls-
- Lagged Requests
Returned
Documents \
Mainframe
TRI Database
Document Requests
Returned Documents
Pulled Documents .
1.4.3
Document
Control
Pulled Box
' Bo* Request'
'Returned Box'
Cfl
8
XJ
xs
.
2
&
«-»
S'
-------
APPENDIX IV DOCUMENTS REVIEWED
The following documents were reviewed for this study:
CBSI's Statement of Work
SYCOM's Statement of Work
CBSI's TRC 1989 Annual Report
Regulatory Impact Analysis in Support of Proposed Rulemaking
Under Section 313 of SARA (ETD, May 1987)
1987 Data Quality Report
TRI Reporting Form R and Instructions (March 1988)
TRI Reporting Package for 1989
Alternatives for EPA Form R Records Management:
A Feasibility Study (Mathtech, January 1990)
TRC Operating Procedures ((Draft) May 1988)
TRIS Data Entry User's Guide (June 1988)
TRIS Supervisory Operations Manual (September 1988)
TRIS Logical Design (June 1988)
TRIS Physical Design (June 1988)
OTS Operating Plans FY89 and FY90
FY89 and FY90 TRI Budgets
Public Access: Two Case Studies of Federal Electronic
Dissemination (GAO, May 1990).
TRI Productivity Review 77
-------
APPENDIX V Data Collection Participants
Thirteen staff members in the following organizations within EPA's Office of
Toxic Substances (OTS) were interviewed as part of this study.
OTS
Information Management Division
Non-Confidential Information Services Section
Public Information Section
Non-Confidential Systems Section
Economics and Technology Division
Regulatory Impacts Branch
Chemical Engineering Branch
TRIM Staff
Functional areas and organizations interviewed for the two main contractors
included:
CBSI
Operations Manager
Program Manager
LAN Administrator
Data Processing
Document Preparation and Storage
Data Reconciliation
Training Administrator
SYCOM
Program Director
Project Manager
Mainframe Coordinator
PC/LAN Coordinator
TRI Productivity Review 79
------- |