------- ------- EPA Operations & Maintenance Manual TABLE OF CONTENTS Tide Page 1. INTRODUCTION 1-1 1.1 BACKGROUND 1-1 1.2 OBJECTIVES 1-4 1.3 AUTHORITY 1-5 1.4 APPLICABILITY OF THE GUIDANCE 1-5 1.5 ASSISTANCE AND SUPPORT AVAILABLE 1-6 1.6 ORGANIZATION 1-6 1.7 SUMMARY 1-7 2. SYSTEM OPERATION 2-1 2.1 ADMINISTRATIVE CONTROL PROCEDURES 2-1 2.1.1 Control of System Access 2-3 2.1.2 Parameter and Table Specification 2-3 2.1.3 User Support 2-4 2.1.4 Supervisory Control of the Production Process 2-6 2.1.5 Archiving 2-6 2.2 COMPUTER CENTER SUPPORT 2-7 2.2.1 Computer Operations 2-7 2.2.2 Production Control 2-9 2.2.3 Backup and Recovery 2-9 2.2.4 Disaster Planning and Recovery 2-10 2.3 USER INTERACTION 2-12 2.3.1 System Access Techniques 2-12 2.3.2 Data Entry and Update Procedures 2-13 2.3.3 Analysis/Reporting Options 2-13 2.3.4 Training 2-15 2.4 APPLICATION SOFTWARE OPERATION 2-16 2.4.1 Systembitialization/Re-initialization 2-17 2.4.2 Error Detection and Recovery 2-17 2.4.3 System Interfacing 2-18 2.5 DOCUMENTATION 2-19 n ------- EPA Operation; & Maintenance Manual Tjfle. Page 2.5.1 Software Operations Document 2-19 2.5.2 Software User's Reference Guide 2-19 2.6 OPERATIONAL BASELINE 2-20 2.7 SUMMARY 2-20 3.1 DEFTNlTIONSANDRESPONSIBILrnES 3-1 3.1.1 System Maintenance 3-1 3.1.1.1 Conective Maintenance 3-2 3.1.1.2 Functional Maintenance 3-2 3.1.1.3 Adaptive Maintenance 3-3 3.1.2 Configuration Management Responsibilities 3-3 3.1.2.1 Review of System Functionality 3-5 3.1.2.2 Coding Standards Enforcement and System Testing 3-5 3.2 CONFIGURATION MANAGEMENT PROCESS 3-6 3.2.1 System Change Requests/Problem Reports 3-7 3.2.2 Analysis of Problem Reports and Change Requests 3-7 3.2.3 Benefit-Cost Analysis 3-9 3.2.4 Change Approval and Action Plan Development 3-10 3.2.5 Software Improvement Increment 3-11 3.2.6 Maintenance Cycles 3-11 3.2.7 Return to a Prior Software Version 3-12 3.2.8 Documentation 3-12 3.3 SUMMARY 3-13 4. SOFTWARE MAINTENANCE 4-1 4.1 MAINTENANCE PROCEDURES 4-2 4.1.1 Documentation Update 4-2 4.1.2 Source Code Standards 4-3 4.1.3 Coding and Review Process 4-3 4.1.4 Testing Standards and Procedures 4-4 in ------- EPA Operations & Maintenance Mani^d Title Page 4.2 MAINTENANCE TOOLS 4-5 4.3 DOCUMENTATION 4-5 4.3.1 Software Maintenance Document 4-5 4.3.2 Data Dictionary 4-6 4.3.3 Source Code 4-7 4.4 SUMMARY 4-7 Appendix A Essential Elements of Information A-l Appendix B Glossary B-l IV ------- EPA Operations & Maintenance Manual LIST OF EXHIBITS 1-1 Complete Software Life Cycle 1-2 1-2 EPA System Development Life Cycle and Decision Process 1-3 1-3 Life Cycle Costs 1-4 2-1 Administrative Responsibilities 2-2 2-2 Disaster Recovery Plan 2-11 3-1 Configuration Management Responsibilities 3-4 3-2 Software Change Request/Problem Report 3-8 ------- Chapter One INTRODUCTION There is a recognition throughout the Environmental Protection Agency (EPA) that a sizable amount of money is being dedicated to support its automated systems. This guidance was developed in order to help control expenditures for operations and maintenance activities and to ensure that the resources expended on these activities are used in the most effective and efficient manner. As a companion to the System Design and Development (rllidance,, Volumes A, B and C, EPA has prepared this document addressing the maintenance and operation phase of the software life cycle. 1.1 BACKGROUND Maintenance is playing an increasingly important role in systems operations. The Agency is currently spending millions of dollars annually on the design, development, implementation, operation and maintenance of information systems to support its mission. The Agency's System Design and Development Guidance was developed by the Office of Information Resources Management (OIRM) to assist managers in developing the most productive and efficient system possible to satisfy an information need. That guidance addresses the following phases of the software life cycle: Volume A - Mission Needs Analysis Volume B - Preliminary Design and Options Analysis Volume C - System Design, Development and Implementation This Operations and Maintenance Manual completes the guidance for managing the software life cycle. This document was developed in response to the recognized need for formal guidance that addresses the day-to-day performance of system operations and maintenance activities. Exhibit 1-1 graphically highlights the portion of the system life cycle that is addressed by this guidance. This guidance was designed to apply to all types of automated systems, including those using standalone and networked (LAN) personal computers, minicomputers and mainframe systems. Exhibit 1-2 displays the relationship between the Operations and Maintenance Manual and the three volumes of the EPA System Design and Develooment Guidance. 1-1 ------- Volume A Volume B Mission Needs Analysis Preliminary Design A Options Analysis Improvement Increment Volume C Operations and Maintenance Manua ------- EXHIBIT 1-2 EPA SYSTEM DEVELOPMENT LIFE CYCLE AND DECISION PROCESS DEVELOPMENT STAGE DECISION/RESULT REAL WORLD MISSION NEED VOLUME A MISSION NEEDS ANALYSIS SYSTEM REQUIREMENT AND OPERATIONAL CONCEPT DEFINITION VOLUME B PRELIMINARY DESIGN & OPTIONS ANALYSIS SYSTEM OPTION DESIGN, BENEFIT/COST ANALYSIS, AND OPTION SELECTION VOLUME c SYSTEM DESIGN, DEVELOPMENT A IMPLEMENTATION FULLY IMPLEMENTED SYSTEM QAM MAMIIAL OPERATIONS AND MAINTENANCE OPERATIONAL SYSTEM 1-3 ------- EPA Operations & Maintenance Manual 1.2 OBJECTIVES The three phases of the software life cycle, that are discussed in Volumes A, B and C of the System Design and Development Guidance, represent approximately 55% of the total life cycle cost for a software application. Exhibit 1-3 illustrates industry average percentages of the total cost for each phase. It should be noted the operations and maintenance phase alone consumes 45% of the resources, which is by far the highest percentage consumed by any one phase. This guidance was prepared to assist managers in expending their resources in the most efficient manner possible in support of operational systems. EXHIBIT 1-3 Life Cycle Costs Mission Needs Analysis / Preliminary Design & Options Analysis 5.0% Detailed Design 15.0% Software Operations a I iJJUHIl mrt 5.0% Maintenance 45.0% Software Testing, Integration, Verification, & Validation 30.0% These figures are a graphic illustration from Controlling Software Products by Tom DeMarco, 1982, based on industry averages. In some instances, the concept of performing system maintenance carries a negative connotation. Maintenance has been viewed as a response to a system deficiency; 1-4 ------- EPA Operations & Maintenance Manual something is wrong and therefore needs fixing. Fortunately, the efforts required in responding to poorly designed systems are only a pan of the overall maintenance function. Maintenance efforts for any system application can be initiated by several conditions. They include: • A new set of mission functions have been mandated by Congress or Senior Officials causing a modification to an existing system. • Management has decided to perform a new function on the current system. • An existing application has been evaluated as being inefficient or ineffective. • Users have requested enhancements to a current system by requesting new functions or outputs. • Operational or data problems have been reported by system users. • Enhancements to the operating system have necessitated modifications to applications software. This manual addresses the process that should be undertaken in response to an occurrence of any of the above instances. 1.3 AUTHORITY This document is part of OIRM's Software Management Guidance series. This series derives its authority from Chapter 4 of the IRM Policy Manual, entitled "Software Management" This document will serve as the primary guidance for directing Agency systems operations and maintenance efforts. 1.4 APPLICABILITY OF THE GUIDANCE This document is intended to assist managers in developing operating procedures, defining staff responsibilities, documenting system requirements, establishing user support and performing configuration management While this document does address system considerations of operations and maintenance, it does not address the task management 1-5 ------- EPA Operations & Maintenance Manual issues of planning, budgeting and staffing which are subject to the style and constraints of individual managers. Utilization of this guidance for the many systems throughout the Agency will promote a uniform approach to the development of operating and maintenance procedures. Agency managers, responsible staff and contractor support staff should reference this guidance both during and after system development and implementation. They are responsible for developing procedures that control operations and maintenance activities as well as for ensuring the preparation of adequate system manuals to support these activities. 1.5 ASSISTANCE AND SUPPORT AVATT .ART .F. Agency managers responsible for oversight of system operations and maintenance should be aware that there are at least two sources available for assistance and support during this phase of the software life cycle: • OIRM, with its general IRM support functions, has staff readily available to assist managers responsible for maintaining Agency systems. • Staff from the National Data Processing Division (NDPD) are also available for assistance, support and guidance relative to operations and maintenance activities. OIRM and NDPD officials, working with System Managers, will assure a complete understanding of operations and maintenance concerns. 1.6 ORGANIZATION This guidance consists of three chapters which address the various aspects of operations and maintenance functions: • Chapter 2 - System Operation — addresses the day-to-day management and control requirements. These include: 1) administrative controls which detail activities associated with the control and administration of a system application, 2) computer center support which discusses the various tasks and activities provided by the computer center, 3) user interaction which defines access 1-6 ------- EPA Operations & Maintenance Manual techniques and the operating procedures utilized by the user, and 4) applications software operation which describes the activities associated with maintaining the stability of applications software. • Chapter 3 - Configuration Management — addresses procedures for analyzing and implementing system changes following system implementation. These include: 1) maintenance definitions and responsibilities of individuals for configuration management activities, and 2) the configuration management process which includes the activities concerned with controlling the change process. • Chapter 4 - Software Maintenance ~ refers to the actual modification of software and its associated system manuals. This includes: 1) maintenance procedures which discuss the procedures and tools to facilitate any software modifications or enhancements, 2) maintenance tools which describe the special automated tools available for making source code changes and 3) documentation which describes the necessary system manuals which should be reviewed or modified during the maintenance phase of the software life cycle. 1.7 SUMMARY This chapter introduces the Operations and Maintenance Manual with a presentation of the need for and objectives of the manual. This manual is intended to complete the OIRM guidance addressing software management through a presentation of operations and maintenance activities. The chapter also defines die target audience and the organization of the document 1-7 ------- Chapter Two SYSTEM OPERATION This chapter addresses system operations requiring management and control on a day-to-day basis in four functional areas: • Administrative Control - details the responsibilities and activities associated with the overall control and administration of a system application. • Computer Center Support ~ describes the various tasks and activities which are provided by the computer center to ensure the daily operation of a system. • User Interaction ~ defines the access techniques and operating procedures that should be followed by a user of a system. • Application Software Operation - describes the activities associated with maintaining the stability of the application software for the user community. Each of these areas will be presented and discussed in this chapter. 2.1 ADMINISTRATIVE CONTROL PROCEDURES Administrative control procedures are necessary to ensure correct operation of the system from a managerial standpoint These procedures include: • Designation of authorized system users • Specification of system parameters and tables • User support • Supervisory control of system operation and report production • Archiving. Responsibility for administrative control procedures is typically shared by the System Manager and Database Administrator as shown in Exhibit 2-1. This division of responsibilities is not absolute. For small systems, a single person may be charged with all of these responsibilities. For very large systems, supplemental staff may be utilized to 2-1 ------- EXHIBIT 2-1 ADMINISTRATIVE RESPONSIBILITIES SYSTEM MANAGER RESPONSIBILITIES • Provide an interface between the system and its users • Control system access through the continual evaluation and determination of current and prospective users and the assessment or revocation of user IDs accordingly • Respond to user inquiries • Update system tables and parameter files • Establish training • Promote open communications with the users through newsletters and user groups • Evaluate functional system performance. DATABASE ADMINISTRATOR RESPONSIBILITIES • Ensure the proper operation of the database management system (DBMS) • Supervise data entry, update, and deletion procedures • Maintain data quality • Maintain the system data dictionary • Evaluate technical system performance • Archive historical data. 2-2 ------- EPA Operations & Maintenance Manual handle specific responsibilities, such as responding to user inquiries or controlling system access. These responsibilities are generally delegated to staff based upon the size and utilization of a particular system. The responsibilities for the administrative control procedures for smaller or less active systems may be assigned to a single individual. A large or very active system may have two staff members involved, one to carry out the system management functions and one to support the database management requirements. In the case of larger systems with split responsibilities, the individuals involved are typically referred to as the System Manager and Database Administrator respectively. 2.1.1 Control of System Access A primary system management responsibility is the control of procedures for system access in order to maintain data integrity, availability and confidentiality. The person in charge of system access should assign access capabilities based on an individual user's authority to perform specific application functions. System access should be granted or revoked based on job requirements and security clearances. Some users may have authority only to enter or look at data while others may need to update and delete data. Access to system outputs, such as reports or statistical information, may also need to be controlled. A more detailed discussion concerning system access controls is contained in the EPA Information Securit 2.1.2 Parameter and Table Specification An important area of system control involves parameter specification and table maintenance. A parameter is defined as an individual variable or constant stored in a file while a table is a file containing multiple parameters having similar characteristics. Parameters and tables that are external to individual programs are often used to vary a system's operation. These may also be used to control a user's functional access authority. Without the use of these parameters and tables, program source code would require modification every time a variable needed to be changed. 2-3 ------- EPA Operations & Maintenance Manual The EPA Payroll System is an example of a system that uses external parameters to control system operating characteristics. Periodically, information is loaded into the parameters, specifying the fiscal year, pay period, accounting period and accrual increment to be used for payroll processing. These parameters are read by the computer program at the time of execution. Thus, the software can be aligned with real world events without the necessity of programming changes. The EPA Combined Payroll Redistribution and Reporting System (CPARS) is an example of a system that uses an external table file to establish a user profile for control of functional access. As one of the CPARS administration functions, individual users are granted or denied access to functional portions of the system, such as report generation or database modification. This is done by establishing a table file containing functional access authorities associated with individual user IDs. CPARS uses this table file to supplement the computer and file access security provided by the Resource Access Control Facility (RACF) on the mainframe computer. In this way, functional access to the system is controlled in addition to system and file access. There are four benefits to using parameters and table files to control system performance: • The authority to specify and control these flexible system attributes can be a tightly controlled system administration function. • Changes can be made universally for all users by the modification of a single item of information. • Programs can be developed independent of the data and do not need to be modified to change system operational characteristics. • A host of user-specific information does not need to be entered by the users each time the software is accessed. 2.1.3 User Support The growth of user computing and the resulting high investment in computing resources has identified the need to ensure proper user support in order to maximize 2-4 ------- EPA Operations & Maintenance Manual productivity and the overall return from this investment. The primary purpose of any support service is to raise the users' productivity by making them more comfortable with available technology and improving their skill at using the software to update, access and manipulate information. In this regard, the person tasked with the administrative system management responsibilities, the System Manager, should generally become the first point of contact concerning user-support issues. For very large systems where the number of calls is great, the first point of contact may be a member of a user-support team. For most smaller systems, the System Manager or his designee, will become the user's interface with a specific system or application. This will promote consistency and coordination of resources. In order to provide effective user support, the System Manager should be responsible for: • Responding to and tracking user inquiries about system operation (via mail, E- mail, phone) • Dealing with system access procedures • Addressing software and hardware problems • Hosting user conferences • Providing appropriate training. In the case of national systems, the supporting OIRM organization should be cognizant of the mechanics of the system since in many instances the System Manager may not be able to accommodate the large number of inquiries from users. OIRM may be able to assist with the overflow of inquiries. The System Manager may also assist users by providing necessary system manuals, publishing a newsletter or using a bulletin board to disseminate system information and moderating user groups. The System Manager should also assist the user in acquiring appropriate training in such areas as use of the reporting functions of the system. This eliminates the need for users to contact the System Manager or OIRM organization for assistance each time an ad-hoc report is needed. For many systems at EPA, hot lines are available to provide additional support to the users. 2-5 ------- EPA Operations & Maintenance Manual 2.1.4 Superyisorv Control of the Production Process Supervisory control of the production process refers to a system of administrative checks which ensure proper utilization and operation of the system. Supervisory controls are established to assist in monitoring the day-to-day operation of a system. They assure valid, proper entry and maintenance of data, accurate performance of input/output procedures, authorized user interaction and special processing requests. They are constructed to ensure that maintenance transactions are handled properly, verification and approval procedures are in place and, if necessary, monitor disaster recovery. These controls help prevent the occurrence of a "garbage in, garbage out" problem. They act as the checks and balances for smoothly operating systems. Supervisory controls can either be built-in functions of the software itself or documents designed for this use. Examples of these include job request forms, system manuals, procedural checklists, audit trails and system performance/exception reports. 2.1.5 Archiving Archiving refers to the creation of copies of data for historical reference purposes. Unlike the philosophy that governs the backing up of files, where a copy of a master, transaction, or table file is made to ensure a copy is available daily, if anything should happen to the original, the philosophy of archiving is to ensure that copies are created for long-term storage of data. Archiving procedures should be designed for systems of all sizes. The storage medium for the long-term storage of a specific application is an important decision made during the system design phase. The most popular types of storage media include magnetic tape or disk. The archival medium decision should be reevaluated periodically in light of changing technology. The decision must also be made whether to store the data on-site or off-site. The need to readily access the archived data, the size of the storage capabilities on-site and the security needs will determine if a dedicated storage location must be established off-site. 2-6 ------- EPA Operations & Maintenance Manual 2.2 COMPUTER CENTER SUPPORT Since individual system operation is often impacted by computer center operations, it is important to understand the activities and controls provided by the computer center. In the context of this guidance, the concept of a "computer center" relates to the hardware and software necessary to support a system application. Computer center support refers to the activities performed by the technical support staff within the computer center environment which help ensure each system will operate properly on a daily basis. This concept is normally associated with the operations of a major computer center, such as the National Computer Center (NCC) or the center located in Cincinnati, but is also valid when discussing mini-computers, such as those operated by the EPA research and development laboratories or local end-user computing workstations. The essential elements of computer center support include: • Computer Operations • Production Control • Backup and Recovery • Disaster Planning and Recovery. 2.2.1 Computer Operations Computer operations encompass those activities that are carried out to maintain a viable system environment. The system environment is comprised of computer hardware, system software and communication devices. Computer hardware consists of the computer and an array of input and output devices including printers, plotters, scanners, tape drives, terminals and disk drives. System software includes the operating system and other required software, whether a traditional third-generation language (COBOL, FORTRAN, PASCAL), a package (LOTUS, SAS) or database management system (ADABAS, dBASE). Communication devices include the modems and communication lines that allow access between computers for sharing data and processing capabilities. The usefs responsibility for maintenance of the system environment is dependent on the size of the system used. Personal Computer (PC) users are directly responsible for maintaining a viable computer environment All PC-based system components generally reside on or near the user's workstation and are under his/her direct control. The individual 2-7 ------- EPA Operations & Maintenance Manual user is usually responsible for supporting the system environment which may include the following tasks: • Starting the system each day • Loading appropriate software • Initiating the required application software • Storing floppy disks properly • Maintaining all appropriate security measures • Ordering and controlling supplies. In a Local Area Network (LAN) environment, a LAN Administrator is assigned the responsibility of operating and maintaining the LAN. The LAN Administrator's responsibilities include overseeing or performing the following: • Installing the LAN • Training users • Maintaining die hardware and software • Performing system backups • Managing network security. Mainframe and most mini-computer operations are generally geographically removed from the user and centralized within a computer center. These large-scale computer environments require both a means for the user to communicate with the computer resources and a staff dedicated to making these resources available to each user as needed. Typical system environment support performed by the computer center staff includes: • Initialising the system each day • Mounting tapes and di$fr$ • Re-allocating processing resources to correspond to job requirements • Adjusting the proportion of shared computer resources allocated to each user • Distributing reports • Maintaining a computer tape library • Ordering and controlling supplies • Ensuring proper system software licensing. 2-8 ------- EPA Operations & Maintenance Mnoal 2.2.2 Production Control Production control operations refer to the activities that support the implementation and periodic processing of various systems. Typical production control activities include: • Job scheduling • Backup and recovery • Disaster planning and recovery • Job Control Language Maintenance • Routine submission of batch jobs • Maintenance of all appropriate security measures. Responsibilities for the individual production control activities vary with the size of the computer center. Individual users are directly responsible for die processing activities associated with their PC-based applications. Mini and mainfrqmf. based applications, such as payroll, accounting and national program systems are generally serviced by computer center staff. Two production control activities mat stand out in terms of their significance to the protection of an application system are backup and recovery and disaster planning and recovery. These are discussed in Sections 22.3 and 2.2.4 respectively. 2.2.3 Backup and Recovery Backup procedures should be designed to protect against any possible loss of data. Duplicate copies of data are made periodically to ensure that a copy of the generated work will always exist even if the master copy of the data are damaged or destroyed. There is always a chance data can be lost either by human error, hardware or software failure or by catastrophic disaster (discussed in Section 2.2.4, Disaster Planning and Recovery). For smaller PC-based systems residing on individual workstations, it is the individual's responsibility for data backup. In most instances, simply creating a backup floppy disk for data and storing the backup in a location removed from die workstation is sufficient For larger systems (mini and mainframe), system developers should always assume the worst-case scenario when designing any backup system. This backup system will 2-9 ------- EPA Operations & Maintenance Manual consist of procedures which are established to maintain and store recent versions of the information residing within a system. A more detailed discussion of specific data backup and recovery methods for the various types of systems can be found in the EPA Information Security Manual. 2.2.4 Disaster Planning and Recovery Disaster planning and recovery refers to the plan or set of procedures that is designed to counter any physical destruction or damage of the hardware resources or to the building in which the computer center is located. The subject of disaster planning and recovery is primarily concerned with mainframe or mini-computers. Individual PC-based systems should consider standard data and system backup and recovery procedures sufficient for disaster planning. Data that have been properly backed up onto another floppy disk and stored in a location removed from the PC would be readily available for use on another PC. The computer center has the primary responsibility for developing any disaster recovery plan to cover mini and mainframe systems. These plans should contain procedures for dealing with various aspects of emergency response, immediate backup and long-term backup. The procedures should be as specific as possible and all appropriate responsibilities clearly defined. Exhibit 2-2 identifies die areas which should be addressed in designing a disaster recovery plan. As with the recovery plans which address data recovery, each computer center should have a separate restoration plan which will assist in restoring die physical computer center site. This plan should establish responsibilities for die following activities: • Assessing the time required to restore die damaged facility or to locate, obtain and set up a new processing facility • Physically restoring die damaged site • Reestablishing die original data processing environment by moving from the backup facility to die restored site or a new processing facility. 2-10 ------- EXHIBIT 2-2 DISASTER RECOVERY PLAN Conduct a risk analysis to identify specific potential dangers to the computer center and computer-related operations Determine which systems/applications are mission critical and must be restored immediately in the case of failure Develop a detailed, systematic plan, based on the above, to guide disaster response activities: Order and install any necessary additional hardware at the backup facility to establish full processing capacity for long-term backup Assess the possible damage to the computer center and salvage useable equipment and data - Recover critical programs and data - Restore communications networks Resume critical application processing in the immediate backup and die long-term backup phases of the recovery process Resume processing of noncritical applications as additional resources become available. 2-11 ------- EPA Operations & Maintenance Manual Once the risks are identified and the recovery procedures are detailed in the disaster recovery plan, the plan must be tested and periodically reviewed to ensure that time and events do not change the situation and that all procedures are still applicable. A further discussion of disaster planning and recovery is located in the EPA Information Security Manual. Appendix D, Contingency Planning. 2.3 USER INTERACTION The third area of daily system operation involves user interaction which consists of access techniques and operating procedures to be followed by a user of the system. The specific topics addressed in this section are: • System access techniques • Data entry and update procedures • Analysis/reporting options • Training. 2.3.1 System Access Techniques System access techniques are the means by which a user interacts with an automated system. These techniques vary depending on the type of system, the type of terminal and the input and output required. In order to ensure that users are able to utilize systems with ease, system access techniques must be developed and clearly and simply documented for users. Step-by-step procedures for gaining access to a system must be developed and made available to the user. If several communications or machine access methods are available, the user must be given instruction in their use. For example, if a number of different types of terminals, workstations or personal computers may access a system, procedures for system access using each machine must be delineated since each machine may have a unique function for each key. In particular, specific instructions for using programmable function (PF) keys, the escape key, the enter key and the return key are necessary to ensure that users of each machine type have a complete understanding of the operation of the system. 2-12 ------- EPA Operations & Maintenance Manual System access techniques may vary among systems due to differing levels of sophistication of the intended system users. A system designed for infrequent or less technically oriented users often relies on extensive use of menus. Where the user is more technically oriented, a more detailed, and therefore more complicated, system interface utilizing direct system commands allows the user more control over system operation. The procedures for system access are defined during the system design phase of the software life cycle and must be described in the Software User's Reference Guide (EEI-1 1) for use during normal system operations. Security concerns relating to access techniques for mainframe, mini-computer and personal computer systems are described in the EPA Information Security Manual. 2.3.2 Data Entry and Update Procedures Interaction with an automated system for data entry and update may vary depending on the type and function of the system. Data input and update schedules, procedures and security requirements are governed by the volatility and sensitivity of the data being processed. The timing of data input is tied directly to the nature of the system. For example, payroll system updates are driven by the bi-weekly payroll processing cycle. In this case, the timing of the update is critical. On the other hand, maintenance of archival and reference information may not be tied to a specific processing cycle. This allows the designated System Manager to establish an appropriate archival schedule. Systems can have several levels of data entry or update approval. Some users may have authority only to enter or look at data; other users may be able to update and delete data. The System Manager controls procedures for each type of access in order to maintain data integrity, availability and confidentiality. These procedures are described in Section 2.1.1, Control of System Access. 2.3.3 Anflfypfr I Repotting Options A system's analytical and report products are developed during the system design phase to support the user in performing his/her job and to meet organizational record keeping requirements. The Software User's Reference Guide should describe these 2-13 ------- EPA Operations & Maintenance Manual capabilities in such a way that the user can easily see the utility in the provided capabilities as well as the direct correlation between the system's tools and the user's task requirements. Failure to make this link readily apparent will result in the underutilization or non-utilization of the system. The timing, options, media and delivery process for the system outputs are important issues which should be addressed in the Guide and fine-tuned periodically to ensure that they meet the needs of users. The timing of system outputs is relevant to the nature of the output formats. For example, outputs of systems which are event-driven usually generate periodic reports. Systems supporting less time-critical functions may have a more flexible reporting schedule. Systems often provide a number of options for obtaining information upon request. These ad hoc requests may allow the user to specify fixed or variable report formats and the characteristics of selected data. The more flexibility a system has in this regard, the more utility the system generally has for the informed user. The flexibility allows the user to create a focused analysis rather than having to glean desired information from a voluminous report The utility of a system is enhanced by having system output available on a variety of media, including paper, magnetic tape or floppy disk. Flexibility in providing output enables a system to support multiple organizations, management levels or equipment types. The additional availability of machine-readable output will enable users to incorporate information electronically into affiliated systems for further detailed analysis or aggregation. The delivery process by which users receive outputs is important for all but single- user PC applications. The opportunity to specify the output location, such as a local or remote printer or terminal screen, is a useful system function. Many systems have a wide range of options of this type, including transmission of system output either within EPA Headquarters or nationwide, which is helpful for transmitting administrative data between Headquarters program offices or between Headquarters program offices and field locations. Procedures for performing this type of operation should be documented in the Software User's Reference Guide (EEI-11). 2-14 ------- EPA Operations & Maintenance Manual 2.3.4 Training Training is a critical element for the effective operation of any application. A variety of training options are available, from traditional classroom instruction to multimedia courses and computer-based training. The System Manager must determine which methods are most applicable to meet different user requirements. Users with varying degrees of skill and experience as well as the frequency of staff turnover may determine the extent to which the organization can provide effective initial and follow-up training. This acknowledgement of varying skill levels will allow the training program to more accurately meet user needs. To develop an effective training program, the person or group responsible for user training will need to establish guidelines and determine who needs to be trained, what applications or systems will be taught and what training methods will be used. The following procedures will help maximize the benefits of any user training program: • Survey managers and supervisors to identify the organization's goals, objectives, programs and projects. This uncovers potential needs and provides a strong direction for and commitment to the training. • Identify and analyze the primary user group. This information will strongly influence training content • Survey the potential participants to determine existing skills. • Conduct a task analysis. This crucial step will ensure the trainee's newly acquired skill matches his/her job requirement • Define training results. By documenting the results of training in terms of trainee performance, a baseline is established which will influence all future training decisions. There are a variety of training options available. These include classroom or student-paced training; Agency or vendor-developed programs; workshops; or individualized training using multimedia courses and computer-based approaches. The best training method depends on users' needs and available resources. 2-15 ------- EPA Operations & Maintenance Manual Users are primarily concerned with what they will need to know in order to perform their jobs. They are often unwilling to invest long study hours, preferring to focus on training that directly relates to their day-to-day experience. Any training program should address this perspective in order to provide effective user training. One efficient way to design a user training program is to categorize users by organizational or skill level. The four organizational levels of users are executive, managerial, technical and administrative. Each level of user requires a different focus and approach to training. Executives and managers usually have little time available for training. Courses for these users should emphasize payback and results from a management perspective. Technical users typically adapt to new technology readily and welcome the introduction of new tools and methods. This category of user is usually more receptive to experimentation and is willing to explore new techniques during a training program. Administrative users are interested primarily in developing skills for their specific work applications. This may involve training on specific applications, such as word processing, presentation graphics, spreadsheets and the applicable computer equipment Another important aspect of training is the need for staff retraining precipitated by extensive changes made to the system software during a maintenance cycle. Follow-up after training is essential. Success is measured by whether users continue to employ and expand the concepts and skills acquired during the training sessions. An analysis should be made to determine which users are benefiting most from the training, which are not benefiting and most importantly, why not This information can then be factored into restructuring the future training curriculum. 2.4 APPLICATION SOFTWARE OPERATION The operation of the application software is the responsibility of the System Operator or operations personnel, either within the computer center or in a program office. These personnel are responsible for systems operation activities, including operating the hardware and maintaining the stability of die application software for the user community. 2-16 ------- EPA Operation & Mvnfcnaocc Manuri The following specific activities of application software operation are described in this section: • System inirialiyatinn/re-initialirarinn • Error detection and recovery • System interfacing. The activities likely to be associated with system operations should be fully described in the Software Operations Document (EEI-10). 2.4.1 System Tnitializa.ti.fln/Re,-initi8]i73tint1 During system implementation, internal files and tables are initialized with a baseline of data and operational parameters. (Definition of parameters and description of table specifications are contained in Section 2.1.2 of this document) In some cases, this initial implementation is sufficient to establish and support operations for the life of the system. In other cases, it may be necessary to re-initialize the system files and tables at the stan of a new operating interval (e.g., day, pay period, month, quarter, fiscal year). A step-by-step process for initialization, including a list of the required input parameters and a description of the update procedures, must be developed and documented in the Software Operations Document (EEI-10). The System Manager will have the responsibility for defining and specifying system parameters and re-initialization requirements to the System Operator. Procedures and standard practices for initialization/re-initialization should be documented in detail in the System Manager's guide. The System Manager's guide is that portion of the Software Operations Document (EEI-10) which delineates the System Manager's function. It is often developed as a separate document for ease of use and to minimize confusion of responsibilities. 2.4.2 HjTftr Detection and Recovery Among the responsibilities of the software operations personnel are error detection and recovery procedures. A variety of system failures can result in system malfunction. Such failures occur as a result of data errors, program errors or equipment malfunctions. The System Operator can determine the type and seriousness of errors, if any, that have 2-17 ------- EPA Operations & Maintenance Manual occurred as a result of a system failure by reference to system error messages and other diagnostics. Each part of the system which can fail should have specific, documented error messages designed to explain the error and its proper resolution. For example: • Program or data errors — program error messages are displayed on the operator's terminal screen or user's screen; these are developed as part of the detailed system design phase of the software life cycle and are documented in the Software Operations Document (EEI-10). • Equipment errors/failures — hardware malfunctions produce error messages either displayed by the hardware itself or by the mainframe operating system, if applicable. These are described in the operating guide supplied with the hardware at time of purchase and are often reiterated in the Software Operations Document (EEI-10) for reference purposes. Effective system restart procedures in response to application software or hardware failures should be documented in the Software Operations Document (EEI-10). To the extent that data are lost or damaged as the result of a software and/or hardware failure, the provisions outlined in Section 2.2.3, Backup and Recovery, should be utilized. 2.4.3 System Interfacing System interfacing procedures become more critical in system operations as increasing numbers of systems interface for data exchange and storage purposes. Procedures for performing or maintaining system interfaces should be documented in the Software Operations Document (EEI-10) and the System Manager's Guide. Where system interfaces impact on or provide additional capability to system users, these should be described in the Software User's Reference Guide (EEI-11). 2-18 ------- EPA Operations & Maintenance Manoal 2.5 DOCUMENTATION The specifics of die system operation activities described above must be thoroughly documented in the Software Operations Document (EEI-10) and the Software User's Reference Guide (EEI-11). These documents are extremely important references for system users and operators, regardless of whether the system runs on a mainframe, mini- computer, or microcomputer. The formats of these documents may vary between systems and organizations, due to the size and complexity of the system, but information requirements are, on the whole, fairly similar for most systems. The major topics that need to be considered during the preparation of operating manuals of any system are presented as essential elements of information (EEIs) in Appendix A. These documents are prepared during System Design, Development and Implementation and updated during System Operations and Maintenance, as needed. Suggested outlines of these EEIs are presented in Appendix A for reference during the System Operations and Maintenance phase. Managers may use their professional judgement in substituting, combining or reducing the content of the EEIs to meet the unique requirements of a particular system. Additionally, the EEIs are not meant to conflict with or add more burden to documentation requirements set out in other manuals, such as the EPA/NCC ADABAS Application Development Procedures Manual. Documentation produced according to such other detailed procedures will invariably satisfy, either partially or fully, most EEI requirements. 2.5.1 Software Operations Document EEI-10 presents the basic elements considered for inclusion in a Software Operations Document This document should provide the System Operator or operations staff with the detailed procedures required to maintain a stable and viable system. 2.5.2 Software User's Reference Guide EEI-11 presents the basic elements considered for inclusion in a Software User's Reference Guide. This guide is intended to provide users with die information necessary to effectively utilize die system. 2-19 ------- EPA Operations & Maintenance Manual 2.6 OPERATIONAL BASELINE The Operational Baseline represents the completely implemented and tested software system including system manuals and database file designs. It is the basis for future maintenance changes and enhancements and is established following a successful Operational Test and Evaluation Review and after it has been placed in production and/or turned over to the user. 2.7 SUMMARY Chapter 2 presents four major functional operations that are necessary for the effective and efficient day-to-day management and control of a system. First, the administrative control procedures which are performed by a System Manager or a Database Administrator include authorizing users to access the system, specifying system variables through parameters and tables, supporting users by responding to inquiries and supervising the production process. Second, additional operational responsibilities which support the system are performed by technical personnel at the appropriate computer center. These functions should be understood and supplemented for each system, as needed. Third, the system must be responsive to user needs through proper user interaction allowing sufficient system access means, ensuring accurate data through controlled data entry and update procedures, providing useful analysis and reporting products and structuring effective training for users. Finally, the software must be operated properly, ensuring initialization of the system, error detection and recovery and proper interfacing with other systems. Procedures for operating the software and reference information for users are contained in system documents developed during system design, development and implementation. Representative outlines of these documents are presented in the appendix. The next chapter discusses configuration management which is the first step in ensuring proper maintenance of a system. Configuration management ensures that changes to the system are controlled through established procedures for evaluating system maintenance and enhancements. 2-20 ------- Chapter Three CONFIGURATION MANAGEMENT Configuration management involves system maintenance or enhancement performed following initial system implementation. Configuration management includes the evaluation methodology and approach to be followed when considering a partial system redesign or determining software obsolescence. Generally speaking, configuration management must apply die same evaluation criteria to the system maintenance function as were applied to the original system design (See the EPA System Design and Development e,- Volume A). This chapter contains configuration management definitions and responsibilities and describes the activities which comprise the configuration management process. It includes information on the management structure, decisions and tools to support the evaluation and the determination and implementation of changes to an operational software application. The first section defines types of system maintenance and their respective responsibilities. The second section addresses the procedures necessary for proper configuration management 3.1 DEFINITIONS AND RESPONSIBILITIES This section includes configuration management definitions, as well as the responsibilities of those individuals who implement system maintenance. 3.1.1 System Maintenance System maintenance can be divided into three categories of software changes: corrective, functional and adaptive maintenance. Corrective and functional maintenance result primarily from the experience of users with the system and relate to the functionality and ease of use of the system. Adaptive maintenance is a result of changes outside the system over which the System Manager has no control. The categories are useful in that they indicate the reason for maintenance, which can have direct bearing on the timing, necessity or urgency of changes. 3-1 ------- EPA Operatic & Maintenance Manual 3.1.1.1 Corrective Corrective maintenance is performed to correct abnormal and/or debilitating system performance that was not detected during system testing. This can happen because the system testing process is generally designed to verify system performance under "normal" operating conditions. Once a system has been put into production, it undergoes the stress of both expected and unexpected user interaction and activity which can precipitate previously undetected system problems. The users may be able to adapt to inconveniences, such as report labeling inaccuracies, peculiarities of report sequencing or unsatisfactory data display formats. Other problems, such as incorrect calculations and acceptance of invalid data inputs, are of a more critical nature as they affect the accuracy of the information processed and result in incorrect system operation. The user community or managers may be the first to notice inaccurate data or functions, and they should be encouraged to submit problem reports. These types of problems tend to receive intensive attention, resulting in management pressure for rapid implementation of changes. Nevertheless, corrective maintenance tactics need to be evaluated, planned and controlled to ensure that the ramifications of changes are known and accepted, the changes perform as required and users are informed of the changed implementation schedule. 3.1.1.2 Functional Functional maintenance addresses proposed system changes that will provide users with enhanced system performance and capabilities that were not specified in the system design phase of the software life cycle. Enhancements may be proposed through submission of change requests by any of the following: • Active users who want changes to increase efficiency or to expand the scope of system function • EPA managers who recognize a change in the mission needs of the Agency or a sub-organization 3-2 ------- EPA Operations & Maintenance Manual • System designers and managers who recognize operating inefficiencies based on actual svstem utilization on actual system utilization. Unlike changes that are forced by corrective and adaptive conditions, proposed functional maintenance is discretionary and should undergo thorough benefit-cost analysis prior to implementation. This will ensure that the most cost-effective course of action is undertaken. Due to the discretionary nature of functional changes, the speed of implementation is subordinate to the cost and functionality factors involved. 3.1.1.3 Ariapn'vq Maintenance Adaptive maintenance refers to system modifications imposed upon the system by external forces. For example, the computer center may install an upgraded computer or operating system, or an affiliated system may be re-written with accompanying changes in system interfaces. These modifications could easily apply to personal computers as well, such as an upgraded PC operating system or a new version of commercial software that requires modification of in-house systems or data input procedures. In these cases, there is no prerogative to accept or reject the required changes. Instead, the pending operating environment should be examined to determine if there are opportunities for upgrading the functionality and efficiency of the system at the time the system is being adapted. The implementation schedule for adaptive maintenance is often defined by the system or installation that is initiating the change in processing environment 3.1.2 Configuration Management Responsibilities For smaller or less active systems a person should be designated to establish and execute system change control procedures. This person is referred to as the Change Control Administrator and is generally recognized to be the System Manager. For large integrated systems, the implementation of system changes may be the responsibility of the Change Control Administrator while change evaluation and final decision-making authority is embodied in a Change Control Board composed of representatives from the system's functional areas or oversight organizations. The responsibilities of configuration management are typically shared by the Change Control Administrator and Change Control Board as shown in Exhibit 3-1. This division of responsibilities is not absolute and can vary depending on system size, organizations involved and management preference. 3-3 ------- EXHIBIT 3-1 CONFIGURATION MANAGEMENT RESPONSIBILITIES CHANGE CONTROL BOARD RESPONSIBILITIES • Periodically evaluating the adequacy of the system in meeting its support role with regard to evolving user/Agency needs • Establishing procedures for documenting, evaluating, and controlling proposed system changes • Organizing change request reports according to the three maintenance categories described in Section 3.1.1 • Performing benefit-cost analyses for proposed modifications • Approving and ranking all proposed system modifications • Determining whether identified problems and approved enhancement requests justify a new version of the system or indicate software obsolescence • Enforcing documentation and coding standards through reviews and audits of modifications • Evaluating the results of system maintenance and determining when a return to a prior software version is indicated • Notifying system users of any system changes. CHANGE CONTROL ADMINISTRATOR RESPONSIBILITIES • Assuring compliance by the System Manager and technical support personnel with established maintenance documentation described in Section 3.1.1 • Obtaining technical assistance as needed to determine the level of effort, interdependencies and ramifications of the proposed system changes • Assembling the proposed system changes into a software improvement increment • Overseeing the performance of software quality assurance. 3-4 ------- EPA Operations & Maintenance Manual 3.1.2.1 Review of System Functionality Periodically, the Change Control Administrator in consultation with the System Manager and program management should review system functionality to ensure that the system is meeting current user/Agency needs. EPA policy and procedures may change over time to reflect a change in overall Agency mission, shift in programmatic emphasis or modification of job tasks. Because of this, the alignment of the system with the goals and job tasks it was designed to support may deteriorate. The Change Control Administrator must determine when the system is not providing adequate support or is obsolete and initiate a study to define options which include: • No change to the system • Partial system redesign (new version of the system) • Complete system redesign (new system) • System termination. The study could include: • Evaluation of the continuing need for the functions provided by the system • Assessment of successful execution of system functions • Analysis of workload and utilization of the system for comparison to estimates made at the time the system was designed. Depending on the size of the system and the user community, this type of study could be accomplished through an individual's observation, a user survey or interviews with managers. 3.1.2.2 Coding StanHatds Enforcement and System Testing The Change Control Administrator has oversight responsibility for ensuring that the coding standards and system testing procedures established during the initial system design phase are observed by the System Manager, Database Administrator and technical support staff during subsequent maintenance cycles. The coding standards and testing procedures were established to promote software quality and maintainability and overall system 3-5 ------- EPA Operations & Maintenance Manual integrity. The applicability of these standards and procedures to the software maintenance process is addressed in Chapter 4. Circumventing die established procedures could lead to degradation of the system due to undetected errors, undocumented coding changes or inconsistent operating or processing procedures. 3.2 CONFIGURATION MANAGEMENT TqnnffiS This section addresses the following activities which comprise configuration management: • System change requests and problem reports • Analysis of problem reports and change requests • Benefit-cost analysis • Change approval and action plan development • Software improvement increment • Maintenance cycles • Return to prior software version • Documentation. The Change Control Administrator (or Change Control Board) has the responsibility to evaluate, plan and control the required system changes through this process to ensure that they perform as required and to assess the ramifications of such changes. The user community must also be notified of the implementation schedule, any procedural changes and actual implementation. Throughout the system maintenance process, the System Manager must be aware of the impact that proposed software changes may have on the entire system and their eventual effects on the user community. Provisions should be made in the configuration management process to accommodate the rapid system changes required to respond to detected errors. These provisions may include lessening the degree and formality of the benefit-cost analysis or reducing die change request review and approval requirements, if appropriate. However, error corrections should still be verified, tested and documented. Abbreviated procedures should not be used to circumvent the normal functioning of the formal change process, though they may be used to speed it up. 3-6 ------- EPA Operations & Maintenance Manual 3.2.1 System Change Requests/Problem Reports Requests for changes to the software may originate from any of the users of the system. A formal procedure should be established to process and document requests for system changes and reports of system problems. The procedure should include submission and tracking of problem reports and change requests to ensure that all reports/requests are addressed in a standard manner, and that none are overlooked by mistake. Problem reports and change requests should consist of at least the following information: • For problem or change, a functional description of the problem or the requested change • For a problem, a description of the conditions under which it occurs • For a change request, the benefit(s) to the user/organization/Agency of the change. Problem reports and change requests should be logged by the Change Control Administrator upon receipt for tracking purposes and categorized as corrective or functional maintenance. (Although adaptive maintenance is usually not initiated through a problem report or a change request, it is important to manage and track the adaptive maintenance process in the same manner that system problems and changes are handled.) One purpose of tracking problem reports and change requests is to ensure that the defined action plans are followed and scheduled maintenance is correctly performed as planned and on time. The actual content and format of the problem report/change request form should be determined by management in order to correspond to standard local procedures. An example of a suggested format for problem reports/change requests is shown in Exhibit 3- 2. 3.2.2 Analysis of Problem Reports and Change Requests Problem reports and change requests must be analyzed to determine the action to be taken in response. Problem reports must be analyzed to determine the severity of the problem and to prioritize problems if several are awaiting correction. Problem reports 3-7 ------- EXHIBIT 3-2 SOFTWARE CHANGE REQUEST/PROBLEM REPORT ORIGINATOR ORGANIZATION PHONE MAIL CODE LOG NUMBER DATE TIME CATEGORY DESCRIPTION OF THE PROBLEM, CONDITIONS UNDER WHICH IT OCCURED, AND ITS IMPACT ON THE USER, ORGANIZATION, OR AGENCY STATUS DATE REVIEW — • ANALYSIS — - BENEFIT/COST — • APPROVAL — - IMPROVEMENT INCREMENT — - MAINTENANCE CYCLE TESTING — - IMPLEMENTATION — - REVIEWER'S NOTES 3-8 ------- EPA Operations & Maintenance Manual usually indicate system corrections which must be made as soon as possible in order to ensure proper system operation. System change requests do not always require immediate resolution. Instead, they must be subjected to additional discussion and analysis, including benefit-cost analysis, before specific action is taken. When a change request is received, the Change Control Administrator consults with the System Manager, Database Administrator and technical support staff, as appropriate, to refine the definition, necessity and consequences of each proposed functional change. The outcome of these discussions may be several software options for achieving the desired performance goals. Each option is then assessed to determine: • The expected level-of-effort required to implement the change • The relationships between die programs, modules and interfaces affected by the change • The impact on the user community. For both functional and corrective maintenance, system changes must be fully evaluated. The impact of the changes must be viewed in terms of the total cost of a proposed change and any adverse effects on overall system quality. As noted above, the decision to perform adaptive maintenance does not usually reside with a local system administrator or manager, so this evaluation is not necessary in such cases. However, adaptive maintenance may require evaluation of various options for performing the maintenance. 3.2.3 Benefit-Ore* Analysis Proposed system modifications an subject to the life cycle benefit-cost analysis techniques described in the EPA System Design and Development Guidance. Volume B. Functional maintenance changes in particular must be thoroughly analyzed because they are optional in die sense that failure to implement them will not adversely affect system performance, as with corrective and adaptive maintenance changes. Attention should be paid to assessing the benefits of functional changes since these benefits may be either small or large in relation to the cost of implementation. Because corrective and adaptive maintenance is not optional, benefit-cost analysis is most appropriately used to determine 3-9 ------- EPA Operations & Maintenance Manual the best option for applying required changes. The depth and formality of the benefit-cost analysis should be determined by the size of the system and the complexity of the proposed modifications. 3.2.4 Change Approval and Action Plan Development Changes in all maintenance categories must be approved by the Change Control Administrator in consultation with the System Manager and program management where appropriate. Even though changes in the corrective and adaptive maintenance categories are usually not elective, the Change Control Administrator should determine the approach and timing of changes. Some requested changes will be rejected because they are trivial or not worth the cost and system disruption caused by their implementation. Other changes which are more fundamental may be rejected on the principle that it would be more cost effective and functional to totally re-design the system than to adapt the desired functions to the current software. The Change Control Administrator is responsible for evaluating the proposed software changes with regard to the following: • The total cost of the proposed modifications • The maintenance burden of the current and proposed systems • The functional requirements of the organization/Agency • The overall effectiveness of the system • The efficiency and productivity gained from a re-designed system • The effects on system security. The result of this analysis is a determination to complete the proposed functional aintenance, to develop a new version of die software or to declare software obsolescence. Once the approval decision has been made, the Change Control Administrator or Change Control Board must develop an action plan for effecting the change. The action plan is based on the information developed in the decision-making process and includes a schedule for implementation, design documents and staff assignments. Throughout the maintenance process, the action plan should be monitored for timeliness and accuracy. 3-10 ------- EPA Operations & Maintenance Manual If it is determined that a new version of the system is warranted, a software improvement increment and an accompanying maintenance schedule is defined. Declaration of system obsolescence will begin a new software life cycle for the total system redesign. Procedures for initiating this system redesign are similar to those utilized during the system design phase and can be found in the EPA System Design and Development Guidance. Volume C Although adaptive maintenance is not initiated through use of either a problem report or a change request, the procedures for analyzing such maintenance are quite similar. An action plan must be developed, scheduled and implemented, and maintenance progress must be monitored for corrective and functional maintenance. 3.2.5 Software Improvement Increment The software improvement increment groups a finite number of enhancements and modifications to be incorporated into the system software. This refers to a group of proposed system changes that has successfully passed through the formal change request review and approval process. Definition of the scope of the software improvement increment is based on the projected level of effort, expected utility gains, budgetary constraints and organizational pressures to improve the system. The documented and communicated release of a new version of the system concludes the software improvement increment. 3.2.6 Maintenance Cycles One goal of configuration management is to provide a stable system for the user community. Systems that are in a continual state of flux, due to a constant flow of changes, will precipitate user frustration, anger and ultimately, rejection of the system. In order to confine system changes to orderly schedules, a formal maintenance cycle should be established. The maintenance cycle does not necessarily refer to routinely scheduled maintenance but rather controlled maintenance. For example, a cycle could be established in which maintenance is performed only after a threshold of demand for system modification, such as a specific number of problems or requests for one change or a specific number of changes, has been reached. Another alternative is to establish a maintenance cycle in which changes are made to only one module or program of a system 3-11 ------- EPA OparatkasA Maintenance Manual at a time. In any case, pending system changes should be grouped together and accomplished at one time as part of a software improvement increment The intention is to provide users with periods of stable operation and known performance characteristics. This also allows the Change Control Administrator time to inform the users of pending changes and instruct them in new operating procedures and/or functional capabilities. 3.2.7 Return to a Prior Software Version Problems with newly implemented versions of the system occur occasionally. In most cases, additional problem reports are submitted which begin the software maintenance process again. However, in rare cases, a change that has been implemented has an extremely adverse operational effect on the users and the system output In such a case, the new version may need to be removed and replaced with the prior version of the system. This return procedure should be performed as soon as possible after a problem has been identified. The Change Control Administrator should make the final decision concerning the need for and timing of the procedure. 3.2.8 The analysis and decision-making process that precedes system software modification must be supported by adequate documentation which includes the following items: • Software Change Request/Problem Report • Notification of adaptive maintenance, if applicable • History of analysis and relevant decisions • Level-of-effoct estimates • Action plan and maintenance schedule • Evaluation of maintenance effects on system security • Approval signature • Statement of Strategy for Software Improvement Increment, which may include the following items: - A summary of the analysis undertaken to determine that a new version of die system is warranted 3-12 ------- EPA Operations & Maintenance Manual - An evaluation and benefit-cost analysis of alternative implementations - An overview of the required hardware, software and file changes for the increment This documentation should be maintained by the Change Control Administrator and made available to the System Manager, Database Administrator and computer center manager or staff. Documentation of system maintenance is an important pan of the configuration management process since it ensures that correct information on use and operation of the system is available to system users and operators at all times. The documents which are prepared during system maintenance are the tools for ensuring an orderly software maintenance process. This maintenance documentation, as described above, also becomes pan of the total accumulation of system documentation since details of changes must be appended to (and in some cases, must replace) the existing documentation to form a complete documentation package for the system. In addition, this documentation is an important means for justifying maintenance costs to internal and external auditors. 3.3 SUMMARY Chapter 3 presents die important elements of configuration management and defines the three types of system maintenance: corrective, functional and adaptive. The main responsibilities of a Change Control Adnrinistrator/Change Control Board are reviewing the system functionality, ensuring coding standards are enforced and system testing procedures are followed and performing configuration management procedures. These procedures involve systematic evaluation of change requests and problem reports, approval of changes, definition of the software changes to be made, establishment of maintenance cycles and documentation of the decision process. The next chapter presents the important considerations in implementing the changes that have been evaluated, approved and scheduled through configuration management procedures. 3-13 ------- Chapter Four SOFTWARE MAINTENANCE Software maintenance refers to the actual modification of software and related system manuals. This occurs during the final phases of the system life cycle and is often precipitated by: • Identification of program "bugs" • Demand for additional capabilities/features • Changing functional requirements • Increase/decrease in scope of a system. System modifications are often especially difficult to implement because of the constraints imposed by the operational characteristics of the existing system and the need for continuity of system operations. Orderly modification of a system is necessary to maintain a stable operational environment for its users. When changes are made to a system, they must undergo testing and acceptance in a non-production environment to determine whether they do in fact perform as desired. Once thorough testing is completed, a pre-production quality assurance step is required to ensure that the results of the changes do not adversely affect other parts of the system and that the changes correctly address the original problem. Strict adherence to the procedures described in Section 4.1 will ensure that modifications are implemented in a correct, predictable and orderly manner with minimal adverse effects on the users of the system. This chapter will discuss the following topics: • Maintenance Procedures — discusses the relationship of the various procedures and standards established during initial system development and their use during system enhancement • Maintenance Tools — describes the special automated tools available to systems analysts and programmers that can assist in working with source code. 4-1 ------- EPA Operations & Maintenance Manual Documentation — highlights the different system documents that either must be changed with a system modification or at least must be reviewed for accuracy. 4.1 MAINTENANCE PRQTmT TRFS This section discusses the maintenance procedures and tools established at the time of system development and installation which also facilitate implementation of any new software modifications or enhancements. Management of the maintenance process will be carried out by the Change Control Administrator/Board and will be concerned with the following areas: • Documentation update • Source code standards • Coding and review process • Testing standards and procedures. 4.1.1 Documentgfin^ Update Maintenance should focus on modification of the entire application, including system manuals and not on the source code modification alone. System documentation problems arise when changes to source code are not reflected in the design documents or user-oriented manuals. Whenever a change to data flow, software structure, module procedure or any other related characteristic is made, supporting technical manuals including the security manual must be updated. System documentation that does not accurately reflect the current state of the software is probably worse than no documentation at all. Major problems occur when innocent use of unchanged system manuals lead to incorrect assessments of software characteristics. The side effects from documentation shortfalls can be reduced substantially if all applicable system manuals are reviewed prior to any further release of the software. In some instances, maintenance requests may require no change in software design or source code but indicate a lack of clarity in user-oriented manuals. In such cases, the maintenance effort should focus on redefining and clarifying existing system manuals. 4-2 ------- EPA Operations & Maintenance Manual 4.1.2 Source Cofl^ Sftijriymfo The same set of coding standards used during the initial design and development of the system application should be applied to maintenance activities. As discussed in the EPA System Design and Development Guidance. Volume C, EPA has a general set of minimum program design and coding standards which should be used either when designing a new system application or modifying an existing one. These standards promote productivity and maintainability as well as software sharing and reuse. The important characteristics of the standards are: • Use of structured programming constructs to control die flow of execution • Elimination or significant reduction in the use of "branching" statements • Modularity in source program design and coding • Good coding practices such as: - Naming conventions - Symbolic parameters Paragraphing - Blocking - Indentation of source code - Single statement per line - Intelligent use of comments - Annotation of author and date of any program modifications - Error messages. Source code standards should be reviewed prior to beginning programming of any modifications. 4.1.3 Coding and Review Process Before the actual coding can begin, a detailed design of what the modified application will look like must be completed. If a major system overhaul is being undertaken, men the detailed design should be formulated using the procedures outlined in Volume C of the EPA System Design and Development Guidance. If the planned changes are smaller in scope, such as adding a new function or correcting a "bug" in the system, then the changes can occur without a large amount of analysis and design documentation. Managers should be aware that ANY change might have a rippling effect through the entire 4-3 ------- EPA Operations & Maintenance Manual application. It is particularly important to review the effects these changes may have on system security. Once the detailed design of the modification is completed, then the production and programming functions can be accomplished. At die completion of this task, all of the changes in coding, controls, databases, user procedures and operations procedures will have been developed. Some of the major activities in developing a software modification include: • Code new software units • Review software unit codes • Unit test new software • Produce unit test reports • Perform subsystem integration testing • Prepare subsystem test reports. Preliminary reviews are accomplished as each piece of new software is added to the original application. These reviews will confirm that the new software product performs according to all requirements and specifications. 4.1.4 Testing Standards Slid Procedures The use of testing standards and procedures during the maintenance phase of the system life cycle should be consistent with the standards and procedures set forth during the initial system development A test plan should be developed for each system and used as a guide to ensure that any modification made to the application is tested thoroughly and the results properly documented An example of a detailed outline of a Software Test and Acceptance Plan (EEI-7), reproduced from EPA's System Design and Development Guidance is included in Appendix A. The testing team must be aware of any "rippling effects" which newly developed software will have on related applications. Testing should not be limited only to the new piece of software; ALL software even remotely related to the modified software should be identified and included in the testing. 4-4 ------- EPA Operations & Maintenance Manual 4.2 MAINTENANCE TOOLS If the system was originally designed using special automated tools, such as Computer- Aided Software Engineering (CASE), then it is recommended that these tools be used by the systems analysts and programmers for identifying critical elements of code, making source code changes and identifying and correcting problems. The EPA has approved a number of standard specialized software tools to be used during a system's development and maintenance effort Further information on approved tools can be obtained by contacting OIRM or the National Data Processing Division (NDPD). 4.3 DOCUMENTATION This section discusses the system manuals and documents prepared during initial system development Several documents should be reviewed periodically or when software is modified in order to determine if they need to be changed and updated during the maintenance phase of the system life cycle. At the very minimum, the following should be reviewed for accuracy: • Software Maintenance Manual • Data Dictionary • Source Code. Other examples of system documents which may require update, such as a Software Operations Document (EEI-10) and a User's Reference Guide (EEM 1), were discussed in Section 2.5 of this manual. 4.3.1 Software The object of the Software Maintenance Document (EEI-9) is to provide program maintenance staff with both general and specific information on the system configuration and application software. This manual should present guidelines and procedures for performing maintenance. Some areas which should be addressed include: • Source code standards • System manual update • Change control process 4-5 ------- EPA Operations & Maintenance Manual • Testing standards and procedures • Maintenance tools • Security. A detailed outline of a Software Maintenance Document (EEI-9) is presented in Appendix A. 4.3.2 Data Dictionary A data dictionary is a collection of information about the data used in a system. Although in some cases a data dictionary must be developed manually, the term itself usually refers to a dictionary maintained by special data dictionary software. It is very important during the maintenance phase of a system life cycle that the data dictionary be updated if changes are made to the structure or definition of data in the system. If updated properly, the data dictionary will provide a consistent official description of data as well as maintain consistent data names required for programming and retrieval. The dictionary should contain such information as the following: • Name • Description • Source • Users of the data, including screens, reports, programs and organizations that access and use the data • Key words used for categorizing and searching for data item descriptions • Format • Quality • Precision • Defaults • Edit criteria • Security requirements. Data dictionaries may be used by the Database Administrator to enforce standards for names and descriptions, ensuring that those who create data follow the standards. 4-6 ------- EPA Operations & Maintenance Manual 4.3.3 Source Code The final element of system documentation which should be revised at the conclusion of any maintenance effort should be the complete listing of the application system source code. For large systems, storing voluminous source code listings could prove to be unwieldy. In this case, the System Maintenance Document (EEI-9) should provide instructions for obtaining source code printouts on an as needed basis. 4.4 SUMMARY This chapter defines the software modification process and the associated change analysis and system manual documentation. The software maintenance procedures are concerned with the important activities of updating system manuals during system maintenance and ensuring that standards for source code development and systematic procedures for developing, reviewing and testing source code are followed. Important system documents required to ensure proper maintenance of a system includes the Software Test and Acceptance Plan (EEI-7), Software Maintenance Document (EEI-9), Software Operations Document (EEI-10) and Usefs Reference Guide (EEI-11). 4-7 ------- EPA Operations & Maintenance Manual APPENDIX A ESSENTIAL ELEMENTS OF INFORMATION A. ESSENTIAI. FT .FMRMTS OF INFORMATION The EEIs associated with the design, development, operations, and maintenance of a computer system are listed below and are outlined within Volumes A, B and C of the EPA System Design and Development Guidance. Those EEIs which are useful as guidelines during system operation and maintenance are proceeded by an asterisk (*) and are outlined in this appendix. EEI-1 • • Mission Needs Statement EEI-2 • • Preliminary Design and Options Analysis EEI-3 • • Project Management Plan EEI-4 • • System Implementation Plan EEI-5 • • System Detailed Requirements Document EEI-6 •• Software Management Plan * EEI-7 • • Software Test and Acceptance Plan EEI-8 •• Software Design Document * EEI-9 • • Software Maintenance Document * EEI-10 • • Software Operations Document *EEI-11 •• Software User's Reference Guide EEI-12 • • System Integration Test Reports A-l ------- EPA Operations & Maintenance Manual SOFTWARE TEST AND ACCEPTANCE PLAN 1. INTRODUCTION 1.1 Purpose 1.2 Background 1.3 Scope 1.4 System References 1.5 Terms and Abbreviations 1.6 Organization of This Document 2. REFERENCED DOCUMENTS 2.1 Government Documents 2.2 Non-government Documents 3. LIMITATIONS/rRACEABILnY 3.1 Limitations 3.2 Traceability 4. TEST PLANS 4.1 Software Unit Testing (includes Manual Procedures) 4.1.1 Test Requirements 4.1.2 Test Management 4.1.2.1 Integration Test Team Organization and Responsibility 4.1.2.2 Responsibilities of Other Organizations 4.1.2.3 Product Control 4.1.2.4 Test Control 4.1.2.S Evaluation and Retest Criteria 4.1.2.6 Test Reporting 4.1.2.7 Test Review 4.1.2.8 Test Identification 4.1.2.9 Test Data Environment 4.1.3 Test Schedule 4.1.4 Test Results A-2 ------- EPA Operations & Maintenance Manual EEI-7 SOFTWARE TEST AND ACCEPTANCE PLAN (Continued) 4.2 Integration Testing of Software Units, Modules and Software Functions/Risk Management 4.2.1 Integration Test Requirements 4.2.2 Integration Test Management 4.2.3 Integration Test Categories 4.2.4 Integration Test Methods 4.2.S Integration Test Schedules 4.2.6 Integration Test Results 4.2.6.* (Insert Name) Integration Test 4.3 Required Resources 4.3.1 Facilities 4.3.2 Hardware 4.3.3 Interface/Support Software 4.3.4 Personnel 4.4 System Test 4.4.1 System Test Requirements 4.4.2 System Test Management 4.4.3 System Test Categories 4.4.4 System Test Methods 4.4.5 System Test Schedules 4.4.6 System Test Results 5. USER ACCEPTANCE 5.1 Test Team 5.2 Pretest Preparations 5.2.1 Development of Test Scenarios and Test Data 5.2.2 Development of Predicted Results 5.2.3 Development of Acceptance Procedures 5.3 Test Execution 5.3.1 Data Analysis 5.3.2 Test Evaluation 5.3.3 Problem Report and Problem Resolution Process A-3 ------- EPA Operations & Maintenance Manual SOFTWARE TEST AND ACCEPTANCE PLAN (Continued) 5.4 Formal Acceptance 5.4.1 Test Report 5.4.1.1 Detailed Test History 5.4.1.2 Detailed Test Results 5.4.1.2.* (Insert Test Name) Test Results 6. NOTES 7. APPENDICES 8. GLOSSARY A-4 ------- EPA Operations & Maintenance Manual SOFTWARE MAINTENANCE DOCUMENT 1. INTRODUCTION 1.1 Purpose 1.2 Background 1.3 Scope 1.4 System References 1.5 Terms and Abbreviations 1.6 Organization of This Document 2. REFERENCED DOCUMENTS 2.1 Government Documents 2.2 Non-government Documents 3. MAINTENANCE PROCEDURES 3.1 Source Code Standards 3.2 Documentation Update (including non-software elements) 3.3 Coding and Review Process 3.3.1 Top Down Approach 3.3.2 Peer Review 3.3.3 Walkthrough 3.3.4 Team Leader Review 3.4 Change Control Process 3.4.1 Change Request 3.4.2 Code Review 3.4.3 Review and Approval 3.4.3.1 Maintainer 3.4.3.2 User 3.5 Testing Standards and Procedures 3.5.1 Test Plans 3.5.2 Test Data 3.5.3 Test Scenarios 3.5.4 Test Results 3.6 Change Implementation Methods 3.6.1 Test to Production Method A-5 ------- EPA Operations & Maintenance Manual SOFTWARE MAINTENANCE DOCUMENT (Continued) 4. MAINTENANCE TOOLS 4.1 Technical Tools 4.1.1 Processing Tools 4.1.1.1 Compilers 4.1.1.2 Cross Reference 4.1.1.3 File Comparator 4.1.1.4 Traces/Dumps 4.1.1.5 Test Data Generator 4.1.1.6 Test Coverage Analyzer 4.1.1.7 Preprocessor 4.1.1.8 Verification/Validation 4.1.2 Clerical Tools 4.1.2.1 On-line Editor 4.1.2.2 Documentation Library 4.1.2.3 Archival Processor 4.1.2.4 Source Code Reformatter 4.12.5 Data Dictionary 4.2 Management Tools 4.2.1 Problem Reporting 4.2.2 Status Reporting 4.2.3 Scheduling 4.2.4 Configuration Management 5. SOURCE CODE 5.* (Insert Software Unit Name) Source Listing 6. NOTES 7. APPENDICES 8. GLOSSARY A-6 ------- EPA Operations & Maintenance Manual SOFTWARE OPERATIONS DOCUMENT 1. INTRODUCTION 1.1 Purpose 1.2 Background 1.3 Scope 1.4 System References 1.5 Terms and Abbreviations 1.6 Organization ofThisDocument 2. REFERENCED DOCUMENTS 2.1 Government Documents 2.2 Non-government Documents 3. OPERATIONS 3.1 System Initialization 3.2 System Restart 3.2.* (Insert Name) Function 3.2.*. 1 Execution 3.2.*.2 Inputs 3.2.*.2.1 User Inputs 3.2.*.2.2 System Inputs 3.2.*.3 Outputs 3.2.*.4 Termination 3.2.*.5 Error Messages 3.3 System Manager 3.3.1 Manager's Functions/Menu 3.3.1.* (Insert Name) Function 3.4 System Backup/Recovery Provisions 3.5 System Security 4. NOTES 5. APPENDICES 6. GLOSSARY A-7 ------- EPA Operations & Maintenance Manual SOFTWARE USER'S REFERENCE GUIDE 1. INTRODUCTION 1.1 Purpose 1.2 Background 1.3 Scope 1.4 System References 1.5 Terms and Abbreviations 1.6 Organization of This Document 2.1 Government Documents 2.2 Non-government Documents 3. DESCRIPTION OF THE SYSTEM 3.1 System Overview and Mission Based Activities 3.2 System Flow and Data Descriptions 3.3 System and Program Manager 3.4 Data Dictionary 4. SYSTEM ACCESS TECHNIQUE(S) 4.1 Hardware/Software Interface(s) 4.2 Menus and Other Methods of Access 4.3 Manual Procedures 5. USER ANALYSIS / REPORTING OPTIONS 5.1 Standard Reports 5.2 Ad-hoc Capabilities 5.3 Specialized Capabilities 5.3.1 Models, Algorithms, Etc. 5.3.2 Graphics 5.3.3 Expert Systems 5.3.4 Laser and Other Output Media 6. DATA ENTRY AND UPDATE PROCESSES 6.1 Methods and Descriptions of Processes 6.2 Data Responsibilities A-8 ------- EPA Operations & Maintenance Manual SOFTWARE USER'S REFERENCE GUIDE (Continued) 7. USER SUPPORT AND TRAINING PROGRAM/SOURCES 7.1 User Support 7.2 Training Sources/Schedules 8. NOTES 9. APPENDICES 10. GLOSSARY A-9 ------- EPA Operations & Maintenance Mama! APPENDIX B GLOSSARY Adaptive Maintenance - Software changes that are made in response to forces from outside the system environment Application Software - Computer program(s) designed to perform the automated data processing operations associated with specific application requirements. Archiving - Permanently storing system data as an historical record of system activity for the purpose of performing time series comparisons and projections. Change Control Administrator - The person tasked with the responsibility of evaluating, planning and controlling required system changes as part of the configuration management process. For smaller systems, one person is often sufficient to accomplish all of these tasks. Change Control Board - For larger systems, that group of managers appointed the responsibility for evaluating, planning and controlling required system changes as pan of the configuration management process. Computer Center - The concept of a "computer center" relates to the hardware and software environment necessary to support a system application, usually associated with the operations of a major computer center, but is also just as valid when discussing minicomputers or local end-user computing workstations. Configuration Management - Management and implementation methodologies associated with increasing or correcting system capabilities, a partial system redesign, or determining software obsolescence. Corrective Maintenance - System changes made in response to abnormal and/or debilitating system performance. Data Dictionary - Collection of information about the data used in a system. This includes names and descriptions of data elements, and data source and format B-l ------- EPA Operations & Maintenance Manual Database Administrator - That person primarily responsible for managing the data within a system. This includes supervising data entry, update, and deletion procedures. Functional Maintenance - System changes performed in response to user requests for enhanced system performance. Maintenance Cycle - The periodic identification, evaluation, selection, implementation and testing of system changes. Modularity - The separating of system functions and program source code into independent but related groups to facilitate system development and maintenance. Parameters - Individual variables used by software developers as a mechanism for altering the performance characteristics of an application system by varying baseline criteria rather than reprogramming. Software Improvement Increment - A finite grouping of enhancements and modifications to be incorporated into the system software at one time. Software Maintenance Manual - A programmer's technical reference guide used as a tool for implementing software changes. Software Operations Manual - A manual providing the system operations staff with a procedural reference for maintaining a stable and viable system. Software Users Reference Guide - A guide to provide users with die information necessary to effectively utilize all system functions. Source Code - Internally documented computer-generated program listings. Supervisory Controls - Controls established to assist in monitoring the daily operation of a system. System Access Techniques - The means by which a user interacts with an automated system. B-2 ------- EPA Operations & Maintenance Manual System Life Cycle - The complete evolution of an application system through its various phases: initial needs analysis, design, development, implementation, operations and maintenance, and eventual obsolescence. System Manager - The person tasked with the responsibility of ensuring a good interface between the system and the user community. System Manager's Guide - A detailed procedural guide to the performance of administrative activities associated with a system. System Operator - The person(s) responsible for system operation activities, including operating the hardware and maintaining the stability of the application software for the user community. Table File - A group of variables or constants with like attributes, used by software developers as a mechanism for altering the performance characteristics of an application system by varying baseline criteria rather than reprogramming. B-3 ------- |