DON No. 81-240-016-07-13
Guidance on Data Handling
And Analyses In An Inspection/
Maintenance Program
FINAL REPORT
1 December 1981
Prepared for:
U.S. Environmental Protection Agency
I/M Staff
2565 Plymouth Road
Ann Arbor, Michigan 48105
CORPORATION
-------
EPA Report No. EPA 460/3-82-007
DCN 81-240-016-07-13
GUIDANCE ON DATA HANDLING AND
ANALYSES IN AN
INSPECTION/MAINTENANCE PROGRAM
Final Report
Prepared by:
Radian Corporation
Prepared for:
U.S. Environmental Protection Agency
I/M Staff
2565 Plymouth Road
Ann Arbor, Michigan 48105
1 December 1981
8501 Mo-Pac Blvd./P.O. Box 9948/Austin, Texas 78766 / (512)454-4797
-------
ABSTRACT
Under contract with the Inspection/Maintenance (I/M) Staff of the
U.S. Environmental Protection Agency, Radian provided guidance on
the establishment of a data handling and analyses system for an
I/M program. The report first discusses yarious uses of data in
an I/M program. Details are then presented on statistical analy-
sis and sampling techniques along with ways of presenting the
data. The report also contains a discussion of data collection
and handling techniques, including provisions for quality
control.
NOTICE
This report was prepared for the Environmental
Protection Agency by Radian Corporation,
Austin, Texas, in fulfillment of EPA Contract
No. 68-02-3513, Task Order No. 7 for the
purpose of assisting state and local agencies
in implementing effective vehicle emissions
inspection and maintenance programs. The
contents of this report are reproduced herein
as received from Radian Corporation. The
opinions, findings, and conclusions expressed
are those of the author and are not necessarily
those of the Environmental Protection Agency.
Mention of company or product names is not to
be considered as an endorsement by the
Environmental Protection Agency.
-------
TABLE OF CONTENTS
Page
ABSTRACT
LIST OF
LIST OF
1.0
2.0
FIGURES :".
TABLES
INTRODUCTION
THE USE OF DATA IN AN I/M PROGRAM
2 . 1 Evaluating the performance of the
Inspection Facilities
2.1.1 Data Required
2.1.2 Analysis of Failure Rates
2.1.3 Analysis of Repair Data in
Decentralized Programs
2.1.4 Analysis of Emission Levels
2.1.5 Analysis of Waiver Rates
2.1.6 Analysis of Data From Facility
Audits
2.1.7 Other Analyses to Evaluate the Per-
formance of Inspection Facilities....
2 . 2 Revising Cutpoints
2.3 Evaluating and Enhancing the Effectiveness
of Sticker Enforcement
2.3.1 Data Required
2.3.2 Calculating the Compliance
Coefficient
2.3.3 Identifying Non-Complying Vehicles...
2.4 Determine Sticker Accountability
2.5 Determine Fee Accountability (for
Contractor Operated Programs)
2.5.1 Basic Accounting Methodology
2.5.2 Validating the Number of
Inspections
2.6 Evaluate the Vehicle Waiver System
i
V
V
1-1
2-1
2-2
2-2
2-4
2-6
2-6
2-7
2-8
2-9
2-10
2-12
2-12
2-13
2-14
2-15
2-16
2-17
2-18
2-19
11
-------
TABLE OF CONTENTS (Continued)
Page
2.6.1 Determining the Basic Waiver Rate.... 2-20
2.6.2 Determining Legitimacy of Waivers.... 2-20
2.6.3 Characterizing Vehicles Receiving
Waivers 2-21
2.7 Quantify Idle Emission Reductions 2-21
2.7.1 Idle Emission Reductions from
Repairs 2-22
2.7.2 Comparing I/M Areas with Non-I/M
Areas 2-23
» 2.8 Reporting 2-24
2.8.1 Reports to the Legislature 2-24
2.8.2 Reports to the Public 2-25
2.8.3 Reports to Other Groups and
Agencies 2-26
2.9 Additional Objectives of I/M Data Analysis.. 2-26
2.9.1 Evaluating the Repair Process.' 2-26
2.9.2 Evaluating Inspector Performance 2-29
2.9.3 Evaluate the Performance of Test
Equipment 2-30
2.9.4 Determine the Effectiveness of the
Public Awareness Programs 2-31
2.10 Summary of Data Requirements 2-32
3 . 0 DATA ANALYSIS TECHNIQUES . , 3-1
3.1 Basic Analysis Techniques 3-1
3.1.1 Tabulations of Data 3-1
3.1.2 Averages and Standard Deviations 3-2
3.1.3 Statistical Analysis Packages 3-4
3. 2 Limits to Data Analysis 3-4
3.2.1 Manual Data Analysis 3-6
3.2.2 Computer Assisted Analysis 3-9
3.2.3 Use of a Data Base Management
System 3-9
3.3 Presentation of Data. . . 3-10
3.3.1 Tables 3-10
111
-------
TABLE OF CONTENTS (Continued)
Page
3.3.2 Graphs 3-10
3.3.3 Bar Charts 3-14-
3.3.4 Reports Generated by Manipulating
a Data Base 3-14
3.4 Sampling 3-14
3.4.1 Determining Sample Sizes 3-15
3.4.2 Determining Sample Size for
Selection of Cutpoints 3-20
4.0 DATA COLLECTION AND HANDLING 4-1
4.1 Data Collection '. 4-1
4.1.1 Data Collection in Decentralized
Programs 4-1
4.1.2 Data Collection in Centralized
Programs 4-7
4.2 Transcription of Data 4-10
4.3 Quality Assurance of the Data Base 4-11
4.4 Cost of Data Processing 4-12
4.4.1 Manual Data Processing 4-13
4.4.2 Computer Assisted Data Processing.... 4-13
4.4.3 Data Processing With an On-Line
Data Base Management System 4-15
APPENDICES
A Development of A Formula to Determine Sample
Size for Cutpoints Selection A-l
B Derivation of Expression for Sample Size
Required for Mean Estimation B-l
C Comprehensive List of Inspection Data C-l
IV
-------
LIST OF FIGURES
1 Data Requirements to Meet Objectives 2-36
2 Sample Tally Sheet 3-7
3 Monthly Vehicle Inspection Report 3-13
4 Example Machine Readable Vehicle Inspection Record... 4-4
5 Sample Data Form 4-6
6 Oregon--Light-Duty Vehicle Testing Summary 4-8
LIST OF TABLES
1 Data Required to Evaluate the Performance of the
Inspection Facilities 2-3
2 Sample Calculation of A Standardized Failure Rate.... 2-5
3 Data Required to Revise Cutpoints 2-11
4 Data Required to Evaluate Sticker Enforcement 2-13
5 Data Required to Account for Fees 2-17
6 Data Required to Review Waivers 2-19
7 Data Required to Evaluate Idle Emission Reductions... 2-22
8 Data Required for Reporting 2-25
9 Data Required to Review Repairs 2-27
10 A Tally of Phone Calls Received By California's
I/M Program 2-33
11 Summary of Data Required 2-34
12 Items That Could Be Tallied in I/M Program 3-3
13 Statistical Analysis Packages 3-5
14 California - Selected Tables From the MVIP
Annual Report 3-11
15 Statistical Report of Colorado's I/M Program 3-12
16 Standard Normal Distributions 3-16
17 The Values of x2 3-17
18 Preliminary Sample of HC Emissions 3-19
v
-------
1.0 INTRODUCTION
An inspection/maintenance program is a system and like many other
systems it depends on feedback. In an I/M program, data provide
the feedback. Although many I/M options exist, the proper opera-
tion of any program depends upon knowledge of how the program is
actually operating. This is accomplished through the timely and
accurate collection and reporting of data to measure the effec-
tiveness of the program. This document is intended to serve as a
reference on the use of data in an I/M program.
There are two main purposes for this document. One is to de-
scribe specific techniques to analyze I/M data in order to gener-
ate the maximum amount of useful information. The other purpose
is to provide guidance on the establishment of a data collection
and handling system. Since an I/M program could generate massive
quantitites of data, it is important to identify the key data
elements. This document recommends data items which should be
collected and offers suggestions on specific techniques to do so.
These techniques include data collection forms that could be used
and the method of data transcription and storage.
The following section discusses how data can be used to perform a
variety of tasks in an I/M program./ These tasks vary from revis-
ing the cut points to evaluating the performance of the inspec-
tion facilities. Analysis techniques and a list of data elements
are provided for each task or objective, and it is noted which
tasks require computers. Section 3 provides details on statisti-
cal analysis and sampling techniques along with ways of present-
ing the data. A detailed discussion of data collection and
handling techniques, including provision for quality control, is
provided in Section 4. The Appendix contains comprehensive lists
of data that can be recorded in an I/M program.
1-1
-------
2.0 THE USE OF DATA IN AN I/M PROGRAM
Many objectives can be met by analyzing data from an I/M program.
Some of these objectives are basic to the day-to-day operation of
the program. For example, data analyses can be used to evaluate
the performance of inspection facilities in order to identify the
facilities which are performing improper inspections or are
recommending too many waivers. Data analyses can also be used to
revise the program cutpoints (the emissions standards used to
fail the vehicles). The effectiveness of the enforcement activi-
ties can be determined by data analyses, and an accurate account-
ing can be made on the issuance of the inspection stickers or
certificates. Data analyses can also be used to evaluate the
vehicle waiver system to determine if waiver policies need to be
revised. Finally, it can serve the basic function of providing
reports for the public, legislative agencies, service industries
and special interest groups.
Aside from the basic objectives of data analyses there are addi-
tional objectives that could enhance operation of the program.
For example, it may be useful to evaluate the effectiveness of
different types of repairs in order to enhance the mechanics
training program. It may also be useful to evaluate the perfor-
mance of the individual inspectors as well as the inspection
facilities.
Consumer protection would be highlighted if data analyses were
used to evaluate the performance of the individual repair facil-
ities. Facilities charging excessively or performing inadequate
repairs could be identified. The performance of the individual
emissions analyzers can also be determined by data analysis, as
could the quality of the calibration gases that are used in these
analyzers. Finally, data analyses could help determine the
effectiveness of a public awareness program.
2-1
-------
2.1 Evaluating the Performance of the Inspection
Facilities
It is important to determine if accurate emission inspections are
being performed. This is particularly true for decentralized
programs, since there are many possibilities for error in the '
inspection. In these programs, the training of the inspectors
varies considerably, as does the quality of the emission ana-
lyzers that are used. In centralized programs, high quality
emissions analyzers generally are used, and there is much more
consistency in the training of the inspectors. Consequently,
data analyses play a greater role in the quality control of
decentralized programs. However, the techniques used to evaluate
the performance of the decentralized inspection facilities are
also applicable to centralized programs, particularly the super-
vision of fleet inspections.
Considerable data analysis is required to determine if correct
inspections are being performed. There is no single parameter
that will identify stations that are performing inaccurate or
fraudulent inspections. Furthermore, data analysis in itself
will not be sufficient to suspend a station's license to inspect.
However, it should be sufficient to identify likely candidates
for frequent audits, spot checks, or training. Essentially, the
I/M program manager must observe trends to determine stations
with possible quality control problems.
2.1.1 Data Required
The I/M manager needs to analyze several different pieces of data
in order to identify stations that may have quality control
problems. Table 1 presents the specific data that are needed for
such an analysis. The basic data items shown on this table will
2-2
-------
TABLE 1. DATA REQUIRED TO EVALUATE THE PERFORMANCE OF THE
INSPECTION FACILITIES
BASIC DATA
Vehicle Inspection Records
- Facility ID (2.1.2)*
Test sequence (initial, after
garage repair, after outside
repair) (2.1.2)
Pass/Fail status (including
waivers issued) (2.1.2)
Facility Audit Data (2.1.6)
Date of check
Results of analyzer
calibration checks
Results of record checks
Make and model of
analyzer
- Results of inspector
proficiency checks
SUGGESTED ADDITIONAL DATA
Vehicle Inspection Records
- Odometer (2.1.2)
Type of repair (items replaced
or repaired) (2.1.3)
- Vehicle type (LDV, LOT, HDG,
etc.) or gross vehicle weight
(2.1.2)
- Date of inspection (2.1.7)
Repair cost (2.1.3)
HC reading (before and after
repairs) (2.1.4)
- CO reading (before and after
repairs) (2.1.4)
- License or VIN (2.1.7)
Model year (or cutpoint
category) (2.1.2)
- Make (2.1.2)
Vehicle Registratiori Records
(2.1.2)
License or VIN
Model year (and/or make)
- Vehicle type or gvw
Roadside or Challenge Check Data
(2.1.7)
Date of check
License or VIN
- HC reading
CO reading
- P/F
Inspection facility (if
indicated on sticker) or
sticker serial number to
trace back to the facility
Complaints (2.1.7)
- Type
- Facility
*Numbers in parentheses refer to sections that describe how the data are used.
2-3
-------
be sufficient for most analyses, although the suggested addi-
tional items will enhance the results. (The Appendix contains
detailed lists of data.)
Important data for determining the quality of the inspections are
contained on the inspection records; consequently, it is
extremely important that these data be accurately recorded.
Another very important source of data are the results of facility
audit checks. Data from roadside or challenge checks may also
give an indication of the quality of the inspection, as could a
tally of complaints. The following sections discuss the analyses
that may be performed to identify stations that may have quality
control problems.
2.1.2 Analysis of Failure Rates
A key indication of the performance of any inspection facility is
the reported failure rate. A simple method to identify stations
with questionable failure rates is to first determine the overall
failure rate by facility. The stations could then be sorted by
failure rate, and those with extremely high or low failure rates
could be more frequently audited by the administrating agency.
Some of the outliers may be correctly inspecting vehicles; there-
fore additional analysis will be useful. Failure rates may vary
by model year and cutpoint category; thus, if these data are
collected, a breakdown by such categories would be very useful.
These breakdowns can be performed manually if each station pro-
vides tabulated results of the inspection (see Section 3.2.1).
2-4
-------
If computers and automated data collection equipment are avail-
abler the following analysis can be performed.
o It is advantageous to be able to isolate
vehicles that are known to have low failure
rates. These vehicles include the 1979 and
later General Motors light duty vehicles and
the 1981 and later (three-way catalysts)
vehicles. Therefore, the failure rates could
be broken down further by make and possibly
gross vehicle weight or vehicle type. (This
latter breakdown is not necessary if the
program only applies to light duty vehicles.)
o Calculation of the average odometer reading
for each group of vehicles may provide addi-
tional insight, since the higher mileage
vehicles stand a greater chance of failing
the test.
o The administrating agency may find it useful
to standardize the failure rate. In order to
arrive at a standardized failure rate for
each facility, it is best to determine the
failure rates for the different groups of
vehicles (e.g., similar model years) and then
use a standard weight factor for each group
to determine standardized failure rate (see
Table 2.) The standard weighting factors
could be determined by tabulating registra-
tion data.
TABLE 2. SAMPLE CALCULATION OF A STANDARDIZED FAILURE RATE
CATEGORY
(1)
1
2
3
4
5
MODEL YEAR
(2)
Pre 68
68-69
70-74
75-79
80 +
FAILURE
RATE
(3)
30%
25%
35%
20%
10%
WEIGHTING1
FACTOR
(4)
.10
.20
.20
.40
.10
Standardized
Failure Rate
3x4
3
5
7
8
1
= 24%
vehicle registration records.
2-5
-------
2.1.3 Analysis of Repair Data in Decentralized Programs
In decentralized programs, an analysis of repair data will
greatly aid surveillance efforts. This is particularly true of
stations that have higher than average failure rates, since they
may actually be performing accurate inspections. Computers play
an important role in the collection and analysis of repair data.
Without computers, the analysis of repair data are largely
limited to spot checking and qualitative review of repair costs
or invoices.
If repair cost data are collected automatically, the average
costs for each station can be calculated. Those stations with
extremely high or low costs could be investigated.
Information on the type of repair can help determine if facil-
ities are charging excessive costs for simple repairs. In addi-
tion, the type of repair data will help to determine whether or
not adequate repairs are being performed. For example, if the
repair facility only adjusts the idle mixture and never performs
more extensive repairs, then that facility may not be achieving
adequate HC emission reductions.
2.1.4 Analysis of Emission Levels
A further screening technique for those facilities identified as
outliers is to analyze the emission levels that are recorded dur-
ing the inspection. Although the available analysis is limited,
the following techniques can be used to determine if the station
is falsifying the emission levels recorded on forms. Like the
preceding analysis, this analysis would require some form of
automated data collection.
2-6
-------
o One method to check on the quality of the
recorded levels would be to calculate the
means and standard deviations of the emission
levels (by model year and possibly by make
and model year groupings). These calcula-
tions should be made for each test facility
and for the entire population. Facilities
that vary greatly from the entire population
could be suspect.
o Another approach to determine expected means
and standard deviations would be to identify
certain high volume stations that appear to
be performing accurate inspections as indi-
cated by failure rate, facility audits, dis-
cussions with owners, the inspection equip-
ment, and so forth. Emission levels from
these stations could be a more accurate
representation of the vehicle population.
2.1.5 Analysis of Waiver Rates
Another basic means of measuring facility performance is to "tally
the number of waivers that are issued by each facility. Those
stations with extremely high waiver rates (as a percentage of the
failed vehicles) may be performing inadequate repairs and those
with low waiver rates may be performing inadequate inspections.
Additional analysis can help to determine if the outliers (with
respect to waivers) may actually be performing adequate inspec-
tions and repairs. Waiver rates may vary for different model
year vehicles or cutpoint categories. Facilities which inspect
primarily newer vehicles, such as new car dealerships, may have
significantly different waiver rates than the average facility.
Thus, if model year data were collected along with the waiver
data, the differences (if any) in waiver rates for different
groups of vehicles could be determined. These differences could
be considered when waiver rates for different inspection facil-
ities (especially the outliers) are being analyzed.
2-7
-------
2.1.6 Analysis of Data From Facility Audits
Facility audits are excellent sources of data on the performance
of a test facility. In fact, these checks plus unannounced spot
checks probably are the only solid legal basis for station sus-
pension. These checks mainly address two facets of the emission
inspection:
o The accuracy of the emissions analyzer; and
o The proficiency of the inspectors to perform
emission tests and other duties, such as
recording data and analyzer maintenance.
A considerable amount of data can be generated from audits;
however, the bottom line is the overall accuracy of the inspec-
tion facility. This is determined by reviewing both the results
of the analyzer accuracy check and the proficiency check. The
raw audit data are specific to each facility; therefore, it is
not appropriate to statistically analyze these data.
Radian is preparing guidelines to perform these audits and review
the data. Consequently, the handling of the raw data generated
during the audits will not be addressed in this report. However,
there are many uses for the audit results after they are tech-
nically evaluated. For example, they can be used to establish
performance trends for different analyzers. And as previously
mentioned, the audit checks can be used to identify stations that
appear to be performing accurate inspections and thus, can be
used to establish expected trends in emission levels.
In particular, the following data are very useful:
o The results of the analyzer calibration and
leak checks (including make of analyzer);
2-8
-------
o The results of the inspector proficiency
checks; and
o The results of the record checks (to check
for proper completion).
2.1.7 Other Analyses to Evaluate the Performance of Inspection
Facilities
The previous analyses provide the major indications of the per-
formance of an inspection facility. However, there are other
data that can be analyzed. If roadside and challenge checks are
performed, then it may be possible to cross-reference these
checks with the I/M inspection (via the license number or VIN) to
provide a gross indication of the accuracy of the inspection and
repair. (The serial number of the inspection sticker may also be
used to identify the original inspection facility.) Because there
would be many reasons for discrepancy between roadside checks and
the initial inspection, it would be difficult to use these checks
to determine the precision of the initial inspection. However,
if gross emitters are found in the field and these vehicles were
recently inspected and passed, there is a high probability that
the inspection facility improperly passed the vehicle.
Another possible way to check on the inspection facility would be
to review complaints that are filed by motorists. In fact, pro-
gram officials may want to encourage motorists to provide feed-
back on the inspection through some form of complaint handling
process (or consumer hot line.) All complaints should be inves-
tigated, if possible, however if resources are limited, facil-
ities with repetitive complaints should receive a high priority
for investigation.
2-9
-------
2.2 Revising Cutpoints
Data can be used to identify and support the need for changes in
program outpoints. Cutpoints may need to be revised for some of
the following reasons:
o To adjust the average failure rate within
cutpoint categories (e.g. to reduce exces-
sively high failure rates for certain vehicle
categories);
o To control the overall failure rate at the
level desired for emission reductions (if the
cutpoints are not changed, the failure rate,
and accordingly, the program effectiveness,
could drop); and
o To revise waiver rates.
Generally, at least one year should pass before cutpoints are
revised.
The general analysis techniques to revise cutpoints are similar
for each, of the above reasons. They essentially involve collect-
ing data from the vehicle inspection records (see Table 3) in
order to tighten (increasing stringency) or loosen the emissions
standards. Several approaches can be used to adjust the stan-
dards. One approach is trial and error. The CO and/or HC stan-
dards could be adjusted up or down until the desired failure rate
is observed. A FORTRAN program for efficient trial and error
selection of cutpoints is available from EPA.
A more straightforward approach may be to construct cumulative
distributions of HC and CO idle emission levels. These distribu-
tions would show the percent of vehicles falling below certain
emission levels and could be developed for different groups of
model years and/or vehicle types. By observing these distribu-
tions, the analyst could pick cutpoints that will fail a greater
2-10
-------
or lesser number of vehicles. However, these distributions do
not give a definitive indication of the number of vehicles that
would fail both the HC and CO standard. Consequently, there is
also some trial and error with this approach.
If the program handles and processes data manually, then a sample
of the population should provide enough data to revise cutpoints
(sampling techniques to revise cutpoints are discussed in Section
3.4). And as previously discussed, the program could also depend
upon data collected from high volume stations that appear to be
performing accurate inspections (see Section 2.1.3).
Procedures and recommendations to revise cutpoints are presented
in a recent EPA report (Recommendations Regarding the Selection
of Idle Emission Inspection Cutpoints for Inspection and Mainte-
nance Programs, EPA-AA-IMS/81-1, January 1981.) This report
addresses idle hydrocarbon (HC) and carbon monoxide (CO) cut-
points and expected resulting failure rates in an I/M program.
TABLE 3. DATA REQUIRED TO REVISE CUTPOINTS
BASIC DATA
Vehicle Inspection Records
Test Sequence (initial or retest)
P/F status (including if waiver issued)
HC reading (before and after repair)
CO reading (before and after repair)
Model year
SUGGESTED ADDITIONAL DATA
Vehicle Inspection Records
Odometer
- Make
Vehicle type (LDV, LOT, HDG, etc.) or gross
vehicle weight
2-11
-------
Recommended outpoints are included for various failure rates both
in the first year of an I/M program and in its second year. The
analysis applies to both centralized and decentralized programs.
2.3 Evaluating and Enhancing the Effectiveness of Sticker
Enforcement
With a sticker enforcement system, there is a greater possibility
of non-compliance than with a registration enforcement system.
In the latter case, the inspection is a prerequisite to vehicle
registration; and thus the enforcement is almost automatic
(unless motorists register their vehicles outside the program
area). A sticker system depends upon local or state police to
stop and ticket vehicles which lack or have expired stickers.
Consequently the effectiveness of sticker enforcement depends
upon adequate police staffing, the vigilance of police in stop-
ping and questioning apparent" violators, the severity of penal-
ties imposed by local courts and the visibility and publicity
associated with the enforcement efforts. Operating I/M programs
that have used stickers for enforcement have reported problems
with program circumvention, especially when the program was
implemented on a regional rather than a statewide basis. Conse-
quently, there is a need for the program to use data analysis to
evaluate the effectiveness of sticker enforcement. In addition,
data analysis can be used to enhance the effectiveness of the
enforcement program.
2.3.1 Data Required
As shown in Table 4, the data required to evaluate enforcement
can be found on the vehicle inspection records, as well as the
registration records. In addition, the maintenance of a data
base that indicates the I/M compliance status of each registered
vehicle would greatly aid the enforcement efforts.
2-12
-------
TABLE 4. DATA REQUIRED TO EVALUATE STICKER ENFORCEMENT
BASIC DATA
Vehicle Inspection Records Vehicle Registration Records
- P/F status (including waiver - Vehicle type (LDV, LDTf
issued) (2.3.2) HDGr etc.) or gross
vehicle weight (2.3.2)
SUGGESTED ADDITIONAL DATA
Vehicle Registration Data Base1 (2.3.3)
License or VIN (from both vehicle inspection and vehicle
registration records)
Registration date (from vehicle registration records)
I/M compliance status (yes/no) (from vehicle inspection
record)
I/M test date (from vehicle inspection record)
Sticker number (from vehicle inspection record)
Data on Citations Issued for Non-compliance (2.3.2)
Developed by combining the Vehicle Inspection Record with the
Vehicle Registration Records
2.3.2 Calculating the Compliance Coefficient
A simple indication of the effectiveness of the enforcement
efforts can be determined by calculating the compliance coeffi-
cient. The compliance coefficient is the percentage of
registered vehicles that have complied with the inspection
requirements. It can be determined by the following formula
where the number of stickers equals the number of passed (or
waived) vehicles:
Compliance _ Number of stickers (including waivers)
Coefficient Number of vehicles registered - exempt vehicles
2-13
-------
In a well enforced program the compliance coefficient should be
close to 1.0. Therefore, if the calculated compliance coeffi-
cient is much below 1.0, then enforcement efforts may need to be
enhanced. Program officials may want to consider adding addi-
tional police or stepping up the public awareness efforts.
Another useful piece of data would be the number of citations for
noncompliance. This number needs to be evaluated in light of the
compliance coefficient. If both numbers were low/ then the pro-
gram could have serious problems.
2.3.3 Identifying Non-Complying Vehicles
If enforcement continues to be a problem despite increased public
awareness or policing, then it may be advantageous to set up a
new system to identify non-complying vehicles. Essentially this
would involve maintaining a data base on the I/M status of the
registered vehicles, and currently registered vehicles that have
not been inspected could be identified.
In a centralized program this form of data base could be easily
maintained by entering compliance data directly at the inspection
lanes. These data could then be correlated with the vehicle
registration data via the license plate or the vehicle identifi-
cation number (VIN), and vehicles needing inspection could be
identified. If the registration data are in a real time data
base management system (DBMS), registration files could be up-
dated continuously as the inspection is performed. Otherwise the
files could be updated in batches via data tapes.
There are several options available to identify non-complying
vehicles in a decentralized I/M program. If machine readable
forms or data tapes are used for the inspection, then it should
be relatively easy to input data on the I/M status (see Section
2-14
-------
4.1.1). The inspection data could be correlated with the regis-
tration data via license number or VIM, and printouts could be
generated of non-complying vehicles. However/ if the forms are
not machine readable/ then it would be necessary to keypunch the
I/M status data along with the license or VIN.
With all of these approaches, except for possibly entering data
into a real time DBMS, there could be accuracy problems with
entering the license number or VIN. This could result in some
vehicles being falsely identified as not complying with the I/M
requirements. However, the presence of the sticker on the vehi-
cle should quickly alleviate these problems.
2.4 Determine Sticker Accountability
In programs which use inspection stickers for compliance enforce-
ment, it is necessary to account for them to ensure that large
numbers of unauthorized stickers are not getting into the hands
of the public. In order to account for stickers, the following
data need to be collected.
1. The serial numbers of stickers issued to each
inspection facility;
2. Data from audits on the serial number of
stickers on hand; and
3. Vehicle inspection data, specifically the
number of passed or waived vehicles.
Sticker accounting should be performed for each inspection facil-
ity (instead of all the facilities collectively) since this will
allow for discrepancies to be immediately tracked down.
Stickers could be accounted for as part of the periodic facility
quality control audits. The agency responsible for issuing
2-15
-------
stickers could supply auditors with lists of facilities and
serial numbers. When the auditors inspect the facilities, they
could check the serial numbers of the stickers on hand. Then,
from the inspection data, the number of passed or waived vehicles
could be tallied. The number of stickers on hand plus the number
of passed or waived vehicles should equal the total number
issued. In addition, the auditors could encourage stations to
return expired stickers for credit, and thus help prevent "hot
stickers."
2.5 Determine Fee Accountability (for Contractor Operated
Programs)
The administrating agency for the I/M program should be respon-
sible for the accountability of the inspection fees. In most
centralized contractor run I/M programs, the inspection fee is
collected by the contractor, who then transmits a portion to the
administrating agency to cover its costs. It may be possible for
a contractor to understate the number of inspections and thus
withhold funds from the State.
Conversely, the administrating agency could pay the contractor a
specific amount for each test or retest. (The inspection fee
could be part of the registration fee.) In this case, it may be
possible for the contractor to overstate the number of inspec-
tions, thereby causing the administrating agency to pay for a
greater number of inspections than actually occurred. (This has
potential for serious cash flow problems or fraud; therefore, I/M
programs may opt to have the contractor collect the fee at the
inspection lanes.)
It should be noted that agencies in Arizona and California do
account for fees in their contractor-operated I/M programs, and
2-16
-------
they have not reported problems with fee accountability or fraud.
The data required to account for fees is shown in Table 5.
TABLE 5. DATA REQUIRED TO ACCOUNT FOR FEES
BASIC'DATA (2.5.1)
Vehicle Inspection Records Contractors Financial Data
Date of inspection Money collected:
Test sequence - Amount due to contractor
- P/F status - Amount due to
Facility ID administrating agency
SUGGESTED ADDITIONAL DATA (2.5.2)
Vehicle Registration Data
License or VIN
Registration date
Vehicle Inspection Records
License or VIN
2.5.1 Basic Accounting Methodology
When accounting for the fees, it is necessary to first determine
the number of tests (including retests). This number can be
determined by tallying the vehicle inspection records. The
amount of money collected by the contractor depends upon the
billing mechanism. If the motorist only pays for a certificate
of compliance or a sticker, then the total funds collected should
equal the number of stickers or certificates times the inspection
fee. However, in some programs motorists pay for each test,
whether or not it is an initial test or retest. In this
instance, it is necessary to keep track of each inspection to
determine whether the correct amount of money was collected.
2-17
-------
2.5.2 Validating the Number of Inspections
If the contractor is suspected of understating or overstating the
number of inspections for the reasons that were previously
discussed, it may be necessary to validate the number of inspec-
tions. For.the case where the number of inspections was under-
stated and registration was used for the enforcement mechanism,
then registration data should provide an accurate number of
certificates issued.
In a contractor operated program using a sticker enforcement sys-
tem, checking for an understatement of the number of inspections
can be accomplished by holding the contractor accountable for the
number of unused stickers. A contractor would have to show that
he possesses the appropriate number of stickers to account for
the difference between the total number of stickers initially
issued to him, and the number issued to motorists after being
inspected.
For a case where the number of inspections are overstated (for
purposes of obtaining additional funds from the administrative
agency) similar checks can be made. If registration is the en-
forcement mechanism, again the number of vehicles registered
should correspond with the number of certificates issued. If
sticker enforcement is used, a check could be made that a valid
license number or vehicle identification number was entered onto
the inspection form. This can be done by cross-referencing the
inspection records with registration data (via the license number
of VIN). Although this latter method is not foolproof, it will
help to identify fraudulent records.
2-18
-------
2.6 Evaluate the Vehicle Waiver System
Another important function of data analysis is to evaluate the
vehicle waiver system. The effectiveness of an I/M program in
achieving emission reductions would be significantly reduced if a
large number or high percentage of the failed vehicles received
waivers. Generally, an excessive waiver rate indicates that
there are problems within the mechanic sector. Since most vehi-
cles should be able to pass an I/M test, a high waiver rate may
also indicate that there are a large number of illegitimate
waivers. It also suggests that people are obtaining certificates
of compliance or inspection stickers without really attempting to
have their vehicles repaired. The data required to review waiv-
ers are shown on Table 6.
TABLE 6. DATA REQUIRED TO REVIEW WAIVERS
BASIC DATA
Vehicle Inspection Records (2.6.1)
- Test sequence (initial or retest)
- Facility ID
- P/F status (including waiver issued)
SUGGESTED ADDITIONAL DATA
Vehicle Inspection Records
- Type of repair (parts required or replaced) (2.6.2)
- Mechanics ID (2.6.2)
- Repair facility ID (2.6.2)
- HC reading (before and after repair) (2.6.2)
- CO reading (before and after repair) (2.6.2)
- Repair cost (2.6.2)
- Model year (2.6.3)
- Make (2.6.3)
- Odometer (2.6.3)
2-19
-------
2.6.1 Determining the Basic Waiver Rate
The basic approach to analyzing the waiver system is to tally the
overall number of waivers and determine the percentage of failed
vehicles that are being waived from compliance. In addition, as
mentioned earlier/ the administrative agency can also determine
if individual repair facilities or mechanics are abusing waiver
policies. By tallying the number and percent -of waivers issued
by the decentralized inspection facilities, the program could
identify facilities suspected of performing inadequate repairs or
inspections. (In centralized I/M programs this analysis will
require the use of licensed repair facilities or mechanics.)
2.6.2 • Determining Legitimacy of Waivers
The following data analysis with the aid of automated data pro-
cessing can help determine the legitimacy of the waivers issued.
o A review of the inspection records can be
made to determine if repair costs of waived
vehicles coincide with the repair cost limits
established by the waiver policy.
o In addition, if the program requires that a
licensed mechanic or garage perform the nec-
essary repairs prior to obtaining a waiver, a
check could be made to insure that a valid
mechanic's ID number was entered onto the ve-
hicle inspection records.
o Also, if the vehicle inspection record con-
tains data on the type of repair, i.e., the
parts repaired or replaced, verification
could be made that the necessary repairs were
made on the vehicle to qualify for a waiver.
In most cases these repairs are the equiva-
lent of a low emission tune-up.
2-20
-------
Another way of determining if waiver policies
are being abused would be to calculate the
idle emission reductions of the waived vehi-
cles. Facilities that show very low idle
emission reductions should be suspected of
performing incorrect or inadequate repairs.
2.6.3 Characterizing Vehicles Receiving Waivers
It may also be useful to characterize the vehicles that are
waived. This would involve tallying the number of waivers by
make, model year and possibly odometer reading. If a relatively
large number of waivers are issued for a certain make and model
year vehicle, then the manufacturer may need to be consulted to
determine why these vehicles are having difficulty complying with
the emissions inspection. A revision of cutpoints may be re-
quired for such a case.
2.7 Quantify Idle Emission Reductions
Since the purpose of inspection/maintenance is to reduce emis-
sions from motor vehicles, program officials may want to calcu-
late the idle emission reductions that have resulted from the
repairs. It should be noted, however, that idle emission reduc-
tions are indicative, but not necessarily definitive of actual
emission reductions (measured by the Federal Test Procedure
(FTP)). The vehicle inspection record provides the necessary
data to calculate the reductions (see Table 7.) However, random
surveys of idle emission levels in non-I/M areas provide addi-
tional input into the effectiveness of the I/M program. Of
course, if the I/M program encompasses the entire state, then it
would be impossible for state officials to conduct these surveys.
2-21
-------
TABLE 7. DATA REQUIRED TO EVALUATE IDLE EMISSION REDUCTIONS
BASIC DATA
Vehicle Inspection Record (2.7.1)
Test sequence (initial or retest)
P/F status (including waiver issued)
- HC reading (before and after repair)
- CO reading (before and after repair)
SUGGESTED ADDITIONAL DATA
Vehicle Inspection Record (2.7.2)
- Odometer
Model year
- Make
Random Surveys in Non-I/M Areas (2.7.2)
HC reading
CO reading
Odometer
- Make
Model year
2.7.1 Idle Emission Reductions from Repairs
A simple indication of the effectiveness of the program would be
the idle emission reductions from repairs. Essentially, this is
determined by calculating the average emission levels before and
after the repairs. The percent difference is calculated by di-
viding the reduction in emission levels by the before repair
levels. If the program handles data manually/ then the above
calculations may be performed on a sample of the vehicle popula-
tion (see Section 3.4).
2-22
-------
2.7.2 Comparing I/M Areas with Non-I/M Areas
If data handling were computer assisted, a comparison can also be
made of the idle emission levels within the program area with a
non-I/M population. One method to do this would be to collect
data at the start of the program before repairs are performed.
The data obtained from the initial emission test could be charac-
terized by make, model year, and odometer reading.
The overall trends in average idle emissions, as a function of
odometer reading, should vary for different technology cate-
gories. For instance, the increase in emissions from catalyst
and non-catalyst vehicles as related to mileage would differ due
to the particular deterioration rate associated with the type of
technology. Thus model year groupings can be developed for the
different technologies (i.e., non-catalyst, pre 1975; oxidation
catalyst, 1975-1980; threeway catalyst, 1981 and later.)
As the program proceeds, the same characterizations can be per-
formed with the I/M population. That is, the average idle emis-
sions (both before and after repair) could be calculated and the
overall trends in emissions as a function of odometer and model
year group could be determined. These trends could then be com-
pared to the extrapolations of data collected at the start of the
program.
The data from the I/M population can also be compared with data
from non-I/M populations. As previously discussed, this would
involve conducting random surveys of idle emission levels in bor-
dering areas that do not have I/M. The emission levels could be
standardized by model year group and odometer reading and compar-
isons could be noted.
2-23
-------
2.8 Reporting
Aside from using data analyses to help run the I/M program, they
can also be used to generate reports about the operation and
effectiveness of the program. Essentially, these reports would
be compiled from the results of the previously described analy-
ses. They should be prepared for different groups, such as the
legislature, the public, special interest groups, agencies, and
the service industry.
2.8.1 Reports to the Legislature
Different types of reports should be prepared for the legisla-
ture. Probably the most useful report would be a summary of the
program data. These reports would give the legislature infor-
mation such as:
o The number of vehicles inspected,
o Failure rates,
o Emission reductions, and
o Repair costs.
Certain members of the legislature will probably want to see a
comprehensive compilation of I/M program data. Consequently,
data should be analyzed to present specific details on the pro-
gram. Essentially all available data should be analyzed to pre-
pare such a report. As a minimum, these reports should include
tabulations and averages of the basic items shown in Table 8,
i.e., the inspection records, the results of field investigation
and enforcement activities. In addition, a summary should be
provided of the results of the following analysis:
o The performance of the inspection facility,
o Revision to the cutpoints,
2-24
-------
RADIAN
CORPOttATIOM
o Effectiveness of the enforcement activities,
o Accountability of the fee and stickers,
o Effectiveness of the waiver system,
o Idle emission reductions, and
o The public's perception of the program.
2.8.2 Reports to the Public
Like the legislature, the public is involved with the I/M pro-
gram. Various public sectors may want to know information such
as failure rate and average repair costs, along with reductions
in idle emission levels. In addition, information on the use of
warranty repairs would be informative to the public. Conse-
quently, program officials should prepare a report similar to the
summary report to the legislature. This report can be compiled
from the data presented in Table 8. The public should also be
kept informed of the availability of more comprehensive reports
on the program.
TABLE 8. DATA REQUIRED FOR REPORTING
Summary data of the following:
BASIC DATA (2. 8)
Vehicle inspection records
Facility audit data
Vehicle registration records
Contractor financial data
Field enforcement activities
SUGGESTED ADDITIONAL DATA (2.8)
Roadside or challenge check data
Vehicle Registration Data Base (indicating I/M status to
determine non-complying vehicles)
Random surveys in non-I/M areas
2-25
-------
2.8.3 Reports to Other Groups and Agencies
Other groups and agencies may be interested in the operation of
the program. Consequently, program officials should be ready to
generate reports as needed. For example, the Better Business
Bureau may want information on emission related repairs, and in
particular, repair costs. Another example would be requests by
the automobile manufacturers for information on the failure rates
of their vehicles. As discussed, these requests may be easily
met if the inspection records are compiled into a well managed
data handling system.
2.9 Additional Objectives of I/M Data Analysis
The previous analysis techniques addressed functions that are
necessary to operate an inspection/maintenance program. However,
as mentioned earlier, data analysis can perform additional
functions that would enhance the operation of the program,
although these functions are not crucial to the day-to-day
activities.
2.9.1 Evaluating the Repair Process
Obviously the quality of the repairs plays an important role in
success of an I/M program. Although mere compliance with the
emission standards is likely to achieve significant emission
reductions, much greater reductions are possible if repairs are
performed carefully and in accordance with the manufacturer's
specifications. Therefore, it is useful for program officials to
use data analysis to evaluate repair performance. This evalua-
tion should address both the individual repair facilities (or
mechanics) and the overall effectiveness of repairs. In both
2-26
-------
cases, the results are useful in providing direction for mechanic
training programs operated by the administrating agency. Data
required to evaluate repairs are shown in Table 9.
TABLE 9. DATA REQUIRED TO REVIEW REPAIRS
BASIC DATA
Vehicle Inspection Record (2.9.1)
- Test Sequence (initial or retest)
P/F status (including waivers issued)
- Repair cost
HC reading (before and after repair)
CO reading (before and after repair)
SUGGESTED ADDITIONAL DATA
Vehicle Inspection Record
Type of repair (2.9.1.2)
- Repair facility ID (2.9.1.1)
- Model year (2.9.1.2)
- Make (2.9.1.2)
Repair Invoices from Waived Vehicles (2.9.1.1)
2.9.1.1 Evaluating the Repair Performance in Centralized
Programs
For centralized programs, considerable analysis can be performed
if some form of repair garage licensing or registration is used.
The ID of the repair facilities could be included on the inspec-
tion records, thus allowing the administrating agency to deter-
mine the emission test performance. Specifically the agency
could determine:
o The percent passing on retest,
o The average cost,
2-27
-------
o The average emission reductions, and
o The number of waivers issued.
Without licensing or registration of the repair facilities in
centralized programs, there are limitations to the analyses that
can be performed. One possible method is to audit records and
repair invoices on waived vehicles. Although this is a time con-
suming method, it can identify candidates for mechanic training
or further surveillance activities. Program officials might look
at the repair records and examine vehicles to determine if the
repairs were correctly performed. In some cases, program offi-
cials could actually contact the repair facility. Arizona has
successfully used such a program (which they term the "waiver sur-
veillance program") to reduce the number of waivers. However,
they do report that it is manpower intensive.
In decentralized programs the repairs are often performed at the
inspection facility. Thus, analysis techniques that were previ-
ously discussed may be used to evaluate the repair performance
(see Sections 2.1.3 and 2.6).
2.9.1.2 Evaluating the Effectiveness of Different Types of
Repairs
If the type of repair is included on the inspection records, then
program officials could tally the repairs that are performed for
different type of failures (i.e., HC, CO, tampering.) The results
of the tally should be compared with the repairs that would be
expected as a result of engineering judgment or studies such as
EPA's FTP Testing Programs. For example, if a vehicle fails only
for CO, then it is unlikely that a complete engine tune-up should
have been performed. Discrepancies found would provide feedback
into mechanic training programs.
2-28
-------
It may also be useful to calculate the idle emission reductions
by type of repair to determine if certain types of repairs appear
to be more effective. It should be noted, however, that although
the idle emission reductions are indicative of the repair perfor-
mance, they do not necessarily indicate that the vehicle will be
achieving low emissions in actual use.
To provide additional input into mechanic training programs and
public awareness programs, calculations could be made on the cost
of repairs by .the following parameters:
o Type of vehicle,
o Age and mileage of vehicle,
o Make of vehicle, and
o Geographic area.
Again, most of these calculations would probably require computer
assistance.
2.9.2 Evaluating Inspector Performance -
In decentralized programs, it may be advantageous to review
vehicle inspection records in order to evaluate the performance
of the inspectors. The items to investigate would be errors in
following the test procedures or errors in filling out the forms.
Procedural errors such as the use of the wrong emission standards
could be picked up by a computer. In order to do this, the in-
spector ID needs to be included in the vehicle inspection record.
A check could then be made to determine if the emission standards
were appropriate for the model year reported on the inspection
record.
2-29
-------
It is more difficult, however/ to determine if the forms are pro-
perly filled out. If computer readable forms are used, then the
computer should be able to flag forms that have missing or absurd
information on them. Techniques to flag these forms will be dis-
cussed in Section 4. However, in some cases it will be neces-
sary to manually scan the forms to see if they were filled out
correctly. If the forms that are not correctly filled out were
separated by the inspector ID, then it should be easy to identify
some of the inspectors who have problems filling out the forms.
In centralized programs, several inspectors may be involved with
each vehicle that comes through the lanes. Consequently, it is
not appropriate to use data analyses to evaluate inspector per-
formance since it is difficult to actually pinpoint which
inspector was responsible for which data.
Even with decentralized programs, it may not be appropriate for
the administrating agency to concern itself with the performance
of each inspector unless they are trying to track down problems
in outlying inspection facilities (those with questionable data).
It will take considerable manpower for the agencies to adequately
monitor the performance of each test facility, regardless of the
inspectors themselves.
2.9.3 Evaluate the Performance of Test Equipment
The results of the facility audits can be used to determine per-
formance trends for different emissions analyzers. Therefore,
auditors should collect data on the type and model of the ana-
lyzers that the stations use. These data can then be correlated
with the results of gas calibration checks made by the auditors.
Analyzers that consistently show good performance could be iden-
tified by the administrating agency, while the agency may want to
2-30
-------
restrict the sale of analyzers that show poor performance. In a
centralized program, an analysis of the data could identify ana-
lyzers with frequent and/ or major problems. These analyzers
could be singled out for major repairs or replacement.
Data from the facility audits can also be used to evaluate the
accuracy of the calibration gases that are used by the inspection
stations. When the auditors are at the stations, they could per-
form a gas calibration with their gas and with the station's gas.
If discrepancies are found, then the auditors could track down
the source of error in the calibration gas. The vendor of the
calibration gas could also be noted on the audit data. This
would allow trends to be investigated for the different specialty
gas manufacturers.
2.9.4 Determine the Effectiveness of the Public Awareness
Programs
Public awareness programs are different for various phases of the
implementation of an I/M program. Early in the program, general
air pollution and automobile-related air pollution would be em-
phasized; later, just prior to implementation, specific details
of the program would be presented to the public; and finally,
after implementation, current operating information would be pro-
vided to the public on an on-going basis. An evaluation of pub-
lic awareness should deal with each phase separately.
There are two general sources of data to assess the effectiveness
of public awareness programs. Data are available from inside the
program such as the number and subject of complaints, and from
outside the program through surveys and/or consumer "hotlines."
The program could encourage the use of consumer hotlines to
report complaints or comments about the program. The operators
2-31
-------
of the hotline could record each phone call and periodically
tally them. The I/M program in the South Coast Area Basin of
California reports that it receives an average of 300 calls per
day. Officials there categorize each call and prepare a summary
report that quantifies the motorist feedback on the program (see
Table 10.) This form of feedback can then be used to direct the
public awareness activities and program modifications.
As mentioned, a major source of public awareness data are public
opinion surveys. To ensure valid results from these surveys it
is extremely important that they are properly designed. In the
near future, EPA expects to release a technical report entitled,
"Public Opinion Polls for Inspection and Maintenance Programs:
Some Technical Considerations." This report will provide details
on the design and uses of public opinion polls.
If the program uses sticker enforcement, then the previous analy-
sis on the effectivenss of enforcement (see Section 2.3) will
provide much input into the public awareness program.
2.10 Summary of Data Requirements
A listing of the data elements that are required to support the
different objectives of I/M data analysis is presented in Table
11. The basic data elements are the data required to meet most
of the major objectives, whereas the suggested additional ele-
ments provide additional analysis capabilities.
Figure 1 shows which data are required to meet each objective.
Note that many objectives may be met with the basic data on the
vehicle inspection records.
2-32
-------
TABLE 10. A TALLY OF PHONE CALLS RECEIVED BY
CALIFORNIA'S I/M PROGRAM
CATEGORIES
HTS Problems
Department of Motor Vehicles
Qualified Mechanic's List
Waiver Information
Data Logs - Fleets
Qualified Mechanic's Procedures
ECS Application (OEM & Retro)
Engine Changes
General Information
Fleet Information & SVIS Calls
Certifying Heavy Duty Trucks
Non-Compliance Questions -
New Cars
Non-Jurisdictional
Seminar Information
MVPC Information
Idle Speed, Standards, $35 Nox
Price
Reference Materials
A.R.B.
Calls from Politicians
Cost of Inspection
REPORT PERIOD
(1 Month)
294
31
24 •
678
4
27
143
134
3,003
62
115
19
6
159
33
38
11
24
0
401
ACCUMULATIVE
TOTAL
2,536
374
876
4,191
255
376
2,417
1,063
28,964
649
751
202
145
950
314
521
59
122
3
1,268
2-33
-------
TABLE 11. SUMMARY OF DATA REQUIRED
BASIC DATA
Vehicle Inspection Records
Date of inspection
- Facility ID
Test sequence (initial or retest)
P/F status (including waivers
issued)
- HC reading (before and after)
CO reading (before and after)
Model year
Sticker Records
Serial number of stickers
issued
- Facility ID
Contractor's Financial Data"
Money collected:
Amount due to contractor
- Amount due to administrating
agency
Facility Audit Data
Date of check
Results of analyzer
calibration checks
Serial numbers of
stickers on hand
Results of record checks
Make and model -of analyzer
Results of inspection
proficiency checks
Vehicle Registration Records
- Vehicle type (LDV, LOT, HDG,
etc.) or gross vehicles weight
Model year
Registration date
- License number or VIN
(Continued)
2-34
-------
TABLE 11. (Continued)
SUGGESTED ADDITIONAL DATA
Vehicle Inspection Records
- Odometer
Type of repair, (items replaced
or repaired)
- Vehicle type (LDV, LOT, HDG,
etc.)
Mechanics ID
Repair facility ID
Inspector ID
Repair cost
License number or VIN
- Make
Random Surveys in Non-I/M Areas
HC reading
CO reading
Odometer
- Make
- Model
Public Opinion Surveys
Facility Audit Data
Source of calibration gas
Roadside or Challenge Check Data
- Date of check
License or VIN
HC reading
CO reading
Tampering check .of results
- P/F
- Inspection facility (if
indicated on sticker)
Complaints
- Type
Facility
2-35
-------
SUGGESTED ADDITIONAL DATA
VEHICLE INSPECTION RECORDS
BASIC OBJECTIVES
1. Evaluating the Performance of the Inspection Facilities
Determine Failure Rates
- By facility
- By facility and model year
- By facility and model year and make
- By facility and odometer
- Standardized
Determine Repair Costs (decentralized)
- By facility
Determine Type of Repair (decentralized)
- By facility
Determine Waiver Rates
- By facility
Determine Analyzer Accuracy
Determine Inspector Proficiency
Determine Gross Accuracy of Inspection
Determine Complaints by Facility
Determine Idle Emissions
2. Revise Cut Points
Determine Idle Emissions
Determine Waiver Rates
/ ""-^
X
X
X
X
X
X
X
X
X
X
X
X
/
X
X
/
/
1
X
' -w
/
X
/
X
1 I //•**/ I
X
X
i or ^
X
/ ^
X
r
F
OJ
*Refer to Table 11 for lists of the basic data elements
Figure 1. Data Requirements to Meet Objectives
-------
SUGGtSIEIJ AllimiUNAL UAIA
BASIC OBJECTIVES (Continued)
VEHICLE INSPECTION RECORDS
3. Evaluate Sticker Enforcement
Calculate Compliance Coefficient
Identify Non-Complying Vehicles
4 Determine Sticker Accountability
5 Determine Fee Accountability (Contractor Operated Prog.)
Account for Fees
Validate the Inspections
6. Evaluate Vehicle Waiver System
Determine Legitimacy of Waivers
Determine Waiver Abuse
Characterize Vehicles Waived
7. Quantify Idle Emission Reductions
Calculate Emission Reductions
Compare Emissions with Non-I/H Case
- Using program data
- Using samples from non-I/N areas
8. Reporting (use above results as needed)
F
X
X
X
X
X
X
X
X
X
V
X
/
X
F
X
X
X
/ /
X
X
X
X
/
X
r
X
[III ^1 1 1
X
X
X
X
r
r
X
/
rO
I
*Refer to Table 11 for lists of the basic data elements
Figure 1. Data Requirements to Meet Objectives (Continued)
-------
SUUULSILU AUUIIIUNAL UAIA
ADDITIONAL OBJECTIVES
VEHICLE INSPECTION RECORDS
9. Evaluate Repairs
Evaluate Facility Performance
- Decentralized programs
- Centralized programs
Evaluating Effectiveness of Repairs
- Type repair vs. failure
- Idle emission reductions by type of repair
- Cost by vehicle parameters
10. Evaluating Inspector Performance
11. Evaluating Equipment Performance
Analyzers
Calibration Gas
12. Evaluate Public Awareness
X
X
X
X
X
X
X
/
X
X
/
X
i **y i **-
/
X
X
X
X
/ / / / / /
1 1 1 1 1
X
X
X
X
X
X
X
X
*Refer to Table 11 for lists of the basic data elements
r °
X
r ^
/
X
N>
I
(jJ
00
Figure 1. Data Requirements to Meet Objectives (Continued)
-------
3.0 DATA ANALYSIS TECHNIQUES
The goal of data analysis is to determine the performance of
various aspects of the I/M program. Program officials need to
carefully consider what summaries and types of analyses are
pertinent to directly address day-to-day operations and policy
making issues. (This is equally true regardless of whether com-
puters or manual techniques are used for data analysis.) The
results of the analyses should be presented as graphical or
tabular summaries to enhance the reviewer's ability to understand
the overall operation of the I/M program. The previous section
showed how data could be used to perform several functions in an
I/M program. This section describes in greater detail the analy-
sis techniques which may be used to perform these functions.
3.1 Basic Analysis Techniques
Generally the review of I/M data does not involve complicated
data analysis techniques. A majority of the analyses can be
performed by simply tabulating or averaging the data. In only a
few instances is it necessary to calculate standard deviations.
However, if samples must be taken to reduce the amount of data
handled, then the data analyst will need to employ more sophis-
ticated measures. Sampling techniques are straight-forward and
are discussed in Section 3.4 of this document.
3.1.1 Tabulations of Data
By and large a vast majority of I/M data analyses can be per-
formed by simply tabulating the data that are collected. Simple
tabulations include determining the failure rate for each facil-
ity. Essentially, this involves counting the number of passed
and failed vehicles as indicated on the inspection records. If
3-1
-------
the records are sorted by the facility, then it is easy to deter-
mine the failure rate for each facility. More complex tabula-
tions include determining the failure rate for each facility as a
function of vehicle make and/or model year. Other examples of
tabulations are listed in Table 12.
3.1.2 Averages and Standard Deviations
For some analyses, simple tabulations will not provide an ade-
quate interpretation of the result. For example, a tabulation of
the idle emission levels would be meaningless. Consequently, it
is sometimes necessary to calculate an average of the values
reported for different pieces of data. The following items are
best expressed as averages.
o Idle emission levels (both before and after
repair),
o Overall failure rate of the program (deter-
mined by averaging the individual failure
rates tabulated by facility), and
o Repair costs.
Although the average may provide an excellent measure of the
central tendency of a set of numbers, it gives no indication of
the variability of those numbers. Consequently, the analyst may
want to calculate standard deviations. As mentioned in Section
2.1.1, for example, standard deviations of the idle emission
levels may give an indication of whether or not an individual
inspection station was falsifying the test results. The use of
the standard deviation may also be useful in determining the
degree of variability of repair costs.
Standard deviations of failure rates could also be calculated in
order to set limits of acceptable failure rates for an inspection
3-2
-------
TABLE 12. ITEMS THAT COULD BE TALLIED IN I/M PROGRAM
• Failure ratesl by model year
• Failure rates * by make and model year
• Failure rates1 by facility
• Waiver rates for each facility
• Type of repairs
Serial numbers of stickers or certificates
issued to each facility
• Fees collected by the facility
• Performance of different test equipment
• Complaints or public opinion surveys
• Reason for failure by type of repair
failure rates should be tallied as follows
- number of vehicles inspected
number of vehicles initially failed
number of vehicles failed upon reinspection
3-3
-------
station. Concerning this latter point, the limits are just as
easily established by observing a certain percentage of the
outlying stations. That is, the I/M program manager may want to
observe the top or bottom five percent of the stations, instead
of observing those stations which fall outside the limits of plus
or minus two standard deviations.
3.1.3 Statistical Analysis Packages
Many companies and institutions have prepared standard statis-
tical analysis packages. These packages are available to govern-
ment and private users and are extremely useful to the analyst.
Many of. the packages allow for data to be quickly tabulated as a
function of 2 or 3 indicators. For example, failure rates can be
determined by facility, by make, and by model year. In addition,
averages and standard deviations are quickly computed by many of
these packages. Table 13 lists many of the available statistical
analysis packages.
3.2 Limits to Data Analysis
The limits to the data analysis that can be performed depends
upon the method that will be used. There are three basic methods
of data analysis: manual, computer assisted, and computer as-
sisted with a data base management system. Obviously computer
assisted analyses provide much more flexibility; however, many
objectives of data analysis can be met by manually tabulating the
data collected in an I/M program. If samples are selected,
manual data analysis may also be used to calculate some averages
and standard deviations.
3-4
-------
TABLE 13. STATISTICAL ANALYSIS PACKAGES
NAME
ADDRESS
SAS
BMDP
SPSS
Minitab
Midas
P. Stat, Inc.
P.O. Box 10066, Raleigh, NC 27609
University of Callifornia, 2223 Fulton Street
Berkeley, CA 94720
Norman H. Nie, Department of Political Science
University of Chicago, 5801 S. Ellis,
Chicago, IL 60637
Prof. Thomas F. Ryan, Jr., Statistical Department
215 Pond Lab., Pennsylvania State University,
University Park, PA 16802
University of Michigan
Institute for Social Research
Ann Arbor, MI 48104
P.O. Box 285, Princeton, NJ 08540
3-5
-------
3.2.1 Manual Data Analysis
Manual data analysis is sufficient to perform many of the tabu-
lations previously discussed. In fact some of the operating I/M
programs depend strictly upon manual data analysis for feedback
of information. Following are some of the tabulations that can
be performed by manually analyzing all of the available data:
o Failure rates by facility,
o Waiver rates by facility,
o The compliance coefficient (see Section 2.3),
o The accountability of stickers or certifi-
cates Clt is recommended that the lists of
serial numbers issued be generated by a com-
puter. However, it is possible to maintain a
log book of the serial numbers issued for
each facility),
o Accountability of fees,
o The performance of different test equipment,
and
o Complaints or public opinion surveys.
The format of the forms is very important if data are to be tabu-
lated manually. Key data elements such as the pass-fail status,
the test sequence (initial or retest) and the facility ID should
be easy to find on the form. In addition, it would be helpful,
particularly in decentralized programs, to collect data periodi-
cally instead of continuously. This would prevent forms from
different facilities from being mixed together. (Forms are dis-
cussed in greater detail in Section 4.)
Manual data analysis is greatly aided if the inspection facil-
ities tally the results (using tally sheets) immediately after
the inspection. Figure 2 is a sample tally sheet, and as shown,
3-6
-------
STATION
From To
Period
Model Year Pass Fail
Pre 70
Total
70-74
Total
75-79
Total
80+
Total
Grand Total
-
FIGURE 2. SAMPLE TALLY SHEET
3-7
-------
it will be very easy to calculate the overall failure rate for
each facility as a function of model year groupings. Stations
could also provide summaries of the number of waivers issued
along with the total number of stickers or certificates issued.
With manual data analysis, more sophisticated cross-tabulations,
averages, and standard deviations are very difficult to determine
for all of the data. It is possible to calculate the overall or
average failure rate by averaging the failure rates of each test
facility. However, for most of the complicated analyses, it will
be necessary to take vehicle samples if manual data analysis is
used. Information from samples can be used to determine:
o Cutpoints,
o Average repair costs,
o Types of repairs performed (by type of
failure),
o Average emission levels (overall and by model
year), and
o Emission reductions (overall, by model year,
by type of repair).
Unfortunately, sampling will not alleviate all the problems of
manual data analysis — there will still be some objectives that
cannot be met. This is particularly true of decentralized
programs, because of the large number of inspection stations.
For example, with manual data analysis, it would be very diffi-
cult to arrive at the failure rate for each facility by make.
Similarly, it would be very difficult to arrive at the average
idle emission levels for each facility, either overall or as a
function of make and model year.
3-8
-------
3.2.2 Computer Assisted Analysis
Unlike manual data analysis, with computer assisted analysis the
collection and entry of data is the main bottleneck. Conse-
quently, it may be necessary to collect samples of the data, even
if sampling is not necessary for data analysis. Once the data
are entered into the computer most of the previously discussed
objectives may be met. Section 4 describes methods to enter the
data into the computer.
In the absence of a data base management system, it is recom-
mended that an I/M program use a standard statistical package to
perform the more sophisticated analyses. The use of these pack-
ages essentially affords the program most of the flexibility that
is available with a data base management system (DBMS). If these
packages are not used, then it is likely that considerable time
will be spent in software development. These packages will en-
able the I/M analyst to perform most of the data analyses tech-
niques that were discussed in Section 2. Of course, some data
such as the data from the quality control audits will still need
to be analyzed manually, despite the use of computer assistance
for many of the other functions.
3.2.3 Use of a Data Base Management System
An online data base management system is ideal for I/M data
analyses. Although the use of standard statistical packages will
allow the analyst to perform many of the functions that could be
performed with a data base management system, the ease of operat-
ing the latter system is of great value. For example, there
are many times when the I/M program manager needs specific in-
formation about the program. This information can be quickly re-
trieved with a DBMS, and most data base management systems can be
used by people that do not have a strong background in computers.
3-9
-------
Consequently, there is not a continuous requirement for the data
processing department to handle information requests since much
analyses can be handled directly by the I/M program manager.
However, it is probably not cost effective for the I/M program to
purchase a DBMS strictly for its own use. This option.should be
investigated if a DBMS is already available to the administrating
agency.
3.3 Presentation of Data
In many cases, the presentation of the data is almost as impor-
tant as the actual analysis that is performed. The results of
the data analysis should be presented in a manner that enhances
the administrator's ability to spot trends and problem areas.
The following is a discussion of the different types of data pre-
sentation formats.
3.3.1 Tables
Tables are best used to present large quantities of summary in-
formation. For example, the failure rates of different combina-
tions of make and model year vehicles is probably best shown in a
table (see Tables 14 and 15). In some cases, however, it may be
difficult for the observer to pick out the salient aspects of the
analysis if a detailed table is used.
3.3.2 Graphs
Graphs are recommended to point out trends in areas such as:
failure rates, monitored reductions, emission levels, or waiver
rates. Figure 3 is an example of a graph that was prepared by
the New Jersey Department of Environmental Protection. The graph
clearly illustrates both the initial failure rate and the failure
3-10
-------
f
M
lable A-2 - Idle tulsslon test Standards and failure Rates (or each Vehicle Category
for the First 22 Weeds of the Program >
:alegorg
2
3
4
S
6
7
0
9
10
II
12
13
14
Vehicles
Inspected
34.764
15.663
63.217
2J.546
46.633
15.137
4.U46
15.460
13.650
40.711
23.870
19.137
67.124
24
306.790
Hu
-------
TABLE 15. STATISTICAL REPORT OF COLORADO'S I/M PROGRAM
November 1
Results of
VEHICLE
MODEL YEAR
68
69
70
71
72
73
74
75
76
77
73
79
30
31
32
TOTAL:
2, 1981
• Idle Testinn .
STA
FIRST TESTS
1810
2646
3118
3772
4972
5880
6409
4708
6163
6209
7308
3441
6775
3884
91
72,686
EIGHT-COUNTY
PROGRAM AREA
TISTICAL REPORT l)F fllR PROGP.A
RETE5T3
361
473
504
617 -
953
1080
1114
775
1023
1387 .
1515
.1680
661
177
1
12,321
FAIL RATE CO
19.9
17.9
16.2
16.4
19.2
13.4
17.4
16.5
15.6
22.3
19.4
19.9
9.8
4.6
1.1
17.0
"obi is io;
Air Po!":u;
Co lor? ..Jo i
M
%
DEDUCT: OH'
47.62
50.00
49. S2
54.95
55.20
57.19
56.79 .
59.70
58.55
64.43
65.29
67.41
75.29
67.70
100.00
59.72
1.
:-iC REDUCTION**
42.05
46.83
44.19
46.62
49.74 ''
47.53
47.96
'49.23
50.23
51.08
50.89
50.85
56.39
57.99
100.00
43.86
* carbon monoxide redaction
** hydrocarbon reduction
These figures were obtained froir report forms of vehicles inspected un^sr r.h«» AIR
Program from July 1, 1981 to the above date. They do not represent the e~tv'e
population of vehicles inspected in tho program to date, only thdt portion which
has been processed by.the computer.
3-12
-------
NEW JERSEY DEPARTMENT OF ENVIRONMENTAL PROTECTION
AND
DIVISION OF MOTOR VEHICLES
i
MONTHLY VEHICLE INSPECTION REPORT
NOVEMBER 1979
o:
o
,
It. ~
Initial Rat*
R*«xaa Rat*
1974
1975
1978
4-
1977
4-
1978
1979
198B
Calendar Year
FIGURE 3
3-13
-------
rate upon reexamination. It is interesting to note that the
sharp rise in failure rate in 1976 was due to the implementation
of more stringent outpoints.
3.3.3 Bar Charts
Bar charts are well suited to illustrate a comparison between two
or three populations. For example, the idle emission levels of
an I/M population could be graphically compared with a non-I/M
population on a bar chart. Bar charts are also useful in illus-
trating the reductions from repairs.
3.3.4 Reports Generated by Manipulating a Data Base
As mentioned earlier, an online data base management system
affords the program manager considerable flexibility in the anal-
ysis of data. By manipulating certain key words, reports can be
instantly shown on a screen or they can be quickly printed on the
teletype. These types of displays are excellent for investiga-
tions on program operations or the effectiveness of repairs. For
example, the operator can easily determine which stations have
questionable failure rates by manipulating a data base. The
actual hard copy results can be shown by any of the previous
three methods.
3.4 Sampling
In manual (and in some instances, automated) systems it is not
always practical to use all the units of a population because of
its size. Here the analyst would want to draw a random sample in
order to characterize the population. In most semi- and fully-
automatic systems, analyses done for reports normally utilize
observations from the entire population. In this case sampling
is not necessary.
3-14
-------
eonpoannoii
Random sampling is the process of choosing "n" units from a popu-
lation of "N" units in a manner such that each of the "N" units
has an equal probability of being chosen. When it is necessary
to sample, it is extremely important to obtain a random sample.
The consequences of not obtaining a random sample are biased
results.
As discussed in Section 2.1.2, there are methods of overcoming
sample biases. Essentially, this involves breaking down the sam-
ple into groups that are likely to influence the overall results.
For example, newer model vehicles are likely to have lower fail-
ure rates. Therefore, if the average failure rate is being cal-
culated from a sample, the analyst must break the sample down
into groups of similar model years. It is important, however,
that an adequate number of events exist for each group of model
years.
3.4.1 Determining Sample Sizes
The following procedure describes a technique to determine the
sample size necessary for statistically valid results. It should
be noted that a sample must be taken for each group that is anal-
yzed. For example, if the emission levels are being calculated
for groups of makes and model years, then sample's must be taken
for each group.
CALCULATING SAMPLE SIZES
1. Define degree of accuracy.
o Select tolerance or difference (d) that the sample
average is allowed to vary from the population
average (e.g., 10%).
3-15
-------
o Select confidence (p) that the error will be less
than d.
- p usually = ;95%
2. Take preliminary sample of size n' or use similar data
previously collected.
3. Select the parameters "Zp" from Table 16.
TABLE 16. STANDARD NORMAL DISTRIBUTIONS
p (%)
95
97.5
99
Zp
1.64
1.96
2.33
4. Determine * as a function of n' and p (see Table 17).
5. Calculate sample size (n) per formula below where Xi is
the observed data from the preliminary sample.
/n' /n' V\
( E V - £ *i )
V=i ^^ 1
„.
d° X'
p,n'
A statistical derivation of this expression is given in Appendix
B.
3-16
-------
TABLE 17. THE VALUES OF
n^
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
25
30
40
50
60
70
80
90
100
99
0.00+
0.02
0.11
0.30
0.55
0.87
1.24
1.65
2.09
2.56
3.05
3.57
4.11
4.66
5.23
5.81
6.41
7.01
7.63
8.26
11.52
14.95
22.16
29.71
37.48
45.44
53.54
61.75
70.06
pa)
97.5
0.00+
0.05
0.22
0.48
0.83
1.24
1.69
2.18
2.70
3.25
3.82
4.40
5.01
5.63
6.27
6.91
7.56
8.23
8.91
9.59
13.12
16.79
24.43
32.36
40.48
48.76
57.15
65.65
74.22
95
0.00+
0.10
0.35
0.71
1.15
1.64
2.17
2.73
3.33
3.94
4.57
5.23
5.89
6.57
7.26
7.96
8.67
9.39
10 . 12
10.85
14.61
18.49
26.51
34.76
43.19
51.75
60.39
69.13
77.93
3-17
-------
3.4.1.1 Example - Calculating a Sample Size to Determine Mean
HC Emissions
Suppose that it is required to determine the sample size in order
to get a close estimate of pre-repair HC emissions. Suppose
further that a small sample of 40 vehicles was taken for this
purpose, and that it is required to determine the emissions to
within ±10 ppm with probability 95%. With data from this
preliminary sample, the analyst may then proceed to calculate the
required sample size.
From Table 17
X*95,40 = 26'51
From the sample the following items were calculated (see Table 18)
40
X. = 9583
40
From Table 16
X..2 = 2,587,039
Z95 = 1.64
Therefore, the number of vehicles necessary for an accurate
prediction of HC emissions is
(1.64)- /2587039 -
/
\
(10) 2 \ 26.51
n = 295 vehicles
3-1-8
-------
TABLE 18. PRELIMINARY SAMPLE OF HC EMISSIONS
Observation No.
I
2
3
• 4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
HC = X
(ppm)
239
348
299
263
257
313
255
207
263
144
263
208
114
211
273
359
360
148
229
78
367
71
177
73
233
265
291
245
268
245
72
368
141
234
222
391
315
181
331
262
X2
57,121
121,438
89,661
69,169
66,049
97,969
65,025
42,849
69,169
20,736
69,169
43,264
12,996 '
44,521
74,529
128,881
129,600
21,904
52,441
6,084
134,689
5,041
31,329
5,329
54,289
70,225
84,681
60,025
71,824
60,025
5,184
135,424
19,881
54,756
49,284
152,881
99,225
32,761
109,561
68,644
TOTALS 9,583 2,587,039
3-19
-------
The analyst may find it useful to plot n versus d in order to
study the trade-off in sample size versus accuracy. Since n is
proportional to 1/d2, it may be possible to get a large reduc-
tion in sample size, while giving up very little accuracy. This
reduction in sample size may be unimportant for calculations with
a computer. But if it is desired to calculate how many random
emission checks must be performed to independently assess emis-
sions, substantial savings may be involved,
3.4.2 Determining Sample Size for Selection of Outpoints
The previous procedure was for the general case where samples are
selected to calculate a parameter describing the population.
However, for the specific case of selecting a sample to determine
cut points the procedure is much easier. Again, samples must be
selected for each category of vehicles for which the cut points
apply (e.g., model year groupings). And the calculation for the
sample size must be done prior to sampling. This is to assure
that proper random sampling techniques are used.
The following formula can be used to calculate the required
sample size to determine the cutpoints for a group of vehicles.
where
p = desired failure rate, and
A = allowable error (in same units as p) .
The statistical theory behind this formula is shown in Appendix
A.
3-20
-------
3.4.2.1 Example - Sample Size for Cut Point Selection
Suppose it is desired to calculate the sample size necessary to
determine a 30 percent stringency level within an accuracy of 5
percent. It follows that:
p = 0.3
A = 0.05
n = 4 X Q(o.Q5)I) = 336 vehicles
3-21
-------
4.0 DATA COLLECTION AND HANDLING
The previous sections discussed what data are pertinent to an I/M
program. Once the data needs are identified, it is very impor-
tant to assure that the data are accurately collected and stored.
Consequently, a key aspect in the handling of data is to investi-
gate measures designed to reduce error. This section discusses
the collection, transfer, and maintenance of data.
4.1 Data Collection
There are three basic ways that data may be collected from an I/M
program:
1. Data can be manually recorded onto forms or
manually entered onto a data tape,
2. Data can be directly entered into a data
base, and
3. The results of the inspection can be automat-
ically recorded onto a data tape.
In an I/M program, it is likely that combinations of the above
methods will be used.
4.1.1 Data Collection in Decentralized Programs
Decentralized programs present unique problems for data collec-
tion. Because of the large number of stations and inspectors,
there are difficulties associated with even the most sophisti-
cated data collection methods. It is recommended that data be
collected in a machine readable format. This can be accomplished
by entering data directly onto a data tape or by filling out
"read forms."
4-1
-------
4.1.1.1 Data Tapes
An approach taken by some of the upcoming I/M programs is to
record data from each inspection onto a data tape that is con-
tained within the analyzer. This method requires a combination
of manual and automatic data recording. For example, items such
as the license number and the facility identification number must
be keyed in manually; whereas, the emissions levels may be auto-
matically recorded.
The advantages of the internal data tape are three-fold; fraudu-
lent inspections are less likely to occur, manual error is
reduced because compliance is automatically determined, and the
storage of information is simplified. However, it is still pos-
sible to incorrectly key in some data items such as license num-
ber and model year.
4.1.1.2 Read Forms
The previously mentioned analyzers that have integral data
recording capabilities are more expensive than the average ana-
lyzer. Consequently, the I/M program may want to opt for
strictly manual recording data onto a form. In this case it is
recommended that "read forms" coded with a number two pencil be
used. These forms can be read by a machine and the data will be
automatically stored.
When designing read forms, it is important that they be:
o Easy to read,
o Easy to understand, and
o Contain no ambiguous information.
4-2
-------
These forms should also have a fixed number of columns and only
certain allowable entries for fields such as model year. In
addition/ it aids recordkeeping if the same number of digits are
used for certain key data items such as station number (e.g., all
station ID numbers should fall between 1000 and 9999). Simi-
larly, the boxes on the forms should have the right number of
spaces, and not just blank lines. Decimal points should also be
on the forms. Finally, methods such as shading could be used to
indicate which items must be filled at the initial versus retest.
Figure 4 is an example of a read form.
The I/M program that Colorado is in the process of implementing
(currently in the change of ownership phase) uses read forms.
Initial results have been promising and considerable analysis has
been performed (see Table 15). Colorado reports that there have
been some data collection problems, primarily from dirty or
stapled forms. Colorado stressed the completion of forms during
its inspector training program, and thus there have been few
problems with errors from improper completion.
4.1.1.3 Basic Forms
Although machine readable forms (or direct data entry), have
obvious advantages, an I/M program can be successfully operated
even if manually tabulated forms are used. The same considera-
tions that apply to machine readable forms apply to basic forms;
they should be easy to fill out and designed to minimize errors
(see 4.1.1.2 for design considerations). In addition, it is
advantageous to place in a prominent position certain key infor-
mation, such as: the vehicle license number, the pass/fail
results (including whether or not the vehicle was waived), the
emission levels, and the model year of the vehicle. A clear lay-
out of the form will simplify the manual tabulation of data. In
4-3
-------
MAKE
AMC O
AUDI O
AUHE O
AUST O
BMW O
BUCK O
CAOI O
CHEK O
CHEV O
CHBY O
OATS O
OOOGO
flAI O
fOHO O
HONDO
INIE O
JAGU O
JEEP O
LANC O
LINC O
MAZOQ
MEBZ O
MERC©
MG O
OlOS O
OPEl O
PUM O
PONI O
POBS O
PUGT O
BENA O
SAAB O
SUBA Q
ior O
TBIP O
VOIK O
voiv O
OTHB O
0 LICENSE
©©©©©©
©OOOO©
©©©©©©
©©©©©©
©©©©©©
©©©©©©
®©©©©©
©©©©©©
©©©©©©
®®0©0®
©©©©©©
©©©©©©
©®©o©o
©©0©©©
©®®©©©
©©©©©©
©©©©©©
©©©©©©
©©©©©©
©©©©©©
©©©©©©
©0©©0©
©©©©©©
0®©©©©
5
Model
DATE OF TEST
G)
©
)
2)©
3)©
.©
©
STATION
NUMBER
000©
0000
©0©@
® FIRST TEST EMISSIONS LEVELS
•/. CO
!
©©.©
©©.©
C5.(5
0©
©-©
©.©
©.©
®®
®.®
Compaie
these
levels
to the
emissions
standards
lor model
year
being
tested.
ppm HC
©00©
©©©©
^D^^G)^)
©©©
©©©
00©
©00
©©©
©0©
©0©
© PASS ©
(f) FAIL (f)
YEAR
H)
"0
"0
-0
••0
»0
-0
•• 0
-0
-0
»0
•• 0
«0
RETEST EMISSIONS LEVELS
CO
T
©©.©
©©.©
©.©
Compare
these
levels
la the
emissions
itandeids
loi model
year
being
tested.
ppm HC
CDCZ)Cl
©0©
©0®
© PASS ©
© FAIL ©
Waiver O
ODOMETER
REOUIREO REPAIR
LABOR
©©©
©©©
©©©
PARTS
0©©
©@©
Figure 4. Example Machine Readable Vehicle Inspection Record
-------
addition it will be easier to keypunch data from such a form.
Figure 5 is an example of a basic inspection form.
4.1.1.4 Data Recording Device
Radian has conceptually designed a device to aid in the recording
of data from an I/M program. This device is well suited for
decentralized programs and can be used with machine-readable or
basic forms. It is similar to a sales receipt handler, where
each form has two copies: one for the customer and one for
records. The device could be used as follows:
o A roll of inspection forms is inserted into
the device,
o The inspector fills out the form, turns the
crank, the copy for the customer separates
and the record copy goes into a box.
A template could be used with this device to aid in filling out
the form and to prevent it from getting dirty. This latter point
is important for machine-readable forms. In addition, the admin-
istrating agency could sell the forms (or the rolls of forms) to
the inspection station. This would simplify the tracking of the
serial numbers issued.
4.1.1.5 The Collection of Forms
Regardless of the method used to record the data, it is recom-
mended that the data be collected when program officials audit
the station. Thus, as a minimum, the stations must keep the data
until the next audit, although some data, particularly analyzer
quality control data should be kept for much longer periods. The
administrating agency should establish a schedule for the mainte-
nance of I/M data.
4-5
-------
SAMPLE
MANUAL DATA FORM
STATION NUMBER
LICENSE NUMBER
Mo Day Yr
DATE
EAR
INSPECTOR'S INITIALS
HC
FIRST TEST EMISSIONS LEVELS
ppm
CO
COMPLIANCE STATUS
Pass
D
Fail
D
RETEST INFORMATION
REPAIR PERFORMED BY:
Inspection station
Owner
D
HC
Pass
REPAIR COST $
RETEST EMISSIONS LEVELS
ppm
CO
COMPLIANCE STATUS -
Fail
D
Other
D
Waiver Issued
D
Figure 5. Sample Data Form
4-6
-------
When the program officials are at the stations, they could review
the forms to make sure that they are properly filled out. In
addition, if manual data analysis were used it would be rela-
tively easy for the auditors to tally the results of the inspec-
tion and prepare periodic summary reports. The auditors could
organize the inspection records and other data by facility,
thereby preventing a mix-up of data from different facilities.
As mentioned earlier, the stations themselves could tally some of
the data, especially if tally sheets were provided by the admins-
trating agency (see Figure 2, page 3-8). These sheets could be
collected by the auditors (with or without the inspection
records).
4.1.2 Data Collection in Centralized Programs
Both manual and automated data collection is much easier in cen-
tralized programs, mainly because of the smaller number of sta-
tions. Furthermore, there is an additional data collection
option available to centralized programs - entering data directly
into a data base management system.
If data are collected manually, then the same considerations for
decentralized programs apply to centralized programs. However,
because of the fewer number of inspection facilities, manual data
analyses are greatly aided by the preparation of detailed summary
tally sheets (see Figure 6). This type of sheet will enable the
analyst to manually perform cross-tabulations such as reason for
and frequency of failure (by model year) that would otherwise be
very difficult to perform if each inspection record had to be re-
viewed. Therefore, in centralized programs considerable data
analyses can be performed, even if the data are recorded
manually.
4-7
-------
OREGON -- LIGHT-DUTY VEHICLE TESTING SUMMARY
OEPARTMEST CF 21VIJ.ONMEHTSI. 1SAI.IT?
VZHICLS INSPECTION ?PCGSAM LCCATICN:
2AH.2 TSSTriG SUMMARY - LIGHT :OTf VSKC12S _ _
Pra S3
Total
63-69
Total
70-71
Total
72-74
Total
75 ?1us
Total
5. TOTAL
PASS
•
HC
-3
2EASCN ?CR SGSCCMPI^ASCS
3CTH
3MGKS
— 11
-i—
3ISC
TOTAL
1
i
i
1
I )•«»
_Totai iicht and Saavy Duty
"ratal Cartiiicatas
Xoney
_ ?aaa
_Trae.'e Carsa Only
Moia« Tases
Absant:
?.eascn: "rsm—To:
Idown
_papositt 31i? Munfaer
Daposic Slip Nuxniar
Siumary Praparad 3y:
Acorovad 3v:
Figure 6
4-8
-------
Entry into a computer terminal is easier in centralized programs.
More sophisticated data entry techniques may be used, and error
checking is possible for some of the fields such as license num-
ber (particularly if the data are being entered directly into a
data base management system). However, the same general consid-
erations that apply to forms also apply to entry into a computer
terminal, that is, the entry format must be easy to read, easy
to understand, and contain no ambiguous information. A program
that calls for the entry of complex numeric codes without any
mnemonic meaning will have a much higher error rate. For exam-
ple, an operator would be more likely to know he had incorrectly
entered GM versus a code like 040.
In addition to operator error there are other sources of error
associated with the entry of data into a computer terminal.
Terminal malfunction such as key bounce in which a double charac-
ter is entered can create erroneous information. High quality
terminals will help minimize this type of error. Although the
cable that connects the emission test equipment and terminals
with the data base is usually considered to be an insignificant
portion of the system, unshielded or improperly matched cables
can introduce substantial error into the data base. Probably the
most catastrophic errors occur in the data base itself where in
one moment, thousands of records may be lost. Regarding this lat-
ter point, the I/M program may want to have hard copy backup data
on certain key items such as license number and the pass/fail
status.
Like decentralized programs, the administrating agency must set
up a schedule to collect data, although in many cases this will
be daily or weekly (or even continuously).
4-9
-------
4.2 Transcription of Data
Most of the data collected needs to be transcribed or entered
into the data base. Most of the time this is a separate step
from the collection of data, although as mentioned above, some
data are immediately transcribed as they are collected (particu-
larly data entered into a real time data base management system).
Errors that occur during this step are called transcription
errors. These errors may be human, for example typos, or mechan-
ical. Major types of transcription include:
1. Read Forms Coded By Number Two Pencil - These
forms are easy to fill out, and by having a
fixed number of data entries they have inher-
ent self-checking mechanisms. Furthermore,
these forms can be fed directly into a compu-
ter; therefore, they have low processing
costs. Figure 4 is an example of one of
these forms.
2. Keypunching Field Records - In this method
handwritten records from the field are key- "
punched onto computer cards and then fed into
the machine. This method has a higher error
rate than the previous method because of the
additional step in the handling of the data.
The error rate may be reduced by verifica-
tion, which consists of keypunching the
record twice and then comparing the two
cards. However, verification obviously
increases the time to enter a record by a
factor of two.
3. Written Records Entered into a Data Entry
Terminal - This method has a slightly lower
error rate than keypunching and it is easier
and quicker to correct the errors.
4. Direct Entry of Data into a Data Entry
Terminal - This method is similar to the
previous method except that it eliminates the
hardcopy (written) record. Consequently, it
could have a slightly lower error rate than
(3) but at the risk of losing the hardcopy
backup. If the data are directly entered
4-10
-------
into an on-line data base system, error
checking/ such as flagging an invalid license
number, is possible.
5. Fully Automatic - Obviously, in a fully auto-
matic system the data transcription occurs
automatically. The only errors that are
going to occur in this case are hardware
errors which can be minimized by having
backup storage systems. However, some data
items such as license numbers cannot be
automatically entered.
4.3 Quality Assurance of the Data Base
This section discusses quality control for the inspection/mainte-
nance data base. A number of checks for invalid data are pos-
sible. A printout should be produced of any invalid data
identified through automatic checks by the computer. Records
that have discrepancies should either be corrected or removed
from the data base. These checks fall in several categories, as
indicated below.
o Does the entry have a feasible and reasonable
value?
A check of this sort will determine if the
values entered for the different paramaters
are within an expected range. For example, a
CO value of 150 percent would be unreason-
able. As discussed, a large number of
unfeasible or unreasonable values may be
eliminated by using forms with a fixed number
of columns and only certain allowable entries
per column. For example, it is not possible
to enter the impossible emission value, 150
percent CO, on such a form. An automatic or
manual check would be beneficial with these
forms, to determine whether no circles or
more than one circle were filled in under the
same column.
4-11
-------
o Are the different entries on a single form
consistent?
Any consistency check which might reveal an
invalid data entry should be made. It should
be possible to automate these checks. The
following are examples of consistency checks:
- Is the model year consistent (i.e., no more
than one year greater than) the year of the
test?
- The pass/fail results should be consistent
with the emission levels. For example, if
the initial test results indicate compli-
ance, then the emission levels should be
within the standard.
- If a tampering inspection was performed and
the catalytic converter visual inspection
was failed, then the year and model of car
should actually have a catalytic converter
(1975 and later domestic vehicles).
o Are the entries on different forms for the
same vehicle consistent?
If the results of subsequent tests for the
same vehicle (identified by the license
number) are reported, then several entries on
different forms for the same vehicle should
be the same, including auto make and model
year.
4.4 Cost of Data Processing
This section will discuss the cost elements associated with dif-
ferent methods of collecting and transcribing data. Since basic
hardware costs are dependent upon what is already available to
the administrating agency, these costs will not be directly
addressed. The cost for data collection obviously depends on the
method that is used to analyze data from an I/M program:
(1) manual, (2) computer assisted, and (3) computer assisted with
a data base management system (DBMS).
4-12
-------
4.4.1 Manual Data Processing
The primary costs of manual data processing are for the forms and
the data analyst. The cost for the data forms can be signifi-
cant; therefore, there needs to be a budget for forms. A gross
estimate for the cost of forms would be ten cents each.
It is very difficult to estimate the cost of a data analyst.
Without the assistance of a computer it would take an analyst
considerably more time to perform the previously mentioned analy-
ses. However, it is likely that programs using manual data
analysis will perform less analysis. Most programs budget for
one full-time data analyst regardless of the type of processing
that is used.
4.4.2 Computer Assisted Data Processing
4.4.2.1 Keypunched Forms
The costs for computer assisted data processing depend upon the
type of data entry. If forms are to be keypunched for entry into
the data base, then the following costs would be incurred:
o Forms
o Keypunching
o Programming
o Data analyst
o Computer time
As previously mentioned, a gross estimate of the cost of forms
would be ten cents. In addition, it is estimated that it would
4-13
-------
take an operator approximately 30 seconds (including verifica-
tion) to keypunch the data items on a typical vehicle inspection
record. This would result in an additional five cents per form
(assuming $6 per hour wage).
The cost for programming, data analysts, and computer time
depends upon the approach taken to analyze the data. If the pro-
gram purchases a standard statistical package (or leases one)
there will be lower costs for programming than if in-house pro-
gramming were used, although the computer costs are likely to be
roughly the same. One full-time data analyst should be budgeted
for most programs.
4.4.2.2 Machine Readable Forms
If the forms are machine readable, the program incurs a cost for
the forms reader instead of incurring costs for a keypuncher.
Forms readers can cost in excess of $100,000. Therefore, the
administrating agency should investigate the option of using
another agency's forms reader. For example, Colorado leases a
form reader from a school district, to read its data forms. The
cost for the forms themselves are similar to basic forms (i.e.,
10 cents each). Again, as with keypunched forms, the costs for
programming, data analysts, and computer time depend upon the
method used to analyze the data.
4.4.2.3 Data Tapes
Some analyzers have built in data recording capabilities. Infor-
mation normally recorded on hard copy (e.g. vehicle information
records) are stored instead on a data tape. These data tapes are
periodically collected and processed by the administrating
agency.
4-14
-------
I/M programs that use these analyzers (e.g., the New York I/M
program) obviously do not incur costs for keypunching or forms
reading. However, the initial cost for these analyzers can be
more than $1,000 higher than similar models without automated
data recording. The additional analyzer costs can be significant
in a decentralized program. Otherwise, the costs for this
approach are similar to the previous methods.
4.4.2.4 Entry into a Computer Terminal
The main difference between the above approaches and data entry
using a terminal are the costs for the terminal. An additional
cost would be the telephone lease lines used to connect the ter-
minal with the central data base. The costs for terminals are
highly variable and can be provided by manufacturers. Costs for
telephone lease lines are also variable, depending on the length.
(These costs can be provided by the phone company). The other
considerations for this type of data entry are the same as for
machine-readable or keypunched forms (programming, analysts,
etc.). Because of the large number of terminals required, this
option would be much more expensive that the other options for
the decentralized program.
4.4.3 Data Processing With an On-Line Data Base Management
System
Like computer assisted data processing, the costs for data pro-
cessing with a data base management system depend upon the type
of data entry. The costs for the different data entry methods
are roughly the same as shown above (i.e., the costs for forms,
keypuncher, etc.). The costs for programming, data analysts, and
computer time depend greatly upon the existing set-up. If an
existing data base management system is available, then there are
fairly low software and start-up costs. The expense is probably
4-15
-------
less than with a computer assisted data processing system, unless
the latter system uses a standard statistical package. However,
as previously mentioned, the cost to produce and set-up an on-
line DBMS strictly for the purpose of handling I/M data, is
probably considerably greater than the cost for the other data
processing systems that were previously discussed.
4-16
-------
APPENDIX A
DEVELOPMENT OF A FORMULA TO DETERMINE SAMPLE SIZE FOR
OUTPOINTS SELECTION
The following procedure can be used to determine the sample size needed to
estimate the HC and CO emission cutpoints. A failure probability is selected
a priori, and HC and CO cut points are selected which yield this failure
probability. Since one condition has been imposed on two cutpoints, they
arg not determined uniquely; some freedom of choice remains.
Define:
n = sample size,
p = fraction failed as determined from the sample; the
' cut points are chosen so that p equals the failure
selected beforehand,
p = true but unknown failure probability given the
selected cut points, and
s = standard error of p
The standard error of p is given by
s\
s -- (1-P)
Then assuming n p > 5 and n (1-p) > 5, the error in p is approximately nor-
mally distributed. Assuming n is reasonably large (n 21 30), the following
is an approximate 95% confidence interval for p:
p (p - 2 sp< p < p + 2 s ) * .95
Thus, the half-width of the confidence interval is 2 s . Now, suppose it is
necessary to estimate p to within a specified accuracy, +A. To achieve this,
we require:
A-l
-------
2splA
or
4 P 0-P) 4 p (1-p)
A2
It is clear that the required sample size n is very sensitive to the speci-
fied accuracy A. Improving the accuracy by a factor of two, for example,
increases the sample size by a factor of four.
A-2
-------
APPENDIX B
DERIVATION OF EXPRESSION FOR SAMPLE SIZE REQUIRED FOR MEAN ESTIMATION
This appendix provides a statistical derivation of the expression given in
Section 3.4.] for the sample size required to compute an average with
specified accuracy. The expression applies in the general case; the
special case of estimating a failure probability is covered in Appendix A.
Suppose a preliminary sample of values, X., i = 1 to n' are available. Then
the following is an estimate of the population variance
n!
£ X.2 - I 2- X.I /n1
i=l x \j -1 L/
(n'-l).
SSx
where SS is the numerator of the fraction above.
A
Being derived from a finite sample, S2 is subject to sampling error. If S2
happens to be lower than the true (but unknown) variance, an estimate of the
required sample size based on S2 will be too low, and the desired accuracy
may not be achieved. It is possible, however, to use the x2 statistic to
derive an upper confidence limit for S2. Using the upper confidence limit
rather than S2 provides a higher probability that the accuracy requirements
will be met. The following is a derivation of an upper confidence limit
for the true variance, a2;
= P
where x2n ni is the value, from Table 16, which is exceeded by a X2 random
p ,n
3-1
-------
variable for sample size n1 (in statistical terms, with n'-l degrees of
freedom) with probability P. Then
= P
(note reversal of inequality)
and
X _z i _ p
x2p,n'
SS
Thus, is an upper confidence limt for the true variance a2.
Y2
X p.n1
If p is chosen to be 95%, for example, then there is only a 5% chance that
the upper confidence limit will not exceed the true variance.
Now, if the sample size n to be determined is larger than 30, it is
reasonable to use the Z- statistic rather than the t- statistic to express
a confidence interval for the mean y, which is to be estimated. In the
case of interest in this report, n will be at least this large. The
confidence interval for the mean is
x * ZP
where Z is a value obtained from Table 15. If p is 95%, for example, this
means that the true mean value u falls in the confidence interval with
probability .95. Now, suppose we require a result accurate within ± d;
then n should be large enough so that
B-2
-------
< d
Z2.S2
d2
< n
Substituting the upper confidence limit for S, we obtain the expression for
n given in Section 3.4.1;
Z 2
_£_
d2
n1
V- E
P,n
< n
Although the same symbol, p, has been used for the confidence levels for the
mean and the variance, different levels could theoretically be used.
B-3
-------
APPENDIX C
COMPREHENSIVE LIST OF INSPECTION DATA
Vehicle - Facility identification
Test lane or analyzer #
Inspector-identification
Date o£ test
Test # (for reference)
License number or VIN
Model Year
Odometer
HC reading (idle, 2500 and/or loaded)
CO reading (idle, 2500 and/or loaded)
CO reading
2
CO -H CO reading
2
Idle speed
Pass/fail
Test code (initial test or retest)
Test # of initial test (or retest) if
this is a retest record
Cutpoints
Engine type (Gas, Diesel, Rotary)
Fu^l type (gasoline, diesel, LPG,
CNG, etc)
Kngine CID
Number of cylinders
Fual System (FI, 4V, 2V, etc.)
Dumber of exhaust pipes
:ia*e
Types of control devices
Vehicle type (LDV, LDT, HDG, etc.)
GVW class (less than 6,000,
6,000-8,500, greater than 8,500)
Compliance certificate issued (yes/no)
Compliance certificate serial number
Waiver certificate issued (yes/no)
Waiver certificate serial number
Inspection fee
Cost of repair (Identify parts and
labor)
Itams replaced or repaired
Tir? pressure (before and after)
Tampering (yes/no)
Items missing and/or inoperable
?ro?ane gain specification
'Jropane gain reading
C-l
-------
Analyzer (weekly
calibration)
Analyzer (monthly
audit)
Quality Control Data
Data of check
Facility ID
Inspector ID
y.ake, model and serial # of analyzer
Hex^'ne'f actor
HC concentration (calibration gas)
CO concentration (calibration gas)
CO concentration (calibration gas)
2
Gas cylinder serial numbers
HC reading (analyzer response)
CO reading (analyzer response)
CO reading (analyzer response)
2
Compliance with gas (HC,CO, CO
2
yes/no)
HC hang up check
Leakage rate
Compliance with leak check (yes/no)
Repairs/maintenance performed (yes/no)
Types of repairs/maintenance
Audit Data
Data of check
Facility identification
Auditor identification
"a.ks, model and serial # of analyzer
Hexane factor
HC concentration (calibration gas)
CO concentration (calibration gas)
CO concentration (calibration gas)-
2
Gas cylinder serial numbers
HC reading (analyzer response)
(analyzer response)
reading (analyzer response)
CO reading
CO
2
Compliance with qas (HC,CO, CO
2
yos/no)
HC hang up check
Leakage rate
Compliance with leak check (yes/no)
Repairs/maintenance performed (yes/no)
Typ-^s of repairs/maintenance
Calibration gas verification
compliance (yes/no)
Compliance with filter check (yes/no)
Acceptable location (yes/no)
V;ei tag issued (yes/no)
C-2
-------
Facility — Number of stickers/certificates on
hand
Serial number of stickers/certifi-
cates on hand
Number of stickers/certificates
issued to vehicles
Inspection and/or repair, calibration
and maintenance of analyzer required
records (yes/no)
Possess required equipment (yes/no)
Have certified inspector on payroll
(yes/no)
Possess required documents (yes/no) -
regulations, manuals, etc.
Inspector Audit - Inspector identification
Certified (yes/no)
Perform inspection properly (yes/no)
Perform calibration properly (yes/no)
Complete records properly (yes/no)
Enforcement Data
Number of citations issued for non-
compliance and number penalized
Number of inspections (from inspec-
tion records)
Number of vehicle registered
a. number waived
b. number exempt
c. number subject to I/M
I/M status of registered vehicles
a. data base maintained on
registered vehicles that
indicate I/M status.
Registration data on areas outside
the program area.
Number of penalties imposed on
inspection facilities
Other Data
Roadside checks (emphasize limited
value)
Challenge checks (provide additional
emphasis)
Other independent emission checks
Public complaints and other public
input
C-3
------- |