&EPA United States Environmental Protection Agency Office of Transportation EPA420-B-04-010 and Air Quality July 2004 Guidance on Use of Remote Sensing for Evaluation of I/M Program Performance ------- EPA420-B-04-010 July 2004 Guidance on Use of Remote Sensing for Evaluation of I/M Program Performance Certification and Compliance Division Office of Transportation and Air Quality U.S. Environmental Protection Agency NOTICE This Technical Report does not necessarily represent final EPA decisions or positions. It is intended to present technical analysis of issues using data that are currently available. The purpose in the release of such reports is to facilitate an exchange of technical information and to inform the public of technical developments. ------- 1. Introduction 3 2. Background 4 2. I.Hi story of I/M 4 3. General Approaches to I/M program evaluation 7 3.1 Defining Program Evaluation 7 3.2. On-Road Data Analysis 8 3.4. Three RSD Program Evaluation Methods 10 4. Equipment Specifications and Measurement Procedures 12 4.1. The Remote Sensing System 12 4.2. Theory of Operation 12 4.3. Operation 13 4.4. Operational Difficulties 15 4.4.1 Signal/Noise Considerations 15 4.4.2. Weather 15 4.4.3. Interference 16 4.4.4. Optical Alignment 16 4.4.5 Emissions Variability 16 4.5 Instruments 16 4.5.1. Calibration Checks 16 4.5.2. Other Instrument Parameters 17 4.6. Site Description 17 4.7. Measurements 18 4.7.1. Data Collection 18 ₯.7.2. Multiple Measurements 18 4.7.3. Operators 18 4.8. Database Format 19 4.9. Department of Motor Vehicle Data 19 4.10. Note Any Changes that Could Affect the Analysis 19 5. Design Parameters and Quality Assurance/Quality Control Protocols 19 5.1. Overview 19 5.2. Vehicle Population 20 5.3. Vehicle Loads 22 5.4. Vehicle Identification 27 5.5. Instrument Calibration 28 5.6. Measurement Methods 28 5.7. Socioeconomics 28 5.8. Seasonal Effects 29 5.9. Program Avoidance 35 5.10. Regional Differences (policies, environment, fuel composition, etc.) 36 5.11. Program Details 36 5.12. Emissions Distributions 37 6. Evaluation Methods 41 6.1. Step Change Method 41 6.1.1. Description 41 6.1.2. Application Examples 42 FINAL - 2 - ------- 6.1.3. Potential Systematic Errors 43 6.2. Comprehensive Method 44 6.2.1. Description 44 6.2.2. Application Examples 44 6.2.3. Steps 45 6.2.4. Advantages/Disadvantages 47 6.2.5. Potential Systematic Errors 48 6.3.1. Description 49 6.3.2. Application Examples 49 6.3.3. Applying the Method 49 7. Summary 53 8. References 54 Appendix A: On-Road Evaluation of a Remote Sensing Unit 56 1. Introduction This document is intended to provide guidance for performing I/M program evaluations using a Remote Sensing Device (RSD). The next section is a background regarding EPA regulation of state I/M programs and a history of methods used to evaluate these programs. Section 3 describes different approaches to evaluate I/M programs, using roadside pullover data or independent remote sensing measurements. Section 4 summarizes equipment specifications and measurement procedures, while Section 5 outlines important design parameters for the collection and analysis of RSD program evaluation data. Section 5 also discusses quality control issues that should be considered in an evaluation. Section 6 describes in detail three alternative methods to perform short-term evaluations of the I/M programs using remote sensing data and discusses the advantages and disadvantages of each. The three methods are the Step Change, the Comprehensive, and the Reference analysis methods. How in-program data can be used to evaluate the long-term, cumulative effect of I/M programs is covered in a separate document. Appendix A contains some simple trouble-shooting methods that can be applied in the field as a first check to determine if an RSD unit is functioning properly. It is strongly recommended that any state considering the use of RSD for program evaluation purposes work closely with their respective regional EPA office and the Office of Transportation and Air Quality to ensure the most up-to-date practices are incorporated into the evaluation. Furthermore, states interested in using RSD for program evaluation must recognize the need within their own agencies to develop a minimum level of expertise with the technology and procedures to ensure reliable data are collected and analyses performed. It should also be recognized given the difficulties associated with I/M program evaluations, that an evaluation based on both out-of-program data (e.g. RSD) and in-program data will provide a more accurate estimate of overall program performance than simply relying on one method alone. FINAL ------- 2. Background 2. I.Hi story of I/M The Environmental Protection Agency (EPA) has had oversight and policy development responsibility for vehicle inspection and maintenance (I/M) programs since the passage of the Clean Air Act (CAA) in 1970 (J_) , which included I/M as an option for improving air quality. The first I/M program was implemented in New Jersey in 1974 and consisted of an annual idle test of 1968 and newer light-duty gasoline-powered vehicles conducted at a centralized facility. No tampering checks were performed and no repair waivers were allowed. I/M was first mandated for areas with long term air quality problems beginning with the Clean Air Act Amendments of 1977 (2). EPA issued its first guidance for such programs in 1978 (3); this guidance addressed State Implementation Plan (SIP) elements such as minimum emission reduction requirements, administrative requirements, and implementation schedules. This original I/M guidance was quite broad and difficult to enforce, given EPA's lack of legal authority to establish minimum, Federal, I/M implementation requirements. This lack of regulatory authority and the state-to-state inconsistency with regard to I/M program design that resulted from it was cited in audits of EPA's oversight of the I/M requirement conducted by both the Agency's own Inspector General, as well as the General Accounting Office. In response to the above-cited deficiencies, the 1990 Amendments to the Clean Air Act (CAAA) (4) were much more prescriptive with regard to I/M requirements while also expanding I/M's role as an attainment strategy. The CAAA required EPA to develop Federally enforceable guidance for two levels of I/M program: "basic" I/M for areas designated as moderate non-attainment, and "enhanced " I/M for serious and worse non-attainment areas, as well as for areas within an Ozone Transport Region (OTR), regardless of attainment status. This guidance was to include minimum performance standards for basic and enhanced I/M programs and was also to address a range of program implementation issues, such as network design, test procedures, oversight and enforcement requirements, waivers, funding, etc. The CAAA further mandated that enhanced I/M programs were to be: annual (unless biennial was proven to be equally effective), centralized (unless decentralized was shown to be equally effective), and enforced through registration denial (unless a pre-existing enforcement mechanism was shown to be more effective). In response to the CAAA, EPA published its I/M rule on November 5, 1992 (5), which established the minimum procedural and administrative requirements to be met by basic and enhanced I/M programs. This rule also included a performance standard for basic I/M based upon the original New Jersey I/M program and a separate performance standard for enhanced I/M, based on the following program elements: Centralized, annual testing of MY 1968 and newer light-duty vehicles (LDVs) and light- duty trucks (LDTs) rated up to 8,500 pounds GVWR. * References are denoted by underlined italic numerals in parentheses and are listed in Section 8. FINAL - 4 - ------- Tailpipe test: MY1968-1980 - idle; MY1981-1985 - two-speed idle; MY1986 and newer - IM240. Evaporative system test: MY1983 and newer - pressure; MY1986 and newer - purge test. Visual inspection: MY1984 and newer - catalyst and fuel inlet restrictor. Note that the phrase "performance standard" used above was initially used in the CAA and is misleading in that it more accurately describes program design. Adhering to the "performance standard" does not guarantee an I/M program will meet a specific level of emissions reductions. Therefore, the performance standard is not what is required to be implemented, it is the bar against which a program is to be compared. At the time the I/M rule was published in 1992, the enhanced I/M performance standard was projected to achieve a 28% reduction in volatile organic compounds (VOCs), a 31% reduction in carbon monoxide (CO), and a 9% reduction in oxides of nitrogen (NOx) by the year 2000 from a No-I/M fleet. The basic I/M performance standard, in turn, was projected to yield a 5% reduction in VOCs and 16% reduction in CO. These projections were made based upon computer simulations run using 1992 national default assumptions for vehicle age distributions, mileage accumulation, fuel composition, etc., and were performed using the most current emission factor model then available for mobile sources, MOBILE 4.1. That version of the MOBILE model was the first to include a roughly 50% credit discount for decentralized I/M programs, based upon EPA's experience with the high degree of improper testing found in such programs. This discount was incorporated into the 1992 rule, and served to address the CAAA's implicit requirement that EPA distinguish between the relative effectiveness of centralized versus decentralized programs. The CAAA also required that enhanced I/M programs include the use of on-road testing and that they conduct evaluations of program effectiveness biennially (though no explicit connection was made between these two requirements). In establishing guidelines for the program evaluation requirement, the 1992 I/M rule specified that enhanced I/M programs were to perform separate, state-administered or observed IM240's on a random sample of 0.1% of the subject fleet in support of the biennial evaluation. Unfortunately, the program evaluation procedure for analyzing the 0.1% sample was never developed with sufficient detail to actually be used by the states. In defining the on-road testing requirement, the 1992 rule required that an additional 0.5% of the fleet be tested using either remote sensing devices (RSD) or road-side pullovers. Furthermore, the role that this additional testing was to play i.e., whether it was to be used to achieve emission reductions over and above those ordinarily achieved by the program, or whether it could be used to aid in program evaluation was never adequately addressed. At the time the 1992 I/M rule was being promulgated, EPA was criticized for not considering alternatives to the IM240. California in particular argued in favor of the Acceleration Simulation Mode (ASM) test, a steady-state, dynamometer-based test developed by California, Sierra Research, and Southwest Research Institute. In fact, this test had been considered by EPA while the I/M rule was under development, but the combination of EVI240, purge, and pressure testing was deemed sufficiently superior to the ASM that EPA dismissed ASM as an option for FINAL - 5 - ------- enhanced I/M programs. Nevertheless, EPA continued to evaluate the ASM test in conjunction with the State of California and by early 1995, sufficient data had been generated to support EPA's recognizing ASM as an acceptable program element for meeting the enhanced performance standard. In early 1995, when the ASM test was first deemed an acceptable alternative to IM240, the presumptive, 50% discount for decentralized programs was still in place. Even at that time, however, the practical importance of the discount was waning, in large part due to program flexibility introduced by EPA aimed at allowing enhanced I/M areas to use their preferred decentralized program designs. This flexibility was created by replacing the single, enhanced I/M performance standard with a total of three enhanced performance standards: * High Enhanced: Essentially the same as the enhanced I/M performance standard originally promulgated in 1992. * Low Enhanced: Essentially the basic I/M performance standard, but with light trucks and visual inspections added. This standard was intended to apply to those areas that could meet their other clean air requirements (i.e., 15%, post-1996 ROP, attainment) without needing all the emission reduction credit generated by a high enhanced I/M program. * OTR Low Enhanced: Sub-basic. Intended to provide relief to those areas located inside the OTR which if located anywhere else in the country would not have to do I/M at all. Despite the additional flexibility afforded enhanced I/M areas by the new standards outlined above, in November 1995 Congress passed and the President signed the National Highway Systems Designation Act (NHSDA) (6), which included a provision that allowed decentralized I/M programs to claim 100% of the SIP credit that would be allowed for an otherwise comparable centralized I/M program. These credit claims were to be based upon a "good faith estimate" of program effectiveness, and were to be substantiated with actual program data 18 months after approval. The evaluation methodology to be used for this 18-month demonstration was developed by the Environmental Counsel of States (ECOS), though the criteria used were primarily qualitative, as opposed to quantitative. As a result, the ECOS criteria developed for the 18-month NHSDA evaluations were not deemed an adequate replacement for the CAAA and I/M rule required biennial program effectiveness evaluation. In January 1998, EPA revised the I/M rule's original provisions for program evaluation by removing the requirement that the evaluation be based on IM240 or some equivalent, mass- emission transient test (METT) and replaced this with the more flexible requirement that the program evaluation methodology simply be "sound" (7). In October 1998, EPA published a guidance memorandum that outlined what the Agency considered to be acceptable, "sound," alternative program evaluation methods (8). All the methods approved in the October 1998 guidance were based on tailpipe testing and required comparison to Arizona's enhanced I/M program as a benchmark using a methodology developed by Sierra Research under contract to EPA. Even though EPA recognized that an RSD-based program evaluation method may be possible, a court-ordered deadline of October 30, 1998 for release of the guidance prevented EPA from approving an RSD-based approach at that time. FINAL - 6 - ------- The focus of this document is to address the concerns EPA has concerning RSD-based program evaluation methods with regard to equipment specifications, site selection, and data collection, as well as outline and explain the advantages and limitations of each RSD analysis methodology. As its operating premise, EPA recognizes that every program evaluation method will have its limitations, regardless of whether it is based upon an RSD approach or more traditional, tailpipe- based measurements. Therefore, no particular program evaluation methodology is viewed as a "golden standard." Ideally, each evaluation method would yield similar conclusions regarding program effectiveness, provided they were performed correctly. Unfortunately, it is unlikely we will see such agreement among methods in actual practice, due to the likelihood that different evaluation procedures will be biased toward different segments of the in-use fleet. Therefore, it is conceivable that the most accurate assessment of I/M program effectiveness will result from evaluations which combine multiple program evaluation methods. 3. General Approaches to I/M program evaluation 3.1 Defining Program Evaluation Aside from the technical challenges involved in gathering I/M program evaluation data, there are also subtleties regarding what data is necessary that must be understood. The evaluation of Basic I/M programs is strictly qualitative as per standard SIP policy protocols used to evaluate stationary source emission reductions. Historically, these type of qualitative evaluations have included verification of such parameters as waiver rates, compliance rates, and quality assurance/ quality control procedures, but they have not involved quantitative estimates of emission reductions using in-program or out-of-program data. The evaluation of Enhanced I/M programs is not as clearly defined and is left to the discretion of the Regional EPA based on the data available. In some instances, it may be possible to estimate the cumulative emission reductions, that is the current fleet emissions are compared to what that same fleet's emissions would be if no I/M program were in existence. However, directly measuring the fleet's emissions to determine the No-I/M baseline is not possible in an area that has implemented an I/M program. Therefore, in order to determine quantitatively whether the level of SIP credit being claimed is being achieved in practice, it becomes necessary to rely on modeling projections to estimate the No-I/M fleet emissions or measure the emissions of a surrogate fleet that is representative of the I/M fleet. The RSD procedures outlined in this guidance provide methods for estimating a fleet's No-I/M emissions using a surrogate fleet. Two other analyses are also possible that can provide useful information regarding program performance. The first method may be thought of as "one-cycle" since it compares the current I/M fleet emissions to the same I/M fleet's emissions from a previous year or cycle. An analysis such as this would yield information with regard to how the program is improving or declining from year to year. The other method should be considered "incremental" in that it compares the current I/M fleet's emissions to that same fleet's emissions while being subjected to a different I/M program, for instance, comparing a fleet's emissions in an area that has just implemented an IM240 program to that same fleet's emissions the previous year when a Basic Program was in FINAL - 7 - ------- operation. It should be noted, that there is a small window of opportunity prior to and during the start-up of any I/M program, or program change, to actually measure the fleet emissions that would provide empirical data on the No-I/M fleet emissions. If resources and time permit, it is recommended that these baseline data be gathered in order to reduce I/M program evaluation dependency on modeling projections and provide the most accurate measure of I/M program performance. 3.2. On-Road Data Analysis Remote sensing measurements can be used as a tool to help achieve the main goal of all I/M programs, namely the reduction of on-road emissions. The general advantages of remote sensing data are the following: i) The testing is unscheduled and measures on-road emissions. ii) A sample of all vehicles driving in an area can be tested. iii) A very large sample of vehicles can be tested for a fraction of the cost of I/M lane testing. iv) Vehicles can be tested over a range of driving conditions, rather than merely the conditions specified in the I/M test*. v) Vehicles that are often not tested due to condition, size or special dynamometer requirements (heavy duty vehicles, vehicles considered unsafe to test, vehicles requiring four- wheel -drive dynamometers) can be measured. vi) The on-road data can evaluate the extent to which owners are repairing their vehicles prior to emission testing. This is a program benefit that cannot be easily measured by means of in-program data without the use of surveys. vii) RSD can be directly converted to mass emissions per volume or mass of fuel burned and may be used to develop emission inventories independent of models In a well-designed remote sensing program, roadway grade and environmental conditions at the measurement site, as well as vehicle speed and acceleration, will be measured and used to calculate the vehicle load for each individual emissions measurement. Analyses can then be performed on a subset of measurements with a distribution of loads similar to that encountered by a single vehicle on the program's I/M test. In addition, by employing careful site selection criteria, remote sensing has the potential to measure emissions under driving modes not currently incorporated into I/M tests. Emissions measured by remote sensing instruments, and in idle and ASM tests, are reported in terms of concentration of total exhaust. Remote sensing data, then, can be directly compared By measuring vehicles on-road, RSD has the ability to measure vehicle performance at high power, "off-cycle" conditions that cannot be readily measured on a dynamometer because of tire slip, tire damage, safety concerns, vehicle owner concern and damage claims. Although off-cycle emissions are not regulated by the vehicle certification process, and their measurement may not be desired for I/M evaluation, they may be an important component of estimating the mobile source inventory. Therefore, on-road measurement of high power, "off-cycle" performance may be used to develop a complete emissions inventory and to assess the effectiveness of repairs under "off-cycle" conditions. FINAL - 8 - ------- with emissions results from I/M programs utilizing idle or ASM testing. However, some enhanced I/M programs measure mass emissions, and report emission results in grams per mile. Remote sensing concentration measurements can be converted to grams per gallon, using combustion chemistry equations, and then grams per mile, using an estimate of the instantaneous fuel economy (miles per gallon) of the vehicle at the time of measurement. The accuracy of the conversion from emissions concentration to grams per mile depends on the accuracy of the estimate of instantaneous fuel economy. Fuel economy varies by vehicle type, technology and age, as well as by vehicle load, thus complicating the conversion. Areas conducting IM240 or ASM testing should plot mean RSD emissions against mean initial EVI240 or ASM emissions by vehicle type and model year. These plots typically show a linear relationship with high correlation coefficients and can be used to establish a direct relationship between the RSD measurements and the I/M test results. EVI240 program data also includes CO2 emissions and thus can be directly converted to emission per gallon and compared to on-road data. These comparisons have been published and show R2 generally greater than 0.95, although the slopes and intercepts are not 1.0 and 0.0 (10). In particular, remote sensing data can be used in several ways to evaluate the effectiveness of an I/M program: i) Remote sensing programs measure vehicles at different times relative to their last I/M test. Therefore, remote sensing data can be used to estimate how quickly repair effectiveness diminishes over time and how much repair is made just prior to the I/M test, as well as track changes in fleet emissions due to changes in test procedures. ii) Remote sensing programs measure almost every vehicle that drives by the instrument, regardless of whether it is participating in the I/M program. Remote sensing data therefore can be used to estimate the number and emissions of vehicles legally exempted from, or illegally avoiding, the I/M program, as well as estimating their emissions. In addition, remote sensing data can identify individual vehicles that never complete the current I/M cycle, or that do not report for testing in a subsequent test cycle, but are still being driven in the I/M area. However, as with in-program data, there are inherent limitations to RSD data. i) The primary objection raised by opponents of RSD is that it must be assumed that a one second snapshot of the vehicle's emissions is characteristic of that vehicle's emission profile. ii) Fleet coverage is also a very realistic concern as it is often difficult to obtain readings on more than 50% of the fleet, which means that there may not be any emission readings for half of the vehicle population. iii) The quality control and quality assurance aspects of RSD data collection and analysis have not been as well documented as those for traditional tailpipe testing. FINAL - 9 - ------- Random roadside pullover testing has similar advantages to remote sensing; the test is unscheduled, and vehicles can be tested at different times relative to their last I/M test. However, roadside testing programs may be more expensive and time-consuming than some remote sensing programs, and so many fewer vehicles can be tested. California has operated a roadside pullover testing program for several years. An advantage of roadside testing is that the vehicles can be tested using the same test methods as those employed in the I/M program. They can also be inspected for visual or functional failures. However, the sample of vehicles participating in the California roadside testing program may not reflect the on-road fleet, since participation in the program is not mandatory, and it is also difficult to verify that vehicle selection is unbiased. Furthermore, roadside pullovers are politically unacceptable in many areas. 3.4. Three RSD Program Evaluation Methods In this document three methods, not necessarily exclusive, of using remote sensing data to analyze I/M program effectiveness are discussed. These are the Step Change, the Comprehensive, and the Reference Methods. The Step Change and Comprehensive evaluation methods are quite similar. Remote sensing measurements are made on a fleet of vehicles in an I/M area. The fleet is then divided into two sub-fleets, based on whether or not individual vehicles have been tested under the current I/M program. The emissions of the two sub-fleets are then compared, after accounting for differences in vehicle type and age. The difference in the emissions of the tested fleet and the untested fleet is the apparent benefit of the I/M program in reducing emissions. The primary difference between the two methods is the number of remote sensing measurements required. The Step Change Method can be performed using a relatively small number of measurements, on the order of 20,000 to 50,000. The Comprehensive Method requires many more remote sensing measurements (several million in the Phoenix example) in order to perform the detailed analyses of program effectiveness. Collecting this much remote sensing data can be relatively expensive; however, if such data are already being collected as part of another program (such as a Clean Screen program), the additional cost of analyzing the data is minimal. The drawback of the Step Change and Comprehensive Methods (aside from the general concerns with regard to RSD mentioned above) is that they only measure the effect of incremental changes in I/M programs unless repeated year after year. The Reference Method is designed to measure the full effect of an I/M program on a vehicle fleet, by comparing the emissions of a fleet subject to I/M with estimated fleet emissions if no I/M program were in place. The accuracy of the Reference Method hinges on the ability to find a fleet in a non-I/M area as similar to the I/M area fleet as possible. Because vehicle emissions are quite variable, both between vehicles and within an individual vehicle, and because many differences between vehicle fleets and their environment can affect vehicle emissions, finding a suitable reference area can be challenging. One way to determine the degree of bias in the reference fleet is to obtain data from a second reference fleet; if there are few biases, the two reference fleets should look the same. The Reference Method can also be used to compare the impact of two I/M programs in different locals. Although this will provide a relative comparison between two programs it will not provide any data to compare an I/M program to a No-I/M fleet. FINAL - 10 - ------- Figure 3.1 below illustrates some of these differences. Figure 3.1.1/M Program Evaluation Methods Using Remote Sensing Data e H o EH RefiamceM ethod com pares an issions of vehicles in an Ijt/[ program [tested fleet] with those of vehicles not in an Ifil program [reference fleet]. Basbl/M Enhanced I/M StepM ethod com pares emissions in one cycle [tested fleet] w nth those in pievixis cycle tJntestEd fleetj. M eas_UES efiectof indHTiental changes to pKagram .Because untested fleetm eas_usd later in cycle than tested fleet, m ay overstate increm eritalpKxjram effect. C om pffihensive M ethod com pares em issjons of fleetatdifferentpoints in Ifl[ cycle. M easores eSsctof pie-test lepair, delay in post-testiepair, and em issions deterioration overtim e. Test FINAL -11- ------- 4. Equipment Specifications and Measurement Procedures 4.1. The Remote Sensing System Figure 4.1 shows a generic diagram of an RSD system which measures CO, CO2, HC, NO, and smoke opacity set up along a single lane of road. The make and model year of the vehicle are identified from the video picture. Figure 4.1: RSD Operational Diagram WE All ILK ' STATION ! OMISSIONS DETECTOR IR/UV SOURCE n CALIBRATION GAS /f\ LICENSE / / PLATE t VfDEO 4.2. Theory of Operation Remote Sensing Devices have been designed to emulate the results one would obtain using a conventional exhaust gas analyzers. Because the effective plume path length and amount of plume seen depend on turbulence and wind, one can only determine ratios of CO, HC, or NO to CO2. Assuming complete and instantaneous mixing, these ratios, Q for CO/CO2, Q' for HC/CO2, and Q" for NO/CO2 are constant for a given exhaust plume. By themselves, Q and Q' are useful parameters with which to describe the combustion system. When the corresponding combustion equations are solved many components of the vehicle operating characteristics can be determined including the instantaneous air/fuel ratio and the % CO,% HC, and % NO which would be read by a tailpipe probe. The equations given below are based upon a carbon mass balance and make use of the fact that the IR HC analysis method only measures about one half of the carbon which would be measured by means of an FID for instance. % CO2 = 42/(2.79 + 2Q % CO = Q * (% CO2) % HC = Q' * (% CO2) % NO = Q" * (% CO2) 0.84Q') FINAL - 12- ------- To derive mass emissions in g/gal of fuel from Q and Q' a fuel density of 0.75 g/mL and the carbon-hydrogen ratio of 1:2 are assumed to yield: CO2 mass emission (g/gal) = 89227(1 + Q + 6Q') CO mass emission (g/gal) = 5678*Q/(1 + Q + 6Q') HC mass emission (g/gal) = 8922*2*Q'/(1 + Q + 6Q') NO mass emission (g/gal) = 6083*Q"/(1 + Q + 6Q') The vehicle's instantaneous air to fuel ratio is A7F by mass = 4.93(3 + 2Q)/(1 + Q + 6Q') All diesel and most gasoline powered vehicles show a Q and Q' near zero since they emit little to no CO or HC. To observe a Q greater than zero, the engine must have a fuel-rich air/fuel ratio and the emission control system, if present, must not be fully operational (if). In the case of diesel combustion, misfire causes high HC readings. Since the overall air/fuel ratio is very lean, even when over-fueling and sooting are taking place, CO emissions only arise from pockets of incomplete combustion, and are limited to about 3% CO, compared to a broken gasoline-powered vehicle which can exceed 12% CO. Recently, the ability to measure nitric oxide (NO) has been added to the existing IR capabilities. The light source, across the road, now contains a deuterium or xenon arc lamp and IR/UV beam- splitter which is mounted in such a manner that the net result from the source is a collimated beam of UV and IR light. As with CO and HC measurements, the NO measurements are possible by ratioing to the CO2 measured in the plume. All pollutants except HC are a specific gas which can unambiguously be measured and calibrated. Exhaust HC is a very complex mixture of oxygenated and unoxygenated hydrocarbons. The filter chosen measures carbon- hydrogen stretching vibrations which are present, but not equally in all HC compounds. This system can easily distinguish gross polluters from low emitters, but the results on an individual vehicle cannot be expected to correlate perfectly with a flame ionization detector, with ozone- forming reactivity, or with air toxicity, since the three are not correlated to one another. For large sample sizes the fleet average emissions correlate well with IM240 g/mi measurements 02). Newer technologies may also be used in place of the UV/IR detectors described above, such as tunable diode lasers. 4.3. Operation When a motor vehicle passes through the beam of a calibrated instrument on the road, the computer notices the blocked intensity of the reference beam. This causes the previous 200 ms of data (20 points) to be stored in memory as the "before car" buffer. The blocked voltages are continuously interrogated both to remember the lowest values (zero offset) and to look for a beam unblock signal. When an unblock signal is recognized, the video picture is frozen into the video screen memory and thus goes to the image recorder, and the next 50 data points (1/2 sec of FINAL ------- exhaust) are placed in a data table. The zero offsets are subtracted from all data. The data stream is interrogated for the highest CO2 voltage. This is the least polluted 10 ms average seen during the 0.7 sec. of data devoted to this vehicle. This set of data (often, but not always, in the before car buffer) then becomes the "clean air reference" (CAR) against which all other data are compared. After all signals have been ratioed to the reference channel, and ratioing the results to the CAR result for that channel, one now has a set of 50 postcar, corrected, fractional transmissions which are converted to gas concentrations such as would have been observed in the gas analyzer. These concentrations are then correlated to CO2 and the slope and error of the slope determined. These slopes (the ratios of the pollutants to CO2) are corrected by the correction factors determined for that time by means of roadside calibration. These slopes now are the Q, Q' and Q" described earlier. The data obtained for each vehicle provide three pollutant ratios. The RSD software now solves the combustion equation for the measured pollutant ratios, compares the errors to preset error limits, and, if acceptable, reports the measurements as % CO, % CO2, % HC, and % NO such as would be measured by a tailpipe probe with the results corrected for water and for any excess air which may not have participated in combustion. In view of the fact that the instrument is calibrated with propane, percent HC is reported as propane; however, other HC species such as hexane or 1,3 butadiene could be used for this purpose as well. The four derived concentrations, % CO, % HC, % NO, and % CO2, are placed on the video output together with the vehicle image (which has been waiting without results for about 0.7 sec.). This image now stays on the screen until the next vehicle comes by to repeat the process. If these results are to be compared to vehicles of known emissions, or gas cylinders puffed into the beam, it is important to compare the three ratios and not the four derived concentrations since there are not actually four independent pieces of information. For example, if a person blocks the beam and exhales into it during the 1/2 sec. after they have unblocked the beam, the computer sees the exhaled CO2, finds no CO, HC, or NO, and reports zeros for those pollutants and about 15% CO2. Exhaled breath rarely contains even 2% CO2, but the system only measures the ratios, and assumes (incorrectly in this case) that the emissions are from a fully stoichiometric automobile using gasoline as fuel. A puff from a cylinder which contains 50% CO and 50% CO2 would be read as 8.6% CO and 8.6% CO2 because the ratio is what is measured not the absolute concentrations. Special software traps should be employed to deal with two cars traveling very close together. In this case, the before car buffer from in front of the first is used as a potential source of clean air reference for the exhaust of the second. The video picture of the first is replaced by the second before any data are overwritten. High pickup trucks thus often get two pictures, only the last of which has emissions data. Other software traps reject data when the slope errors are too large, and when there is no sign of any significant exhaust plume (such as behind 18-wheel trailers whose tractors have elevated exhausts). FINAL - 14 - ------- For the interested reader, Appendix A contains a brief description of some trouble-shooting procedures that can be performed quickly in the field as a first step to verify if an RSD unit is operating properly or if it is in need of service. 4.4. Operational Difficulties 4.4.1 Signal/Noise Considerations Remote emissions measurements would all be very straightforward if one were able to measure directly behind the tailpipe of each passing car. Absorptions would be large, and the system signal/noise (S/N) would not be limiting. In fact, vehicle tailpipes are not in standardized configurations, vehicle engine sizes are not uniform, and there is very rapid turbulent dilution of the exhaust behind vehicles moving faster than about 5 mph. Thus, one is forced to make engineering tradeoffs between the desire to measure all vehicles and the necessity to have an adequate S/N so as not to report incorrect exhaust emissions values. The detection of gas absorption is based upon the reduction of signal on one detector versus the reference detector. Thus, the average car measured at an uphill freeway ramp in Denver shows an exhaust plume already diluted by a factor of about 10. This situation gives rise to an easily measurable 14% reduction in the CO2 voltage. Because the average CO content is about l/20th of the CO2 and the HC 1/1 Oth of the CO, the average total changes in CO and HC voltages are only 3 and 1 part in 1000, respectively. The NO channel shows a similar response as HC. Thus, the instrument builder's challenge is to build a system in which part per thousand changes in IR and UV intensity are accurately measured in all weather conditions beside a normal road at a measurement frequency of 100 Hz. At other locations, the plume dilution factor is 100 and a decision must be made whether the individual instrument's S/N is adequate for readings to be reported or if the data should be reported as invalid. This bleak outlook is somewhat mitigated by the fact that the source need only maintain a stable intensity for about two seconds for a complete measurement series and the fact that the data reduction process intrinsically "averages" all the 1/2 sec. data to only three ratios. Newer technologies having improved S/N ratios may be available and used over greater distances. 4.4.2. Weather Measuring light intensities over a 10 m path to better than a few parts per thousand can be inhibited by bad weather. Ambient temperature and humidity variations are not a problem, but snowflakes and heavy rain add too much noise to all data channels. Wet or very dusty roadways cause a plume of spray or dust behind vehicles moving above about 10 mph. These plumes also add noise to the system, and generally increase the data rejection rate to an unacceptable level. At the most productive sites, the remote sensor can gather data on 10,000 vehicles in a working day; thus, it often generates data faster than the operator can handle. In such cases, taking the day off to analyze data when the weather conditions are not appropriate may be beneficial. Gross polluting vehicles are thought to be the same vehicles on dry as well as on wet days. FINAL - 15 - ------- 4.4.3. Interference The HC wavelength suffers from some interference from gas phase, and certainly from participate phase, water (so-called "steam" plumes from colder vehicles operating at low ambient temperatures). When steam plumes are so thick that you cannot see through them (Fairbanks, AK., at forty below zero) the system no longer operates since all wavelengths are absorbed or scattered too much for useful data to be acquired. 4.4.4. Optical Alignment If the instrument is not perfectly optically aligned, the voltages are likely to be very sensitive to equipment vibration. Since moving vehicles both shake the roadway and generate wind pulses, rigid instrument mounting is as important as perfect internal and external optical alignment. Software is written so that these noise sources generate "invalid" flags. Proper alignment at a well characterized RSD-site can yield 95% valid RSD readings on passing vehicles using UV/IR detector technology. The system is designed to operate on a single-lane road. Freeway ramps, turn lanes, and the inevitable road closures for sewer, gas, water, telephone, and road maintenance are often good candidates for RSD emission measurement sites. Multiple-lane operation has been reported but is not recommended. 4.4.5 Emissions Variability Emissions of motor vehicles are not constant from second to second or from day to day. Broken vehicles in particular often seem to have a large random component to their emissions irrespective of what test is used to make the measurement (13). Some vehicle emission variability has known causes such as the initial operation of cold vehicles before the engine control system stabilizes and the catalyst begins operation, or when the vehicle is accelerated at full throttle. Both situations give rise to large CO and HC emissions from even well-maintained vehicles, but can be minimized through careful site selection. 4.5 Instruments 4.5.1. Calibration Checks Two separate calibration procedures should be performed on every remote sensing unit. The first is conducted in a laboratory and should be performed by the equipment manufacturer. It may consist of exposure in the laboratory at a path length of about 22 ft to known absolute concentrations of NO, CO, CO2, and propane in an 8 cm IR flow cell with CaF2 or other IR transmitting windows. The calibration curves are used to establish the fundamental sensitivity of each detector/ filter combination to the gas of interest. The results of this calibration should be provided to the state or contracting party upon request. The second calibration should be every hour (14) during operation until the stability of the individual system is quantified and characterized using statistical process control methods. Once control charts have been established, the calibration frequency may be reduced appropriately. Several puffs of gas designed to simulate all measured components of the exhaust are released from a cylinder containing certified amounts of NO, CO, CO2, and propane into the optical beam path. The ratio readings from the instrument are compared to those certified by the FINAL - 16 - ------- cylinder manufacturer. In this way the system never actually measures exhaust emissions; it basically compares the pollutant ratios in a known standard gas cylinder and those measured in the vehicle exhaust. The gases used for the second calibration shall by certified to +1-2% of a known NIST standard and be in the following ranges: CO 1-9% HCasCS 300-4100ppm NO 1500-3600 ppm CO2 5-14% (with the balance oxygen free nitrogen) Additionally, some quick checks are provided in Appendix A that may be useful in trouble- shooting equipment in the field. 4.5.2. Other Instrument Parameters At a minimum the following parameters shall also be recorded in all RSD program evaluation studies for each RSD site in a stations log. The log may be kept electronically or in hardcopy format. i) A description of the RSD equipment including light source, make/model of instrument, and detector type. ii) The name of the operator and the van. If more than one operator or van are used, key and record which operator and/or van was used for each measurement. iii) Complete description of the calibration procedure. iv) Audit check results v) Calibration check results vi) Any equipment changes vii) Verification of speed and acceleration measurement devices 4.6. Site Description A site description for each RSD data collection site shall be generated that shall include the following information. i) Road map with features affecting traffic flow. ii) Note any change in the position of the light source, detector, etc. from previous RSD studies iii) Note any change in traffic patterns from previous RSD studies. iv) Note the altitude of the site and the road grade. Include a field in the database showing the road grade in percent for all measurements. v) Digital picture of the site including all cones, etc, that would influence motorist driving patterns. vi) Global Positioning Satellite coordinates based on the NAD86 reference standard. FINAL - 17 - ------- 4.7. Measurements 4.7.1. Data Collection The following measurements shall be recorded at each site where RSD program evaluation data are collected. i) %CO2, %CO, %NO, %HC, maximum CO2, all error terms, restarts, and negative emission numbers. Include a field showing whether HC is reported as propane or hexane. ii) Speed and acceleration. Vehicle Specific Power shall be calculated as described below. Valid VSP values shall be between 0-20 kW/ton. VSPkw/t = 4.39*sin(slope)*v+0.22*v*a+0.0954*v+0.0000272*v3 where "a" is vehicle acceleration in mph/s, "v" is vehicle speed in mph, and slope is the road grade in degrees . iii) Location of speed measurement relative to emission measurement. It is recommended that vehicle speed be measured 5-10 m prior to the emissions measurements. iv) Time and date of measurement v) License plate. Record all plates including in-state, out-of-state (OS), dealer (D), paper plate (PP), obscured plate (OP), and no-plate-visible (NPV) vi) Hourly temperature, barometric pressure, and relative humidity vii) Describe how plume strength is determined and flagged, as well as the criteria for rejecting measurement attempts. viii) Site reference label ix) RSD unit number or unique identifier 4.7.2. Multiple Measurements Multiple measurements made on the same vehicle shall be treated in one of the following ways; however, the program evaluation report will clearly state which method has been chosen and the rational behind this choice. A multiple measurement is not restricted by the timeframe over which it is collected. Therefore, it may be hours, days, weeks or months. Option (iv) below is recommended, although there may be circumstances when another option may be more appropriate. i) Multiple measurements are treated as independent readings ii) Multiple measurements are averaged and treated as a single reading iii) Multiple measurements are discarded and only the first reading is used iv) The maximum, minimum and average values are reported to provide as comprehensive a snapshot of a vehicle's emission profile as possible. 4.7.3. Operators Care must be taken to ensure operators are properly trained in the routine operation of the equipment and fully understand and implement the QA/QC required procedures. Furthermore, it This equation should be considered generic in that it may be applied to all types of vehicles. More accurate equations dependent on MY and/or vehicle type may be developed in the future. FINAL -18- ------- is imperative that daily vehicle quotas do not compromise the operators judgments or actions with regard to QA/QC and the data collection process. 4.8. Database Format The RSD data collected shall be made available in an ASCI text file that may be easily ported into a standard commercially available database software package such as Access, Oracle, SAS, etc. If special procedures are required to port the data into such a software package the software code or procedures shall be provided upon request. 4.9. Department of Motor Vehicle Data Department of Motor Vehicle data shall be reported as follows. i) Date DMV data received from DMV ii) Information indicating how current the most recent DMV data in the file are. iii) VIN, Model Year, Make, Model, Fuel Type, Vehicle Type, Zip Code iv) I/M test date. v) I/M test results in g/mi, ppm or percent. 4.10. Note Any Changes that Could Affect the Analysis Any changes to the I/M program which would impact the analysis shall be recorded and reported in the program evaluation report. Such changes may include, but are not limited to, changes in the operational details of the I/M program itself, or the use of a seasonal fuel program to reduce mobile source emissions. 5. Design Parameters and Quality Assurance/Quality Control Protocols 5.1. Overview This section outlines a number of critical issues that must be addressed to perform a program evaluation using RSD technology. These issues include data collection design parameters, equipment specifications, calibration procedures, quality control, and several known sources of bias in vehicle emissions measurements that can affect any evaluation of an I/M program. Some of these are unique to remote sensing data, while others apply to evaluations based on in- program data as well. The issues or types of bias that must be considered in a remote sensing program evaluation have been broadly grouped into the following categories and discussed under the appropriate headings below: vehicle population, vehicle load, vehicle identification, instrument calibration, measurement method, socioeconomics, seasonal effects, program avoidance, regional differences, program details and emissions distributions. The importance of five issues (vehicle load, program avoidance, vehicle identification, program details and emissions distributions) are roughly similar for each of the three evaluation methods. Because the Reference Method relies on measurement in two different geographic regions, it is FINAL - 19 - ------- most sensitive to all of the remaining types of bias. The likelihood of bias can be minimized if multiple reference sites are chosen and the sites are well-characterized with common load characteristics. Because the Comprehensive Method requires large numbers of measurements, multiple vans and sites can increase a bias due to instrument calibration, socioeconomics, and seasonal effects. In collecting data at a single site over a short time period, the Step method eliminates the potential for socioeconomic and seasonal bias between the two measured subfleets; however, the estimate of program effectiveness may be biased if the site chosen or the time of testing does not capture the distribution of driver socioeconomics or environmental variables representative of the I/M area. This potential source of bias can be tested by comparing the measured fleet numbers by model year to other data , bearing in mind that on-read measurements are expected to measure newer, higher annual mileage vehicles more than older, lower annual mileage vehicles. 5.2. Vehicle Population Goal: Account for differences in vehicle fleet distributions in the program evaluation analysis. Perhaps the most common source of bias when comparing emissions of two fleets of vehicles is the vehicle distribution of the two fleets. Older, higher mileage, vehicles tend to have higher emissions than newer, lower mileage, vehicles. Light duty trucks were built to less strict emissions standards than passenger cars, and are observed to have higher in-use emissions. In addition, there is a wide range in average emissions by vehicle model, even for vehicles of the same age (15). Differences in vehicle fleets can be determined by comparing vehicle distributions of the two fleets by type and model year. (Note: The Step, Comprehensive and Reference Methods all compare fleet averages; however, the composition of these sub-fleets is different for each method.) Average emissions by type and model year should be calculated for each fleet and compared to determine any emissions differences between the two fleets. The average emissions for each fleet should then be weighted by a single distribution of vehicles by type and model year (preferably that of the I/M program area), to determine the overall fleet emissions and the percent difference between the two fleets. Table 5.1 displays examples CO emissions by model year from samples of vehicles measured in a reference area and an I/M area. The composite fleet averages of 0.86% CO for the reference area and 0.58% CO for the I/M area suggest the I/M area vehicles are 32% cleaner. This is not a fair comparison, however, because it is evident from the fleet fraction percentages (Columns D and G) that the I/M area sample contains a greater proportion of newer vehicles. To overcome this, the I/M area model year CO contributions are re-weighted according to the reference area fleet fraction percentages. This is shown in column H. The adjusted composite emissions level for the I/M area is now 0.76% CO, resulting in an apparent 12% (1-0.76/0.86) benefit from the program. It should be noted that this 12% apparent benefit should be converted * This data may be obtained from Department of Motor Vehicle (DMV) records or modeling defaults, although empirical DMV data would be preferred. FINAL - 20 - ------- to a mass basis to be more meaningful and allow more direct comparisons to other I/M program evaluation results as well as results from other air pollution control programs. Of course, this raises the question as to what extent the greater proportion of newer vehicles in the I/M fleet is the result of the I/M program. Addressing this question is difficult. No current analyses of in-program or out-of-program data provides information in this regard. At this time, further studies are needed to address this issue. Because the Step Change and Comprehensive Methods compare fleets of vehicles from the same I/M area, there is likely to be little difference between the two fleets with respect to fleet distribution. However, when using the Reference Method, vehicle populations can be significantly different between different geographical areas, as can fuel composition, environmental factors, and motorist socioeconomic status (discussed below). FINAL -21- ------- Table 5.1: Average RSD Readings by Model Year B H Model Year Pre-60 Y60-65 Y66-70 Y71-75 Y76-80 Y81 Y82 Y83 Y84 Y85 Y86 Y87 Y88 Y89 Y90 Y91 Y92 Y93 Y94 Y95 Y96 Y97 Avg/Tot Reference Area AvgCO 3.45 4.12 3.50 2.74 2.42 2.24 1.94 1.71 1.64 1.39 0.99 0.83 0.72 0.68 0.56 0.50 0.43 0.37 0.28 0.23 0.17 0.11 0.86 Count Fleet % 70 390 1333 2661 10259 2818 3430 5440 8424 10322 12067 12532 14410 14803 14479 14666 12977 14617 13222 15055 9668 876 194519 0.04% 0.20% 0.69% 1.37% 5.27% 1.45% 1.76% 2.80% 4.33% 5.31% 6.20% 6.44% 7.41% 7.61% 7.44% 7.54% 6.67% 7.51% 6.80% 7.74% 4.97% 0.45% 100.00% AvgCO 1.60 3.61 3.24 2.50 2.19 .64 .34 .36 .23 .18 0.83 0.77 0.70 0.61 0.53 0.50 0.42 0.36 0.30 0.26 0.20 0.21 0.58 I/M Count 16 39 137 310 1173 373 470 707 1203 1654 2172 2497 2853 3059 3366 3717 3645 4350 4507 5435 4320 2116 48119 Area Fleet % 0.03% 0.08% 0.28% 0.64% 2.44% 0.78% 0.98% 1.47% 2.50% 3.44% 4.51% 5.19% 5.93% 6.36% 7.00% 7.72% 7.57% 9.04% 9.37% 11.29% 8.98% 4.40% 100.00% ExD 0.00 0.01 0.02 0.03 0.12 0.02 0.02 0.04 0.05 0.06 0.05 0.05 0.05 0.05 0.04 0.04 0.03 0.03 0.02 0.02 0.01 0.00 0.76 5.3. Vehicle Loads Goal: Ensure that RSD measurements are made under known vehicle operating conditions. Another important source of potential bias is the load under which the vehicle is operating when the emissions measurement is made. Emissions per gallon are very much less speed and load dependent than emissions per mile, nevertheless load is an important variable. Researchers use vehicle specific power (VSP equation given earlier) which is a function of vehicle speed, acceleration, drag coefficient, and tire rolling resistance, and roadway grade, to characterize the load the vehicle is operating under at the time the measurement is made (16,1Z). On-road remote sensing units measure tailpipe exhaust plumes for a fraction of a second as vehicles pass by the unit. HC, CO and NOx pollutant emissions are estimated by comparing the FINAL -22- ------- ratio of their concentrations to the concentration of CO2 seen in the vehicle exhaust plume. Although, the remote sensing unit does not measure the volume of exhaust gases produced, a number of vehicle load conditions can elevate the remote sensing observed emission levels: i) When a motorist lifts his/her foot off the gas pedal, the volume of air and fuel flowing through the vehicle engine and exhaust system is suddenly reduced. Under these circumstances, the ratio of HC and CO to the now reduced level of CO2 is often increased. Although the volume and mass of emissions are substantially reduced when a driver lifts off the gas, to the remote sensing unit, the ratio of the concentrations of HC and CO to CO2 are actually higher and a higher emissions value is recorded. This effect is greatest for HC. ii) When a motorist presses sharply on the accelerator, the vehicle may go into what is termed an 'off-cycle' condition. The current generation of vehicles have been certified using the Federal Test Procedure; however, this test does not cover the full power range of the vehicle. Consequently, vehicles were designed to minimize emissions only over the power range tested in the certification cycle. At higher powers, so called "off-cycle" or power enrichment emissions often increase dramatically although the vehicle is functioning as designed. Under these circumstances, a vehicle can have high emissions when measured by remote sensing but may meet the I/M inspection requirements. This effect is greatest for CO and NOx. For these reasons, multiple remote sensing measurements for the same vehicle can vary considerably if the site is such that the operating mode of the vehicle at the time of the measurement is not consistent. As stated earlier (Section 4.7.2), it is recommended that in the case of multiple measurements, all data are retained, or the maximum, minimum and average values are reported to provide as comprehensive a snapshot of a vehicle's emission profile as possible. For broken vehicles, the variability and the likelihood of high readings is extreme. For low emitting, new or well-maintained vehicles, variability caused by driving mode changes under normal operating circumstances is very small. The load under which each individual vehicle is driving, or VSP, should be calculated based on vehicle speed, acceleration, and roadway grade, as described earlier. The distribution of VSP should then be compared between different remote sensing sites to determine if vehicles are being driven differently at different sites. If there are enough remote sensing measurements, average emissions by vehicle type and model year can be weighted by a common VSP distribution to remove any bias introduced by different vehicle loads at different remote sensing sites. With regard to repair effectiveness, it is important to recognize that not only are absolute emission levels sensitive to vehicle load; the percent change in emissions from vehicle repair is as well. An analysis of repair effectiveness on a sample of vehicles given a full IM240 test before and after repair indicates that the percent reduction in emissions over the moderately loaded portion of the EVI240 was only half that of the reduction over the entire EVI240 (18). FINAL ------- Therefore, it is critical that any analysis of remote sensing data used to characterize fleet emissions in general or estimate repair effectiveness include the calculation of vehicle load. To minimize the possibility of a driver making sudden throttle changes it is recommended remote sensing units be sited in locations such as highway on or off ramps. In addition, analyses that rely on data from more than one remote sensing site should re-weight average emissions at different sites by a similar distribution of vehicle loads, to allow proper comparison of emissions data collected at each site. There is some evidence that older vehicles behave differently than newer vehicles with respect to VSP. In the future, vehicles designed to meet supplemental FTP certification requirements can be expected to behave differently than today's vehicles. Consequently, adjusting calculations, if required, should probably divide the populations into several ranges of model years. Table 5.2 illustrates the various loads vehicles are subject to during emission tests or accelerations. Table 5.2 Examples of VSP Values Activity Maximum Rated Power 0-60 in 15 seconds 60 mph up 4% grade FTP or IM240 max Typical RSD site Average EVI240 ASM5015 ASM2525 VSP (kW/metric ton) 44-120 33 23 23 10-15 8 6 5 Figures 5.1, 5.2 and 5.3 illustrate the relationships between emission and VSP for various vehicle MY groupings. Maintaining as narrow a VSP window as possible will help minimize variability between site measurements, although there may be practical limitations of how tight the VSP operating window can be held. The data presented in the following three figures indicate relatively constant CO and HC emissions for VSP values between 5 and 20 kW/metric ton, while NO emissions are more variable even if the VSP window is reduced to 10 to 20 kW/metric ton. Therefore, for this data set it would appear that a VSP range of 15 +/- 5 kW/metric ton would be the recommended target to minimize site-to-site load variability. FINAL -24- ------- Figure 5.1 RSD %CO vs VSP (Denver Remote Sensing Clean Screen Pilot 12/99) O RSD CO vs. Specific Power -15 -10 -5 0 5 10 15 20 25 Specific Power kW/t 40 FINAL -25- ------- Figure 5.2 RSD %HC (C6) vs VSP (Denver Remote Sensing Clean Screen Pilot 12/99) (0 I Q. 0 X RSD HC vs. Specific Power -15 -10 -5 5 10 15 20 25 30 35 Specific Power kW/t 40 FINAL -26- ------- Figure 5.3 RSD %NO vs VSP (Denver Remote Sensing Clean Screen Pilot 12/99) RSD NO vs. Specific Power a. a. -15 -10 -5 5 10 15 20 Specific Power kW/t 25 30 35 40 5.4. Vehicle Identification Goal: Identify vehicle license plate so RSD emissions may be linked to specific vehicle and I/M test result if available. Optical character recognition is commonly used to read license plates in RSD studies; however, car must be taken to ensure these data are accurate. The license plate's design or color scheme may adversely affect the accuracy of the data, and this would obviously result in errors in linking the RSD reading with the correct I/M test result. If manually entry is to be used to enter license plate data into a database, procedures should be developed to identify and correct transcription errors. It must also be understood that depending on a state's infrastructure regarding vehicle registration tracking and ease of access to the I/M test database, matching the RSD data with the appropriate I/M test result can be more difficult than anticipated. FINAL -27- ------- 5.5. Instrument Calibration Goal: Ensure RSD units are calibrated using standardized procedures. More detailed calibration specifications are provided in Section 4.5; however, it should be noted that the accuracy specifications on instruments may have a greater range than the differences between fleets, so the instruments may meet specifications but still give significantly different results. For example, if the CO specification is +/- 0.25%, at a typical fleet average of 1% CO, one system could be centered at 1.05% and another at 0.95%. Both are well within specification but would report a 10% difference in two identical fleets. Several approaches are possible for identifying and correcting this problem. Not all may be feasible: i) Examine unit certification and audit data to determine offsets. ii) Run the units side by side to obtain comparative results. iii) Compare emission distributions for new model years of vehicles whose emissions profiles are expected to be the same in both fleets. 5.6. Measurement Methods Goal: Convert concentration based RSD measurements on individual vehicles to mass based fleet estimates. Remote sensing measures emissions in terms of concentration ratios in the total exhaust, while I/M programs that use idle or ASM testing measure emissions concentrations. However, programs that use IM240 or IM240-derivatives use concentration readings, air flow and miles driven on a dynamometer to calculate mass emissions. Therefore, fuel consumption data for an area may be used with fleet average RSD or ASM measurements taken in units of g/kg fuel to determine the fleet average emissions or the fleet average emissions could be converted to g/mi values by using instantaneous vehicle fuel economy estimates. Also, as mentioned earlier (Section 3.3) areas conducting IM240 or ASM testing should plot mean RSD emissions against mean initial EVI240/ASM emissions by vehicle type and model year. These plots typically show a linear relationship with high correlation coefficients and can be used to establish a direct relationship between the RSD measurements and the I/M test results. 5.7. Socioeconomics Goal: Minimize the socioeconomic influence on data collection so that the I/M program benefits are quantified and not the socioeconomic differences that exist between fleets due to income. It is believed that the vehicles owned by relatively low-income drivers tend to have higher emissions, from a combination of vehicle age and mileage, model, and historical maintenance practices. Researchers have found that vehicle owner socioeconomics can affect vehicle emissions independent of even vehicle type, age, and model (19). Specifically, in one study CO and HC emissions were found to be roughly 25% higher in Lynwood CA than in El Monte CA20. FINAL - 28 - ------- The socioeconomic background of the drivers of vehicles measured by remote sensing can be quite different depending on where the instrument is located. The effect of driver socioeconomics on remote sensing emissions can be identified by graphing average emissions by vehicle type and age for each measurement site, after correcting for different load conditions at each site. Driver socioeconomics must be considered when selecting sites for remote sensing measurement. If measurements from different sites are to be compared, such as under the Reference Method, sites with similar driver socioeconomics should be used. One method to determine if a true cross section of vehicles is being sampled is to plot the percentage of RSD measurements vs. ZIP code . If it is discovered that the differences in fleet emissions between two sites are due primarily to socioeconomic factors, there is no easy way to deconvolute the existing data. Therefore, this issue should be addressed in the planning phase before any data is collected. 5.8. Seasonal Effects Goal: Minimize the influence of seasonal variables on data collection. Since no existing I/M programs vary their cutpoints vary by season, seasonal effects may influence a vehicle's measured emissions and therefore whether it passes its I/M test. However, the seasonal effects impact vehicle operations independently of whether emissions are measured by in-program analyzers or RSD. Therefore, a seasonal effect may introduce a bias when comparing, for instance, remote sensing measurements taken during two distinct time periods. Vehicle emissions as measured by the Arizona program vary by season as depicted in Figures 5.4-6. Figure 5.4 shows the daily average CO of initial IM240 tests of Arizona passenger cars over a three year period (filled circles, left scale). Emissions of cars that are fast-passed or fast- failed are extrapolated to their full IM240 equivalents. The trend in the maximum daily temperature is also shown (gray lines, right scale). The solid vertical lines denote the calendar years, whereas the dashed vertical lines denote the changes in fuel. CO, and HC, are higher in warmer summer months; while NOx shows the opposite seasonal trend, and is higher in winter months. Colorado EVI240 data show similar seasonal patterns. It is unclear whether the seasonal variation is due to a combination of seasonal temperatures and changes in fuel composition, or to inadequate conditioning of vehicles prior to testing. The seasonal variation in Arizona remote sensing (Figure 5.5) and loaded idle (Figure 5.6) data appears to mirror that of the Arizona IM240 emissions, suggesting that vehicle conditioning is not the cause of the variation. However, the seasonal variation in CO and HC in the Wisconsin IM240 program (Figure 5.7) and the Minnesota idle program (Figure 5.8) are in the opposite direction, that is, CO and HC are higher in winter months. (The trend in Wisconsin NOx follows that of Arizona and Colorado.) More analysis is needed to better understand these seasonal * Other parameters may be used to segregate the data such as IM area, previous IM test result or MY. A specific example in which IM area was used may be found in the June 19, 2000 Inspection & Maintenance Review Committee Report, "Evaluation of the Enhanced Smog Check Program", Appendix F. FINAL - 29 - ------- trends, and why they differ by area; however, these trends can be identified using RSD and should be discussed as a component of an IM program evaluation. Average emissions can be plotted by time periods (preferably weeks or days) and compared with average temperatures and fuel seasons to determine if there is a seasonal variation in remote sensing and/or I/M emissions. To reduce any seasonal effect on emissions, remote sensing measurements for the Reference Method should be made during roughly the same time period. FINAL - 30 - ------- Figure 5.4. Daily Average CO, Arizona IM240 Daily Average CO (adjisted), IiitialTests of Passenger Cais 1995-97 Anzona M 240 140 120 Day FINAL -31- ------- Figure 5.5. Daily Average CO, Arizona Remote Sensing Aveiage R em ote S ensiig C O , by D ay 1996-1997 Arizona O U 0.6 Day FINAL -32- ------- 0.9 Figure 5.6. Daily Average CO, Arizona Loaded Idle (Pima County) Average Loaded EHe CO , by Day 1995-97 Arfeona 140 120 0.0 Day FINAL ------- Figure 5.7. Daily Average CO, Wisconsin IM240 Daify Average CO , IrtalTests of PassengerCais 1996-97 W isconsii M 240 140 -- 120 -- 100 ni Day FINAL -34- ------- 0.9 0.0 Figure 5.8. Daily Average CO, Minnesota Idle DaJlyAveiage CO , IiitialTests of Passenger Cais 1991-95 M Jnnesota Hfe 140 Day 5.9. Program Avoidance Goal: Account for emissions from motorists who are avoiding the I/M program. There is evidence that I/M programs are inducing owners to re-register their vehicles outside of I/M areas (70, 2f). If these re-registrations are legitimate, i.e. drivers relocating their residences or selling their vehicles to new owners outside of the I/M area, then the program has helped to reduce emissions in the I/M area. However, there is evidence that a portion of these re- registrations are attempts to avoid I/M testing and many of these vehicles continue to be driven in the I/M area (12, 22). Studies have estimated that program avoidance can lower the apparent CO reductions on the order of 2% (10, 12). This program avoidance complicates any evaluation of an I/M program, in that analysis of I/M data would indicate emissions reductions (vehicles FINAL -35- ------- leaving area) that are not occurring on the road. As discussed above, remote sensing data can include such vehicles in their estimate of fleet emissions. In addition, remote sensing data can be used to identify the subset of vehicles that are no longer registered in the I/M area but continue to be driven in the area. The design of a remote sensing program itself can influence which vehicles are measured under the program. A program which provides a negative incentive, such as additional I/M testing, for driving past a remote sensor may encourage drivers to avoid having their vehicle measured by a remote sensor. On the other hand, a program that is intended for research purposes only, or provides only a positive incentive (the possibility of being exempted from the next I/M test), will result in a more representative sample of vehicles measured. The distribution of vehicles (by type and model year) measured by remote sensing should be compared with the distribution of vehicles registered in the area, or reporting for I/M testing. Any differences between the two distributions may indicate a bias in one of the samples and suggest a possible program avoidance issue that needs to be addressed. However, care must be taken when performing this comparison because RSD measurements reflect on-road driving distributions while traditional I/M testing is registration based. Therefore, it is possible that RSD will over-sample newer vehicles relative to a registration-based I/M program. 5.10. Regional Differences (policies, environment fuel composition, etc.) Goal: Account for differences in fleet emissions not attributable to I/M across geographic regions. A number additional variables, such as environmental conditions, fuel composition, vehicle registration, safety inspection, public attitudes, and tax policies, etc., can result in biases in emissions measurements made in different regions. These biases would have the biggest impact on an evaluation using the Reference Method. Some of these potential biases and methods for minimizing their impact are discussed in more detail in the Reference Method section below. 5.11. Program Details Goal: Identify and account for I/M program operation details in the program evaluation analysis. Biennial I/M programs use a simple technique to determine which vehicles are to be tested in which year. For example, Colorado requires vehicles of even model years to be tested in even calendar years, and vehicles of odd model years to be tested in odd calendar years. Arizona bases a vehicle's test year on whether the last digit in the vehicle identification number is odd or even. Different states have different policies regarding whether I/M tests are required when a vehicle changes ownership, and the circumstances under which a vehicle's registration date and year changes. Additionally, vehicles that are newly registered in AZ or CO must be tested when they are first registered, regardless of their model year or last digit in their VIN. These factors complicate the determination of whether a particular vehicle has been tested under the current I/M program or not. Therefore, it is essential that the date of each vehicle's last I/M test be used to determine whether the vehicle has been tested under the current I/M program. FINAL - 36 - ------- States may also often have different policies regarding vehicle license plates. The license plate of a car sold in Colorado stays with the original owner, whereas the license plate is transferred to the new owner in Arizona. These policies may complicate the matching of license plates observed by remote sensing units with the correct vehicle and I/M test result information. These and similar details of registration and I/M programs should be understood to minimize the misidentification of the tested and untested vehicle fleet. 5.12. Emissions Distributions Goal: Identify possible sources of bias in the measured emissions. One way of determining whether emissions measurements are biased is to compare average emissions by vehicle type and model year, as described in Section 5.2. If average emissions by model year are consistently higher for one group of vehicles than another, then the emissions of that group of vehicles may be biased by some of the factors discussed above. Another approach is to compare the distribution of emissions of a subset of similar vehicles in the I/M-tested and untested fleets. Because vehicle emissions are highly skewed, with a majority of vehicles with low emissions and a small number of vehicles with very high emissions, differences between groups of vehicles will be more readily apparent if the distribution is plotted on a log-normal scale. Three ways to compare emissions distributions are outlined below. i) One way of looking for changes in the shape of emissions distributions is to look at the contribution of the dirtiest 10% of vehicles which contribute a large percentage of the total emissions. Table 5.3 illustrates the contribution of the dirtiest 10% of vehicles in each model year. The vehicles' CO emissions were measured at multiple RSD sites that have been divided into three groups based on the mean vehicle specific power of vehicles measured at the site. The percentages in Table 5.3 show that the percentage of emissions concentrated in the dirtiest 10% of vehicles is greatest among the newest model vehicles that have a smaller number of high emitters. ii) Another method is to divide vehicles into equal groups (quintiles or deciles), and plot the average emissions of each group. Decile plots focus attention on the majority of vehicles that have relatively low emissions. Figure 5.9 is a decile plot using the same data as Table 5.3 for 2000 model vehicles; however, now it becomes easier to distinguish between the low and high emitting vehicles. iii) The third method is to plot the full distribution of vehicles, rather than quintiles or deciles; the full distribution allows closer examination of the differences in the small number of high emitters in two samples of vehicles. Figure 5.10 is a full distribution plot of the data shown in Table 5.3 and Figure 5.9 for 2000 model vehicles. The use of a logarithmic scale highlights the difference among the few high emitters in each data set. FINAL - 37 - ------- Table 5.3. CO Emissions 10% dirtiest by MY Year 1976-80 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 <15 46% 47% 50% 51% 54% 54% 60% 61% 62% 62% 60% 61% 61% 60% 59% 61% 61% 63% 65% 68% 71% 72% Site VSP (kW/t) 15-17.5 43% 44% 44% 48% 48% 50% 56% 57% 60% 60% 60% 62% 61% 60% 60% 63% 61% 63% 66% 68% 73% 75% >17.5 39% 42% 44% 48% 48% 48% 53% 55% 59% 59% 61% 63% 64% 63% 65% 67% 67% 68% 72% 74% 77% 77% FINAL - 38 - ------- Figure 5.9. CO Emissions Decile Plots 2000 Model RSD CO 100 90 80 70 60 50 40 30 20 10 % of Vehicles C O FINAL -39- ------- Figure 5.10. CO Emissions Full Emissions Distributions 2000 Model RSD CO 17.5 kW/t 5-17.5 kW/t 15 kW/t % of Vehicles -1 2 CO% FINAL -40- ------- 6. Evaluation Methods This section outlines three methods to use remote sensing data to analyze I/M program effectiveness over the short term. The first two methods, the Step Change and Comprehensive methods, involve remote sensing measurements collected in an I/M area; the final method, the Reference Method, compares remote sensing measurements collected in an I/M area with measurements collected in an external, or reference. Each of these methods is described in more detail below. 6.1. Step Change Method 6.1.1. Description There are several reasons for performing on-road emission reductions independent of an I/M program. New technology vehicles are lower emitting for a given fleet age than older technology vehicles. Depending on local and national economic factors, the fleet age itself may be changing (newer vehicles are lower emitting) and it is possible that public education or willingness to carry out required maintenance is less compliant than anticipated, and the auto repair industry capability are improving irrespective of the presence or absence of an I/M program. All these factors make it important not only to measure the on-road emission reductions of the I/M fleet, but also to measure the emissions of a well matched control fleet, preferably differing only in I/M status. The Step Method is an on-road evaluation of new or changed I/M programs using a built-in representative control group. On-road emissions are the parameter which I/M programs are intended to control ; however, most I/M programs emphasize testing of fully warmed-up exhaust emissions. If I/M exhaust emissions failure is followed up by successful repair, scrapping the vehicle or relocating it to a region from which it is rarely driven in the program area, then the program should show on-road exhaust emission reductions. When a new I/M program starts or when there is a major program change, then there is a window of opportunity to evaluate the effectiveness of that change in reducing on-road emissions. That window arises when the new (or changed) program has impacted about 50% of the local fleet. If an annual program starts, then the window is after about six months. In a biennial program the window is after the first year. The concept behind this evaluation is that the untested fleet serves as the representative control group for the tested fleet. Ideally, data collection should be carried out at a sufficient number of sites in the area to ensure appropriate representation and sampling should include surface streets as well as highway on/off ramps; however, a single well-traveled site can be representative of an I/M area. As mentioned in Section 5.6, one method to determine if a true cross section of vehicles is being sampled is to plot the percentage of RSD measurements vs. ZIP code. It is tacitly assumed on-road emissions are controlled by linking the I/M standards to certification standards, as vehicle emissions shouldn't be expected to be reduced below their certification levels. Whether this strategy is appropriate or valid with respect to reducing on-road emissions and improving air quality is a discussion beyond the scope of this document. FINAL - 41 - ------- 6.1.2. Application Examples Colorado had various versions of decentralized idle/2500 tests since the early 1980s and switched in the Denver metro area to a biennial centralized EVI240 based program on January 1 1995. Because the program is biennial, by January of 1996, roughly half the measured fleet (odd MY) had been through the new I/M program and the other half (even MY) had missed a year of their old annual program. On-road monitoring was carried out for five days in January of 1996 at a single heavily trafficked site. Approximately 26,000 valid, plate-matched records were obtained. Data were collected at a freeway off-ramp to eliminate cold-start vehicle emissions. Vehicle load was not measured as it was assumed tightly curved uphill ramps have little off-cycle power- enrichment, and the tested and untested MY are randomly interspersed and subject to the same loads thus making for a valid comparison independent of load. Additionally, the VSP concept was at best in the developmental stage. However, EPA strongly would recommend vehicle load be characterized using VSP as described earlier for all program evaluation studies. DMV records provided county of registration, I/M eligibility and most recent I/M status (pass, fail, or waiver). Individual emission data bases are not normally distributed; however, if one treats the means from each measurement day as an independent sample then these sub-samples can be analyzed using normal statistics. This resulted in 5 means (1 for each day) per fleet. For a fleet of about 26,000 vehicles it was found that the uncertainty in the apparent emissions benefits is +2%. This error would be reduced with a larger fleet size provided that approximate equality between tested and untested vehicles could be maintained. The first analysis was "eligible and certainly tested" versus " eligible in the future but not tested" giving an apparent 7+2% CO benefit. During this first analysis it was recognized that many vehicles should have been tested but were not, so a second analysis was "should have been tested" versus "not tested". This reduced the apparent benefit to 6+2%. Approximately 1300 vehicles registered in locations not required to take the I/M test were also measured at one site and these vehicles showed higher average on-road emissions. However, they also showed an alternation of emissions by MY as if the I/M program had caused failing vehicles to be reregistered to outlying counties but yet continue to be driven in Denver. A follow up study a year later confirmed that indeed this effect is happening and, when included for that site, reduces the apparent benefit by 2%. The contribution of these "repair avoidance" cheaters to the basin wide fleet emissions cannot be determined from one freeway interchange site, but their emissions were large enough that at the measurement site the 6+2% apparent I/M CO benefit was reduced to 4+2% (70, 72). The same database actually allowed for two other I/M benefit tests of lower precision. Using only the even MY vehicles, the on-road emissions of those tested versus those untested was evaluated. This resulted in a 5+3% apparent I/M CO benefit. Evaluation of the difference in on- road emissions between vehicles of all MY tested within four months before the measurement time and two months after indicated an apparent 8+6% benefit for CO. On-road benefits for HC and NO were insignificant. The analyses discussed above were published in the literature (70). FINAL - 42 - ------- Several factors obscured the clean 50/50 split between untested even MY and tested odd MY in these studies. For instance, many 1994 MY vehicles were tested in 1995, 1995 and later MY new vehicles obtained a four-year I/M waiver, and all vehicles had to take the I/M test upon change of ownership regardless of MY. However, many of these potentially confounding factors can be corrected. 6.1.3. Potential Systematic Errors A major advantage of a single-site, single time I/M evaluation study is that instrument calibration and vehicle load/speed are irrelevant since both fleets are subject to the same measurement system. A second advantage is the measured and the control fleets are perfectly matched socioeconomically. A third advantage is that the evaluation can be carried out with only a single week of work to within 2% accuracy levels, and the fleet average remote sensing data has been shown to correlate very well with fleet average IM240 data (10). However, three disadvantages are apparent; one that the window of opportunity is only when a new program starts up or a program change which is predicted to have measurable effect is initiated; the second that the reference group of untested vehicles may not be a correct reference; the third is added diligence is needed to ensure a representative sample is obtained. There is some evidence that change of ownership vehicles have higher emissions than the average of the same MY. This effect would cause the average of the untested even MY vehicles (the control group) to be biased low and thus cause an underestimate of apparent I/M benefit. It is possible to attempt to correct for this bias25. This study eliminated the large sample of 1994 MY vehicles which had been tested because they were very numerous and certainly a few months older than the untested (last quarter) of the 1994 MY. These two effects both lower the apparent emission of the untested fleet, thus increasing the apparent I/M benefit from the previous 4%-7% range to 8%-ll% with the same +2% error. The last two analyses are not effected by these corrections and remain at 5+3% and 8+6% apparent I/M benefit for CO (12). There had been an annual I/M program in place in Denver for more than ten years. The odd MY fleet took the old test in 1994 and the new in 1995. The untested even MY fleet skipped testing in 1995 because their scheduled IM240 was in 1996. If the old program had no benefit, then this skip introduces no bias. If the old program had emissions benefits which last a long time (long repair lifetimes as in the EPA Mobile model) then no bias is introduced, but, the apparent benefit is that of the new program relative to the older one; not relative to a "no I/M" baseline. To the extent that repair lifetimes are not as long as modeled by EPA and the old program did lead to reduced emissions, then the skipped annual test moves the control group back toward the no I/M line, thus overestimating the I/M benefit relative to the previous program but with the upper limit being relative to no I/M. To correct for this bias, one needs to estimate both the emission reductions from the previous (idle/2500) program and the apparent repair lifetime, but this is not straightforward. If from the DMV records one can determine which tested odd MY vehicles were not changing ownership, then the even MY bias is removed and the study measures the apparent I/M benefit for the fleet which does not change ownership. FINAL ------- 6.2. Comprehensive Method 6.2.1. Description The Comprehensive Method involves comparing remote sensing emission measurements of a fleet of vehicles measured prior to initial I/M testing with those of a fleet of vehicles measured after final I/M testing. The difference in fleet average remote sensing emissions is the initial percent reduction due to the I/M program. Sufficient numbers of measurements are made so that emission reductions can be evaluated by vehicle type and model year, and by I/M result. Important observations about repair effectiveness and program avoidance can be made if enough vehicles are measured. One of the main reasons for using remote sensing measurements to evaluate the effectiveness of I/M programs is that remote sensors measure emissions of vehicles that may not be participating in an I/M program. The Comprehensive Method differs from other remote sensing methods, in that it explicitly compares emissions reductions of the I/M tested fleet as measured by the program and as measured independently by remote sensing. The Comprehensive Method can also be used to compare the emissions of the I/M-tested fleet with those of the non-I/M-tested fleet, as can the other methods. 6.2.2. Application Examples The Comprehensive Method concept was first applied by Doug Lawson, using unscheduled roadside idle testing of randomly selected vehicles from CARB's 1989, 1990, and 1991 random roadside surveys. Lawson found that average emissions levels of vehicles tested prior to their I/M test were about the same as those of vehicles tested after their I/M test. The emissions levels measured during the scheduled I/M tests were 60% less than the emissions measured during unscheduled testing either before or afterwards (24). The analysis was limited, in that fewer than 5,000 vehicles were analyzed in any given year. Radian International was the first to apply this method using remote sensing data, in a 1997 evaluation of California's I/M program for the California Bureau of Automotive Repair (25). For their analysis Radian had access to over 3.5 million RSD measurements from the Statewide On- Road Emissions Measurement System. Because of concerns regarding the accuracy of some of the RSD instruments, the first 6 months or so of RSD data were not included in the analysis (the report gives no indication of how many measurements, or vehicles, were involved in the analysis). Radian also excluded RSD measurements taken at sites that had a relatively high percentage of high emitting vehicles from the newest model years. Radian grouped the RSD measurements into two time periods: 30 to 90 days prior and 0 to 90 days after. Radian also grouped vehicles by model year group and I/M outcome (initial pass, initial fail/final pass, initial fail/no final pass). However, despite the large sample size, Radian did not have enough remote sensing measurements to compare pre- and post-I/M remote sensing emissions of the same vehicles (that is, a total of three emissions measurements on each vehicle), so the cost associated with such a study should not be underestimated. FINAL - 44 - ------- More recently the Comprehensive Method was used by Lawrence Berkeley Laboratory in analyzing 4 million remote sensing measurements on 1.2 million vehicles in the Phoenix I/M area (18,, 22, 26). It was found that initial emissions reductions as measured by remote sensing were roughly half that as measured by the initial and final IM240 tests; IM240 data indicated a 15% reduction in fleet-wide CO and HC emissions due to the program (g/mi units), while the remote sensing data indicated only a 7% reduction in CO and an 11% reduction in HC emissions (g/kg fuel units). Because there is a small gas mileage benefit to CO and HC emissions reductions the per mile emission reduction as measured by RSD would be slightly higher. For instance, assuming a 10% gas mileage improvement and a 10% emissions reduction after repair would increase the 7% CO and 11% HC g/kg fuel RSD measurements to 8% and 12% respectively. However, these values are still below the 15% reductions determined using IM240 data. Part of this discrepancy may be due to the different loads vehicles are subjected to under IM240 testing and remote sensing measurement. As in the earlier Step Method study, the VSP concept was still under development and not available as a tool to reduce measurement bias due to vehicle load. The Comprehensive analysis found that average remote sensing emissions increased as vehicles got further from their I/M test; the initial 12% reduction in fleet-wide CO emissions less than one month after I/M testing declined to only a 6% reduction in fleet-wide CO emissions one year after I/M testing. In other words the repair benefits did not last nearly as long as they do in the I/M models. The Comprehensive analysis also found that average RSD emissions increase as vehicles get closer to their scheduled I/M test; this is especially true for vehicles that fail their initial I/M test. An analysis of emissions trends in the weeks prior to their initial I/M test indicates that the average emissions of these initial fail vehicles do decline slightly immediately prior to I/M testing, suggesting that pre-test repairs and/or adjustments are being made. 6.2.3. Steps Under the Comprehensive Method, a large number of remote sensing measurements are taken at suitable sites throughout an I/M area. License plates from the remote sensing measurements are then matched with license plates either in a registration database, or in the I/M testing database. How remote sensing measurements are matched with vehicle information depends on how each state registers vehicles. For instance, some states (such as Arizona) assign license plates to vehicles; when a vehicle is sold, the license plate stays with the vehicle. In contrast, other states (e.g. Colorado) assign license plates to a driver and when a vehicle is sold the license plate stays with the driver and can be affixed to a new vehicle. It is critical that license plates obtained from remote sensing programs be matched to the correct vehicle, and in some cases this will require tracking a vehicle's VIN to link it with the appropriate I/M test record and then match it to the RSD measurement. It must be understood that depending on a state's infrastructure regarding vehicle registration tracking and ease of access to the I/M test database, this task can be very difficult. The result is a large database of vehicles, some with multiple remote sensing measurements and multiple I/M tests (vehicles that fail their initial I/M test and return for subsequent testing). Vehicles are then classified into several groups, based on the results of their I/M test(s): 1) vehicles that pass their initial I/M test; 2) vehicles that fail their initial test but pass a subsequent FINAL - 45 - ------- test; 3) vehicles that fail their initial test and do not receive a subsequent I/M test; and 4) vehicles that fail their initial test and fail a subsequent I/M test. Vehicles can be further categorized into more groups, based on the time between initial and final I/M test, or the results of their emissions test vs. the results of visual or functional I/M tests. Individual records are then categorized based on the time between the remote sensing measurement and the initial or final I/M test. For example, individual remote sensing measurements of vehicles can be grouped into 3 month time periods (0 to 3 months, 3 to 6 months, 6 to 9 months) prior to the vehicle's initial I/M test and after the vehicle's final I/M test. These time periods can be shortened to as little as one month or one week, depending on the number of remote sensing measurements. Remote sensing measurements of individual vehicles with multiple measurements in a given time period can be averaged to obtain a single measurement for that vehicle in that time period, or can be treated as independent observations (meaning that some vehicles are "double-counted" in some time periods). Given the size of the database collected in the Comprehensive Method, valuable insight into repairs and repair durability can also be estimated. Analyses should include calculating average emissions as a function of time period, I/M result, vehicle type and model year and plotting the results. To determine the initial effectiveness of the I/M program, only remote sensing measurements over a relatively short period should be used, to minimize the impact of changes to the vehicle on the results. For example, average emissions of up to 3 months prior to initial I/M test can be compared with average emissions of up to 3 months after final I/M test. The difference in average remote sensing emissions is the initial emissions reduction due to repair of many vehicles identified by the I/M program as high emitters (some of the emission reduction may also be due to vehicles passing a subsequent I/M test without any repairs being made.) The emission reduction can be calculated for the entire tested fleet to determine the overall impact on the fleet, as well as for subsets of the fleet with different I/M results, to determine the impact of the program on, say, vehicles that fail initial I/M testing. The initial emissions reductions as measured by remote sensing can then be compared with the initial emissions reductions as measured by I/M testing. The analysis can also be extended to time periods further after I/M testing to analyze the short-term durability of any repairs made under the I/M program. There is evidence that some vehicles are repaired or receive maintenance just before their scheduled I/M test; these pre-test repairs may result in the initial I/M test underestimating the average emissions prior to I/M testing. This underestimation of the baseline emissions may in turn result in an underestimation of the effectiveness of the I/M program. The data in Figure 6.1 provide evidence that owners do perform maintenance prior to an I/M test and survey data indicated that 35% of vehicle owners brought their vehicle in for a tune-up prior to their initial test. The figure shows average weekly remote sensing CO emissions in different time periods before the initial, and after the final, I/M test of each vehicle. The figure indicates that emissions increase as vehicles get closer to their I/M test; however, emissions decrease substantially (12%) about three weeks prior to the initial I/M test. An evaluation based only on measurements taken immediately before and after I/M testing would estimate an 8% reduction in emissions. If the effect of pre-test repairs and adjustments are included, however, the reduction attributable to the program increases to 18%. To minimize the effect of pre-test repairs on baseline emissions, remote sensing measurements made within a month before a scheduled I/M test can be excluded FINAL - 46 - ------- from the analysis (i.e., remote sensing measurements from 1 to 3 months prior to the initial I/M test can be compared with remote sensing measurements from 0 to 3 months after the final I/M test; Radian used this approach in their analysis of California RSD data). Figure 6.1. Average CO RSD Emissions by Time Period, 1996-97 Arizona Remote Sensing Average CO RSD Emissions by Time Period 1996-97 Arizona Remote Sensing SS o u o Cfl oc A. 12% reductic B. 8% reduct: A + B 18% reducti< H h H h H 1 1 h H h 13 12 11 10 9 8 7 6 5 4 3 2 1 Number of Weeks Prior to Initial IM240 H h H h H h 1234567 After Final IM240 6.2.4. Advantages/Disadvantages There are several advantages to using the Comprehensive Method: i) The initial emissions reductions attributable to the program can be independently measured, and can be compared with those measured by the program itself. ii) The repair effectiveness over the short-term (i.e., up to 2 years after final I/M testing) can be independently measured. Short-term repair effectiveness can be compared with long-term repair effectiveness as measured using multiple years of in-program data on the same vehicles. iii) The effect of pre-test repairs on average emissions can be measured. FINAL - 47 - ------- iv) Because large numbers of remote sensing measurements are made, the Comprehensive Method allows the identification of vehicles that do not report for, or do not complete, I/M testing, yet are still being driven in the I/M area. Video camera surveillance can also be used to identify non-compliant vehicles, at less expense than remote sensing measurement; however, video cameras will only provide information on registration avoidance without any air quality data on high emitting vehicles The primary disadvantage of the Comprehensive Method is that it requires a large number (on the order of millions) of remote sensing measurements. The method can be applied on smaller sample sizes (20,000 or more), but the error on the fleet average emissions estimate will increase. Since RSD measurements made up to roughly 3 months prior to and after I/M testing are most representative of the condition of the vehicles when they were tested under the I/M program, only these measurements can be used to estimate initial program effectiveness. In a biennial (24- month) I/M program, therefore, only about a quarter of the vehicles measured by RSD will have been measured within 3 months of their I/M test. However, the remaining RSD measurements can be used to estimate short-term repair effectiveness, and the effect of pre-test repairs on fleet emissions. 6.2.5. Potential Systematic Errors Because the Comprehensive Method relies on large numbers of remote sensing measurements, the remote sensing program will likely have to occur over several months or possibly a year. Vehicle emissions as measured by the Arizona and Colorado IM240 programs vary by season; HC and CO are higher in warmer summer months, while NOx is higher in winter months. It is unclear whether this variation is due to a combination of seasonal temperatures and changes in fuel composition, or to inadequate conditioning of vehicles prior to testing (the seasonal variation in the Wisconsin EVI240 program data, Arizona remote sensing data, and the Minnesota idle program data are in the opposite direction of the variation in the Arizona and Colorado IM240 program data). No existing I/M programs vary their cutpoints by season to account for seasonal effects on emissions. There is a possibility that seasonal variation in emissions measured by remote sensing and the I/M program may introduce a systematic bias in the analysis. The efficiency of remote sensing sites in identifying unique vehicles decreases over time; that is, many vehicles drive by the same sites every day. So concentrating the remote sensing program on a handful of sites, measured throughout the year, may limit the total number of vehicles measured. More sites may be used to increase the number of vehicles measured; however, this may increase any effect of site bias (either due to the fleet of vehicles or the roadway configuration at individual sites) on the evaluation results. The vehicle specific power of individual remote sensing readings can be calculated, using roadway grade at the remote sensing site as well as speed and acceleration measurements, and used to minimize any site bias attributable to site characteristics, as discussed in Section 5.3. FINAL - 48 - ------- 6.3 Reference Method 6.3.1. Description The Reference Method for evaluating I/M programs involves comparing remote sensing data from vehicles registered in an I/M program area to vehicles registered in a non-I/M program area. (The Reference Method may also be used to compare the fleet average emissions from one I/M program to the fleet average emissions of another I/M program; although this section focuses on the I/M to No-I/M comparison.) Obtaining an adequate sample size of non-I/M program vehicles will typically require conducting measurements in a separate geographic area, or the "reference" area. The reference area, by virtue of its absence of an I/M program, serves as a surrogate untested fleet. The difference in fleet emissions between the I/M program area being evaluated and its "reference" area represents the emission reductions attributable to I/M program effectiveness. Additionally, this difference can then be compared with that predicted by mobile models, such as MOBILE, to determine an overall effectiveness rating. The validity of this approach depends upon selecting a reference area without distinctive characteristics that will systematically bias the evaluation, as well as the accuracy of the model if such an approach is used. This section provides general guidance for conducting such an evaluation, including selection of a reference area, data needs, and data analysis approaches. 6.3.2. Application Examples The Air Quality Laboratory of Georgia Institute of Technology used the Reference Method to evaluate the effectiveness of the basic I/M program in place in Atlanta in 1994. At that time, I/M was required for vehicles registered in only four counties of the Atlanta 13-county metropolitan area: Fulton, DeKalb, Cobb and Gwinnett. The remaining nine counties, which were not tested until enhanced I/M was implemented, served as the reference fleet. The results of the evaluation indicated that Atlanta's basic I/M program was more effective for cars than predicted by the MOBILE model, but less effective than predicted for trucks. The Georgia Department of Natural Resources used this result to support the mobile source emission reduction credit claimed in the State of Georgia's 1996 State Implementation Plan. The Reference Method was also be used to evaluate Atlanta's enhanced I/M program in October 2000. 6.3.3. Applying the Method Using the Reference Method for I/M program evaluation involves three major tasks: selecting a reference area, gathering the necessary data, and analyzing that data. 6.3.3.1. Reference Area Selection There are 6 key criteria to consider in selecting a reference area and they are presented below. i) Distance Perhaps the most critical criterion for selecting a reference area is suitable geographic distance from the reference area. Recent analyses of Denver and Ohio registrations suggest that I/M programs motivate vehicles to migrate out of an area to adjacent non- FINAL - 49 - ------- I/M counties (12, 27). Thus, if an agency were to select an adjacent area to evaluate its I/M program, higher-emitting vehicles may migrate to the reference area, making for an artificially dirtier untested fleet. Therefore, reference areas should be chosen at a significant distance from the I/M program area to lower the probability of vehicle migration. ii) Fleet Age The age of the fleet is another critical factor in selecting a reference area. Vehicle age is a well-documented contributor to automobile emissions. Consequently, fleet age is a critical consideration in selecting a reference area for an I/M program evaluation. To illustrate, comparisons between an older fleet within an I/M area and a younger fleet in a reference area will underestimate I/M program effectiveness. Isolating emissions by model year between the older and younger fleet will improve the comparison, but such controls will not account for the affects of higher annual vehicle miles traveled (VMT) or potentially higher maintenance rates of the older fleet that influence emissions. VMT data are not readily available in all jurisdictions, but may be inferred using traffic count data and vehicle population information from the state department of transportation. While VMT may be estimated from other data sources, maintenance rates are generally unobservable. Thus, the reference fleet should be roughly the same age as the I/M area fleet. Comparable fleet age can be determined most easily by a bar chart that plots the percentage distribution of vehicles within each model year for the I/M program area and its reference area. iii) Climate Climate is another key consideration in selecting a reference area. A variety of factors related to climate affect automobile emissions, and thus the selection of a reference area. For example, salt may be applied to roads in colder climates, potentially resulting in higher rates of catalytic converter rusting, which in turn influences vehicle emission control capacity. At the other extreme, high temperatures, such as those found in Arizona, may more rapidly dissolve the polymer used in emission components, adversely affecting their functioning. Altitude is another climatic factor which may result in differential emissions through potentially faster deterioration rates of emission control systems. A wealth of resources - including National Weather Service data - are available to assist policymakers in identifying areas within their region that provide comparable climatic conditions. iv) I/M Program Policies Differences in policy programs between an I/M evaluation area and its reference area may bias program evaluation. For example, a safety inspection program that requires functional lights and brakes may speed fleet turnover by denying registration to poor- condition vehicles. While the emission profiles of these vehicles is uncertain, it is a reasonable hypothesis that they are higher than average emitters and that a safety inspection program will weed some of them out, thus shifting fleet emissions downward. Thus, the presence of a safety program in a reference area might underestimate the effectiveness of the I/M program being evaluated by providing an artificially low baseline for comparison. FINAL - 50 - ------- v) Motor Vehicle Tax System The tax system for motor vehicles is another source of variance in the fleet distribution. To illustrate, an ad valorem tax that declines rapidly with vehicle age may have the affect of slowing fleet turnover by making ownership of older vehicles more affordable. Conversely, ad valorem taxes in a reference area that are onerous among all model years may shift the income level of older vehicle owners upward such that the socioeconomic characteristics of vehicle owners are not equivalent by model year between the comparison areas. State policies on antique vehicles can also influence fleet age and condition. For example, Georgia vehicles 25 years and older receive permanent tags with no further requirements for taxation, emissions testing or registration. This exemption may result in a concentration of very old vehicles compared with other areas that offer no such exemption. These are just a few examples of how public policies seemingly unrelated to air quality can nonetheless influence fleet emissions. Consequently, policymakers should research policy programs in candidate states to rule out the potential for systematic emission biases that could result from their presence. vi) Socioeconomic Factors Finally, socioeconomic conditions are the least studied of the influences on automobile emissions. Most of the evidence regarding the influence of socioeconomics on fleet condition and emissions is anecdotal, relying on conventional wisdom that less affluent people will drive older vehicles (an assertion for which there is some evidence) and that they cannot afford to properly maintain their vehicles (for which there is little evidence). Another assumption is that older motorists drive their cars infrequently but maintain them well. While socioeconomic conditions have received relatively little scholarly attention in comparison with physical influences on automobile emissions, it is nonetheless wise to consider them in selecting a reference area because they may represent the unobserved influences of maintenance practices, driving behavior, and culture. 6.3.3.2. Data Needs In addition to remote sensing data from the I/M evaluation area and its comparison fleet, the Reference Method requires registration data, I/M records, and model outputs, assuming it is desired to include the model as a part of the analysis protocol. Remote sensing data should be collected from the I/M program area and its reference area under similar physical conditions and within roughly the same timeframe. Simultaneous data collection prevents differences that may occur due to temperature affects on emissions or seasonal policy changes such as fuel changes. Registration data are needed to generate the characteristics of remotely sensed vehicles, such as registration address, model year and vehicle type. The registration address is particularly critical for identifying whether a vehicle is located in the I/M area or reference area. For example, if the I/M program area and reference area are located near one another, then it is possible to measure inspected vehicles in the reference area and reference area vehicles in the I/M program area. Registration address can also be used to generate demographic characteristics for the registration area. This process, known as geocoding, locates the census block group of the registration address. The census block group, in turn, can be used to generate demographic data from the FINAL -51- ------- most recent national census on its residents. These demographic data include median household income, median family income, and the number of households receiving social security, retirement and public assistance. Given the inverse relationship between a census block group's median household income and the average age of its registered fleet, these data provides additional controls for fleet age, as well as safeguard controls for the unobservable influences of maintenance practices, driving habits, and cultural effects (28). I/M records provide two optional pieces of information for the Reference Method. The first is odometer data. Odometer data can be used to extract annual vehicle miles traveled (VMT), which contribute to wear and tear and ultimate deterioration of a vehicle's emission control system. VMT is typically calculated by subtracting odometer readings for two consecutive years, dividing by the number of days between inspections, and multiplying that figure by 365. (The daily mileage must be multiplied by 730 for states with biennial testing.) I/M records can also be used to identify "invalid" reference area vehicles and non-compliant inspection area vehicles. If a significant number of reference area vehicles have recently migrated from the inspection area, it is possible that the evaluation be biased high or low depending on the average emission level of the migrating vehicles. I/M records can also be used to estimate noncompliance in the I/M program area by identifying vehicles whose emissions inspections have lapsed. This information prevents the I/M fleet from appearing artificially dirty, while contributing valuable information to the compliance aspect of program performance. Finally, emission factor modeling output from MOBILE or another model that predicts emissions of the inspected and non-inspected fleets can then be used to compare with real-world differences in inspected and non-inspected fleets measured by the remote sensing data. RSD data can also be combined with exhaust emission factors for cars and light-duty trucks extracted from the model tailpipe emission factors. These emission factors project average grams/mile by model year and are the product of a range of inputs, including program design (testing technology, model-year coverage, and emissions standards), fleet characteristics (fleet VMT and age distribution), and operating modes (hot stabilized emissions to correlate with the condition of in-use vehicles). Inputs for the I/M-county fleet will include the design elements of the current program such as the emissions analyzer, range of model years required for inspection, and the testing mode, e.g. one-speed idle testing. I/M program elements for the non-I/M fleet are simply omitted. The modeling process will also require the model year distribution of the evaluation and reference fleets. It should be noted that use of MOBILE may introduce analytical complexity as well as increased technical uncertainty in the results due to the internal coding of the model that will inherently make comparisons and computational assumptions the user may not fully appreciate. 6.3.3.3. Data Analysis The Reference Method can involve a variety of analytical approaches to assess the effectiveness of an I/M program. The raw emissions of an I/M program area and its reference area can be compared with histograms to determine any differences in the distribution of high emitters, low emitters, and median points. The significance of emissions differences by model year can be FINAL - 52 - ------- determined through error bar charts that plot the mean emissions plus the associated uncertainty. Regression modeling can be used to determine the influence of registration in the I/M program area versus its reference area on emissions. RSD emission differences in inspected and reference fleets can be compared to the differences predicted by EPA mobile models to determine an I/M program effectiveness rating (29). 6.3.4. Advantages and Disadvantages The Reference Method has strengths and weaknesses for the evaluation of I/M programs. Most importantly, it is a quantitative estimate of I/M effectiveness that is easy to calculate given adequate data, although incorporating modeling output into the analysis will certainly add a layer of complexity. As an external reference point for evaluating I/M programs, it provides ongoing opportunities for evaluation whether a program is within a year of implementation or five years into operation is irrelevant. However, a significant amount of information is required beyond remote sensing data, including registration records, I/M records and model outputs. Furthermore, no reference area will completely match the I/M area profile, thus there is always the risk that some characteristic will systematically bias the I/M program evaluation higher or lower than it should be. Finally, the method will not work in some states (such as California), where there are no reference fleets because the entire state is included in the I/M program area. The Reference Method can also be used to compare on-road emissions in the region to be evaluated to those in another region, such as Arizona, where I/M effectiveness has been estimated by other methods. 7. Summary Three methods for estimating I/M program effectiveness using RSD data were outlined in this guidance. Every effort was made to provide as much detail as possible with regard to data collection procedures, QA/QC protocols, analysis methods, and sources of error or possible bias associated with a given method; however, it is recognized that improvements to those methods outlined in this document will continue to evolve. Therefore, it is strongly recommended that any state considering the use of RSD for program evaluation purposes work closely with their respective regional EPA office and the Office of Transportation and Air Quality to ensure the most up-to-date practices are incorporated into the evaluation. Furthermore, states interested in using RSD for program evaluation must recognize the need within their own agencies to develop a minimum level of expertise with the technology and procedures to ensure reliable data are collected and analyses are performed properly. It should also be recognized given the difficulties associated with I/M program evaluations, that an evaluation based on both out-of-program data (e.g. RSD) and in-program data will provide a more accurate estimate of overall program performance than simply relying on one method alone. FINAL ------- 8. References 7 Clean Air Act, 1970 2 Clean Air Act Amendments, 1977 3 EPA Inspection/Maintenance Policy Guidance, 1978. 4_ Clean Air Act Amendments, 1990. 5 57 FR 52950 or 40 CFR Part 51, IM Program Requirements; Final Rule, November 5, 1992. 6 National Highway System Designation Act of 1995 (23 U.S.C. 101). 7_ 62 FR 1362 or 40 CFR Parts 51 and 52, Minor Amendments to Inspection Maintenance Program Evaluation Requirements; Amendment to the Final Rule, January 9, 1998. 8_ "Guidance on Alternative IM Program Evaluation Methods, EPA Memo, Office of Mobile Sources, Regional and State Programs Division, October 30, 1998. 9 Singer, Harley, Littlejohn, Ho and Vo, "Scaling of Infrared Remote Sensor Hydrocarbon Measurements for Motor Vehicle Emission Inventory Calculations", ES&T (32)21, p.3241, 1998. 10_ Stedman, Bishop, Aldrete, Slott, "On-Road Evaluation of an Automobile Emission Test Program" ES&T., 31, p.927, 1997. 77 Stedman and Bishop, "Measuring the Emissions of Passing Cars", Accounts of Chemical Research, 29(10), p.489, 1996. 72 Stedman, Bishop, Slott, "Repair Avoidance and Evaluating Inspection and Maintenance Programs", ES&T, 32, p. 1544, 1998. 73 Wenzel, Singer, and Slott, "Some Issues In the Statistical Analysis of Vehicle Emissions", J. Transportation and Statistics (3)2, p.l, September 2000. 14_ Mann and Jones, CRC Report, "On-Road Remote Sensing of Automobile Emissions in the Research Triangle Park, North Carolina Area: 1997 and 1998", p.5, March 2000. 75 Wenzel and Gumerman. "In-Use Emissions by Vehicle Model", Presented at 8th CRC On- Road Vehicle Emissions Workshop, San Diego, CA, April 1998. J_6 McClintock, "The Colorado enhanced I/M Program 0.5% Sample Annual Report", Remote Sensing Technologies Inc., Prepared for Colorado Department of Public Health and Environment, 1998. 17_ Jimenez, McClintock, McRae, Nelson and Zahniser "Vehicle Specific Power: A Useful Parameter for Remote Sensing and Emission Studies." Presented at 8th CRC On-Road Vehicle Emissions Workshop, San Diego, CA, April 1998. 75 Wenzel, Reducing Emissions from In-Use Vehicles: An Evaluation of the Phoenix Inspection and Maintenance Program using Test Results and Independent Emissions Measurement, Environmental Science and Policy, (4), p.359, 2001. 79 Wenzel, "I/M Failure Rates by Vehicle Model", Presented at 7th CRC On-Road Vehicle Emissions Workshop, San Diego, CA, April 1997. 20 Stedman, Bishop, Beaton, Peterson, Guenther, McVey and Zhang, "On-Road Remote Sensing of CO and HC Emissions in CA", Final Report to Air Resources Board, AO32- 093. 27 McClintock "The Denver Remote Sensing Clean Screening Pilot" , Prepared for the Colorado Department of Public Health and Environment, 1999. FINAL - 54 - ------- 22 Wenzel, "Reducing Emissions from In-Use Vehicles: An Evaluation of the Phoenix Inspection and Maintenance Program using Test Results and Independent Emissions Measurement", Environmental Science and Policy, (4), p.377, 2001. 25 Slott, "The Use of Remote Sensing Measurements to Evaluate Control Strategies: Measurements at the End of the First and Second Year of Colorado's Biennial Enhanced I/M Program", Presented at the 8th CRC On-Road Vehicle Emissions Workshop, San Diego CA, April 1998. 24_ Lawson, '"Passing the test'Human behavior and California's Smog Check program," J. Air Waste Manage. Assoc., 43, p.1567, 1993. 25 Klausmeier and Weyn, "Using Remote Sensing Devices (RSD) to Evaluate the California Smog Check Program", Report to the California Bureau of Automotive Repair, October 2, 1997. 26_ Wenzel, "Human Behavior in I/M Programs," Presented at the 15th Annual Mobile Sources/Clean Air Conference, Snowmass, CO, September, 1999. 27 McClintock, "I/M Program Avoidance and Enforcement", Presented at the 15th Annual Mobile Sources/Clean Air Conference, Snowmass, CO, September, 1999. 25 Leisha DeHart-Davis private communication with Jim Lindner. 29 Rodgers, Lorang, DeHart-Davis, "Measuring EVI Program Effectiveness Using Optical RSD: Results of the Continuous Atlanta Fleet Evaluation", Atmospheric Environment, submitted for publication. FINAL - 55 - ------- Appendix A: On-Road Evaluation of a Remote Sensing Unit All on-road remote sensors carry out at least a measurement of the CO/CO2 ratio in the exhaust of a passing vehicle. It is possible for an interested party to carry out a quantitative evaluation of the precision of this measurement. This evaluation can be done without going to the expense and complexity of an on-road audit using a vehicle of known emissions (wet gas audit), or a vehicle designed to puff surrogate compressed gas mixtures of known ratios (dry gas audit). The measurement of exhaust CO/CO2 ratio is obtained by estimating the slope of a graph of CO versus CO2 (or more properly delta CO versus delta CO2). The evaluation is carried out by observing the quality of the individual data points which are used to derive this slope. Several on-road remote sensors operate for 0.5 seconds at 100 hz, thus obtaining 50 data points for this correlation. Several on-road remote sensors use a puff of gas of known CO/CO2 ratio as a field calibration. For these sensors, the system operator can display the CO/CO2 graph from a calibration, whether the calibration was considered valid or not. EVALUATION OF A CALIBRATION PUFF: Figure 1 shows a valid CO/CO2, HC/CO2 and NO/CO2 on-road calibration puff (FEAT 3002, Sept. 27, 2001, Casa Grande, AZ). When evaluating a remote sensor, the first parameter to note is the quality of the data and the fit. In the case shown, all 50 points are almost touching the straight line and r2 = 0.99. The next parameter to note is the extent of the data spread on the CO, HC, NO and CO2 axes. Different instruments use different units. These graphs show the gas concentrations %CO, %HC (propane), %NO and CO2 in an 8cm cell . These units are chosen to correspond approximately to what would be measured were one to directly probe a tailpipe. The units however do not matter, but the spread of both gases in a plot such as Figure 1 is important to note. Figure 2 shows a CO/CO2, HC/CO2 and NO/CO2 on-road calibration puff (FEAT 3002, August 29, 2001, Phoenix AZ). This was not a valid calibration. In this case, the calibration gas appears to be mixed with exhaust from a vehicle which had recently passed through the optical beam. It is not important that occasional invalid calibrations look bad. It is important that the instrument is able to obtain valid calibrations, which look like Figure 1, and are carried out with a data spread comparable to a typical automobile at the same site. This parameter also must be determined at the roadside in order to evaluate the instrument. It should be noted that air spectroscopy gas optical absorption data often are given in strange units because what is measured is the product of concentration and path length. Thus, atm.cm or %.cm or ppm.cm, or even % in 8cm are all units which may be used and all can be inter-converted. In fact the CO2 plume from a typical car as measured by an on-road sensor can be as large as latm.cm, but more often is 0.01 or 0.01 atm cm which could also be rendered as l%.cm. CO is typically 1/10 of that and HC and NO 1/100. FINAL - 56 - ------- Another noise evaluation which one should ask any instrument to be able to perform is a calibration but without any added calibration gas. The graphical evaluation is uninteresting, namely a cluster of points at the origin. However, the spread of these points along each of the axes is a direct measure of the noise which the instrument will see from all passing vehicles. Again, the spread should be compared to the spread expected from a typical motor vehicle in a realistic roadway situation using the same remote sensing unit. FINAL - 57 - ------- 0 Figure 1. Half-second puff calibration plots for CO, HC and NO. The straight lines are linear least squares regressions of the data. FINAL -58- ------- 0.3 0.1 0,15 0.1 0,05 0 %CO? Figure 2. Half-second calibration gas puff for CO, HC and NO which has been contaminated with exhaust from a passing vehicle. FINAL -59- ------- EVALUATION OF INDIVIDUAL MOTOR VEHICLE EMISSIONS: At the roadside, when the instrument is operating and calibrated, call up and observe CO/CO2 ratio graphs from about three randomly chosen vehicles. The skewed distribution of emissions implies that these are all likely to be low emitting cars with very small CO/CO2 slopes. The parameter to observe on these graphs is the range (spread) of the CO2 data. If the CO axis is auto scaling, the noise may look very bad but actually be very good. Note the CO2 spread. It should be comparable to the calibration, or at least not less than about lOx smaller. Figure 3 shows typical data from a passing vehicle. The CO2 readings are from about 0.3% to 1.3%, for a total spread of 1% CO2 in 8cm. The spread for the calibration shown in Figure 1 is about 4.5% and in Figure 2 about 2.2%. In both cases the calibrations are at a comparable, although larger spread than the on-road data. Now it is necessary to evaluate the CO/CO2 graph on a vehicle with higher than zero CO/CO2 ratio. If the raw data are stored and can be recalled and graphed from each vehicle, then wait for a vehicle with CO/CO2 > 0.25 (about 3.5% CO on the video screen). Now observe this CO/CO2 graph. The CO2 spread should be comparable to the three low CO emitters observed earlier. The CO spread should be comparable to the CO spread on the calibration puff, or at least not less than about lOx smaller. If these criteria are met and this graph looks "good", for instance, r2 > 0.9, then you have an instrument likely to provide precise and accurate measurements ,if the calibration gas supplier is trustworthy, data. Figure 4 shows on-road CO/CO2 data from a cold-start vehicle measured at the University of Denver. A similar evaluation analysis can be carried out for HC and NO; however, if the CO/CO2 data do not pass muster, then HC/CO2 and NO/CO2 are much less useful because the readings are missing a major component of the carbon balance. Note also that HC emissions are smaller and harder to measure than CO, so more (relative) noise is to be expected. If the data you see at roadside are of similar or better quality then you are observing a good instrument. If they are not up to this quality, then your should think twice about accepting the data until the operator/vendor can convince you that the instrument is functioning properly. The ability to read vehicle exhaust independently of vehicle type should also be verified. This may be done by making a note of the valid reading rate from normal sedans and from SUV's and pickups while observing roadside operations. In a perfect world all vehicles with ground level exhaust should be measured. In reality some are not, but this should be observed to be a random process or a systematic one caused by driving mode (noticeable decelerations) not one caused by vehicle type or body height. FINAL - 60 - ------- 0.4 1.5 Figure 3. In-use data for a low CO emitting vehicle. FINAL -61 - ------- 0.4 0.2 Figure 4. In-use data from a cold-start vehicle with elevated levels of CO. EVALUATION USING EXHALED BREATH: A non-smoking human exhales CO? and negligible amounts of CO, HC and NO. The remote sensor should be able to read human breath as a passing car, as long as it is accompanied by a blocked and unblocked optical beam. Fifteen readings of breath with the FEAT instrument in the laboratory yielded a mean CO reading of 0.07% with a standard deviation of 0.04%. HC read a mean of 39 ppm propane with a standard deviation of 50 and NO a mean of-3 ppm with a standard deviation of 18 ppm. FINAL -62- ------- |