United States Air and Radiation EPA420-R-95-004 Environmental Protection September 1995 Agency vvEPA Development of a Methodology for Estimating Basic Emission Rates for Use in the MOBILE Emission Factor Model > Printed on Recycled Paper ------- SR95-09-03 Development of a Methodology for Estimating Basic Emission Rates for Use in the MOBILE Emission Factor Model prepared for: U.S. Environmental Protection Agency under Contract No. 68-C4-0056 Work Assignment No. 0-06 September 30, 1995 prepared by: Phil Heirigs Robert G. Dulla Sierra Research, Inc. 1801 J Street Sacramento, CA 95814 (916)444-6666 Although the information described in this report has been funded wholly or in part by the United States Environmental Protection Agency under Contract No. 68-C4-0056, it has not been subjected to the Agency's peer and administrative review and is being released for information purposes only. It therefore may not necessarily reflect the views of the Agency and no official endorsement should be inferred. ------- Development of a Methodology for Estimating Basic Emission Rates for MOBILE Emission Factor Model Table of Contents Page 1. Introduction 1 Background 1 Organization of the Report 3 2. Overview of Emission Factors Development for MOBILES 4 Database Adjustments 4 Fuel/Temperature Adjustments 6 IM240-to-FTP Correlations 7 Correlation Adjustments 8 TECHS Inputs 9 3. Alternative Methods for Using IM240 Data to Develop Basic Emission Rates 12 Survey Summary 12 IM240-to-FTP Conversion Procedure 13 TECHS Inputs 19 Treatment of Light-Duty Trucks 23 4. Adjusting I/M Data to a Non-I/M Basis 24 5. Incorporating "Off-Cycle" Emissions into MOBILE 28 6. Use of State-Generated IM240 Data in MOBILE 32 Data Collection and Development of Simulated FTP Scores .... 32 Development of Basic Emission Rates from Simulated FTP Scores . 34 7. References 36 Appendix A - EIRG's Responses to Questionnaire on Developing Basic Emission Rates from IM240 Data ------- List of Figures Figure Page 2-1 Outline of Emission Factor Development for MOBILES 5 2-2 Change to the HC Cold-Start Offset as a Function of Mileage for 1983+ MPFI Vehicles 9 3-1 Comparison of Very High + Super Emitter Fractions - TECHS vs. Hammond Data for MPFI/CL Vehicles 21 4-1 Effect of An I/M Program on Emissions as a Function of Repair Cycle 26 List of Tables Table Page 2-1 Final Seasonal Fuel/Temperature Adjustments Used for MOBILESa (Ratio of Lab/Indolene IM240 Scores to Lane/ Tank Fuel IM240 Scores) 6 2-2 IM240-to-FTP Correlation Equations Developed for MOBILES ... 7 3-1 Summary of IM240/Basic Emission Rate Survey Scores 14 3-2 Effect of Multiple Counting of Foreign Vehicles on the Distribution of Emitter Categories by Technology Type for 1983 and Later Model Years 15 ------- 1. INTRODUCTION Background With the release of MOBILES, the U.S. Environmental Protection Agency (EPA) made a significant departure from the historical method of using its Emission Factors database to develop exhaust basic emission rate (BER) equations (i.e., the non-I/M emission rates in the model). In previous versions of MOBILE, data used for the BERs were collected through a process often referred to as "surveillance" testing, where vehicle owners are randomly contacted (usually by letter) and asked to give up their cars for a week of testing. Over the years, EPA has become concerned that the vehicles they receive for the Emission Factors testing are not representative of the in-use fleet, particularly with respect to the fraction of poorly maintained, high-emitting vehicles. This has been primarily attributed to a sample selection bias, e.g., if a vehicle owner knows that his or her car has been poorly maintained or has been tampered, he or she will not voluntarily submit it for emissions testing. To overcome sample bias concerns {and to provide a much larger sample for analysis), EPA used IM240 emissions data collected during the initial two years of an inspection and maintenance (I/M) program in Hammond, IN, to develop the exhaust basic emission rate equations for MOBlLESa." It was felt that this approach would provide a less biased sample because all vehicle owners had to participate in the state-run portion of the program. EPA then recruited vehicles from the state-run testing lanes for the EPA tests. Because all of the exhaust emission relations contained in MOBILE (e.g., temperature corrections, speed corrections, etc.) are based on FTP testing with certification fuel (Indolene), a means to convert the IM240 data collected at the lane on tank fuel to an FTP/Indolene basis was needed. This conversion process was a multi-step procedure, consisting of the steps listed below. Factors that accounted for the differences in ambient temperatures and fuel characteristics between conditions experienced during IM240 testing at the I/M lane and IM240 testing in the laboratory were developed from a subset of Hammond lane vehicles. Vehicles were tested in their first I/M "cycle," and therefore the data represent emissions from a non-I/M fleet. ------- Those factors were used to convert all the Hammond lane IM240 data (tested with tank fuel) to a laboratory/Indolene IM240 basis. Correlation equations between IM240 emissions on Indolene measured in the lab and FTP values on Indolene in the lab were developed from a sample of vehicles. These correlation equations were then applied to all of the Hammond IM240 data (first adjusted for fuel and temperature differences) to put all data on an FTP/Indolene basis. Once the IM240-to-FTP conversion process described above was completed, the TECH5 model was used to calculate the BER equations (zero-mile level and deterioration rates) for MOBILE. The TECH model uses a "regime" approach to develop emission rates (as a function of vehicle mileage) by model-year group (i.e., 1981-1982 and 1983+) and technology (i.e., closed-loop multipoint fuel injection (MPFI/CL), closed-loop throttle- body injection (TBI/CL), closed-loop carbureted (CARB/CL), and open- loop) . Four emitter groups (or regimes) are defined in the TECH model: normals, highs, very highs, and supers. Emission rates (by model-year group/technology) are determined by multiplying the emission rate of each emitter category by the fraction of each emitter category making up the fleet at mileage intervals corresponding to vehicle age. Thus, two primary inputs to the TECH model are the emitter-category emission rates and the emitter-category population growth rates. Once the model year group/technology emission rates are calculated, model-year-specific emission factors (which are input to MOBILESa) are generated by weighting the emission rates of each group by its expected fraction of the fleet. Although the IM240-to-FTP conversion approach provides a considerably larger sample from which to develop BER equations for the MOBILE model, several potential shortcomings have been identified in evaluations sponsored by the American Petroleum Institute (API) .lr2" Thus, in Work Assignment 0-06 of contract #68-C4-0056, EPA directed Sierra Research, Inc. (Sierra)" to perform an evaluation of ways in which the use of IM240 data for the development of basic emission rates could be improved. In addition, the Work Assignment called for an assessment of how IM240 data collected in an I/M area could be adjusted to a non-I/M basis, recommendations for incorporating off-cycle effects into the MOBILE model, and a review of methodologies by which state-generated IM240 data could be used to develop user-input basic emission rates for MOBILE. This report documents the evaluations performed under this Work Assignment. Superscripts denote references listed in Section 7 of' this report. Sierra received assistance from subcontractors Air Improvement Resource (AIR) and Energy and Environmental Analysis (EEA) during the performance of this work assignment. -2- ------- Organization of the Report Following this introduction, Section 2 provides an overview of the methods used to develop basic emission rate equations for MOBILES from the Hammond IM240 data. Section 3 follows with a discussion of alternative methods for developing basic emission rates from IM240 data. An assessment of methods to adjust IM240 data collected in an I/M area to a non-I/M basis is contained in Section 4, while recommendations for incorporating off-cycle emissions into the MOBILE model are presented in Section 5. Section 6 is a discussion of how IM240 data collected by states could be used to develop user-input basic emission rates for MOBILE, and Section 7 lists the references cited in this report. ### -3- ------- 2. OVERVIEW OF EMISSION FACTORS DEVELOPMENT FOR MOBILES Before proposing a method (or methods) to develop basic emission rates for the MOBILE model from IM240 data, it is useful to review the procedure used to generate basic emission rates for MOBILES. That approach, which is diagrammed in Figure 2-1, was based on converting IM240 data collected in Hammond, IN, to an FTP basis prior to the development of inputs for the TECHS model. The conversion process (which consisted of a number of individual adjustments in addition to the development of IM240-to-FTP correlation equations) and the TECHS inputs developed from those converted data are described in this section of the report. Database Adjustments Prior to the development and application of IM240-to-FTP correlations, several adjustments were made to the IM240 database so that it better reflected a national average mix of domestic and foreign vehicles. In addition, vehicles that had missing or suspicious odometer readings were deleted, as were vehicles tested in March and April on days in which the temperature was 25°F or more above the monthly average. These adjustments are described below. Foreign Manufacturers - Because the vehicles tested in Hammond did not accurately reflect the national average fraction of foreign vehicles, each foreign vehicle in the database was counted two to four times. This adjustment increased the 1981 and later model year light-duty vehicle sample size from 6,597 to 7,821. Missing or Suspicious Mileage - A number of vehicles in the Hammond database had '0' or missing mileage and were deleted. In addition, vehicles that were coded as having an odometer reading > 300,000 miles were deleted. This adjustment decreased the database from 7,821 to 6,999 records. Seasonal Outliers - Data collected on 14 test dates in March and April when the ambient temperature was 25°F or more above the monthly average were deleted because many of those vehicles were statistical outliers. (Excessive purge was thought to be influencing the IM240 results.) This affected a relatively small number of vehicles, and it resulted in decreasing the database from 6,999 to 6,826 records. -4- ------- Outline of Emission Factor Development for MOBILES" IM240 Data Collected in Hammond, Indiana > ;, Database Adjustments Non-representative foreign manufacturers Missing/suspicious mileage Seasonal outliers i t f-cfef/Temperature Adjustments Applied to get lane/tank fuel IM240s on a lab/lndolene basis tf !M240-to-FTP Correlations Lab/lndolene IM240s correlated with lab/lndolene FTPs Cold-start function Application of residuals Predicted FTP Scores from IM240 data V * Emitter category emission levels Emitter category growth functions * Discussion of components in shaded boxes follows. ------- Fuel/Temperature Adjustments Because EPA wished to develop the IM240-to-FTP correlations based on vehicles IM240 tested in a laboratory with Indolene, a method was needed to account for the differences between the lane and the lab before the correlation equations were applied to the Hammond lane IM240 data. For the Hammond database, it was felt that those differences were primarily related to tank fuel versus Indolene and the temperature differences occurring between the lane and the lab. (However, a number of other differences could also impact test variability between the lane and the lab, e.g., vehicle preconditioning procedures, inconsistent dynamometer settings, how well the IM240 speed-time trace is followed, etc.) The fuel/temperature adjustments prepared for MOBILES were based on a subset of the Hammond vehicles that were tested at the lane on tank fuel and at the lab on Indolene. Adjustment factors were developed by season (i.e., March-April, May-June, July-September, and October-February) and the following emitter categories: Normal HC/CO - lane IM240 s 1.64 g/mi HC and < 13.6 g/mi CO, High HC/CO - lane IM240 > 1.64 g/mi HC or > 13.6 g/mi CO, Normal NOx - lane IM240 < 2.0 g/mi NOx, and High NOx - lane IM240 > 2.0 g/mi NOx. Once the data were segregated as outlined above, the mean emission levels for the lane/tank fuel scores and the lab/Indolene scores were determined. Adjustment factors were then developed from the ratio of these mean values. A summary of those adjustment factors is shown in Table 2-1. Table 2-1 Final Seasonal Fuel/Temperature Adjustments Used for MOBILE5a (Ratio of Lab/Indolene IM240 Scores to Lane/Tank Fuel IM240 Scores) Pollutant HC CO NOx Emitter Group Normal High Normal High Normal High Seasonal Adjustment Factor Mar -Apr 0.766 0.851 1.072 0.934 0.809 0.784 May-Jun 0.884 0.940 1.007 1.038 0.825 0.736 Jul-Sep 0.823 0.935 0.792 0.880 0.913 0.669 Oct-Feb 0.880 1.137 1.036 1.074 0.862 0.826 -6- ------- IM240-to-FTP Correlations Once the Hammond lane IM240 data were adjusted to a lab/Indolene basis, correlation equations relating the IM240 to the FTP were applied to the data. The IM240-to-FTP correlations were based on a regression analysis of data collected from vehicles tested over the IM240 on Indolene and the FTP on Indolene. (The database used for the correlation analysis included vehicles from the Hammond program as well as vehicles tested in Ann Arbor.) The regressions were performed according to the following model-year groups and technology types: 1981-1982, 1981+ open-loop, 1983+ carbureted/closed-loop, 1983+ throttle-body injection/closed-loop, and 1983+ multipoint fuel-injection/closed-loop. The HC and CO correlations were performed in log space with a cold-start offset ("X" in the equation below) that varied by technology, while the NOx correlations were based on a simple linear equation without a cold- start offset value: Log10(FTPHC/co - X) = b + m*Log10(IM240HC/co) FTP>, = b m*IM240N For cases in which (FTPHC/CO - X) < 0.01, the IM240 score was substituted for (FTPHC/CO - X) . In this way, errors resulting from taking the logarithm of a negative number were avoided. In addition, if the intercept term was not statistically different from zero at the 95% confidence level, the regressions were re-run without an intercept. Table 2-2 summarizes the results of the correlation analysis. Table 2-2 IM240-to-FTP Correlation Equations Developed for MOBILES Pollutant HC CO NOx Model Year/ Technology 1981-1982 1981+ Open-Loop 1983+ CARB/CL 1983+TBI/CL 1983+ MPFI/CL 1981-1982 1981+ Open-Loop 1983+ CARB/CL 1983+TBI/CL 1983+ MPFI/CL 1981-1982 1981+ Open-Loop 1983+ CARB/CL 1983+TBI/CL 1983+ MPFI/CL N 58 24 73 224 211 58 24 73 224 211 58 24 73 224 266 X 0.309 0.315 0.195 0.180 0.222 2.140 1.640 1.579 1.541 1.696 NA NA NA NA NA b 0.1382 0.1448 0.0000 0.0000 0.0000 0.0000 0.3090 0.0000 -0.1386 0.0000 0.2534 0.0000 0.0000 0.0767 0.1250 m 1.0715 0.9654 0.9745 0.9840 0.9520 1.004 0.851 0.906 1.072 0.886 0.7737 0.9306 0.8925 0.8234 0.7730 R2 0.909 0.879 0.905 0.873 0.915 0.943 0.904 0.873 0.782 0.780 0.825 0.976 0.961 0.901 0.825 -7- ------- Correlation Adjustments When the correlation equations were applied to the lane IM240 scores (which had been corrected to a lab/Indolene basis) , two additional adjustments were made. First, the cold-start offset was assumed to be a function of vehicle odometer (although the correlations were performed with a constant X value) ; second, regression residuals were randomly applied to each data point. Cold-Start Offset - The cold-start offset (X) values used in the above correlation equations were developed, by technology group, from the mean value of the difference between the FTP and the IM240 based on normal emitters with FTP values greater than the IM240 (i.e., the value of (FTP - IM240) was determined for each normal emitter, and the mean of the positive results was used as X) . When the correlation equations were applied to the IM240 data, the value of X was adjusted to account for the effects of aging and mileage. Development of this adjustment for 1983+ model years is described below. (A slightly different procedure was used for 1981-1982 model-year vehicles.) The value of X in the correlation equations reflects the cold-start offset at the mean mileage of the correlation sample. At mileages below this mean, it follows that X should be decreased by some amount to account for the fact that the catalyst has been aged less and is expected to be more active. (Alternatively, X should be increased at mileages above the mean.) Thus, the cold-start offset is actually X plus an increment that is a function of vehicle odometer, i.e., X-Offset Function = f(x) = X + f (Odometer) EPA has defined f (Odometer) in the above equation to be "the difference of the model year means regression for normal emitters and a 'New' line created by connecting a point on the model year means regression line at the mean mileage of the correlation sample with the zero mile level used in MOBILE4 . 1 . " The X-offset function is therefore: f(x) = X + ZVSLmB1M,l - ZMLmMeans + ODOM*(DET.New, - DET^ Means) A plot of the two lines described above for HC from multipoint fuel- injected vehicles is shown in Figure 2-2 (XHC = 0.222 g/mi for that group) . Regression Residuals - Another adjustment made during the application of the correlation equations was the addition of randomized regression residuals , i.e., Log10(FTPHC/co - X) = b + m*Log10(IM240HC/co) + res FTPNOx = b + m*IM240NOx + res where "res" represents regression residuals from the correlation sample. ------- Figure 2-2 (Pas 0.6 0.5 Change to the HC Cold-Start Offset as a Function of Mileage for 1983+ MPFI Vehicles 0.269 0.2 Cold-Start Offset - X + l(Odomater) AtO mile*. f(Odometer) = 0.269-0.308 = -0.030 8/rrt Modal Year Mean* Regression Una - ((Odometer) Cold-Stan Offset It Decreased Relative toX 'New1 Un« Based on MOBILE4.1 ZML Mean Mileage Cold-Start Offset It Increased Relative toX 4 6 8 10 Odometer (10,000 miles) 12 14 According to EPA, adding the residuals randomly to the FTP emission levels predicted by the correlation equations attempts to restore a distribution of predicted FTP values for a given IM240 score. Otherwise, there will be a single predicted FTP value for each IM240 score. A distribution of predicted FTP scores and emission levels is important for some analyses, such as the determination of I/M credits. For example, if residuals were not applied, 100% of the FTP emissions from a certain emitter group could be identified on the basis of the IM240 score. TECHS Inputs Once the Hammond data were converted to predicted FTP scores, the results were used to develop inputs to the TECH5 model (i.e., emitter- category emission rates and growth functions). The following emitter categories were used in TECH5 for HC and CO emissions: Normal HC/CO - HC 1 0.82 g/mi and CO <, 10.2 g/mi, High HC/CO - HC > 0.82 g/mi or CO > 10.2 g/mi, Very High HC/CO - HC > 1.64 g/mi or CO > 13.6 g/mi, and Super HC/CO - HC > 10.0 g/mi or CO > 150.0 g/mi. NOx emissions were analyzed separately from HC and CO, with only two emitter categories being defined: normals (<. 2.0 g/mi) and highs (> 2.0 g/mi). -9- ------- The data were also segregated by the following technology groups: open-loop, carbureted/closed-loop, throttle-body injection (TBI)/closed-loop, and multipoint fuel-injection (MPFI)/closed-loop. Finally, emission rates were determined separately for 1981-1982 model year vehicles and 1983+ model year vehicles. HC/CO Emission Rates - For HC and CO, the emitter category emission rates (i.e., zero-mile level (ZM) and deterioration rates (DRs)) were constructed as outlined below. 1. MOBILE4.1 ZMs were used for 1981-82 normals. 2. 1983+ DRs were used for 1981-82 normals, highs, and very highs. 3. Emission rates of normals were capped at the same rate for 1981-82 and 1983+ groups. 4. Normal caps were set at the maximum of the 1981-82 or 1983+ 100,000-mile levels calculated from the 1981-82 and 1983+ ZM and DR for normal emitters. 5. Deterioration rates that were negative and without significance were assumed to be zero. 6. Regression of carburetor very highs was performed for 1983-1988 model years only (although the regression results were applied to all 1983+ carbureted vehicles). Including 1989 resulted in a negative ZM. 7. A covariance analysis was used for fuel-injected very highs that resulted in the same DR but different ZM levels for the 1981-82 group and the 1983+ group. (This resulted in substantially higher HC and CO emission rates from the 1983+ group compared to the 1981-82 group.) 8. All model years were combined for supers. NOx Emission Rates - The following procedure was used to develop NOx emitter category emission rates: 1. 1981-82 model year normals used the MOBILE4.1 ZM and the DR was determined from the mean emission level and mileage of the Hammond sample. 2. 1983+ model year normals used a covariance analysis that forced the deterioration rates to be equal for vehicles certified to 1.0 and 0.7 g/mi NOx. (This resulted in different zero-mile levels.) 3. High NOx emitters used DRs from the normal NOx emitters, and the ZM levels were back-calculated from the mean emission level and mileage of the Hammond sample. Growth Functions - Equally as important as the emission level of each emitter category are the growth functions assigned to those categories. For MOBILES, EPA wanted to base emission control system deterioration on both vehicle age and mileage. This was done by using data from 1987 and -10- ------- later model years to establish the growth rate of non-normals (i.e., highs + very highs + supers) for mileages less than 50,000. For mileages above 50,000, data from the 1981-86 model years were used for the TBI and carbureted technology groups, while data from 1984-86 model years were used for MPFI vehicles. (EPA judged that pre-1984 MPFI represented "prototype" technology.) The method used to establish the emitter-category growth rates was based on first developing growth rates for the following emitter groups: supers, very highs + supers, and highs + very highs + supers. Once these were established, individual emitter-category growth rates were determined by subtraction. The analytical technique used to develop the growth functions for each of the above groups is best explained with an example. For the MPFI very highs+super group, the following process was used. First, the <50,000 mile growth rate was established by determining the fraction of very highs+supers from all 1987+ MPFI data. In the Hammond sample, there were 155 very highs+supers out of 1,716 total vehicles in this group (i.e., 9.03%). This fraction was then divided by the average mileage of the group (28,182) to obtain a growth rate of 0.03205/10,000 miles. The growth rate beyond 50,000 miles was calculated by first determining the fraction of very highs+supers in the 1984-1986 model year group (138/460, or 30.0%) and the average mileage of that group (68,464) . The second growth rate was then calculated by linear extrapolation of a line connecting the fraction of very highs+supers at 50,000 miles (i.e., 5*0.03205, or 16.0%) and the point established from the >50,000 1984-86 group (i.e., 0.300 at 68,464 miles). This resulted in a >50,000 growth rate of 0.07568/10,000 miles. ### -11- ------- 3. ALTERNATIVE METHODS FOR USING IM240 DATA TO DEVELOP BASIC EMISSION RATES In developing alternatives to the MOBILES methodology for using IM240 data to generate basic emission rate equations, the following approach was used. First, members of an ad hoc Emission Inventory Review Group (EIRG)* were asked to provide their thoughts on the strengths and weaknesses of the MOBILES methodology. In addition, they were asked to suggest alternatives to that methodology. Their responses were then used to formulate an informal survey in which the MOBILES methods and proposed alternatives were ranked. The results of that survey helped focus the development of the recommended alternatives presented in this section. The discussion below first presents a brief summary of the survey responses. That is followed by a description of each of the adjustments and calculations performed in the MOBILES approach, with a summary of the concerns and limitations expressed by the EIRG. Recommended alternatives conclude each discussion point. This discussion is structured in two parts: (1) the IM240-to-FTP conversion procedure, and (2) inputs to the TECHS model. Although the Scope of Work for this project called for the development of a single methodology for estimating MOBILE basic emission rates from IM240 data, in some cases it is impossible to recommend a single method without first reviewing the results of several alternatives. Thus, some portions of the following discussion contain more than one recommended approach. Survey Summary As described above, a questionnaire was circulated to members of the EIRG which summarized the methods used to develop basic emission rate equations for MOBILES and asked for a listing of strengths, weaknesses, and alternatives to each specific adjustment that was performed. (For a summary of the methods used for MOBILES, refer to Section 2 of this report.) The responses to that questionnaire, which are summarized in Appendix A, helped form the basis of a survey that was distributed to the EIRG. In that survey, participants were asked to rank the importance of specific data adjustments and alternative methods that could be used in the development of basic emission rate equations from IM240 data. The purpose of the survey was to provide a more objective ranking of the importance of adjustments and alternative methods, which The EIRG was made up of individuals responsible for emission factor development from EPA's Office of Mobile Sources, the California Air Resources Board's Mobile Source Division, EEA, AIR, and Sierra. -12- ------- would then help focus efforts to expand on some of the EIRG's recommendations. Although surveys were not filled out by the entire EIRG, responses were received by six participants. A summary of those responses, with an average score for each question/recommendation, is contained in Table 3-1. In general, the results indicate that alternatives to the methods used to develop MOBILES basic emission rate equations from the Hammond IM240 data are preferred. IM240-to-FTP Conversion Procedure The first step in the development of basic emission rates for MOBILES was to convert the lane IM240 data collected in Hammond to an FTP basis. That process consisted of several steps, including adjustments for a non-representative mix of foreign and domestic vehicles, corrections for suspicious or missing mileages, and corrections to get the lane IM240 data (collected with tank fuel) on a laboratory/Indolene basis (these are thought of as temperature and fuel corrections). This last step was necessary because the IM240-to-FTP correlations were based on laboratory IM240 and FTP tests with a standard fuel (Indolene) at a standard temperature. Finally, correlation equations were applied to the lane data to generate simulated FTP scores for the entire lane IM240 database. Below is a brief description of the methods used in the IM240-to-FTP conversion process for MOBILES. Following the description of each adjustment/method is a summary of the concerns expressed by the EIRG and recommended alternatives. Database Adjustments - Foreign Manufacturers - Because the vehicles tested in Hammond did not accurately reflect the national average fraction of foreign vehicles, each foreign vehicle in the database was counted two to four times. Limitations and Concerns with Current Approach - The EIRG generally agreed that sampling biases should be accounted for if it can be established that durability differences are significant. The foreign/domestic split is only one possible bias, and it is possible that durability differences among engine families or among manufacturers are equally important. Recommended Alternative - There was really no consensus reached among the survey respondents on how to proceed with a sample selection bias correction. However, it is clear that the first step is to determine if durability differences are significant. That can be done a number of different ways. For example, Table 3-2 presents the distribution of emitter categories (i.e., normals, highs, very highs, and supers) with and without the foreign vehicle adjustment used in MOBILES. The table indicates that the impact of not making this adjustment is most pronounced for carbureted vehicles, with only slight changes occurring in the distribution of emitter categories for the fuel-injected technologies. -13- ------- Summary of IM240/Basic Emission Rate Survey Scores Adj ustment/Methodology Database Adjustments - Weighting of Foreign Vehicles Is this adjustment needed? If so, what is the best approach? Current method Manufacturer/engine family basis Foreign/domestic with more emphasis on tech type Database Adiustments - Missing or Suspicious Mileage Is this adjustment needed? If so, what is the best approach? Current method Change to an age-based analysis Assign sample average mileage Assign average mileage based on vehicle age Database Adiustments - Seasonal Outliers Is this adjustment needed? f so, what is the best approach? Current method Limit use of data to FTP temperature ranges Establish a temperature range for each RVP "season" Temperature correct the outliers Determine if purge is really higher during those periods Only reject data on basis of statistical/engr analyses Fuel/Temperature Adiustments Is this adjustment needed? If so, what is the best approach? Current method Choose records that are similar to FTP conditions Multivanate analysis of differences Quantify temperature effects independently of fuel Statistical analysis of all external variables Correlate FTP directly with lane IM240s MOBILE temp/RVP factors to adjust lane IM240s Split data into temperature regimes IM240-to-FTP Correlations What is the best approach for correlating IM240 with FTP? Current method Multiple correlations by emitter group Explore different equational forms, choose best stats Base choice of equational form on most random variance Regress IM240 against individual FTP bags Eliminate cold-start offset; use IM240 for FTP bags 2/3 Correlation Adiustments - Cold-Start Offset What is the best way of determining a cold-start offset? Current method Regress bag 1 versus IM240 and use if statistically significant Use IM240 for non-start emissions; FTP for cold start Use IM240 for bag 2/3; bag! vs bag2/3 from FTP data Correlation Adiustments - Regression Residuals Should regression residuals be used in IM240/FTP regressions? If so, what is the best approach? Add in randomized residuals when applying correlation eqn Develop a probability distribution Use a log-normal or Weibull distribution of residuals TECHS Incuts - Emission Rates Should it be assumed that DRs are the same for different MYs? Is there any basis for using MOBILE4.1 rates in this analysis? Do the MY breakpoints adequately reflect developing versus mature technology? TECHS Inputs - Growth Functions What is the best way to develop emitter growth functions? Current method Add a third linear growth rate beyond 100,000 miles Analyze the data in 10,000-mile bins Respondent Scores DJB 3 3 3 5 4 3 4 3 2 4 3 4 3 1 2 2 3 4 3 3 3 2 2 4 3 4 3 3 4 5 2 2 4 3 4 3 4 3 5 4 4 2 4 4 RAR 2 2 4 3 5 2 3 4 5 5 4 2 3 3 5 3 3 1 1 1 4 5 5 5 5 1 5 3 3 4 2 1 1 1 3 5 JL 3 4 4 4 2 4 4 4 1 5 4 4 4 4 3 2 2 5 LSC 3 1 4 4 5 1 4 1 2 3 2 3 3 2 4 4 3 2 3 4 3 4 3 3 3 2 4 4 2 4 3 2 3 3 3 2 2 2 2 1 1 3 1 3 4 ROD 5 1 5 4 4 1 4 4 5 5 2 1 5 5 3 5 5 1 1 5 3 4 1 4 4 1 3 3 2 4 5 0 3 4 5 2 1 2 3 1 1 1 1 1 5 PLH 3 2 3 3 5 2 2 2 5 4 2 4 5 2 4 2 5 2 4 3 4 4 3 2 2 1 3 4 3 3 5 1 3 5 4 3 3 5 4 1 1 1 1 3 5 Average Score 3.2 1.8 3.8 3.8 4.5 1.8 3.4 2.8 3.8 3.6 2.3 3.5 4.0 2.5 3.4 3.4 3.6 2.0 3.0 3.6 3.8 3.6 2.4 2.4 2.8 1.6 3.6 3.8 3.0 4.0 4.5 1.2 3.2 3.8 3.7 3.2 2.3 3.3 3.2 2.2 1.7 2.0 1.2 2.8 4.7 ------- Table 3-2 Effect of Multiple Counting of Foreign Vehicles on the Distribution of Emitter Categories by Technology Type for 1983 and Later Model Years Technology Multiple Foreigns6 MPFI/CL TBI/CL CARB/CL OPEN LOOP Single Foreigns MPFI/CL TBI/CL CARB/CL OPEN LOOP Total Data Points 2208 1991 1654 252 1742 1873 1344 196 Emitter Category Normal 0.788 0.718 0.540 0.214 0.776 0.722 0.503 0.189 High 0.077 0.141 0.149 0.210 0.082 0.138 0.158 0.194 V. High 0.131 0.135 0.303 0.567 0.138 0.133 0.331 0.607 Super 0.004 0.007 0.008 0.008 0.005 0.007 0.008 0.010 Represents data used for MOBILES. It is recommended that a similar analysis (or some type of analysis of variance) be performed on a manufacturer-specific basis to determine if durability differences exist. Based on the data presented in Table 3-2, it appears that this would be most important for carbureted vehicles. It would also be useful to determine if manufacturer-specific (or foreign/domestic) differences are more prevalent as a function of model year (e.g., do the early 1980 model year vehicles exhibit a greater emissions difference than late 1980 model year vehicles). If those differences do exist, then the data should be weighted accordingly. Database Adlustments - Missing or Suspicious Mileage - A number of vehicles in the Hammond database had '0' or missing mileage and were deleted. In addition, vehicles that were coded as having an odometer reading > 300,000 miles were deleted. Limitations and Concerns with Current Approach - The primary concern with deleting these data points is that valid data are removed. This . may be a particular problem for high-mileage vehicles which are badly needed in the database. Recommended Alternative - There was fairly strong sentiment that corrections for missing and suspicious mileage should be made. In terms of suspicious mileages, we recommend running an odometer "cleaning" routine on all vehicles to identify vehicles with unusually high or low mileage accumulation rates. That can be done by first estimating the age of the vehicle at the time it was tested based on the difference -15- ------- between the test date and the model year." The age at the time of testing can be used to flag vehicles with mileage accumulation rates below 3,000 miles per year or above 30,000 miles per year for closer inspection. (Clearly, other mileage accumulation cutpoints could be used in this type of analysis.) In the Hammond database, there were a significant number (about 10%) of vehicles with missing odometer readings. Thus, some method to estimate the mileage of those vehicles is recommended. The general consensus of the EIRG was to assign those vehicles the average mileage of the remaining vehicles in the database based on the age of the vehicle at the time it was tested. A broader issue related to this adjustment is whether to develop emission factors based on vehicle age rather than accumulated mileage. Both accumulated mileage and vehicle age play a role in emissions deterioration and emitter-category growth functions. (Emitter-category growth functions are discussed in detail below under "TECHS Inputs.") To the extent that some deterioration of emission control systems is due to weathering effects, emitter-category growth would be best characterized by vehicle age. To the extent that deterioration is due to vehicle use, emitter-category growth would be best characterized by odometer reading. As currently structured, TECH5 and MOBILES use a fixed relationship between vehicle age and odometer so that only one of these variables can be used in determining the population of the various emitter categories. The average relationship between vehicle age and odometer reading shows that the average vehicle is driven fewer miles per year as it ages. Consequently, a nonlinear relationship between either of these variables and emitter-category population sizes represents the combined effects of both. As described below, this is the approach that is recommended for the development of the emitter- category growth functions. If age-based versus odometer-based emission deterioration is still a concern, the analysis could be limited to vehicles that are within a certain fraction (or standard deviation) of the mean mileage for each vehicle age in the dataset. This is a reasonably easy adjustment to perform, but it has the disadvantage of eliminating real, valid data points. Lane/Tank Fuel-to-Lab/Indolene Adjustments - Because it is desirable to develop the IM240-to-FTP correlation equations with vehicles operated in a lab on Indolene, a means to convert the lane/tank fuel IM240 scores to a lab/Indolene basis is needed. It is thought that this adjustment is primarily a fuel and temperature correction that accounts for the differences between the lane and the lab. For MOBILES, this adjustment was developed from a subset of Hammond vehicles that were tested at the Model years are assumed to run from October 1 of the previous year to September 30 of the model year. The midpoint date of April 1 in the model year is assumed as the initial operating date of the vehicle. In cases where the vehicle is tested during its initial model year, it is assumed to have been placed in operation midway between the start of the model year and the test date. -16- ------- lane on tank fuel and in the lab on Indolene. Adjustment factors were developed by season (i.e., March-April, May-June, July-September, and October-February) and emitter category. In addition to the general fuel/temperature correction, data collected in March and April on 14 test dates when the ambient temperature was 25°F or more above the monthly average were deleted because many of those vehicles were statistical outliers. (Excessive purge was thought to be influencing the IM240 results.) Limitations and Concerns with Current Approach - The EIRG's major complaint with the lane-to-lab adjustment utilized in the development of MOBILES emission factors is that it may not have accounted for all external variables affecting emissions (e.g., preconditioning effects). In terms of the deletion of data points collected on the aforementioned test dates, there was concern that true high emitters were deleted. Recommended Alternative - Although a variety of alternatives were offered by the EIRG, none stood out as being vastly superior to the others. There was general agreement that temperature effects should be quantified separately from fuel effects, but the mechanism to do that is unclear given that fuel samples were not taken in the Hammond program. The easiest and most straightforward way is to consider only data that were collected within the FTP temperature range. However, this may vastly reduce the number of valid records. As a first cut on this adjustment, the number of tests performed outside of the FTP temperature range should be determined from the Hammond database. If too many tests are discarded, this approach would not be practicable. (Thus, even if the Hammond IM240 database, which contains approximately 16,000 records, is severely diminished, the large volume of data available from operating IM240 programs could be used to fill the void.) Another approach that has merit is to establish a different temperature range for each RVP "season" that would result in similar vapor generation rates. This would minimize the effect that excessive purge might have on IM240 emission rates. In addition, under hot stabilized operation (which, ideally, is the mode the vehicle is in during the IM240 test), the temperature impact on emissions is not significant. Establishing temperature ranges could be accomplished by analyzing fuel samples from a number of vehicles each week or month, and recording the test temperature for each vehicle and the diurnal temperature profile on each test date. For the Hammond database, it would be worthwhile to determine the availability of fuel volatility statistics for that area during the test program. This information is likely to be available for the summer (e.g., through RVP compliance testing), but the winter months may pose difficulties. A survey of refiners supplying fuel to that area may provide winter fuel specifications. To serve as a check on possible preconditioning or excessive purge problems, a comparison of the IM240 bag 1 and bag 2 scores needs to be performed on the lane data. If the ratio (or difference) of bag 1 and bag 2 is outside of predetermined limits, that record should be discarded. A data set that may be useful to determine what those limits should be is the ASM/IM240 comparison test program conducted by EPA in Phoenix. In that program, roughly half of the vehicles were tested with -17- ------- the ASM first, while the other half were tested with the IM240 first. The IM240 scores that were collected immediately following the ASM test should represent a well-preconditioned subset of vehicles. Correlation Eoruations - Once the Hammond lane IM240 data were adjusted to a lab/Indolene basis, correlation equations relating the IM240 to the FTP were applied to the data. The IM240-to-FTP correlations were based on a regression analysis of data collected from vehicles tested over the IM240 on Indolene and the FTP on Indolene. (The database used for the correlation analysis included vehicles from the Hammond program as well as vehicles tested in Ann Arbor.) The regressions were performed according to the following model year groups and technology types: 1981-1982, 1981+ open-loop, 1983+ carbureted/closed-loop, 1983+ throttle-body injection/closed-loop, and 1983+ multipoint fuel-injection/closed-loop. The HC and CO correlations were performed in log space with a cold-start offset ("X" in the equation below) that varied by technology, while the NOx correlations were based on a simple linear equation without a cold- start offset value: Logi0(FTPHC/co - X) = b + m*Log10(IM240HC/Co) + res FTPNOx = b + m*IM240NOx + res For cases in which (FTPHC/CO - X) < 0.01, the IM240 score was substituted for (FTPHC/CO - X) . In this way, errors resulting from taking the logarithm of a negative number were avoided. The "res" term in the equation above represents regression residuals from the correlation sample. Limitations and Concerns with Current Approach - The EIRG expressed two primary concerns with the correlation method developed for MOBILES. First, the use of the cold start offset ("X" in the equation above) implies that the IM240 can predict vehicle emissions during cold operation. Since the IM240 is a hot test, it should be used only to estimate running emissions. Second, there was general discomfort with the use of the log-based equation. In addition, it was not clear to some members of the EIRG that adding residuals was necessary for developing basic emission rate equations (particularly since these data were not used to generate the I/M identification rates used by TECHS to develop the I/M credits matrices for MOBILES). Recommended Alternative - We recommend that IM240 data be used only to predict hot stabilized vehicle operation. That being the case, there remains a question of whether the IM240 should be correlated only with bag 2 of the FTP or with a combination of bag 2 and bag 3. Because a combination of bag 2 and bag 3 encompasses a broader range of vehicle operation (i.e., the speed during bag 2 never goes above 35 mph, with most operation below 30 mph), it is recommended that the correlation be performed between the IM240 and a "hot FTP" (i.e., bag 2 weighted 52.1% -18- ------- and bag 3 weighted 47.9%). Although bag 3 contains a start, the impact of that is minimal because the engine and emission control system do not cool off significantly during the 10-minute soak between bag 2 and bag 3 . The IM240/FTP regressions should be developed by exploring a number of different correlation equations and choosing the ona(s) that gives the best agreement. (Different sets of data may have different equations giving the best agreement.) One possible form to consider is the use of separate regressions for the normal-emitting vehicles and the high- emitting vehicles. This can be done without the arbitrary selection of a break point by testing all possible data points as a break point. The final break point is selected as the one that provides the minimum error. In addition to exploring different functional forms for the regression equations, the technology groups chosen for MOBILE5 should be reevaluated. Although it is unnecessary for the development of basic emission rates, it may be desired to add regression residuals to the correlation equations to obtain a more random distribution of predictions. If this adjustment is made, it should be based on developing a distribution of residuals about the mean. It is likely that such a distribution would take the form of a log-normal distribution, with the longer "tail" being above the mean. For all cases in which residuals are applied, an evaluation of how the application of those residuals changes the overall distribution of emissions must be performed. If the mean emission levels are changed by adding in residuals, the method used to incorporate that effect must be reviewed and modified as necessary. A final point related to the correlation analysis is how to account for cold start emissions. It is recommended that a cold start offset (i.e., bag 1 - bag 3) be developed from FTP data, with different factors being developed based on technology and emitter category. As discussed below (in the section on off-cycle emissions), it may be possible to determine cold start emissions from recent testing conducted by CARB on the LA92 cycle. The start component of the LA92 is much more representative of driving behavior during vehicle start-up than bag 1 of the FTP because it contains a wider range of in-use operation. Alternatively, a separate correlation between bag 1 and bag 2/bag 3 of the FTP could be developed using only FTP data. Since bag 2/bag 3 estimates are available through the IM240/FTP correlation outlined above, it would then be possible to determine bag 1 emissions. This approach has the advantage of being computationally simple, but it may not adequately describe emissions during vehicle start-up. TECHS Inputs Once the IM240 data were converted to an FTP basis, the FTP-based results were used to develop inputs to the TECHS model. There are two primary inputs to the TECH model that drive the computation of basic (i.e., non-I/M) emission rates for use in MOBILE: technology-specific emitter-category emission rates, and technology-specific emitter- category growth functions. In TECHS (which was used to develop basic -19- ------- emission rates for MOBILES), four technology groups were used for 1981 and later model year vehicles: open-loop, carbureted/closed-loop, throttle-body fuel-injection (TBI)/closed-loop, and multipoint fuel-injection (MPFI)/closed-loop. The emitter categories used in TECHS were defined as follows: Normal HC/CO - HC <; 0.82 g/mi and CO <; 10.2 g/mi, High HC/CO - HC > 0.82 g/mi or CO > 10.2 g/mi, Very High HC/CO - HC > 1.64 g/mi or CO > 13.6 g/mi, and Super HC/CO - HC > 10.0 g/mi or CO > 150.0 g/mi. NOx emissions were analyzed separately from HC and CO, with only two emitter categories being defined: normals (s 2.0 g/mi) and highs (>2 . 0 g/mi) . Emitter-Category Emission Rates - The emitter category emission xates developed for TECHS were based on a mix-and-match methodology that appeared somewhat arbitrary. Thus, most recommendations of the EIRG were directed at a more consistent approach to developing emitter- category emission rates. Limitations and Concerns with Current Approach - The EIRG expressed concern about the use of MOBILE4.1 zero-mile levels for the 1981-1982 normal emitters in the development of emitter-category emission rates for MOBILES. EPA indicated that this was done because using only the Hammond data for this group resulted in a zero-mile level that was above the emission standard. This occurred because there were few low-mileage 1981-1982 model year vehicles in the Hammond database, and the regression was being driven by vehicles with well over 50,000 miles. Another concern expressed by the EIRG was that 1983+ deterioration rates were used for the 1981-1982 model year group; it is not clear that this adequately reflects the difference between evolving (1981-1982) and mature (1983+) technologies, particularly for fuel-injected vehicles. Recommended Alternative - As a first cut, the choice of emitter category cutpoints needs to be re-evaluated based on a statistical analysis of the data (e.g., through a "cluster" analysis) rather than multiples of the emission standards. (It is our understanding that this is being done under another work assignment; thus, alternative cutpoints for emitter categories were not investigated in this effort.) This re- evaluation should also be extended to the model-year and technology groups used for analysis. Second, once the emitter categories and technology groups are chosen, emission rates should be determined independently for each - there is no reason to force deterioration rates of early 1980 model year vehicles to be the same as early 1990 model year vehicles. For cases in which data are sparse (e.g., low-mileage 1981-1982 model-year normals), it may be possible to bolster the data set with FTP data collected at the Ann Arbor laboratory. If the emitter-category cutpoints are chosen properly, a normal emitter's -20- ------- characteristic emission rate should be somewhat independent of the type of I/M program under which the vehicle was operating (i.e., an I/M program changes the distribution of vehicles among emitter categories but not necessarily the emission rate of those categories - this is the basic assumption used in California's emission factor and I/M benefits model, CALIMFAC). Emitter-Category Growth Functions - In MOBILES, the development of emitter-category growth functions relied on a very simplistic approach in which the fraction of non-normals as a function of mileage was determined by drawing a line through three points - from the origin through a point representing the non-normal fraction of 1987+ model year vehicles at that group's average mileage (for the < 50,000-mile growth rate), and from the point where that line crossed the 50,000-mile mark through a point representing the non-normal fraction of 1981-1986 model year vehicles at that group's average mileage (for the > 50,000-mile growth rate). Limitations and Concerns with Current Approach - The EIRG expressed concern that the approach used in MOBILES is too sensitive to the post- 50, 000-mile sample distribution, and that the 50,000-mile break point is essentially arbitrary. Figure 3-1 compares the TECHS very high+super emitter fractions versus the data collected in the Hammond program for MPFI vehicles. As the figure shows, the growth in the very high+super emitters is nonlinear and the TECHS method tends to inflate emissions at higher mileages. Figure 3-1 Comparison of Very High+Super Emitter Fractions TECHS vs. Hammond Data for MPFI/CL Vehicles 1.2 O *3 O co 0.8 0) 0.6 Q. 3 tfl + 0.4 0.2 TECHS 87+ (< SDK)' 84-86 (> SDK) 87+ (< SDK)' 83-86 (> 50K) -Er 1983+- -O 5 10 Odometer (10,000 miles) 15 20 * Emitter fractions based on Hammond data. NOTE: Numbers in parenthesis indicate sample size. -21- ------- Recommended Alternative - From the survey responses received from the EIRG, there is very strong sentiment that the emitter-category growth functions should be developed from a statistical analysis of data broken up into 10,000-mile bins. It appears that there are sufficient data in the Hammond sample to do this with reasonable certainty up to 100,000 miles. However, the data beyond 100,000 miles should probably be segregated into 25,000-mile bins. Any number of analytical approaches can be used to develop emitter- category growth functions. One method is to account for the possibility that the emitter-category growth functions could be a linear relation (as a function of mileage) or a nonlinear relation with either increasing or decreasing slope. This can be modeled with the following regression equation: Pi = A + B(mile) + C(mile)2 + D(mile)* where pt represents the fraction of emitter group i as a function of vehicle mileage; A, B, C, and D are regression constants; and mile represents vehicle mileage. The emitter-category growth functions should be developed using a weighted analysis where the weight for each mileage bin represents the total number of vehicles in that bin. The SAS regression procedure REG, using the "adjusted R-squared" method, can be used with the above equation to determine the regime growth functions. This method computes regression results for all possible combinations of variables. Seven different regression equations are possible with this approach: linear term only (C = D = 0) , quadratic term only (B = D = 0) , square-root term only (B = C = 0), linear and quadratic terms (D = 0), linear and square-root terms (C = 0), quadratic and square-root terms (B = 0), and all terms. An alternative to the above could be the development of a step-wise linear fit, which is similar to what was done for MOBILES. However, such an analysis should not be constrained to a predetermined flex point (i.e., 50,000 miles), nor should it be constrained to only two lines. Individual equations would be selected based on their regression statistics; however, since the sum of emitter-group factions must equal 100%, the process and constraints outlined below must be followed. 1. Compute emitter-category populations using the equations selected for each emitter group. 2. Set any individual population fraction greater than 100% to 100%. 3. Set any negative individual population fraction to zero. 4. Normalize the population fractions resulting from steps 1 and 2 to a sum of 100%. -22- ------- The adjustment process outlined above would be done prior to comparing the agreement of the regression results with the input data. The set of equations that provided the best match to the entire set of data would then be selected. Treatment of Light-Duty Trucks Light-duty-truck basic emission rate equations for EPA's MOBILE and CARB's EMFAC models have historically been based primarily on passenger car data. That is because there have not been enough emissions data collected on light-duty trucks to support an independent analysis. In the past, light-duty-truck emission rates have been determined by first evaluating passenger car emission rates by technology type (e.g., carbureted versus fuel-injected) and then calculating the light-duty- truck, model-year-specific emission rates by weighting the technology- specific passenger car rates by the expected technology mix for light- duty trucks. Since the introduction of new technology on light-duty trucks has generally lagged passenger cars by a few years (at least for pre-1990 model year vehicles), the basic emission rates for light-duty trucks were higher than for passenger cars. In addition, adjustments were also applied to account for the fact that light-duty trucks are certified to less stringent numerical emission standards. This latter adjustment is typically performed by applying a ratio of emission standards to the passenger car basic emission rate zero-mile level, while the deterioration rate is left unchanged. There is concern that the above approach may understate emissions from light-duty trucks because deterioration rates (or, more properly, the growth rate of high-emitting vehicles) are based on passenger cars, which are generally subjected to a less severe duty cycle than light- duty trucks. For that reason, we recommend that light-duty-truck emission rates be determined independently from passenger cars, using IM240 data collected from light-duty trucks. The data to do this will be available within the next few years as IM240 programs are implemented in various communities. It is unclear that the IM240-to-FTP conversion procedure would need to be tailored specifically for light-duty trucks, but certainly the emitter category emission rates and growth functions (input to the TECHS model) developed from the simulated FTP scores should be based on light-duty-truck data. As a short-term alternative, IM240 data from Arizona, Colorado, and Maine should be reviewed to determine if there is a significant difference in emissions deterioration between cars and light trucks. If there is, then scaling factors, which are a function of vehicle mileage, could be developed and applied to the light-duty-truck basic emission rates (if they have been based on passenger car data). ### -23- ------- 4. ADJUSTING I/M DATA TO A NON-I/M BASIS In the future, there are likely to be considerable IM240 data made available for emission factor development. Obviously, one source of those data is I/M programs that are using the IM240 procedure. However, EPA's I/M rule requires a program effectiveness evaluation that includes IM240 testing-on a minimum of 0.1% of the vehicle fleet. Thus, programs not running the IM240 as part of their standard test protocol will be required to collect IM240 data on at least some vehicles. One shortfall related to the use of state-generated IM240 data is that the data are from a fleet of vehicles subject to I/M," while the basic emission rates used in the MOBILE model reflect a non-I/M condition. (I/M benefits are determined in MOBILE based on I/M test type, test frequency, compliance rate, waiver rate, etc.) Thus, if the state- generated IM240 data are to be used in future versions of MOBILE, a method to adjust the I/M data to a non-I/M basis is needed. This section presents alternative views on how to adjust IM240 data collected in an I/M area to a non-I/M basis. Five different methods are discussed, with recommendations for both short-term and long-term approaches. Use of Only Those Data Collected in Non-I/M Areas - One option for developing non-I/M basic emission rate equations is to continue using IM240 data collected in non-I/M areas, or in areas that are in their first I/M cycle. Non-I/M area testing could be accomplished through the use of a portable dynamometer in conjunction with a random pullover program. Alternatively, IM240 testing with a portable dynamometer could be linked with annual safety inspections for areas that have those inspections but do not have an I/M program. First-cycle IM240 data should be available in a few areas of the country that will be starting up I/M programs for the first time in the next year or two. One source of first-cycle IM240 data recently collected is Maine, which ran IM240 tests from July 1994 to the fall of that year. Although this is the preferred method of developing non-I/M emission rates, it is not a workable long-term solution. The cost of operating a roving IM240 data collection program would likely be prohibitively expensive, and the availability of first-cycle I/M data will diminish in future years. Use of Remote Sensing to Develop Emitter-Category Distributions - Although there is considerable support for remote sensing and the idea of determining in-use emission rates from RSD measurements is conceptually appealing, there remain serious obstacles to the use of For areas that are implementing I/M programs for the first time in response to the I/M rule, the data collected in the first "cycle" could be considered non-I/M data. However, the majority of areas implementing enhanced I/M programs already have some kind of I/M program in place. -24- ------- this technology. In theory, RSD readings collected in a non-I/M area could be compared to RSD readings collected in an I/M area (i.e., the area in which IM240 data are collected), and the distribution of vehicles among emitter categories (based on RSD readings) in the I/M area could be tuned to match the non-I/M area distribution. In practice, there is so much variability in RSD measurements (e.g., from siting differences, equipment differences, driver behavior, etc.) that discerning a 20% to 30% difference in emissions as a result of an I/M program would be unlikely. For the reasons stated above, adjusting I/M data to a non-I/M basis using remote sensing measurements is not likely to provide an acceptable degree of certainty. Development of a Statistical Model Similar to TECHS or CALIMFAC - One approach to developing non-I/M IM240 scores from data collected in an I/M area is to develop a statistical model that accounts for all of the parameters considered in the current I/M models, i.e., essentially run TECHS or CALIMFAC "backwards." In this approach, all of the constants in the emitter-category growth functions, emission rates, I/M inspection and repair effectiveness, etc. would be viewed as parameters that could be varied (within set bounds) to obtain the best possible agreement between predicted and measured emissions for the entire fleet. The optimization process would continuously compare the TECH predictions with the actual measured emissions for the set of data vehicles. The parameters in the model would be adjusted, based on the error in this comparison, using the algorithms of the optimization procedure. A new comparison would be made between measured emissions and those predicted using the new set of model parameters. This process would be repeated until a satisfactory agreement between predictions and data was obtained. The optimization process for a new version of TECH would be initiated by using the parameters from the previous version. There are many different approaches to optimization in this type of situation. The problem would be non-linear because emissions are the product of emitter-category populations and emissions: the emission rate parameters and the emitter-category population parameters would interact in multiplicative terms, resulting in non-linearity. Thus, linear programming (which is generally convergent) could not be used, and a non-linear optimization technique would be necessary. Clearly, one of the drawbacks to this approach is that it would take a fairly significant effort to develop and maintain such a model. Thus, it is unclear that this approach could be used in the short term, and long-term usage would depend on EPA's commitment to support such a model. Short-Term Recommendation - Continued Use of Non-I/M Data - In the short term (i.e., the next year or two), the most reasonable approach to developing non-I/M emission rates is the continued use of IM240 data collected in the first cycle of I/M programs. This is most important for the development of emitter-category growth functions, which really drive overall emission deterioration rates. For emitter-category emission rates, the differentiation between data collected in non-I/M versus I/M programs is less important. In fact, mixing the non-I/M and I/M data from Hammond would bolster the database used for MOBILES. To serve as a check on the growth functions derived from the Hammond non- -25- ------- I/M data, IM240 data from the Maine program (or other first-cycle programs) could be used. Long-Term Recommendation - Analysis of Repair Cycle Data - Although preferred, the continued use of IM240 data collected in non-I/M areas is probably not a valid long-term option for developing non-I/M emission factors. Because of that, a means to account for I/M effects is needed. An alternative to RSD data analysis or a large statistical program is to simply calculate the repair cycle benefits observed in an operating IM240 program, and assume that emission deterioration between repair cycles is equal to the emission deterioration in the absence of an I/M program. This approach is illustrated in Figure 4-1, which shows the classical sawtooth pattern associated with I/M test and repair. The data used to perform this analysis should be available as part of the I/M program data collection responsibilities outlined in the I/M rule. Each I/M test record must contain the vehicle identification number and the category of test performed (i.e., initial test, first retest, etc.). Thus, it will be possible to determine the before-repair state of vehicles in each repair cycle (the top points of the sawtooth illustrated in Figure 4-1) and the after-repair state of vehicles (the bottom points of the sawtooth in Figure 4-1). Assuming that the deterioration observed from one cycle to the next (or one age to the next) is relatively independent of the I/M program, adding the repair benefit from the previous cycle (e.g., "A" in Figure 4-1) to the before- repair point of the current cycle (i.e., the top of line "B" in Figure 4-1) would give the non-I/M emission rate. Figure 4-1 Effect of an I/M Program on Emissions as a Function of Repair Cycle 468 Vehicle Age/Odometer 10 -26- ------- To describe the concept, the above discussion focuses on average emission rates; however, it would also be possible to use this approach for developing emitter-category growth functions, or to superimpose an emitter-category distribution associated with each of the A through E offsets on the next cycle or vehicle age. This approach offers the advantage of being conceptually simple, and the information needed to perform the calculations should be available with the IM240 emissions data collected.by states operating an IM240 test program. A key assumption in the approach outlined above is that emissions deterioration (or, similarly, the growth rate of high-emitting vehicles) between one I/M cycle and the next is independent of the I/M program. This is the same assumption used in the current version of GARB's CALIMFAC model,3 and was based on an analysis of CARB's First I/M Evaluation Program "recapture" vehicles. These vehicles were tested and repaired during the program, then were returned to the test laboratory after approximately six months in customer service. This analysis showed that, with the exception of pre-1975 model year vehicles, post- repair emissions deteriorate at essentially the same rate as pre-repair emissions in the tested vehicles. To further validate the above assumption, an analysis of CARB's Second I/M Evaluation Program should be performed. (This program is commonly referred to as the "1,100-Car Study.") In Phase 1 of that project (conducted from January 1991 to March 1992), approximately 1,100 vehicles that initially failed an I/M test received an FTP before and after repair. Phase 2 of the project involved FTP testing of recaptured vehicles after one year, while Phase 3 of the project involved FTP testing of vehicles after two years (prior to their next regularly scheduled biennial inspection). Approximately 750 vehicles were tested in Phase 2, and 500 were tested in Phase 3. This database represents a fairly robust sample of between-inspection tests, but it has never been thoroughly analyzed for this purpose. It may also be possible to use the approach described above in conjunction with the IM240 data collected as part of each state's I/M program evaluation requirements (i.e., the 0.1% testing requirement in §51.353 of the rule) to develop non-I/M emission rates from data collected in an I/M area. However, these IM240 data are supposed to be collected at the time of initial inspection, so after-repair data would likely not be available for those vehicles, i.e., only the top points in the "sawtooth" illustrated in Figure 4-1 would be available for analysis. The bottom points of the sawtooth could be estimated based on additional data analysis and reporting required in the I/M rule. Section 51.353 also requires states to perform a program evaluation that includes an assessment of the effectiveness of repairs performed on vehicles that failed the tailpipe emission test. Depending upon the level of detail included in that assessment, it may be possible to use that evaluation to estimate the repair benefit illustrated by the letters A through E in Figure 4-1. -27- ------- 5. INCORPORATING "OFF-CYCLE" EMISSIONS INTO MOBILE In the past four years, there has been an extensive effort on the part of EPA and CARB to better understand in-use driving behavior. That effort has led to the development of alternative drive cycles that include higher speeds and acceleration rates than are included in the FTP. It is generally recognized that vehicle operation under these more severe conditions results in higher emissions than occur using the FTP. Because the MOBILE model is based on emission data collected over the FTP, EPA has requested an evaluation of methods that could be used to incorporate off-cycle emissions in the next version of the MOBILE model. A significant limitation to developing a method to account for off-cycle emissions is the lack of data that have thus far been collected over alternative driving cycles. To date, there have been two primary test programs that have collected emissions data over alternative cycles: (1) EPA and industry testing to support the supplemental FTP rulemaking, and (2) CARB "Unified Cycle" (LA92) testing to support inventory development. Presented below is a brief description of these programs and our recommendations for incorporating off-cycle emissions into the MOBILE model. Supplemental FTP Rulemaking - In February of .this year, EPA published a Notice of Proposed Rulemaking (NPRM) recommending revisions to the FTP. That rule would require vehicle manufacturers to conduct a Supplemental Federal Test Procedure (SFTP) which includes three new driving cycles (or "bags") to control emissions during air conditioning usage, intermediate soak times and vehicle start-up, and aggressive driving. Only the effects of vehicle start-up and aggressive (or off-cycle) driving are being considered in this Work Assignment. Under contract to EPA and CARB, Sierra has developed a number of different driving cycles from instrumented vehicle data collected in Baltimore and chase car data collected in Los Angeles. Those cycles include: a start cycle ("ST01") that is representative of the first four minutes of vehicle operation; an aggressive driving cycle ("REP05") that reflects speeds and accelerations not covered by the LA4 cycle; and a remnant ("REM01") cycle, which is intended to represent the balance of in-use driving not already covered by the ST01 and REP05. In addition, two "composite" cycles have been developed that capture the range of speed and acceleration events observed in the drive cycle databases - the EPA Composite cycle (based on Baltimore and Los Angeles data) and the LA92 cycle (based on only Los Angeles data). Ideally, -28- ------- proper weighting of the ST01, REP05, and REM01 cycles would result in equivalency with the EPA Composite cycle. During the development of the SFTP rulemaking, EPA tested eight well- maintained 1991-1993 model year vehicles over the FTP, ST01, REP05, and REM01 cycles. These vehicles were also tested on two driving cycles that represented extreme acceleration and speed profiles. One of those cycles was developed by EPA/industry ("HL07"), and one was developed by GARB ("ARB02"). By weighting the ST01, REP05, and REM01 cycles according to the fraction of VMT represented by these cycles, EPA found that emissions increased by 0.04 g/mi NMHC, 2.8 g/mi CO, and 0.08 g/mi NOx relative to the hot FTP results. (The average hot FTP emission rate for these vehicles was 0.04 g/mi HC, 1.6 g/mi CO, and 0.19 g/mi NOx.) Following the completion of EPA's testing, auto manufacturers sponsored an emission test program. That effort consisted of 26 late-model vehicles that were tested on the FTP, REP05, HL07, and ARB02. Little testing was conducted on the REM01 since the focus of the EPA/industry effort was on developing a control cycle (i.e., certification), and the REM01 cycle was thought of as an inventory cycle. Based on the results of the above test programs, a high-speed/load transient control cycle was developed (termed "US06") which is a 600-second test comprised of segments of the REP05 and the ARB02 cycles. It should be noted that this cycle was developed with the intent of controlling emissions from aggressive driving and transient operation. It was not developed for the purpose of evaluating in-use emissions. CARS Unified Cycle Test Program - In the summer of 1992, Sierra recorded in-use speed-time profiles of randomly selected vehicles that were followed by a chase car. During this chase car study, which was sponsored by CARB, data were collected over a mix of road routes designed to represent all travel occurring in the Los Angeles area. These data were then used to develop a "composite" driving cycle (the LA92 cycle) designed to match the overall speed-acceleration distribution observed in the Los Angeles data set. To date, CARB has performed FTP and LA92' emission tests on roughly 250 vehicles during two separate test programs. As part of CARB's 12th In-Use Surveillance Program, 170 1983 and later model year vehicles were tested over the LA92 cycle. In addition, CARB conducted a special test program that ran from late 1993 to mid-1994 in which 80 1971 and later model year vehicles were tested on the LA92 and the FTP. Clearly, the CARB testing CARB performs the LA92 emission test in a manner similar to the FTP. The test begins with a cold start, and emissions from the first 300 seconds of the cycle are collected in bag 1. Emissions from the remainder of the LA92 cycle are collected in bag 2. The vehicle is then allowed to soak with the engine off for 10 minutes, and the first 300 seconds of the LA92 are re-run, comprising bag 3 of the test. CARB computes a composite LA92 emission rate by assuming 43% of starts are cold starts and 57% of starts are hot starts.. This is the same approach used to compute a weighted FTP score. However, because bags 1 and 3 of the LA92 test are much shorter than bags 1 and 3 of the FTP (1.2 versus 3.6 miles) and bag 2 of the LA92 is longer than bag 2 of the FTP (8.6 versus 3.9 miles), the factors used to weight each bag's q/mi emission rate are much different for the FTP and the LA92. -29- ------- offers a much larger and more representative database from which to develop off-cycle corrections than does EPA's SFTP program. In addition to the data already collected, CARS is planning in the next year to test 75 vehicles over the FTP, the LA92, and eight different speed cycles developed from the Los Angeles chase car data. CARB also has plans to test approximately 250 vehicles over the FTP and LA92 cycles in its next in-use surveillance project. Many of these data are likely be available prior to the next major release of the MOBILE model. Recommended Approach - Although it may be tempting to rely on the data collected as part of the SFTP development process to make adjustments to MOBILE for off-cycle effects, there are a number of problems associated with the use of those data. First, only eight vehicles have been tested over the full complement of cycles thought to capture start-up and off- cycle events. Although the industry data were more robust in terms of the number of vehicles tested, those vehicles were tested only on the FTP, REP05, HL07, and ARB02 cycles. Additionally, SFTP data collected in the future will likely be over the US06 cycle, which is a combination of the REP05 and ARB02 cycles and was not designed to represent in-use driving. The use of the US06 data for in-use emission estimates would require some kind of correlation or adjustment to get the data on a REP05-cycle basis. Any adjustment of that kind would introduce additional uncertainty into the results. Finally, and most importantly, simply weighting the ST01, REP05, and REM01 does not best reflect the proper mix of speed and acceleration observed in the chase car and instrumented vehicle databases. That mix is better represented by one of the composite cycles developed by Sierra. Because of the data deficiencies in the SFTP test program, it is much more appropriate to develop off-cycle corrections from the LA92 emissions data collected by CARB for the purposes of estimating in-use emissions. Since the LA92 cycle matches the acceleration/speed profiles from all in-use vehicle operation (at least for Los Angeles), a ratio of the LA92 results to the corresponding FTP results will provide a good indication of the emissions increase associated with off-cycle events. Although it could be argued that the use of a data set developed with the EPA Composite cycle (which incorporated in-use driving patterns in Baltimore and Los Angeles) would be more appropriate, sufficient data are not available to characterize vehicle operation and emissions over this cycle. In terms of developing an off-cycle correction factor, the CARB data would allow a reasonable accounting for possible emission differences by technology. The 1993-94 special test program included 12 pre-1975 model year vehicles, 31 1975 to 1980 model year vehicles, and 37 1981 to 1992 model year vehicles. All of the vehicles tested over the LA92 in the 12th Surveillance Program (approximately 170) were from the 1983 and later model years. With LA92 and FTP tests conducted on slightly over 200 1981 and later model year vehicles, it would be possible to investigate differences by fuel delivery technology and perhaps by emitter category. This approach will become more attractive as additional data are collected by CARB. Depending on the way in which start emissions are treated in the next version of MOBILE, the actual development of vehicle start-up and off- cycle correction factors could be performed in a number of different ways. For example, if start emissions are separated from running -30- ------- emissions, then bag 1 of the LA92 could be correlated with bag 1 of the FTP (e.g., through a regression analysis). This is particularly important since bag 1 of the LA92 is much more reflective of vehicle start-up operation than the FTP bag 1. If start emissions are treated as an offset, the difference between bag 1 and bag 3 of the LA92 could be compared to the difference between bag 1 and bag 3 of the FTP. Alternatively, it may be desirable to determine start emissions directly from bag 1 (or bag 1 - bag 3) of the LA92 data without considering the FTP data. Hot stabilized emissions can be corrected for off-cycle events by comparing a combination of bags 2 and 3 of the LA92 cycle (i.e., a "hot LA92") to a combination of bags 2 and 3 of the FTP.' A correction factor can be developed by taking a simple ratio of the LA92 results to the FTP results or through a regression analysis. As with the start correction, the data should be segregated by technology and perhaps emitter category (i.e., "normal" versus "high" emitters). Although this adjustment inherently includes a speed correction, the principal adjustment is to account for the failure of. the FTP to adequately cover the full range of speeds and accelerations occurring in customer service. ### In our opinion, the combination of bags 2 and 3 is more representative of stabilized operation than bag 2 alone. -31- ------- 6. USE OF STATE-GENERATED IM240 DATA IN MOBILE This Work Assignment also called for a review of methodologies that states could use to develop locality-specific basic emission rates for use in MOBILE. The development of locality-specific basic emission rates has the obvious advantage of allowing an area to accurately represent its fleet of light-duty vehicles, while minimizing the reliance on certain relations in the MOBILE (and TECH) models that have been developed with the intent of reflecting national averages. In addition, developing locality-specific emission rates has the potential to better reflect the impact of a particular I/M program on vehicular emissions. However, a number of issues must be considered in order to have confidence that the local predictions are more representative of an area than the estimates obtained by simply running MOBILE. There are two steps involved in developing basic emission rate equations from state-generated IM240 data. First, the IM240 data need to be collected and converted to an FTP basis. Second, the simulated FTP scores need to be analyzed to develop basic emission rate equations. Although most of the procedures that would have to be followed to generate locality-specific emission rates from IM240 data have already been discussed in this document, this section reviews the areas of particular importance that would have to be considered in such an analysis. Data Collection and Development of Simulated FTP Scores As discussed previously, there are potentially two sources of IM240 data that will be available from which to develop basic emission rate equations. Obviously, if a state has an I/M program based on IM240 exhaust measurements, the data collected in the program can be used. For states not conducting IM240 testing as part of the standard I/M program, IM240 data will be available from the program evaluation requirements in Section 51.353 of the I/M rule (i.e., 0.1% of the subject vehicle fleet must be tested each year over the IM240 cycle or another transient mass emission test approved as equivalent). Although ready access to these data makes the development of locality-specific emission rates attractive, there are a number of additional pieces of data and test requirements that would be needed before the data could be used to develop simulated FTP scores. Because it is unlikely that states would have the resources to conduct FTP tests on a subset of vehicles tested at the I/M lane, EPA-generated IM240-to-FTP correlation equations would have to be used. Since those correlations are based on IM240 tests conducted in a laboratory environment on a standard test fuel (i.e., Indolene), the state- collected IM240 data would have to be adjusted to reflect a standard fuel and temperature. To ensure that proper data are available to correctly predict FTP-based emission rates from IM240 data, states should be required to collect a number of pieces of information related -32- ------- to the fleet of vehicles they intend to use for emission factor development. This includes the following: Ambient temperature should be recorded for all vehicles tested. The maximum and minimum daily temperatures should also be recorded. Fuel samples should be collected and analyzed for a subset of vehicles included in the database. By analyzing the test temperature and fuel parameters, a determination can be made as to whether RVP/temperature interactions (which may lead to excessive purge) are having an inordinate influence on the results. Those test reco.rds that are outside of a predetermined RVP/temperature window could be excluded from the analysis. In addition, the analysis of other fuel parameters (e.g., oxygenates, aromatics, sulfur) might allow base fuel/Indolene correction factors to be developed from the reformulated gasoline Complex Model (or from the data that were used to formulate the Complex Model, with correlations developed based on Bag 3 of the FTP). In terms of data collection efforts, several other issues would need to be considered by the states. First, only full IM240 tests should be used for emission factor development. With the implementation of fast- pass and fast-fail algorithms, it is unclear how many vehicles will receive a full IM240 test in an operating program and whether those vehicles that do receive a full IM240 will be representative of the entire fleet. Thus, we recommend a means to ensure that vehicles selected for emission factor development be chosen at random (e.g., every 20th vehicle tested at the lane) and identified as emission factor vehicles. In addition, those vehicles should be tested over the complete IM240 cycle regardless of whether they pass or fail the fast- pass or fast-fail cutpoints. If this procedure is not followed, very clean and very dirty vehicles will not be properly represented in the emission factor data set. Second, vehicles selected for emission factor development should be run over a short preconditioning cycle prior to the IM240 test (e.g., two to three minutes at 40 mph). This would help ensure that vehicles that may have cooled off in the queue are back up to operating temperature before being tested. Finally, information on technology type (e.g., carbureted, throttle-body injection, multipoint injection; open-loop, closed-loop) would be needed if the IM240 data gathered in the program are to be used to forecast emissions. It may be possible to determine technology type with a VIN decoding routine;' however, if states do not have access to such a program, then technology The I/M rule requires VINs to be recorded for each I/M test record. For 17-character VINs (which have been the standard since the early 1980s), the 9th character represents the "check digit," which is intended to verify the accuracy of the VIN. The check digit is determined by a mathematical routine in which each VIN character is assigned a number, which is then multiplied by a preset value based on its position in the VIN. These products are then summed and divided by 11, and the remainder represents the VIN check digit. To ensure the accuracy of the VINs collected in I/M lanes, it is recommended that an electronic cleaning routine be used to verify the VIN check digit. -33- ------- information would also have to be recorded for each vehicle used for emission factor development. Development of Basic Emission Rates from Simulated FTP Scores Once the IM240 data are converted to an FTP basis, emission rate equations can be developed. If only the current year rates are desired, the analytical technique would be fairly straightforward. The data would first be sorted by vehicle type (i.e., car versus truck) and model year. Next, the average pre-inspection emission rate would be determined. (If the data are from an operating IM240 program, there should be a field indicating whether the test is a baseline or retest; IM240 data collected as part of the 0.1% requirement are supposed to reflect emissions immediately prior to inspection.) The effect of the I/M program on each model-year emission rate would then be estimated based on the fraction of failures and the benefit of repair (taking into account waivered vehicles). The repair benefit could be determined from an analysis of pre-repair and post-repair I/M data which should be available from the repair effectiveness analysis required in the I/M rule. Finally, the model-year-specific emission rates would be determined from the pre-inspection emission rates and the after-repair emission rates based on whether the I/M program in place is an annual program or a biennial program. For an annual program, the model-year- specific emission rate would be calculated as follows: ERAA = Fractionpass * ERPre_pass + (l-FractionPass) * (ERPre_Fail + ERA£.er_Rep)/2 where: ERM = Annual average emission rate, ERpre = nnu v , .pass = Pre-inspection emission rate for passing vehicles, re.Fail = Pre-inspection emission rate for failing vehicles, Af;er_Rep = After-repair emission rate, and FractionPass = Fraction of vehicles passing the inspection. ^Rpre-Fail ER For a biennial program, the following equation would be used: ERBA = Fractionlnspecced * ERM + (l-FractionInspecced) * ERPre_A11 where ERBA is the model-year-specific emission rate for a biennial program, ER^ is the annual average rate defined above, Fractionlnspect:ed is the fraction of vehicles inspected in a given year (i.e., 50% in a perfectly biennial program) , and ERPre_A11 is the prs-inspection emission rate of all vehicles. The method described above provides only the mean model-year emission rates for one calendar year. Thus, if the data were collected in 1995, the model-year-specific emission rates could only be used in conjunction with MOBILE to develop a calendar year 1995 emission estimate (i.e., by inputting the model-year rates as zero-mile levels and specifying zero for deterioration rates). Clearly, not being able to forecast emissions is a significant shortcoming of the above approach, and a means to -34- ------- develop emission rates described by a zero-mile level and a deterioration rate is needed. To develop emission factors that can be used with MOBILE to forecast emissions, a method similar to that described above could be used. First, the simulated FTP data would be sorted by vehicle type, age (or odometer), and technology (e.g., carbureted, throttle-body injection, multipoint injection). Pre-inspection and after-repair emission rates would then be determined by vehicle age (or odometer), which would result in a plot similar to that illustrated in Figure 4-1 for each technology. A single emission value would be determined for each vehicle age by weighting the pre-inspection and after-repair points based on whether the I/M program in effect has an annual or biennial inspection frequency. A regression analysis would then be performed on these points to develop zero-mile levels and deterioration rates as a function of vehicle technology. Next, model-year emission factors would be calculated by weighting the technology-specific rates by the mix of those technologies observed in the fleet. In performing this analysis, care would have to be taken to ensure that the vehicles included in calculations had been certified to the same emission levels. To account for future emission standards, the zero-mile levels would be adjusted by the ratio of future-to-current standards. (Deterioration rates would remain unchanged, as they represent the I/M program in effect.) Note that this approach provides a future inventory that includes the impact of an I/M program, but it does not predict the benefit of the I/M program. Once the model-year zero-mile levels and deterioration rates are determined, they can be input to MOBILE (as user-input emission rates) and the model run. Note that since these rates already account for the presence of the I/M program, the I/M options in MOBILE would not be invoked. Summary Although the potential exists for states to develop locality-specific basic emission rates from IM240 data collected as part of an operating I/M program or the program evaluation requirements of the I/M rule, it is unclear how many states will attempt to do this. This judgment is based primarily on the following two factors. At this time, it appears that only a small number of states are likely to include IM240 testing in their enhanced I/M programs; thus, available IM240 data will come from the program evaluation requirements (i.e., 0.1% of the subject fleet must be tested). This results in a much smaller number of test records upon which to perform the analyses described above. Based on the information presented above, a significant investment in time and resources will be required on the part of states to develop basic emission rate equations from IM240 data. ### -35- ------- 7. REFERENCES 1. "Investigation of MOBILESa Emission Factors: Evaluation of IM240- to-FTP Correlation and Base Emission Rate Equations," Prepared by Sierra Research for the American Petroleum Institute, API Publication Number 4605, June 1994. 2. "Investigation of MOBILESa Emission Factors: Assessment of Exhaust and Nonexhaust Emission Factor Methodologies and Oxygenate Effects," Prepared by Systems Application International for the American Petroleum Institute, API Publication Number 4603, June 1994. 3. "Development of the CALIMFAC California I/M Benefits Model," Prepared by Sierra Research for the California Air Resources Board, Report No. SR-91-01-01, January 1991. ### -36- ------- APPENDIX A EIRG'S RESPONSES TO QUESTIONNAIRE ON DEVELOPING BASIC EMISSION RATES FROM IM240 DATA Database Adjustments Weight Foreign Manufacturers - Because the vehicles tested in Hammond did not accurately reflect the national average fraction of foreign vehicles, each foreign vehicle in the database was counted 2 to 4 times. Strengths: 1. Accounts for fleet mix biases in testing areas. 2. Foreign vehicles generally have a much lower DF than domestic. 3. Accounts for under-represented manufacturers. Comparison with the non-weighted results indicated a net increase in non-normals, which was most pronounced for carbureted closed-loop technology. 4. Important to account for foreign/domestic split because of differences in quality, durability, etc. Weaknesses: 1. No area using MOBILE will match the assumed national average fleet, and there is no way to account for this. 2. Variations among engine families are just as significant as foreign/domestic. 3. Has it been established that foreign vehicles are a bias? If it is important, why not treat those vehicles as a separate technology group. What about displacement, mileage, etc? Seems like an arbitrary adjustment. 4. Method used may be based on a poor sample and not reflect representative mix of foreign vehicles. Al terna ti ves: 1. This is a second-order effect - ignore it. 2. Predict base emission rates on a manufacturer/model year basis. 3. Develop emission factors separately for foreign/domestic and allow users the option to input that parameter. A-l ------- 4. More sophisticated analysis by engine family or by groups of engine families. 5. This should not be a problem for tech-group-specific analyses. 6. Whether this correction is applied depends on how significant any technology or durability differences are. Is foreign vs. domestic enough, or should all individual manufacturers be weighted. If differences are significant, use sampling theory to pick the optimum sample for desired weighting. 7. Develop a method to check representativeness of the available foreign data, e.g., technology, manufacturer, age. Compare with a more robust sample in a more representative area and then modify the weighting factors. Missing or Suspicious Mileage - A number of vehicles in the Hammond database had '0' or missing mileage and were deleted. In addition, vehicles that were coded as having an odometer reading > 300,000 miles were deleted. Strengths: 1. Removes bad data that could incorrectly influence emissions vs. mileage regressions, particularly for zero mileage. 2. Prevents compromising odometer-based relationships. 3. Avoids large statistical impacts from inclusion of extreme and likely erroneous mileages. 4. Obvious way of screening questionable data. Weaknesses: I. Some suspicious data could be valid data points. 2. Eliminating records reduces sample size. 3. High-mileage, poor condition vehicles may be more likely to have odometer problems. 4. Deletes valid data. 5. May remove actual high-mileage vehicles, which are badly needed in the database. 6. Limits sample size. A-2 ------- A1 terna ti ves: 1. Group vehicles according to age for some statistics (e.g., fraction of high emitters at 5 years vs. 50,000 miles). In this way, incorrect mileages are not an issue. 2. Compare odometer-based relationships with these vehicles classified by the mean mileage for that model year (i.e., assign to them the mean mileage by model year or age). 3 . Leave vehicles in the database and assign to them the average mileage of the remaining vehicles. This does not affect the slope of the regressions, just the y-intercept. 4. Use an age-odometer algorithm to identify suspected erroneous data. This works for both high and low mileage vehicles. 5. Generate mileage as a function of age, but consider each year's distribution of travel. Look at that distribution in the database; if it is OK (i.e., not too wide), take mean values as a function of age and use that. Seasonal Outliers - Data collected on 14 test dates in March and April when the ambient temperature was 25°F or more above the monthly average were deleted because many of those vehicles were statistical outliers. (Excessive purge was thought to be influencing the IM240 results.) Strengths -. 1. Excessive purge is added separately in MOBILE (i.e., through temperature/RVP corrections) and must be eliminated from data used to estimate base emission rates. 2. Excessive purge is a problem at high RVP and temperature; this method solves it. 3. Rejection of statistical outliers with data errors enhances the value of the database. 4. Solves the problem of excessive purge effects. Weaknesses \ 1. Reduces sample size. 2. Deletes valid data, particularly when temperatures are high in the spring or fall. High ozone episodes can occur during these times and improved emission factors under these conditions are worth some effort to develop. 3. True high emitters may be deleted. A-3 ------- 4. Rejection of true outliers reduces the accuracy of the database. 5. If it happened 14 out of 60 days, are these true outliers? 6. This type of activity occurs in the real world. How do the models account for this effect? Alternatives: 1. If sample is large enough, only use data collected within the FTP temperature range for the IM240/FTP correlation and base emission rates. IM240 data at different temperatures could be used to develop temperature/fuel correction factors. 2. If this is a real problem in the spring and fall, perhaps there should be a correction factor. 3. Determine if purge is higher during these times than on hot mid- summer days. Estimate vapor generation during running conditions and diurnals under both conditions using actual temperatures and estimates of local RVP. If spring/fall vapor generation (and thus purge) is relatively high, then fuel RVP is likely an important factor to include in the analysis (along with temperature). If both seasons show similar vapor generation levels, retain data and correct for temperature. 4. Temperature-correct the outliers to see if that gives more realistic results. 5. Data rejection should be based on a combination of statistical plus engineering analysis. The procedure of rejecting all data when performance problems are suspected is a good one. The long- range goal of IM240/FTP correlation will probably require some temperature correction correlation. The problem of excessive purge may be reduced with ORVR-sized canisters (or increased if the vehicle was just refueled). 6. Develop seasonal emission factors (i.e., summer, winter, spring/fall) based on temperatures and fuels reflective of those seasons. 7. Set some test temperature range for each RVP "season" within which data are used for FTP correlations. 8. Rather than deleting data out of hand, compute the temperature/fuel impact and use the results to validate the performance of the model. A-4 ------- Fuel/Temperature Adjustments Because EPA wished to develop the IM240-to-FTP correlations based on vehicles IM240 tested in a laboratory with Indolene, a method was needed to account for the differences between the lane and the lab before the correlation equations were applied to the Hammond lane IM240 data. For the Hammond database, it was felt that those differences were primarily related to tank fuel versus Indolene and the temperature differences occurring between the lane and the lab. (However, a number of other differences could also impact test variability between the lane and the lab, e.g., vehicle preconditioning procedures, inconsistent dynamometer settings, how well the IM240 speed-time trace is followed, etc.) The fuel/temperature adjustments prepared for MOBILE5 were based on a subset of the Hammond vehicles that were tested at the lane on tank fuel and at the lab on Indolene. Adjustment factors were developed by season (i.e., March-April, May-June, July-September, and October-February) and the following emitter categories: Normal HC/CO - lane IM240 s 1.64 g/mi HC and <; 13.6 g/mi CO, High HC/CO - lane IM240 > 1.64 g/mi HC or > 13.6 g/mi CO, Normal NOx -' lane IM240 <, 2.0 g/mi NOx, and High NOx - lane IM240 > 2.0 g/mi NOx. Once the data were segregated as outlined above, the mean emission levels for the lane/tank fuel scores and the lab/Indolene scores were determined. Adjustment factors were then developed from the ratio of these mean values. Strengths: 1. Since MOBILE adjusts for fuel and temperature separately, the base emission rates must be adjusted to FTP conditions. This approach is simple and easy to understand. 2. At least some accounting for major differences. 3. Only compares large sets of data. Test-to-test and vehicle-to- vehicle variability is reduced. 4. Any adjustment that accounts for variation due to external factors is helpful in the overall correlation. 5. Accounts for differences between the lane and the lab. 6. Some accounting for seasonal impacts. Weaknesses -. I. The two-step adjustment adds uncertainty. A simple adjustment may not be appropriate. The emission groupings were not chosen for best results. No technology groupings. A-5 ------- 2. Two-variable analysis may explain only part of the difference on specific vehicles. 3. Includes possible offset in lab and lane measurements. Merges fuel and temperature effects, when temperature is known and fuel specifications (at the lane) are not. Fuel effects are sufficiently difficult to assess in controlled experiments with multiple tests on repeatable vehicles. Probably impossible to determine under the test conditions existing here. 4. One set of factors are used to correct for fuel/temperature when going from lane IM240 to lab IM240, and a different set of factors when going from lab FTP to real-world FTP. Shouldn't factors be similar to fuel/temperature factors for FTP bag 3? 5. Was this part of a comprehensive study to determine the effects of different external variables? 6. Data from Hammond were extremely variable and not all of that variability could be reasonably explained by temperature and fuel effects (e.g., 20% of the vehicles had lane/tank fuel IM240s and lab/Indolene IM240s for HC and/or CO differ by more than 3 times). 7. These adjustments appear trivial considering a) cloning of foreign vehicles, b) using an average "X" in the regression equation, c) the "X" is of questionable merit, d) log space was used, and e) residuals are applied. 8. Not clear how the other differences - vehicle preconditioning, etc. - are accounted for when analyzing test results. Also, average temperature may mask the effect of unusual swings. A1 terna ti ves: 1. Fuel samples would aid in the adjustment and allow comparisons between fuels and temperature. 2. If sample size allows, choosing only records that are similar to FTP conditions may make the adjustment less important. 3. Possibly develop new emitter groupings or technology groupings. 4. Develop a more sophisticated multivariate analysis of differences using an engineering model. 5. Quantify temperature effect independently from fuel effect (i.e., IM240 versus temperature correlation). Use measured ambient temperatures at the time of the IM240 test. Do not segregate by season, given use of actual test temperature. Fuel-related reasons for segregating by season appear to be weak. While volatility changes with season, so does average temperature, in a compensating manner. Largest volatility-related effects will be on cold/hot days within a season, not across seasons. Given no knowledge of the lane fuel parameters, the fuel effect will be A-6 ------- part of the constant in the temperature regression. If, on average, in-use fuel is somewhat "dirtier" than Indolene, then the lane emissions will generally be higher than the lab measurements, temperature effects aside. 6. Should try to do a statistical analysis to determine the significance of all possible external variables on the final correlation, then account for those variables vith the statistically significant impacts. 7. Correlate FTP directly with lane IM240 scores, using only those conditions (i.e., temperature and fuel) that reasonably match the FTP. 8. Some accounting for inconsistent preconditioning should be considered. For example, look at the IM240 bag 1 vs bag 2 scores and possibly delete record if difference is outside a pre- determined window. Alternatively, compare lane bag results to lab bag results. If the difference is large in bag 1 but not bag 2, a preconditioning problem could have existed. 9. Use MOBILE temperature and RVP correction factors (for bag 3 or a combination of bags 2 and 3) to adjust the lane scores to an FTP temperature and Indolene basis, or at least use this information as a reality check on the factors developed with the test data. The MOBILE approach (or similar temperature/fuel factors developed specifically from IM240 tests) could possibly also be used in a state-based IM240-to-FTP analysis. (It is unlikely that states would have the resources to run the lab/Indolene IM240s for generating their own fuel/temperature corrections.) 10. Split the data into more temperature-specific regimes based on values recorded each day (i.e., look at weather data) and base the corrections on the temperature regimes. A-7 ------- IM240-to-FTP Correlations Once the Hammond lane IM240 data were adjusted to a lab/Indolene basis, correlation equations relating the IM240 to the FTP were applied to the data. The IM240-to-FTP correlations were based on a regression analysis of data collected from vehicles tested over the IM240 on Indolene and the FTP on Indolene. (The database used for the correlation analysis included vehicles from the Hammond program as well as vehicles tested in Ann Arbor.) The regressions were performed according to the following model year groups and technology types: 1981-1982, 1981+ open-loop, 1983+ carbureted/closed-loop, 1983+ throttle-body injection/closed-loop, and 1983+ multipoint fuel-injection/closed-loop. The HC and CO correlations were performed in log space with a cold start offset ("X" in the equation below) that varied by technology, while the NOx correlations were based on a simple linear equation without a cold start offset value: Log10(FTPHC/co - X) = b + m*Log10(IM240HC/co) FTPNOx = b + m*IM240NOx For cases in which (FTPHC/CO - X) < 0.01, the IM240 score was substituted for (FTPHC/CO - X) . In this way, errors resulting from taking the logarithm of a negative number were avoided. (A discussion of the cold start offset is included in the next section.) Strengths: 1. For an average, these correlations should provide a good estimate of average FTP emissions. 2. Cold start emission excess could be unrelated to hot start emissions. Any relationship between hot and cold emissions will automatically be included in the slope. 3. Use of different technology group regressions for limited number of groups is a good balance between sample size and accounting for different vehicles. 4. This is a relatively straightforward and easy technique. 5. It's slick and simplifies the use of the data. A-8 ------- Weaknesses: 1. To the extent that individual predicted FTP values are used, these correlations are only good for averages. These technology groupings were not chosen for the best correlations. One fit was used for all emitter groups. 2. Non-linear relationships were not investigated. 3. Unclear what analyses were performed to decide on logarithmic relationship for HC and CO and the linear relationship for NOx. 4. The log-based equation is equivalent to: FTPHC/CO = X + 10b[IM240HC/co]m Is this a realistic regression? (For b=0 and m=l, it gives a simple regression.) 5. Has it been established that disaggregation by technology groupings is justified? 6. There is really no connection between the IM240 and cold start. Cold start should be directly calculated from FTP data. 7. This method implies that the IM240 is being defined as equivalent to a "no-start" FTP, and there is no basis for this. The fact that FTP-X can be negative bears this out. 8. Calculating X from mean[FTP - IM240] implicitly assumes that the IM240 is equal to a "hot FTP." Is this reasonable? 9. It is not at all clear that X does a good job of accounting for the cold start offset. The fact that there were problems with negative numbers suggests that it did not. Al terna ti ves: 1. Develop multiple correlations separately for emitter groups, possibly for new technology groupings. 2. Explore different equational forms, but it is unclear that statistics would improve. 3. Perform both log and linear regressions and examine the variance about the regression line. The approach that shows a variance that is constant and randomly distributed about the regression line regardless of IM240 level is preferred, regardless of the correlation coefficient. If a log function is still preferred for HC and CO, then switch to the more complex approach. 4. Regress individual FTP bag data against IM240 level. Use of "X" should not be necessary for bags 2 and 3, and may not be necessary for bag 1. Again, be sure that the assumed functional A-9 ------- relationship meets the basic assumptions necessary for performing a regression. 5. Manufacturers have claimed that catalyst washcoat technology was significantly improved in the latter 1980s. The analysis should explore another major model-year group. 6. Try different regression formulae and pick the one with the best statistics. 7. Get rid of the cold start offset in the IM240 correlation and only use the IM240 to predict bag 2 and/or bag 3 (or, alternatively, a "Hot FTP", i.e., [0.521*Bag 2 + 0.479*Bag 3] = b + m*IM240). The cold start offset could then be calculated from available FTP data, with consideration for emitter groups. 8. Focus on the relationship between IM240 and bags 2 and 3. A-10 ------- Correlation Adjustments When the correlation equations were applied to the lane IM240 scores (which had been corrected to a lab/Indolene basis), two additional adjustments were made. First, the cold start offset was assumed to be a function of vehicle odometer reading (although the correlations were performed with a constant X value) , and second, regression residuals were randomly applied to each data point. Cold Start Offset - The cold start offset (X) values used in the above correlation equations were developed, by technology group, from the mean value of the difference between the FTP and the IM240 for normal emitters with FTP values greater than the IM240 (i.e., the value of (FTP - IM240) was determined for each normal emitter, and the mean of the positive results was used as X) . When the correlation equations were applied to the IM240 data, the value of X was adjusted to account for the effects of aging and mileage. The way that this adjustment was developed for 1983+ model years is described below. (A slightly different procedure was used for 1981-1982 model year vehicles.) The value of X in the correlation equations reflects the cold start offset at the mean mileage of the correlation sample. At mileages less than this mean, it follows that X should be decreased by some amount to account for the fact that the catalyst has been aged less and is expected to be more active. (Alternatively, X should be increased at mileages above the mean.) Thus, the cold start offset is actually X plus an increment that is a function of vehicle odometer, i.e., X-Offset Function = f (x) = X + f (Odometer) EPA has defined f (Odometer) in the above equation to be "the difference of the model year means regression for normal emitters and a 'New' line created by connecting a point on the model year means ' regression line at the mean mileage of the correlation sample with the zero mile level used in MOBILE4 . 1 . " The X-offset function is therefore: f(x) = X + ZMLHOB^M.! - ZMLMYMeans + ODOM* (DET.New. - DET^ Means) Strengths -. 1. Simple in concept and allows IM240 data to directly replace FTP data in the TECH model . 2. Will yield directionally consistent results. 3 . Accounts for cold start emissions which would not be measured in IM240 (unless IM240 vehicle is not warm) . 4. Calculates a cold start offset that is a function of vehicle age/mileage . A-ll ------- Weaknesses: 1. Clumsy handling. May not reflect the "true" effects of cold starts (and hot starts). 2. Why is MOBILE4.1 used as the "gold" standard? 3. Creates a mileage effect based on the results of two unrelated analyses. The effect is then extrapolated far beyond the mileage at which data have been collected. 4. Looks pretty hokey. 5. Hard to tell. What do statistics for measured versus computed cold start FTP emissions look like? 6. Really odd way to perform this adjustment. Why was MOBILE4.1 brought into this analysis at all? 7. It is unclear as to why X is defined as being independent of emissions (i.e., it's based on normals) but varies with age/odometer. Isn't age important only because emissions deteriorate accordingly? 8. Continues to use "X", which is a poor surrogate for cold start. Al terna ti ves 1. Use IM240 data only to estimate non-start emission rates. Use other data for start emissions directly. 2. Firsc determine whether cold start emissions are related to hot start emissions (e.g., regress bag 1 versus IM240). If such a relationship exists, then it can be determined directly from the regression. If not, cold start emissions must be determined from bag 1 data. IM240 data should not be used, nor should estimated FTP data from IM240 data. The IM240/FTP relationship is too uncertain, and clearly produces higher in-use FTP estimates versus previous FTP measurements (i.e., MOBILE4.1). The slope of cold start emissions versus mileage would be due in part to the change in methodology and overestimate of the mileage effect. 3. Use IM240 to correlate with bag 2/bag 3 of the FTP. Develop separate correlation between Bag I and bag2/bag 3 using only FTP data. 4. Reexamine from scratch other possible adjustments (e.g., multiplicative versus additive). Look at obtaining actual data on cold start offset versus odometer as opposed to MOBILE4.1 correlation. 5. Develop a cold start offset that is entirely separate from the IM240 data. Using FTP data, this could be done in a number of different ways. A-12 ------- 6. Low-mileage cold start offset can be determined from bag FTP results or from new car FTP versus IM240 tests. Regression Residuals - Another adjustment made during the application of the correlation equations was the addition of randomized regression residuals, i.e., Log10(FTPHC/co - X) = b + m*Log:o(IM240HC/co) + res FTPNOx = b + m*IM240NOx + res where "res" represents regression residuals from the correlation sample. According to EPA, adding the residuals randomly to the FTP emission levels predicted by the correlation equations attempts to restore a distribution of predicted FTP values for a given IM240 score. Otherwise, there will be a single predicted FTP value for each IM240 score. A distribution of predicted FTP scores and emission levels is important for some analyses, such as the determination of I/M credits. For example, if residuals were not applied, 100% of the FTP emissions from a certain emitter group could be identified on the basis of the IM240 score. Strengths -. 1. Without some adjustment, the individual predicted FTP values will tend to clump around the mean, making any evaluations that depend on emission distributions (e.g., number of high emitters) suspect. Residuals are actual observed distribution effects. 1. A relatively simple non-parametric way to model emission distributions. 2. Agree with concept. 3. Good for IM240 ID rates - not necessarily for FTP analysis. 4. Introduces a "distribution" back into the data. 5. Converting emission data from normal to log space, regressing in log space, and then converting back to normal emissions tends to yield a lower average emission level when compared to a simple average of the original data. This occurs because the average of the logarithms is akin to a geometric mean, which is always lower than the arithmetic mean when some variability is present. When the goal is to determine a relative (i.e., percent) change in emissions, then this reduction in the mean is not a problem. However, if the goal is to estimate absolute emission levels, then the reduction is a problem, since the atmospheric impact is the average of the emissions, not their logarithms. In this case, EPA desired to develop absolute estimates of FTP emissions from IM240 emissions. EPA may have added the residuals in order to compensate for the inherent downward bias in the logarithmic analysis relative to the atmospheric effect. 6. Simple to implement. A-13 ------- Weaknesses: 1. Since the residuals are randomly applied, the analysis cannot be replicated without a mapping of which vehicles used which residual value. 2. Too dependent on the individual points and character of the database used; could have problems with homoskedasticity. 3. Should add regression residual to value without cold start offset. Log space could cause problems. 4. The residuals were developed in log space, so the sum of these residuals (in log space) was zero. However, when the antilog was taken in the regression equation, it led to a net increase in predicted FTP emissions (relative to the non-residual equation). This significantly influenced the fraction of normals and non- normals in the predicted FTP database (e.g., for the MPFI group, the fraction of normal emitters was 78.8% with the residuals applied, and 90.2% without the residuals applied). Is this consistent with the relative difference between a linear regression and the log-space regression? 5. It is not clear that any analyses were done to demonstrate that application of residuals in fact yielded results that matched the atmospheric impact of the original data. 6. Assuming this was done rigorously, this is a good technique. However, it is compromised by the relatively large "X" effect, which is not rigorous. 7. Not clear that the use of the residuals in log space provides a representative distribution of predicted FTP scores. In fact, it seems unlikely to do so. Al terna ti ves: I. Rather than adjusting each predicted FTP value individually, a "probability" distribution might be developed which could be used to predict what portion of the fleet with a given IM240 score was above or below a given FTP score. 2. Could use log-normal formulation or Weibull distribution, with x and a derived from data. (Earlier suggested by EEA to EPA but rejected as being too complex.) 3. Need to ask how this changes the overall distribution. There is some initial distribution of IM240 scores. Without the addition of the residuals this distribution will not change when FTP values are computed. Do they change when residuals are added? If so, are the results reasonable? 4. This is a good idea, although it is unclear that it really needs to be done if all the database is used for is base emission rates. A-14 ------- If it is desired to do this, make sure that application of residuals does not unintentionally skew results. 5. Compare the arithmetic means of the original FTP data and the estimated FTP levels using the IM240/FTP correlation. Do this with and without the residuals added back in. Use the technique that matches the original data best. A-15 ------- TECHS Inputs Once the Hammond data were converted to predicted FTP scores, the results were used to develop inputs to the TECHS model (i.e., emitter category emission rates and growth functions). The following emitter categories were used in TECH5 for HC and CO emissions: Normal HC/CO - HC < 0.82 g/mi and CO < 10.2 g/mi, High HC/CO - HC > 0.82 g/mi or CO > 10.2 g/mi, Very High HC/CO - HC > 1.64 g/mi or CO > 13.6 g/mi, and Super HC/CO - HC > 10.0 g/mi or CO > 150.0 g/mi. NOx emissions were analyzed separately from HC and CO, with only two emitter categories being defined - normals (< 2.0 g/mi) and highs (> 2.0 g/mi). The data were also segregated by the following technology groups: open-loop, carbureted/closed-loop, throttle-body injection (TBI)/closed-loop, and multipoint fuel-injection (MPFI)/closed-loop. Finally, emission rates were determined separately for 1981-1982 model- year vehicles and 1983+ model-year vehicles. HC/CO Emission Rates - For HC and CO, the emitter category emission rates (i.e., zero-mile level (ZM) and deterioration rates (DRs)) were constructed as follows: 1. MOBILE4.1 ZMs were used for 1981-82 normals. 2. 1983+ DRs were used for 1981-82 normals, highs, and very highs. 3. Emission rates of normals were capped at the same rate for 1981-82 and 1983+ groups. 4. Normal caps were set at the maximum of the 1981-82 or 1983+ 100,000-mile levels calculated from the 1981-82 and 1983+ ZM and DR for normal emitters. 5. Deterioration rates that were negative and without significance were assumed to be zero. 6. Regression of carburetor very highs was performed for 1983-1988 model years only (although the regression results were applied to all 1983+ carbureted vehicles). Including 1989 resulted in a negative ZM. 7. A covariance analysis was used for fuel-injected very highs that resulted in the same DR but different ZM levels for the 1981-82 group and the 1983+ group. (This resulted in substantially higher HC and CO emission rates from the 1983+ group compared to the 1981-82 group.) 8. All model years were combined for supers. A-16 ------- Strengths -. 1. Allows detailed evaluation of the effects of high emitters and the effect of control programs (i.e., I/M). 2. Accounts for mileage impact on emission rates. 3. Recognizes that technology changes/improvements impact deterioration rates and creates a structure to account for the effect. Weaknesses: 1. Emitter groups should be statistically chosen instead of based on emission standards. Technology groupings need to be selected based on emission performance. User input in MOBILE has no impact on the assumptions used for the base emission rates. 2. Mix and match approach is not defensible. Should use the same data set for all analyses of regime sizes and emission levels. 3. Could double-count impact of emission deterioration and regime growth. 4. Many of the assumptions on when technology changed appear arbitrary and do not account for differential performance that may occur within the defined groups (distributional effects). A1 terna ti ves: 1. Possibly develop new emitter groups and technology groups. 2. Incorporate this function into the MOBILE code. 3. Use more regimes so that emissions are not a function of odometer in a given regime. 4. Use the same data set for all analyses. In cases where data are sparse, say so and do the best you can with what you've got. In some cases, if you are slim on data it probably means there are not that many in the fleet (e.g., 1981-82 MPFI vehicles) so the impact on fleet-average emission estimates (which is ultimately what we are trying to figure out here) is minimal. On the other hand, there probably were sufficient data for the 1981-82 carbureted group to analyze by itself and not use the 1983+ DRs. If there was concern about the number of normals in this group (which was probably low, given the fact the 1981-82 vehicles were 10 years old when tested), why not pull in some of the FTP data from Ann Arbor testing to represent low mileage normals. It's really the emitter category growth functions that drive the deterioration rates anyway. A-17 ------- 5. Need to discuss too many issues, e.g., making FI DRs the same for different groups does not seem reasonable unless it can be proven that the assumption holds. NOx Emission Rates - The following procedure was used to develop NOx emitter category emission rates: 1. 1981-82 model-year normals used the MOBILE4.1 ZM and the DR was determined from the mean emission level and mileage of the Hammond sample. 2. 1983+ model-year normals used a covariance analysis that forced the deterioration rates to be equal for vehicles certified to 1.0 and 0.7 g/mi NOx. (This resulted in different zero-mile levels.) 3. High NOx emitters used DRs from the normal NOx emitters, and the ZM levels were back-calculated from the mean emission level and mileage of the Hammond sample. Strengths: I. Using the ZM levels ties the results to actual FTP data. 2. Covariance analysis can check the null hypothesis that DRs are the same for different standards. 3. Simple approach. Weaknesses: 1. NOx emission estimates are significantly affected by several very high NOx emitters that are in the 8-12 g/mi range for IM240. It is unclear if they received accompanying FTP tests. Engine-out NOx for most of these vehicles should be in the 4-5 g/mi range. There has been no explanation as to why these were so high - there should be a review of the database to see if they were improperly tested at the I/M lane. 2. Why not compute DRs for highs. Method used could lead to unrealistically high zero-mile emissions for highs. 3. The assumptions drive the emission estimates and it is not clear how well it represents real-world occurrences .- how is this validated? Alternatives: 1. Accept hypothesis that NOx emission DFs may be zero or negative? 2. Compute DFs for each group as shown in the data. A-18 ------- Growth Functions - As important as the emission level of each emitter category are the growth functions assigned to those categories. For MOBILES, EPA wanted to base emission control system deterioration on both vehicle age and mileage. This was done by using data from the 1987 and later model years to establish the growth rate of non-normals (i.e., highs + very highs + supers) for mileages less than 50,000. For mileages above 50,000, data from the 1981-86 model years were used for the TBI and carbureted technology groups, while data from 1984-86 model years were used for MPFI vehicles. (EPA judged that pre-1984 MPFI represented "prototype" technology.) The method used to establish the emitter category growth rates was based on first developing growth rates for the following emitter groups: supers, very highs + supers, and highs + very highs + supers. Once these were established, individual emitter category growth rates were determined by subtraction. The analytical technique used to develop the growth functions for each of the above groups is best explained with an example. For the MPFI very highs+super group, the following process was used. First, the <50,000 mile growth rate was established by determining the fraction of very highs+supers from all 1987+ MPFI data. In the Hammond sample, there were 155 very highs+supers out of 1,716 total vehicles in this group (i.e., 9.03%). This fraction was then divided by the average mileage of the group (28,182) to obtain a growth rate of 0.03205/10,000 miles. The growth rate beyond 50,000 was calculated by first determining the fraction of very highs+supers in the 1984-1986 model year group (138/460, or 30.0%) and the average mileage of that group (68,464) . The second growth rate was then calculated by linear extrapolation of a line connecting the fraction of very highs+supers at 50,000 miles (i.e., 5*0.03205, or 16.0%) and the point established from the >50,000 1984-86 group (i.e., 0.300 at 68,464 miles). This resulted in a >50,000 growth rate of 0.07568/10,000 miles. Strengths: 1. Simple, yet accounts for non-linearity. 2. Straightforward technique, but it is unclear how this is a function of both age and mileage. 3. It gets numbers that can be used in the model. 4. It's an easy procedure and gives the 50,000-mile kink that has been assumed for years. 5. Simple to do. A-19 ------- Weaknesses -. 1. Does not account for "shape" of curve at very high mileages. 2. This technique is too sensitive to post-50,000 mile sample distribution. 3. The 50,000-mile break point is essentially arbitrary. (The only objective reason to choose it is that it represents the certification mileage.) Because of relatively low mileages of samples, calculated slopes are extrapolated far beyond the range of the data and can drive important policy decisions. Second slope is likely overestimated based on the assumption of zero high emitters at zero miles. This assumes that there is a significant number of high emitters at 1,000 miles and minimizes the projected number of high emitters at 50,000 miles. This latter fact in turn maximizes second slope. 4. It does not account for variation in growth rate or zero population at zero miles. 5. The 50,000-mile kink is not supported by the data. This method artificially inflates the second DR. 6. Linear growth rate is not obvious; it's more likely to tail off at high mileage. The method has not been validated and it is extremely sensitive to the two bins under and over 50,000 miles. Alternatives: 1. Cap number of highs, very highs, and supers? Add third linear growth beyond 100,000 miles? Non-linear fit? 2. Statistical analysis of p(high,super) in 10,000-mile bins. 3. Break data into small, but meaningful mileage increments (e.g., 10,000 mile increments). Plot fractions of high emitters and emissions of normals and highs. Perform linear and non-linear regressions to determine if the slope at higher mileage is increasing or decreasing and, if so, whether the effect is statistically significant. 4. Break data into 10,000-mile bins (at least up to 100K), and determine the fraction of each emitter category in each bin. Use a regression analysis to develop emitter category growth functions. 5. Break the data into more bins and track the growth rate and revisit the assumptions about the model year groups included in the analysis (i.e., it seems to mix mileage and model year when it should just be mileage). A-20 ------- |