United States Air and Radiation EPA420-R-95-004
Environmental Protection September 1995
Agency
vvEPA Development of a
Methodology for
Estimating Basic
Emission Rates for
Use in the MOBILE
Emission Factor Model
> Printed on Recycled Paper
-------
SR95-09-03
Development of a Methodology for Estimating
Basic Emission Rates for Use in the
MOBILE Emission Factor Model
prepared for:
U.S. Environmental Protection Agency
under Contract No. 68-C4-0056
Work Assignment No. 0-06
September 30, 1995
prepared by:
Phil Heirigs
Robert G. Dulla
Sierra Research, Inc.
1801 J Street
Sacramento, CA 95814
(916)444-6666
Although the information described in this report has been funded wholly or in part by the United
States Environmental Protection Agency under Contract No. 68-C4-0056, it has not been
subjected to the Agency's peer and administrative review and is being released for information
purposes only. It therefore may not necessarily reflect the views of the Agency and no official
endorsement should be inferred.
-------
Development of a Methodology for
Estimating Basic Emission Rates for
MOBILE Emission Factor Model
Table of Contents
Page
1. Introduction 1
Background 1
Organization of the Report 3
2. Overview of Emission Factors Development for MOBILES 4
Database Adjustments 4
Fuel/Temperature Adjustments 6
IM240-to-FTP Correlations 7
Correlation Adjustments 8
TECHS Inputs 9
3. Alternative Methods for Using IM240 Data to Develop Basic
Emission Rates 12
Survey Summary 12
IM240-to-FTP Conversion Procedure 13
TECHS Inputs 19
Treatment of Light-Duty Trucks 23
4. Adjusting I/M Data to a Non-I/M Basis 24
5. Incorporating "Off-Cycle" Emissions into MOBILE 28
6. Use of State-Generated IM240 Data in MOBILE 32
Data Collection and Development of Simulated FTP Scores .... 32
Development of Basic Emission Rates from Simulated FTP Scores . 34
7. References 36
Appendix A - EIRG's Responses to Questionnaire on Developing
Basic Emission Rates from IM240 Data
-------
List of Figures
Figure Page
2-1 Outline of Emission Factor Development for MOBILES 5
2-2 Change to the HC Cold-Start Offset as a Function of
Mileage for 1983+ MPFI Vehicles 9
3-1 Comparison of Very High + Super Emitter Fractions -
TECHS vs. Hammond Data for MPFI/CL Vehicles 21
4-1 Effect of An I/M Program on Emissions as a Function of
Repair Cycle 26
List of Tables
Table Page
2-1 Final Seasonal Fuel/Temperature Adjustments Used for
MOBILESa (Ratio of Lab/Indolene IM240 Scores to Lane/
Tank Fuel IM240 Scores) 6
2-2 IM240-to-FTP Correlation Equations Developed for MOBILES ... 7
3-1 Summary of IM240/Basic Emission Rate Survey Scores 14
3-2 Effect of Multiple Counting of Foreign Vehicles on the
Distribution of Emitter Categories by Technology Type for
1983 and Later Model Years 15
-------
1. INTRODUCTION
Background
With the release of MOBILES, the U.S. Environmental Protection Agency
(EPA) made a significant departure from the historical method of using
its Emission Factors database to develop exhaust basic emission rate
(BER) equations (i.e., the non-I/M emission rates in the model). In
previous versions of MOBILE, data used for the BERs were collected
through a process often referred to as "surveillance" testing, where
vehicle owners are randomly contacted (usually by letter) and asked to
give up their cars for a week of testing. Over the years, EPA has
become concerned that the vehicles they receive for the Emission Factors
testing are not representative of the in-use fleet, particularly with
respect to the fraction of poorly maintained, high-emitting vehicles.
This has been primarily attributed to a sample selection bias, e.g., if
a vehicle owner knows that his or her car has been poorly maintained or
has been tampered, he or she will not voluntarily submit it for
emissions testing.
To overcome sample bias concerns {and to provide a much larger sample
for analysis), EPA used IM240 emissions data collected during the
initial two years of an inspection and maintenance (I/M) program in
Hammond, IN, to develop the exhaust basic emission rate equations for
MOBlLESa." It was felt that this approach would provide a less biased
sample because all vehicle owners had to participate in the state-run
portion of the program. EPA then recruited vehicles from the state-run
testing lanes for the EPA tests.
Because all of the exhaust emission relations contained in MOBILE (e.g.,
temperature corrections, speed corrections, etc.) are based on FTP
testing with certification fuel (Indolene), a means to convert the IM240
data collected at the lane on tank fuel to an FTP/Indolene basis was
needed. This conversion process was a multi-step procedure, consisting
of the steps listed below.
Factors that accounted for the differences in ambient
temperatures and fuel characteristics between conditions
experienced during IM240 testing at the I/M lane and IM240
testing in the laboratory were developed from a subset of
Hammond lane vehicles.
Vehicles were tested in their first I/M "cycle," and therefore the
data represent emissions from a non-I/M fleet.
-------
Those factors were used to convert all the Hammond lane IM240
data (tested with tank fuel) to a laboratory/Indolene IM240
basis.
Correlation equations between IM240 emissions on Indolene
measured in the lab and FTP values on Indolene in the lab were
developed from a sample of vehicles.
These correlation equations were then applied to all of the
Hammond IM240 data (first adjusted for fuel and temperature
differences) to put all data on an FTP/Indolene basis.
Once the IM240-to-FTP conversion process described above was completed,
the TECH5 model was used to calculate the BER equations (zero-mile level
and deterioration rates) for MOBILE. The TECH model uses a "regime"
approach to develop emission rates (as a function of vehicle mileage) by
model-year group (i.e., 1981-1982 and 1983+) and technology (i.e.,
closed-loop multipoint fuel injection (MPFI/CL), closed-loop throttle-
body injection (TBI/CL), closed-loop carbureted (CARB/CL), and open-
loop) . Four emitter groups (or regimes) are defined in the TECH model:
normals, highs, very highs, and supers. Emission rates (by model-year
group/technology) are determined by multiplying the emission rate of
each emitter category by the fraction of each emitter category making up
the fleet at mileage intervals corresponding to vehicle age. Thus, two
primary inputs to the TECH model are the emitter-category emission rates
and the emitter-category population growth rates. Once the model year
group/technology emission rates are calculated, model-year-specific
emission factors (which are input to MOBILESa) are generated by
weighting the emission rates of each group by its expected fraction of
the fleet.
Although the IM240-to-FTP conversion approach provides a considerably
larger sample from which to develop BER equations for the MOBILE model,
several potential shortcomings have been identified in evaluations
sponsored by the American Petroleum Institute (API) .lr2" Thus, in Work
Assignment 0-06 of contract #68-C4-0056, EPA directed Sierra Research,
Inc. (Sierra)" to perform an evaluation of ways in which the use of
IM240 data for the development of basic emission rates could be
improved. In addition, the Work Assignment called for an assessment of
how IM240 data collected in an I/M area could be adjusted to a non-I/M
basis, recommendations for incorporating off-cycle effects into the
MOBILE model, and a review of methodologies by which state-generated
IM240 data could be used to develop user-input basic emission rates for
MOBILE. This report documents the evaluations performed under this Work
Assignment.
Superscripts denote references listed in Section 7 of' this report.
Sierra received assistance from subcontractors Air Improvement
Resource (AIR) and Energy and Environmental Analysis (EEA) during the
performance of this work assignment.
-2-
-------
Organization of the Report
Following this introduction, Section 2 provides an overview of the
methods used to develop basic emission rate equations for MOBILES from
the Hammond IM240 data. Section 3 follows with a discussion of
alternative methods for developing basic emission rates from IM240 data.
An assessment of methods to adjust IM240 data collected in an I/M area
to a non-I/M basis is contained in Section 4, while recommendations for
incorporating off-cycle emissions into the MOBILE model are presented in
Section 5. Section 6 is a discussion of how IM240 data collected by
states could be used to develop user-input basic emission rates for
MOBILE, and Section 7 lists the references cited in this report.
###
-3-
-------
2. OVERVIEW OF EMISSION FACTORS DEVELOPMENT
FOR MOBILES
Before proposing a method (or methods) to develop basic emission rates
for the MOBILE model from IM240 data, it is useful to review the
procedure used to generate basic emission rates for MOBILES. That
approach, which is diagrammed in Figure 2-1, was based on converting
IM240 data collected in Hammond, IN, to an FTP basis prior to the
development of inputs for the TECHS model. The conversion process
(which consisted of a number of individual adjustments in addition to
the development of IM240-to-FTP correlation equations) and the TECHS
inputs developed from those converted data are described in this section
of the report.
Database Adjustments
Prior to the development and application of IM240-to-FTP correlations,
several adjustments were made to the IM240 database so that it better
reflected a national average mix of domestic and foreign vehicles. In
addition, vehicles that had missing or suspicious odometer readings were
deleted, as were vehicles tested in March and April on days in which the
temperature was 25°F or more above the monthly average. These
adjustments are described below.
Foreign Manufacturers - Because the vehicles tested in Hammond did not
accurately reflect the national average fraction of foreign vehicles,
each foreign vehicle in the database was counted two to four times.
This adjustment increased the 1981 and later model year light-duty
vehicle sample size from 6,597 to 7,821.
Missing or Suspicious Mileage - A number of vehicles in the Hammond
database had '0' or missing mileage and were deleted. In addition,
vehicles that were coded as having an odometer reading > 300,000 miles
were deleted. This adjustment decreased the database from 7,821 to
6,999 records.
Seasonal Outliers - Data collected on 14 test dates in March and April
when the ambient temperature was 25°F or more above the monthly average
were deleted because many of those vehicles were statistical outliers.
(Excessive purge was thought to be influencing the IM240 results.) This
affected a relatively small number of vehicles, and it resulted in
decreasing the database from 6,999 to 6,826 records.
-4-
-------
Outline of Emission Factor Development for MOBILES"
IM240 Data Collected in Hammond, Indiana
> ;, Database Adjustments
Non-representative foreign manufacturers
Missing/suspicious mileage
Seasonal outliers
i
t f-cfef/Temperature Adjustments
Applied to get lane/tank fuel IM240s
on a lab/lndolene basis
tf !M240-to-FTP Correlations
Lab/lndolene IM240s correlated with
lab/lndolene FTPs
Cold-start function
Application of residuals
Predicted FTP Scores from
IM240 data
V *
Emitter category emission levels
Emitter category growth functions
* Discussion of components in shaded boxes follows.
-------
Fuel/Temperature Adjustments
Because EPA wished to develop the IM240-to-FTP correlations based on
vehicles IM240 tested in a laboratory with Indolene, a method was needed
to account for the differences between the lane and the lab before the
correlation equations were applied to the Hammond lane IM240 data. For
the Hammond database, it was felt that those differences were primarily
related to tank fuel versus Indolene and the temperature differences
occurring between the lane and the lab. (However, a number of other
differences could also impact test variability between the lane and the
lab, e.g., vehicle preconditioning procedures, inconsistent dynamometer
settings, how well the IM240 speed-time trace is followed, etc.)
The fuel/temperature adjustments prepared for MOBILES were based on a
subset of the Hammond vehicles that were tested at the lane on tank fuel
and at the lab on Indolene. Adjustment factors were developed by season
(i.e., March-April, May-June, July-September, and October-February) and
the following emitter categories:
Normal HC/CO - lane IM240 s 1.64 g/mi HC and < 13.6 g/mi CO,
High HC/CO - lane IM240 > 1.64 g/mi HC or > 13.6 g/mi CO,
Normal NOx - lane IM240 < 2.0 g/mi NOx, and
High NOx - lane IM240 > 2.0 g/mi NOx.
Once the data were segregated as outlined above, the mean emission
levels for the lane/tank fuel scores and the lab/Indolene scores were
determined. Adjustment factors were then developed from the ratio of
these mean values. A summary of those adjustment factors is shown in
Table 2-1.
Table 2-1
Final Seasonal Fuel/Temperature Adjustments Used for MOBILE5a
(Ratio of Lab/Indolene IM240 Scores to Lane/Tank Fuel IM240 Scores)
Pollutant
HC
CO
NOx
Emitter
Group
Normal
High
Normal
High
Normal
High
Seasonal Adjustment Factor
Mar -Apr
0.766
0.851
1.072
0.934
0.809
0.784
May-Jun
0.884
0.940
1.007
1.038
0.825
0.736
Jul-Sep
0.823
0.935
0.792
0.880
0.913
0.669
Oct-Feb
0.880
1.137
1.036
1.074
0.862
0.826
-6-
-------
IM240-to-FTP Correlations
Once the Hammond lane IM240 data were adjusted to a lab/Indolene basis,
correlation equations relating the IM240 to the FTP were applied to the
data. The IM240-to-FTP correlations were based on a regression analysis
of data collected from vehicles tested over the IM240 on Indolene and
the FTP on Indolene. (The database used for the correlation analysis
included vehicles from the Hammond program as well as vehicles tested in
Ann Arbor.) The regressions were performed according to the following
model-year groups and technology types:
1981-1982,
1981+ open-loop,
1983+ carbureted/closed-loop,
1983+ throttle-body injection/closed-loop, and
1983+ multipoint fuel-injection/closed-loop.
The HC and CO correlations were performed in log space with a cold-start
offset ("X" in the equation below) that varied by technology, while the
NOx correlations were based on a simple linear equation without a cold-
start offset value:
Log10(FTPHC/co - X) = b + m*Log10(IM240HC/co)
FTP>,
= b
m*IM240N
For cases in which (FTPHC/CO - X) < 0.01, the IM240 score was substituted
for (FTPHC/CO - X) . In this way, errors resulting from taking the
logarithm of a negative number were avoided. In addition, if the
intercept term was not statistically different from zero at the 95%
confidence level, the regressions were re-run without an intercept.
Table 2-2 summarizes the results of the correlation analysis.
Table 2-2
IM240-to-FTP Correlation Equations Developed for MOBILES
Pollutant
HC
CO
NOx
Model Year/
Technology
1981-1982
1981+ Open-Loop
1983+ CARB/CL
1983+TBI/CL
1983+ MPFI/CL
1981-1982
1981+ Open-Loop
1983+ CARB/CL
1983+TBI/CL
1983+ MPFI/CL
1981-1982
1981+ Open-Loop
1983+ CARB/CL
1983+TBI/CL
1983+ MPFI/CL
N
58
24
73
224
211
58
24
73
224
211
58
24
73
224
266
X
0.309
0.315
0.195
0.180
0.222
2.140
1.640
1.579
1.541
1.696
NA
NA
NA
NA
NA
b
0.1382
0.1448
0.0000
0.0000
0.0000
0.0000
0.3090
0.0000
-0.1386
0.0000
0.2534
0.0000
0.0000
0.0767
0.1250
m
1.0715
0.9654
0.9745
0.9840
0.9520
1.004
0.851
0.906
1.072
0.886
0.7737
0.9306
0.8925
0.8234
0.7730
R2
0.909
0.879
0.905
0.873
0.915
0.943
0.904
0.873
0.782
0.780
0.825
0.976
0.961
0.901
0.825
-7-
-------
Correlation Adjustments
When the correlation equations were applied to the lane IM240 scores
(which had been corrected to a lab/Indolene basis) , two additional
adjustments were made. First, the cold-start offset was assumed to be a
function of vehicle odometer (although the correlations were performed
with a constant X value) ; second, regression residuals were randomly
applied to each data point.
Cold-Start Offset - The cold-start offset (X) values used in the above
correlation equations were developed, by technology group, from the mean
value of the difference between the FTP and the IM240 based on normal
emitters with FTP values greater than the IM240 (i.e., the value of
(FTP - IM240) was determined for each normal emitter, and the mean of
the positive results was used as X) . When the correlation equations
were applied to the IM240 data, the value of X was adjusted to account
for the effects of aging and mileage. Development of this adjustment
for 1983+ model years is described below. (A slightly different
procedure was used for 1981-1982 model-year vehicles.)
The value of X in the correlation equations reflects the cold-start
offset at the mean mileage of the correlation sample. At mileages below
this mean, it follows that X should be decreased by some amount to
account for the fact that the catalyst has been aged less and is
expected to be more active. (Alternatively, X should be increased at
mileages above the mean.) Thus, the cold-start offset is actually X
plus an increment that is a function of vehicle odometer, i.e.,
X-Offset Function = f(x) = X + f (Odometer)
EPA has defined f (Odometer) in the above equation to be "the difference
of the model year means regression for normal emitters and a 'New' line
created by connecting a point on the model year means regression line at
the mean mileage of the correlation sample with the zero mile level used
in MOBILE4 . 1 . " The X-offset function is therefore:
f(x) = X + ZVSLmB1M,l - ZMLmMeans + ODOM*(DET.New, - DET^ Means)
A plot of the two lines described above for HC from multipoint fuel-
injected vehicles is shown in Figure 2-2 (XHC = 0.222 g/mi for that
group) .
Regression Residuals - Another adjustment made during the application of
the correlation equations was the addition of randomized regression
residuals , i.e.,
Log10(FTPHC/co - X) = b + m*Log10(IM240HC/co) + res
FTPNOx = b + m*IM240NOx + res
where "res" represents regression residuals from the correlation sample.
-------
Figure 2-2
(Pas
0.6
0.5
Change to the HC Cold-Start Offset as a Function
of Mileage for 1983+ MPFI Vehicles
0.269
0.2
Cold-Start Offset - X + l(Odomater)
AtO mile*. f(Odometer) = 0.269-0.308 = -0.030 8/rrt
Modal Year Mean*
Regression Una
- ((Odometer)
Cold-Stan Offset It
Decreased Relative
toX
'New1 Un« Based
on MOBILE4.1 ZML
Mean Mileage
Cold-Start Offset It
Increased Relative
toX
4 6 8 10
Odometer (10,000 miles)
12
14
According to EPA, adding the residuals randomly to the FTP emission
levels predicted by the correlation equations attempts to restore a
distribution of predicted FTP values for a given IM240 score.
Otherwise, there will be a single predicted FTP value for each IM240
score. A distribution of predicted FTP scores and emission levels is
important for some analyses, such as the determination of I/M credits.
For example, if residuals were not applied, 100% of the FTP emissions
from a certain emitter group could be identified on the basis of the
IM240 score.
TECHS Inputs
Once the Hammond data were converted to predicted FTP scores, the
results were used to develop inputs to the TECH5 model (i.e., emitter-
category emission rates and growth functions). The following emitter
categories were used in TECH5 for HC and CO emissions:
Normal HC/CO - HC 1 0.82 g/mi and CO <, 10.2 g/mi,
High HC/CO - HC > 0.82 g/mi or CO > 10.2 g/mi,
Very High HC/CO - HC > 1.64 g/mi or CO > 13.6 g/mi, and
Super HC/CO - HC > 10.0 g/mi or CO > 150.0 g/mi.
NOx emissions were analyzed separately from HC and CO, with only two
emitter categories being defined: normals (<. 2.0 g/mi) and highs
(> 2.0 g/mi).
-9-
-------
The data were also segregated by the following technology groups:
open-loop,
carbureted/closed-loop,
throttle-body injection (TBI)/closed-loop, and
multipoint fuel-injection (MPFI)/closed-loop.
Finally, emission rates were determined separately for 1981-1982 model
year vehicles and 1983+ model year vehicles.
HC/CO Emission Rates - For HC and CO, the emitter category emission
rates (i.e., zero-mile level (ZM) and deterioration rates (DRs)) were
constructed as outlined below.
1. MOBILE4.1 ZMs were used for 1981-82 normals.
2. 1983+ DRs were used for 1981-82 normals, highs, and very highs.
3. Emission rates of normals were capped at the same rate for
1981-82 and 1983+ groups.
4. Normal caps were set at the maximum of the 1981-82 or 1983+
100,000-mile levels calculated from the 1981-82 and 1983+ ZM
and DR for normal emitters.
5. Deterioration rates that were negative and without significance
were assumed to be zero.
6. Regression of carburetor very highs was performed for 1983-1988
model years only (although the regression results were applied
to all 1983+ carbureted vehicles). Including 1989 resulted in
a negative ZM.
7. A covariance analysis was used for fuel-injected very highs
that resulted in the same DR but different ZM levels for the
1981-82 group and the 1983+ group. (This resulted in
substantially higher HC and CO emission rates from the 1983+
group compared to the 1981-82 group.)
8. All model years were combined for supers.
NOx Emission Rates - The following procedure was used to develop NOx
emitter category emission rates:
1. 1981-82 model year normals used the MOBILE4.1 ZM and the DR was
determined from the mean emission level and mileage of the
Hammond sample.
2. 1983+ model year normals used a covariance analysis that forced
the deterioration rates to be equal for vehicles certified to
1.0 and 0.7 g/mi NOx. (This resulted in different zero-mile
levels.)
3. High NOx emitters used DRs from the normal NOx emitters, and
the ZM levels were back-calculated from the mean emission level
and mileage of the Hammond sample.
Growth Functions - Equally as important as the emission level of each
emitter category are the growth functions assigned to those categories.
For MOBILES, EPA wanted to base emission control system deterioration on
both vehicle age and mileage. This was done by using data from 1987 and
-10-
-------
later model years to establish the growth rate of non-normals (i.e.,
highs + very highs + supers) for mileages less than 50,000. For
mileages above 50,000, data from the 1981-86 model years were used for
the TBI and carbureted technology groups, while data from 1984-86 model
years were used for MPFI vehicles. (EPA judged that pre-1984 MPFI
represented "prototype" technology.)
The method used to establish the emitter-category growth rates was based
on first developing growth rates for the following emitter groups:
supers,
very highs + supers, and
highs + very highs + supers.
Once these were established, individual emitter-category growth rates
were determined by subtraction.
The analytical technique used to develop the growth functions for each
of the above groups is best explained with an example. For the MPFI
very highs+super group, the following process was used. First, the
<50,000 mile growth rate was established by determining the fraction of
very highs+supers from all 1987+ MPFI data. In the Hammond sample,
there were 155 very highs+supers out of 1,716 total vehicles in this
group (i.e., 9.03%). This fraction was then divided by the average
mileage of the group (28,182) to obtain a growth rate of 0.03205/10,000
miles. The growth rate beyond 50,000 miles was calculated by first
determining the fraction of very highs+supers in the 1984-1986 model
year group (138/460, or 30.0%) and the average mileage of that group
(68,464) . The second growth rate was then calculated by linear
extrapolation of a line connecting the fraction of very highs+supers at
50,000 miles (i.e., 5*0.03205, or 16.0%) and the point established from
the >50,000 1984-86 group (i.e., 0.300 at 68,464 miles). This resulted
in a >50,000 growth rate of 0.07568/10,000 miles.
###
-11-
-------
3. ALTERNATIVE METHODS FOR USING IM240 DATA TO
DEVELOP BASIC EMISSION RATES
In developing alternatives to the MOBILES methodology for using IM240
data to generate basic emission rate equations, the following approach
was used. First, members of an ad hoc Emission Inventory Review Group
(EIRG)* were asked to provide their thoughts on the strengths and
weaknesses of the MOBILES methodology. In addition, they were asked to
suggest alternatives to that methodology. Their responses were then
used to formulate an informal survey in which the MOBILES methods and
proposed alternatives were ranked. The results of that survey helped
focus the development of the recommended alternatives presented in this
section.
The discussion below first presents a brief summary of the survey
responses. That is followed by a description of each of the adjustments
and calculations performed in the MOBILES approach, with a summary of
the concerns and limitations expressed by the EIRG. Recommended
alternatives conclude each discussion point. This discussion is
structured in two parts: (1) the IM240-to-FTP conversion procedure, and
(2) inputs to the TECHS model. Although the Scope of Work for this
project called for the development of a single methodology for
estimating MOBILE basic emission rates from IM240 data, in some cases it
is impossible to recommend a single method without first reviewing the
results of several alternatives. Thus, some portions of the following
discussion contain more than one recommended approach.
Survey Summary
As described above, a questionnaire was circulated to members of the
EIRG which summarized the methods used to develop basic emission rate
equations for MOBILES and asked for a listing of strengths, weaknesses,
and alternatives to each specific adjustment that was performed. (For a
summary of the methods used for MOBILES, refer to Section 2 of this
report.) The responses to that questionnaire, which are summarized in
Appendix A, helped form the basis of a survey that was distributed to
the EIRG. In that survey, participants were asked to rank the
importance of specific data adjustments and alternative methods that
could be used in the development of basic emission rate equations from
IM240 data. The purpose of the survey was to provide a more objective
ranking of the importance of adjustments and alternative methods, which
The EIRG was made up of individuals responsible for emission factor
development from EPA's Office of Mobile Sources, the California Air
Resources Board's Mobile Source Division, EEA, AIR, and Sierra.
-12-
-------
would then help focus efforts to expand on some of the EIRG's
recommendations.
Although surveys were not filled out by the entire EIRG, responses were
received by six participants. A summary of those responses, with an
average score for each question/recommendation, is contained in
Table 3-1. In general, the results indicate that alternatives to the
methods used to develop MOBILES basic emission rate equations from the
Hammond IM240 data are preferred.
IM240-to-FTP Conversion Procedure
The first step in the development of basic emission rates for MOBILES
was to convert the lane IM240 data collected in Hammond to an FTP basis.
That process consisted of several steps, including adjustments for a
non-representative mix of foreign and domestic vehicles, corrections for
suspicious or missing mileages, and corrections to get the lane IM240
data (collected with tank fuel) on a laboratory/Indolene basis (these
are thought of as temperature and fuel corrections). This last step was
necessary because the IM240-to-FTP correlations were based on laboratory
IM240 and FTP tests with a standard fuel (Indolene) at a standard
temperature. Finally, correlation equations were applied to the lane
data to generate simulated FTP scores for the entire lane IM240
database.
Below is a brief description of the methods used in the IM240-to-FTP
conversion process for MOBILES. Following the description of each
adjustment/method is a summary of the concerns expressed by the EIRG and
recommended alternatives.
Database Adjustments - Foreign Manufacturers - Because the vehicles
tested in Hammond did not accurately reflect the national average
fraction of foreign vehicles, each foreign vehicle in the database was
counted two to four times.
Limitations and Concerns with Current Approach - The EIRG generally
agreed that sampling biases should be accounted for if it can be
established that durability differences are significant. The
foreign/domestic split is only one possible bias, and it is possible
that durability differences among engine families or among manufacturers
are equally important.
Recommended Alternative - There was really no consensus reached among
the survey respondents on how to proceed with a sample selection bias
correction. However, it is clear that the first step is to determine if
durability differences are significant. That can be done a number of
different ways. For example, Table 3-2 presents the distribution of
emitter categories (i.e., normals, highs, very highs, and supers) with
and without the foreign vehicle adjustment used in MOBILES. The table
indicates that the impact of not making this adjustment is most
pronounced for carbureted vehicles, with only slight changes occurring
in the distribution of emitter categories for the fuel-injected
technologies.
-13-
-------
Summary of IM240/Basic Emission Rate Survey Scores
Adj ustment/Methodology
Database Adjustments - Weighting of Foreign Vehicles
Is this adjustment needed?
If so, what is the best approach?
Current method
Manufacturer/engine family basis
Foreign/domestic with more emphasis on tech type
Database Adiustments - Missing or Suspicious Mileage
Is this adjustment needed?
If so, what is the best approach?
Current method
Change to an age-based analysis
Assign sample average mileage
Assign average mileage based on vehicle age
Database Adiustments - Seasonal Outliers
Is this adjustment needed?
f so, what is the best approach?
Current method
Limit use of data to FTP temperature ranges
Establish a temperature range for each RVP "season"
Temperature correct the outliers
Determine if purge is really higher during those periods
Only reject data on basis of statistical/engr analyses
Fuel/Temperature Adiustments
Is this adjustment needed?
If so, what is the best approach?
Current method
Choose records that are similar to FTP conditions
Multivanate analysis of differences
Quantify temperature effects independently of fuel
Statistical analysis of all external variables
Correlate FTP directly with lane IM240s
MOBILE temp/RVP factors to adjust lane IM240s
Split data into temperature regimes
IM240-to-FTP Correlations
What is the best approach for correlating IM240 with FTP?
Current method
Multiple correlations by emitter group
Explore different equational forms, choose best stats
Base choice of equational form on most random variance
Regress IM240 against individual FTP bags
Eliminate cold-start offset; use IM240 for FTP bags 2/3
Correlation Adiustments - Cold-Start Offset
What is the best way of determining a cold-start offset?
Current method
Regress bag 1 versus IM240 and use if statistically significant
Use IM240 for non-start emissions; FTP for cold start
Use IM240 for bag 2/3; bag! vs bag2/3 from FTP data
Correlation Adiustments - Regression Residuals
Should regression residuals be used in IM240/FTP regressions?
If so, what is the best approach?
Add in randomized residuals when applying correlation eqn
Develop a probability distribution
Use a log-normal or Weibull distribution of residuals
TECHS Incuts - Emission Rates
Should it be assumed that DRs are the same for different MYs?
Is there any basis for using MOBILE4.1 rates in this analysis?
Do the MY breakpoints adequately reflect developing versus
mature technology?
TECHS Inputs - Growth Functions
What is the best way to develop emitter growth functions?
Current method
Add a third linear growth rate beyond 100,000 miles
Analyze the data in 10,000-mile bins
Respondent Scores
DJB
3
3
3
5
4
3
4
3
2
4
3
4
3
1
2
2
3
4
3
3
3
2
2
4
3
4
3
3
4
5
2
2
4
3
4
3
4
3
5
4
4
2
4
4
RAR
2
2
4
3
5
2
3
4
5
5
4
2
3
3
5
3
3
1
1
1
4
5
5
5
5
1
5
3
3
4
2
1
1
1
3
5
JL
3
4
4
4
2
4
4
4
1
5
4
4
4
4
3
2
2
5
LSC
3
1
4
4
5
1
4
1
2
3
2
3
3
2
4
4
3
2
3
4
3
4
3
3
3
2
4
4
2
4
3
2
3
3
3
2
2
2
2
1
1
3
1
3
4
ROD
5
1
5
4
4
1
4
4
5
5
2
1
5
5
3
5
5
1
1
5
3
4
1
4
4
1
3
3
2
4
5
0
3
4
5
2
1
2
3
1
1
1
1
1
5
PLH
3
2
3
3
5
2
2
2
5
4
2
4
5
2
4
2
5
2
4
3
4
4
3
2
2
1
3
4
3
3
5
1
3
5
4
3
3
5
4
1
1
1
1
3
5
Average
Score
3.2
1.8
3.8
3.8
4.5
1.8
3.4
2.8
3.8
3.6
2.3
3.5
4.0
2.5
3.4
3.4
3.6
2.0
3.0
3.6
3.8
3.6
2.4
2.4
2.8
1.6
3.6
3.8
3.0
4.0
4.5
1.2
3.2
3.8
3.7
3.2
2.3
3.3
3.2
2.2
1.7
2.0
1.2
2.8
4.7
-------
Table 3-2
Effect of Multiple Counting of Foreign Vehicles on the Distribution of
Emitter Categories by Technology Type for 1983 and Later Model Years
Technology
Multiple
Foreigns6
MPFI/CL
TBI/CL
CARB/CL
OPEN LOOP
Single Foreigns
MPFI/CL
TBI/CL
CARB/CL
OPEN LOOP
Total
Data
Points
2208
1991
1654
252
1742
1873
1344
196
Emitter Category
Normal
0.788
0.718
0.540
0.214
0.776
0.722
0.503
0.189
High
0.077
0.141
0.149
0.210
0.082
0.138
0.158
0.194
V. High
0.131
0.135
0.303
0.567
0.138
0.133
0.331
0.607
Super
0.004
0.007
0.008
0.008
0.005
0.007
0.008
0.010
Represents data used for MOBILES.
It is recommended that a similar analysis (or some type of analysis of
variance) be performed on a manufacturer-specific basis to determine if
durability differences exist. Based on the data presented in Table 3-2,
it appears that this would be most important for carbureted vehicles.
It would also be useful to determine if manufacturer-specific (or
foreign/domestic) differences are more prevalent as a function of model
year (e.g., do the early 1980 model year vehicles exhibit a greater
emissions difference than late 1980 model year vehicles). If those
differences do exist, then the data should be weighted accordingly.
Database Adlustments - Missing or Suspicious Mileage - A number of
vehicles in the Hammond database had '0' or missing mileage and were
deleted. In addition, vehicles that were coded as having an odometer
reading > 300,000 miles were deleted.
Limitations and Concerns with Current Approach - The primary concern
with deleting these data points is that valid data are removed. This .
may be a particular problem for high-mileage vehicles which are badly
needed in the database.
Recommended Alternative - There was fairly strong sentiment that
corrections for missing and suspicious mileage should be made. In terms
of suspicious mileages, we recommend running an odometer "cleaning"
routine on all vehicles to identify vehicles with unusually high or low
mileage accumulation rates. That can be done by first estimating the
age of the vehicle at the time it was tested based on the difference
-15-
-------
between the test date and the model year." The age at the time of
testing can be used to flag vehicles with mileage accumulation rates
below 3,000 miles per year or above 30,000 miles per year for closer
inspection. (Clearly, other mileage accumulation cutpoints could be
used in this type of analysis.)
In the Hammond database, there were a significant number (about 10%) of
vehicles with missing odometer readings. Thus, some method to estimate
the mileage of those vehicles is recommended. The general consensus of
the EIRG was to assign those vehicles the average mileage of the
remaining vehicles in the database based on the age of the vehicle at
the time it was tested.
A broader issue related to this adjustment is whether to develop
emission factors based on vehicle age rather than accumulated mileage.
Both accumulated mileage and vehicle age play a role in emissions
deterioration and emitter-category growth functions. (Emitter-category
growth functions are discussed in detail below under "TECHS Inputs.")
To the extent that some deterioration of emission control systems is due
to weathering effects, emitter-category growth would be best
characterized by vehicle age. To the extent that deterioration is due
to vehicle use, emitter-category growth would be best characterized by
odometer reading. As currently structured, TECH5 and MOBILES use a
fixed relationship between vehicle age and odometer so that only one of
these variables can be used in determining the population of the various
emitter categories. The average relationship between vehicle age and
odometer reading shows that the average vehicle is driven fewer miles
per year as it ages. Consequently, a nonlinear relationship between
either of these variables and emitter-category population sizes
represents the combined effects of both. As described below, this is
the approach that is recommended for the development of the emitter-
category growth functions.
If age-based versus odometer-based emission deterioration is still a
concern, the analysis could be limited to vehicles that are within a
certain fraction (or standard deviation) of the mean mileage for each
vehicle age in the dataset. This is a reasonably easy adjustment to
perform, but it has the disadvantage of eliminating real, valid data
points.
Lane/Tank Fuel-to-Lab/Indolene Adjustments - Because it is desirable to
develop the IM240-to-FTP correlation equations with vehicles operated in
a lab on Indolene, a means to convert the lane/tank fuel IM240 scores to
a lab/Indolene basis is needed. It is thought that this adjustment is
primarily a fuel and temperature correction that accounts for the
differences between the lane and the lab. For MOBILES, this adjustment
was developed from a subset of Hammond vehicles that were tested at the
Model years are assumed to run from October 1 of the previous year to
September 30 of the model year. The midpoint date of April 1 in the
model year is assumed as the initial operating date of the vehicle. In
cases where the vehicle is tested during its initial model year, it is
assumed to have been placed in operation midway between the start of the
model year and the test date.
-16-
-------
lane on tank fuel and in the lab on Indolene. Adjustment factors were
developed by season (i.e., March-April, May-June, July-September, and
October-February) and emitter category.
In addition to the general fuel/temperature correction, data collected
in March and April on 14 test dates when the ambient temperature was
25°F or more above the monthly average were deleted because many of
those vehicles were statistical outliers. (Excessive purge was thought
to be influencing the IM240 results.)
Limitations and Concerns with Current Approach - The EIRG's major
complaint with the lane-to-lab adjustment utilized in the development of
MOBILES emission factors is that it may not have accounted for all
external variables affecting emissions (e.g., preconditioning effects).
In terms of the deletion of data points collected on the aforementioned
test dates, there was concern that true high emitters were deleted.
Recommended Alternative - Although a variety of alternatives were
offered by the EIRG, none stood out as being vastly superior to the
others. There was general agreement that temperature effects should be
quantified separately from fuel effects, but the mechanism to do that is
unclear given that fuel samples were not taken in the Hammond program.
The easiest and most straightforward way is to consider only data that
were collected within the FTP temperature range. However, this may
vastly reduce the number of valid records. As a first cut on this
adjustment, the number of tests performed outside of the FTP temperature
range should be determined from the Hammond database. If too many tests
are discarded, this approach would not be practicable. (Thus, even if
the Hammond IM240 database, which contains approximately 16,000 records,
is severely diminished, the large volume of data available from
operating IM240 programs could be used to fill the void.)
Another approach that has merit is to establish a different temperature
range for each RVP "season" that would result in similar vapor
generation rates. This would minimize the effect that excessive purge
might have on IM240 emission rates. In addition, under hot stabilized
operation (which, ideally, is the mode the vehicle is in during the
IM240 test), the temperature impact on emissions is not significant.
Establishing temperature ranges could be accomplished by analyzing fuel
samples from a number of vehicles each week or month, and recording the
test temperature for each vehicle and the diurnal temperature profile on
each test date. For the Hammond database, it would be worthwhile to
determine the availability of fuel volatility statistics for that area
during the test program. This information is likely to be available for
the summer (e.g., through RVP compliance testing), but the winter months
may pose difficulties. A survey of refiners supplying fuel to that area
may provide winter fuel specifications.
To serve as a check on possible preconditioning or excessive purge
problems, a comparison of the IM240 bag 1 and bag 2 scores needs to be
performed on the lane data. If the ratio (or difference) of bag 1 and
bag 2 is outside of predetermined limits, that record should be
discarded. A data set that may be useful to determine what those limits
should be is the ASM/IM240 comparison test program conducted by EPA in
Phoenix. In that program, roughly half of the vehicles were tested with
-17-
-------
the ASM first, while the other half were tested with the IM240 first.
The IM240 scores that were collected immediately following the ASM test
should represent a well-preconditioned subset of vehicles.
Correlation Eoruations - Once the Hammond lane IM240 data were adjusted
to a lab/Indolene basis, correlation equations relating the IM240 to the
FTP were applied to the data. The IM240-to-FTP correlations were based
on a regression analysis of data collected from vehicles tested over the
IM240 on Indolene and the FTP on Indolene. (The database used for the
correlation analysis included vehicles from the Hammond program as well
as vehicles tested in Ann Arbor.) The regressions were performed
according to the following model year groups and technology types:
1981-1982,
1981+ open-loop,
1983+ carbureted/closed-loop,
1983+ throttle-body injection/closed-loop, and
1983+ multipoint fuel-injection/closed-loop.
The HC and CO correlations were performed in log space with a cold-start
offset ("X" in the equation below) that varied by technology, while the
NOx correlations were based on a simple linear equation without a cold-
start offset value:
Logi0(FTPHC/co - X) = b + m*Log10(IM240HC/Co) + res
FTPNOx = b + m*IM240NOx + res
For cases in which (FTPHC/CO - X) < 0.01, the IM240 score was substituted
for (FTPHC/CO - X) . In this way, errors resulting from taking the
logarithm of a negative number were avoided. The "res" term in the
equation above represents regression residuals from the correlation
sample.
Limitations and Concerns with Current Approach - The EIRG expressed two
primary concerns with the correlation method developed for MOBILES.
First, the use of the cold start offset ("X" in the equation above)
implies that the IM240 can predict vehicle emissions during cold
operation. Since the IM240 is a hot test, it should be used only to
estimate running emissions. Second, there was general discomfort with
the use of the log-based equation. In addition, it was not clear to
some members of the EIRG that adding residuals was necessary for
developing basic emission rate equations (particularly since these data
were not used to generate the I/M identification rates used by TECHS to
develop the I/M credits matrices for MOBILES).
Recommended Alternative - We recommend that IM240 data be used only to
predict hot stabilized vehicle operation. That being the case, there
remains a question of whether the IM240 should be correlated only with
bag 2 of the FTP or with a combination of bag 2 and bag 3. Because a
combination of bag 2 and bag 3 encompasses a broader range of vehicle
operation (i.e., the speed during bag 2 never goes above 35 mph, with
most operation below 30 mph), it is recommended that the correlation be
performed between the IM240 and a "hot FTP" (i.e., bag 2 weighted 52.1%
-18-
-------
and bag 3 weighted 47.9%). Although bag 3 contains a start, the impact
of that is minimal because the engine and emission control system do not
cool off significantly during the 10-minute soak between bag 2 and
bag 3 .
The IM240/FTP regressions should be developed by exploring a number of
different correlation equations and choosing the ona(s) that gives the
best agreement. (Different sets of data may have different equations
giving the best agreement.) One possible form to consider is the use of
separate regressions for the normal-emitting vehicles and the high-
emitting vehicles. This can be done without the arbitrary selection of
a break point by testing all possible data points as a break point. The
final break point is selected as the one that provides the minimum
error. In addition to exploring different functional forms for the
regression equations, the technology groups chosen for MOBILE5 should be
reevaluated.
Although it is unnecessary for the development of basic emission rates,
it may be desired to add regression residuals to the correlation
equations to obtain a more random distribution of predictions. If this
adjustment is made, it should be based on developing a distribution of
residuals about the mean. It is likely that such a distribution would
take the form of a log-normal distribution, with the longer "tail" being
above the mean. For all cases in which residuals are applied, an
evaluation of how the application of those residuals changes the overall
distribution of emissions must be performed. If the mean emission
levels are changed by adding in residuals, the method used to
incorporate that effect must be reviewed and modified as necessary.
A final point related to the correlation analysis is how to account for
cold start emissions. It is recommended that a cold start offset (i.e.,
bag 1 - bag 3) be developed from FTP data, with different factors being
developed based on technology and emitter category. As discussed below
(in the section on off-cycle emissions), it may be possible to determine
cold start emissions from recent testing conducted by CARB on the LA92
cycle. The start component of the LA92 is much more representative of
driving behavior during vehicle start-up than bag 1 of the FTP because
it contains a wider range of in-use operation. Alternatively, a
separate correlation between bag 1 and bag 2/bag 3 of the FTP could be
developed using only FTP data. Since bag 2/bag 3 estimates are
available through the IM240/FTP correlation outlined above, it would
then be possible to determine bag 1 emissions. This approach has the
advantage of being computationally simple, but it may not adequately
describe emissions during vehicle start-up.
TECHS Inputs
Once the IM240 data were converted to an FTP basis, the FTP-based
results were used to develop inputs to the TECHS model. There are two
primary inputs to the TECH model that drive the computation of basic
(i.e., non-I/M) emission rates for use in MOBILE: technology-specific
emitter-category emission rates, and technology-specific emitter-
category growth functions. In TECHS (which was used to develop basic
-19-
-------
emission rates for MOBILES), four technology groups were used for 1981
and later model year vehicles:
open-loop,
carbureted/closed-loop,
throttle-body fuel-injection (TBI)/closed-loop, and
multipoint fuel-injection (MPFI)/closed-loop.
The emitter categories used in TECHS were defined as follows:
Normal HC/CO - HC <; 0.82 g/mi and CO <; 10.2 g/mi,
High HC/CO - HC > 0.82 g/mi or CO > 10.2 g/mi,
Very High HC/CO - HC > 1.64 g/mi or CO > 13.6 g/mi, and
Super HC/CO - HC > 10.0 g/mi or CO > 150.0 g/mi.
NOx emissions were analyzed separately from HC and CO, with only two
emitter categories being defined: normals (s 2.0 g/mi) and
highs (>2 . 0 g/mi) .
Emitter-Category Emission Rates - The emitter category emission xates
developed for TECHS were based on a mix-and-match methodology that
appeared somewhat arbitrary. Thus, most recommendations of the EIRG
were directed at a more consistent approach to developing emitter-
category emission rates.
Limitations and Concerns with Current Approach - The EIRG expressed
concern about the use of MOBILE4.1 zero-mile levels for the 1981-1982
normal emitters in the development of emitter-category emission rates
for MOBILES. EPA indicated that this was done because using only the
Hammond data for this group resulted in a zero-mile level that was above
the emission standard. This occurred because there were few low-mileage
1981-1982 model year vehicles in the Hammond database, and the
regression was being driven by vehicles with well over 50,000 miles.
Another concern expressed by the EIRG was that 1983+ deterioration rates
were used for the 1981-1982 model year group; it is not clear that this
adequately reflects the difference between evolving (1981-1982) and
mature (1983+) technologies, particularly for fuel-injected vehicles.
Recommended Alternative - As a first cut, the choice of emitter category
cutpoints needs to be re-evaluated based on a statistical analysis of
the data (e.g., through a "cluster" analysis) rather than multiples of
the emission standards. (It is our understanding that this is being
done under another work assignment; thus, alternative cutpoints for
emitter categories were not investigated in this effort.) This re-
evaluation should also be extended to the model-year and technology
groups used for analysis. Second, once the emitter categories and
technology groups are chosen, emission rates should be determined
independently for each - there is no reason to force deterioration rates
of early 1980 model year vehicles to be the same as early 1990 model
year vehicles. For cases in which data are sparse (e.g., low-mileage
1981-1982 model-year normals), it may be possible to bolster the data
set with FTP data collected at the Ann Arbor laboratory. If the
emitter-category cutpoints are chosen properly, a normal emitter's
-20-
-------
characteristic emission rate should be somewhat independent of the type
of I/M program under which the vehicle was operating (i.e., an I/M
program changes the distribution of vehicles among emitter categories
but not necessarily the emission rate of those categories - this is the
basic assumption used in California's emission factor and I/M benefits
model, CALIMFAC).
Emitter-Category Growth Functions - In MOBILES, the development of
emitter-category growth functions relied on a very simplistic approach
in which the fraction of non-normals as a function of mileage was
determined by drawing a line through three points - from the origin
through a point representing the non-normal fraction of 1987+ model year
vehicles at that group's average mileage (for the < 50,000-mile growth
rate), and from the point where that line crossed the 50,000-mile mark
through a point representing the non-normal fraction of 1981-1986 model
year vehicles at that group's average mileage (for the > 50,000-mile
growth rate).
Limitations and Concerns with Current Approach - The EIRG expressed
concern that the approach used in MOBILES is too sensitive to the post-
50, 000-mile sample distribution, and that the 50,000-mile break point is
essentially arbitrary. Figure 3-1 compares the TECHS very high+super
emitter fractions versus the data collected in the Hammond program for
MPFI vehicles. As the figure shows, the growth in the very high+super
emitters is nonlinear and the TECHS method tends to inflate emissions at
higher mileages.
Figure 3-1
Comparison of Very High+Super Emitter Fractions
TECHS vs. Hammond Data for MPFI/CL Vehicles
1.2
O
*3
O
co
0.8
0) 0.6
Q.
3
tfl
+ 0.4
0.2
TECHS
87+ (< SDK)'
84-86 (> SDK)
87+ (< SDK)'
83-86 (> 50K)
-Er
1983+-
-O
5 10
Odometer (10,000 miles)
15
20
* Emitter fractions based on Hammond data.
NOTE: Numbers in parenthesis indicate sample size.
-21-
-------
Recommended Alternative - From the survey responses received from the
EIRG, there is very strong sentiment that the emitter-category growth
functions should be developed from a statistical analysis of data broken
up into 10,000-mile bins. It appears that there are sufficient data in
the Hammond sample to do this with reasonable certainty up to 100,000
miles. However, the data beyond 100,000 miles should probably be
segregated into 25,000-mile bins.
Any number of analytical approaches can be used to develop emitter-
category growth functions. One method is to account for the possibility
that the emitter-category growth functions could be a linear relation
(as a function of mileage) or a nonlinear relation with either
increasing or decreasing slope. This can be modeled with the following
regression equation:
Pi = A + B(mile) + C(mile)2 + D(mile)*
where pt represents the fraction of emitter group i as a function of
vehicle mileage; A, B, C, and D are regression constants; and mile
represents vehicle mileage. The emitter-category growth functions
should be developed using a weighted analysis where the weight for each
mileage bin represents the total number of vehicles in that bin.
The SAS regression procedure REG, using the "adjusted R-squared" method,
can be used with the above equation to determine the regime growth
functions. This method computes regression results for all possible
combinations of variables. Seven different regression equations are
possible with this approach:
linear term only (C = D = 0) ,
quadratic term only (B = D = 0) ,
square-root term only (B = C = 0),
linear and quadratic terms (D = 0),
linear and square-root terms (C = 0),
quadratic and square-root terms (B = 0), and
all terms.
An alternative to the above could be the development of a step-wise
linear fit, which is similar to what was done for MOBILES. However,
such an analysis should not be constrained to a predetermined flex point
(i.e., 50,000 miles), nor should it be constrained to only two lines.
Individual equations would be selected based on their regression
statistics; however, since the sum of emitter-group factions must equal
100%, the process and constraints outlined below must be followed.
1. Compute emitter-category populations using the equations
selected for each emitter group.
2. Set any individual population fraction greater than 100% to
100%.
3. Set any negative individual population fraction to zero.
4. Normalize the population fractions resulting from steps 1 and 2
to a sum of 100%.
-22-
-------
The adjustment process outlined above would be done prior to comparing
the agreement of the regression results with the input data. The set of
equations that provided the best match to the entire set of data would
then be selected.
Treatment of Light-Duty Trucks
Light-duty-truck basic emission rate equations for EPA's MOBILE and
CARB's EMFAC models have historically been based primarily on passenger
car data. That is because there have not been enough emissions data
collected on light-duty trucks to support an independent analysis. In
the past, light-duty-truck emission rates have been determined by first
evaluating passenger car emission rates by technology type (e.g.,
carbureted versus fuel-injected) and then calculating the light-duty-
truck, model-year-specific emission rates by weighting the technology-
specific passenger car rates by the expected technology mix for light-
duty trucks. Since the introduction of new technology on light-duty
trucks has generally lagged passenger cars by a few years (at least for
pre-1990 model year vehicles), the basic emission rates for light-duty
trucks were higher than for passenger cars. In addition, adjustments
were also applied to account for the fact that light-duty trucks are
certified to less stringent numerical emission standards. This latter
adjustment is typically performed by applying a ratio of emission
standards to the passenger car basic emission rate zero-mile level,
while the deterioration rate is left unchanged.
There is concern that the above approach may understate emissions from
light-duty trucks because deterioration rates (or, more properly, the
growth rate of high-emitting vehicles) are based on passenger cars,
which are generally subjected to a less severe duty cycle than light-
duty trucks. For that reason, we recommend that light-duty-truck
emission rates be determined independently from passenger cars, using
IM240 data collected from light-duty trucks. The data to do this will
be available within the next few years as IM240 programs are implemented
in various communities. It is unclear that the IM240-to-FTP conversion
procedure would need to be tailored specifically for light-duty trucks,
but certainly the emitter category emission rates and growth functions
(input to the TECHS model) developed from the simulated FTP scores
should be based on light-duty-truck data. As a short-term alternative,
IM240 data from Arizona, Colorado, and Maine should be reviewed to
determine if there is a significant difference in emissions
deterioration between cars and light trucks. If there is, then scaling
factors, which are a function of vehicle mileage, could be developed and
applied to the light-duty-truck basic emission rates (if they have been
based on passenger car data).
###
-23-
-------
4. ADJUSTING I/M DATA TO A NON-I/M BASIS
In the future, there are likely to be considerable IM240 data made
available for emission factor development. Obviously, one source of
those data is I/M programs that are using the IM240 procedure. However,
EPA's I/M rule requires a program effectiveness evaluation that includes
IM240 testing-on a minimum of 0.1% of the vehicle fleet. Thus, programs
not running the IM240 as part of their standard test protocol will be
required to collect IM240 data on at least some vehicles.
One shortfall related to the use of state-generated IM240 data is that
the data are from a fleet of vehicles subject to I/M," while the basic
emission rates used in the MOBILE model reflect a non-I/M condition.
(I/M benefits are determined in MOBILE based on I/M test type, test
frequency, compliance rate, waiver rate, etc.) Thus, if the state-
generated IM240 data are to be used in future versions of MOBILE, a
method to adjust the I/M data to a non-I/M basis is needed.
This section presents alternative views on how to adjust IM240 data
collected in an I/M area to a non-I/M basis. Five different methods are
discussed, with recommendations for both short-term and long-term
approaches.
Use of Only Those Data Collected in Non-I/M Areas - One option for
developing non-I/M basic emission rate equations is to continue using
IM240 data collected in non-I/M areas, or in areas that are in their
first I/M cycle. Non-I/M area testing could be accomplished through the
use of a portable dynamometer in conjunction with a random pullover
program. Alternatively, IM240 testing with a portable dynamometer could
be linked with annual safety inspections for areas that have those
inspections but do not have an I/M program. First-cycle IM240 data
should be available in a few areas of the country that will be starting
up I/M programs for the first time in the next year or two. One source
of first-cycle IM240 data recently collected is Maine, which ran IM240
tests from July 1994 to the fall of that year.
Although this is the preferred method of developing non-I/M emission
rates, it is not a workable long-term solution. The cost of operating a
roving IM240 data collection program would likely be prohibitively
expensive, and the availability of first-cycle I/M data will diminish in
future years.
Use of Remote Sensing to Develop Emitter-Category Distributions -
Although there is considerable support for remote sensing and the idea
of determining in-use emission rates from RSD measurements is
conceptually appealing, there remain serious obstacles to the use of
For areas that are implementing I/M programs for the first time in
response to the I/M rule, the data collected in the first "cycle" could
be considered non-I/M data. However, the majority of areas implementing
enhanced I/M programs already have some kind of I/M program in place.
-24-
-------
this technology. In theory, RSD readings collected in a non-I/M area
could be compared to RSD readings collected in an I/M area (i.e., the
area in which IM240 data are collected), and the distribution of
vehicles among emitter categories (based on RSD readings) in the I/M
area could be tuned to match the non-I/M area distribution. In
practice, there is so much variability in RSD measurements (e.g., from
siting differences, equipment differences, driver behavior, etc.) that
discerning a 20% to 30% difference in emissions as a result of an I/M
program would be unlikely.
For the reasons stated above, adjusting I/M data to a non-I/M basis
using remote sensing measurements is not likely to provide an acceptable
degree of certainty.
Development of a Statistical Model Similar to TECHS or CALIMFAC - One
approach to developing non-I/M IM240 scores from data collected in an
I/M area is to develop a statistical model that accounts for all of the
parameters considered in the current I/M models, i.e., essentially run
TECHS or CALIMFAC "backwards." In this approach, all of the constants
in the emitter-category growth functions, emission rates, I/M inspection
and repair effectiveness, etc. would be viewed as parameters that could
be varied (within set bounds) to obtain the best possible agreement
between predicted and measured emissions for the entire fleet.
The optimization process would continuously compare the TECH predictions
with the actual measured emissions for the set of data vehicles. The
parameters in the model would be adjusted, based on the error in this
comparison, using the algorithms of the optimization procedure. A new
comparison would be made between measured emissions and those predicted
using the new set of model parameters. This process would be repeated
until a satisfactory agreement between predictions and data was
obtained. The optimization process for a new version of TECH would be
initiated by using the parameters from the previous version. There are
many different approaches to optimization in this type of situation.
The problem would be non-linear because emissions are the product of
emitter-category populations and emissions: the emission rate
parameters and the emitter-category population parameters would interact
in multiplicative terms, resulting in non-linearity. Thus, linear
programming (which is generally convergent) could not be used, and a
non-linear optimization technique would be necessary.
Clearly, one of the drawbacks to this approach is that it would take a
fairly significant effort to develop and maintain such a model. Thus,
it is unclear that this approach could be used in the short term, and
long-term usage would depend on EPA's commitment to support such a
model.
Short-Term Recommendation - Continued Use of Non-I/M Data - In the short
term (i.e., the next year or two), the most reasonable approach to
developing non-I/M emission rates is the continued use of IM240 data
collected in the first cycle of I/M programs. This is most important
for the development of emitter-category growth functions, which really
drive overall emission deterioration rates. For emitter-category
emission rates, the differentiation between data collected in non-I/M
versus I/M programs is less important. In fact, mixing the non-I/M and
I/M data from Hammond would bolster the database used for MOBILES. To
serve as a check on the growth functions derived from the Hammond non-
-25-
-------
I/M data, IM240 data from the Maine program (or other first-cycle
programs) could be used.
Long-Term Recommendation - Analysis of Repair Cycle Data - Although
preferred, the continued use of IM240 data collected in non-I/M areas is
probably not a valid long-term option for developing non-I/M emission
factors. Because of that, a means to account for I/M effects is needed.
An alternative to RSD data analysis or a large statistical program is to
simply calculate the repair cycle benefits observed in an operating
IM240 program, and assume that emission deterioration between repair
cycles is equal to the emission deterioration in the absence of an I/M
program. This approach is illustrated in Figure 4-1, which shows the
classical sawtooth pattern associated with I/M test and repair. The
data used to perform this analysis should be available as part of the
I/M program data collection responsibilities outlined in the I/M rule.
Each I/M test record must contain the vehicle identification number and
the category of test performed (i.e., initial test, first retest, etc.).
Thus, it will be possible to determine the before-repair state of
vehicles in each repair cycle (the top points of the sawtooth
illustrated in Figure 4-1) and the after-repair state of vehicles (the
bottom points of the sawtooth in Figure 4-1). Assuming that the
deterioration observed from one cycle to the next (or one age to the
next) is relatively independent of the I/M program, adding the repair
benefit from the previous cycle (e.g., "A" in Figure 4-1) to the before-
repair point of the current cycle (i.e., the top of line "B" in Figure
4-1) would give the non-I/M emission rate.
Figure 4-1
Effect of an I/M Program on Emissions
as a Function of Repair Cycle
468
Vehicle Age/Odometer
10
-26-
-------
To describe the concept, the above discussion focuses on average
emission rates; however, it would also be possible to use this approach
for developing emitter-category growth functions, or to superimpose an
emitter-category distribution associated with each of the A through E
offsets on the next cycle or vehicle age. This approach offers the
advantage of being conceptually simple, and the information needed to
perform the calculations should be available with the IM240 emissions
data collected.by states operating an IM240 test program.
A key assumption in the approach outlined above is that emissions
deterioration (or, similarly, the growth rate of high-emitting vehicles)
between one I/M cycle and the next is independent of the I/M program.
This is the same assumption used in the current version of GARB's
CALIMFAC model,3 and was based on an analysis of CARB's First I/M
Evaluation Program "recapture" vehicles. These vehicles were tested and
repaired during the program, then were returned to the test laboratory
after approximately six months in customer service. This analysis
showed that, with the exception of pre-1975 model year vehicles, post-
repair emissions deteriorate at essentially the same rate as pre-repair
emissions in the tested vehicles.
To further validate the above assumption, an analysis of CARB's Second
I/M Evaluation Program should be performed. (This program is commonly
referred to as the "1,100-Car Study.") In Phase 1 of that project
(conducted from January 1991 to March 1992), approximately 1,100
vehicles that initially failed an I/M test received an FTP before and
after repair. Phase 2 of the project involved FTP testing of recaptured
vehicles after one year, while Phase 3 of the project involved FTP
testing of vehicles after two years (prior to their next regularly
scheduled biennial inspection). Approximately 750 vehicles were tested
in Phase 2, and 500 were tested in Phase 3. This database represents a
fairly robust sample of between-inspection tests, but it has never been
thoroughly analyzed for this purpose.
It may also be possible to use the approach described above in
conjunction with the IM240 data collected as part of each state's I/M
program evaluation requirements (i.e., the 0.1% testing requirement in
§51.353 of the rule) to develop non-I/M emission rates from data
collected in an I/M area. However, these IM240 data are supposed to be
collected at the time of initial inspection, so after-repair data would
likely not be available for those vehicles, i.e., only the top points in
the "sawtooth" illustrated in Figure 4-1 would be available for
analysis. The bottom points of the sawtooth could be estimated based on
additional data analysis and reporting required in the I/M rule.
Section 51.353 also requires states to perform a program evaluation that
includes an assessment of the effectiveness of repairs performed on
vehicles that failed the tailpipe emission test. Depending upon the
level of detail included in that assessment, it may be possible to use
that evaluation to estimate the repair benefit illustrated by the
letters A through E in Figure 4-1.
-27-
-------
5. INCORPORATING "OFF-CYCLE" EMISSIONS INTO MOBILE
In the past four years, there has been an extensive effort on the part
of EPA and CARB to better understand in-use driving behavior. That
effort has led to the development of alternative drive cycles that
include higher speeds and acceleration rates than are included in the
FTP. It is generally recognized that vehicle operation under these more
severe conditions results in higher emissions than occur using the FTP.
Because the MOBILE model is based on emission data collected over the
FTP, EPA has requested an evaluation of methods that could be used to
incorporate off-cycle emissions in the next version of the MOBILE model.
A significant limitation to developing a method to account for off-cycle
emissions is the lack of data that have thus far been collected over
alternative driving cycles. To date, there have been two primary test
programs that have collected emissions data over alternative cycles:
(1) EPA and industry testing to support the supplemental FTP rulemaking,
and (2) CARB "Unified Cycle" (LA92) testing to support inventory
development. Presented below is a brief description of these programs
and our recommendations for incorporating off-cycle emissions into the
MOBILE model.
Supplemental FTP Rulemaking - In February of .this year, EPA published a
Notice of Proposed Rulemaking (NPRM) recommending revisions to the FTP.
That rule would require vehicle manufacturers to conduct a Supplemental
Federal Test Procedure (SFTP) which includes three new driving cycles
(or "bags") to control emissions during air conditioning usage,
intermediate soak times and vehicle start-up, and aggressive driving.
Only the effects of vehicle start-up and aggressive (or off-cycle)
driving are being considered in this Work Assignment.
Under contract to EPA and CARB, Sierra has developed a number of
different driving cycles from instrumented vehicle data collected in
Baltimore and chase car data collected in Los Angeles. Those cycles
include:
a start cycle ("ST01") that is representative of the first four
minutes of vehicle operation;
an aggressive driving cycle ("REP05") that reflects speeds and
accelerations not covered by the LA4 cycle; and
a remnant ("REM01") cycle, which is intended to represent the
balance of in-use driving not already covered by the ST01 and
REP05.
In addition, two "composite" cycles have been developed that capture the
range of speed and acceleration events observed in the drive cycle
databases - the EPA Composite cycle (based on Baltimore and Los Angeles
data) and the LA92 cycle (based on only Los Angeles data). Ideally,
-28-
-------
proper weighting of the ST01, REP05, and REM01 cycles would result in
equivalency with the EPA Composite cycle.
During the development of the SFTP rulemaking, EPA tested eight well-
maintained 1991-1993 model year vehicles over the FTP, ST01, REP05, and
REM01 cycles. These vehicles were also tested on two driving cycles
that represented extreme acceleration and speed profiles. One of those
cycles was developed by EPA/industry ("HL07"), and one was developed by
GARB ("ARB02"). By weighting the ST01, REP05, and REM01 cycles
according to the fraction of VMT represented by these cycles, EPA found
that emissions increased by 0.04 g/mi NMHC, 2.8 g/mi CO, and 0.08 g/mi
NOx relative to the hot FTP results. (The average hot FTP emission rate
for these vehicles was 0.04 g/mi HC, 1.6 g/mi CO, and 0.19 g/mi NOx.)
Following the completion of EPA's testing, auto manufacturers sponsored
an emission test program. That effort consisted of 26 late-model
vehicles that were tested on the FTP, REP05, HL07, and ARB02. Little
testing was conducted on the REM01 since the focus of the EPA/industry
effort was on developing a control cycle (i.e., certification), and the
REM01 cycle was thought of as an inventory cycle.
Based on the results of the above test programs, a high-speed/load
transient control cycle was developed (termed "US06") which is a
600-second test comprised of segments of the REP05 and the ARB02 cycles.
It should be noted that this cycle was developed with the intent of
controlling emissions from aggressive driving and transient operation.
It was not developed for the purpose of evaluating in-use emissions.
CARS Unified Cycle Test Program - In the summer of 1992, Sierra recorded
in-use speed-time profiles of randomly selected vehicles that were
followed by a chase car. During this chase car study, which was
sponsored by CARB, data were collected over a mix of road routes
designed to represent all travel occurring in the Los Angeles area.
These data were then used to develop a "composite" driving cycle (the
LA92 cycle) designed to match the overall speed-acceleration
distribution observed in the Los Angeles data set. To date, CARB has
performed FTP and LA92' emission tests on roughly 250 vehicles during
two separate test programs. As part of CARB's 12th In-Use Surveillance
Program, 170 1983 and later model year vehicles were tested over the
LA92 cycle. In addition, CARB conducted a special test program that ran
from late 1993 to mid-1994 in which 80 1971 and later model year
vehicles were tested on the LA92 and the FTP. Clearly, the CARB testing
CARB performs the LA92 emission test in a manner similar to the FTP.
The test begins with a cold start, and emissions from the first 300
seconds of the cycle are collected in bag 1. Emissions from the
remainder of the LA92 cycle are collected in bag 2. The vehicle is then
allowed to soak with the engine off for 10 minutes, and the first 300
seconds of the LA92 are re-run, comprising bag 3 of the test. CARB
computes a composite LA92 emission rate by assuming 43% of starts are
cold starts and 57% of starts are hot starts.. This is the same approach
used to compute a weighted FTP score. However, because bags 1 and 3 of
the LA92 test are much shorter than bags 1 and 3 of the FTP (1.2 versus
3.6 miles) and bag 2 of the LA92 is longer than bag 2 of the FTP (8.6
versus 3.9 miles), the factors used to weight each bag's q/mi emission
rate are much different for the FTP and the LA92.
-29-
-------
offers a much larger and more representative database from which to
develop off-cycle corrections than does EPA's SFTP program.
In addition to the data already collected, CARS is planning in the next
year to test 75 vehicles over the FTP, the LA92, and eight different
speed cycles developed from the Los Angeles chase car data. CARB also
has plans to test approximately 250 vehicles over the FTP and LA92
cycles in its next in-use surveillance project. Many of these data are
likely be available prior to the next major release of the MOBILE model.
Recommended Approach - Although it may be tempting to rely on the data
collected as part of the SFTP development process to make adjustments to
MOBILE for off-cycle effects, there are a number of problems associated
with the use of those data. First, only eight vehicles have been tested
over the full complement of cycles thought to capture start-up and off-
cycle events. Although the industry data were more robust in terms of
the number of vehicles tested, those vehicles were tested only on the
FTP, REP05, HL07, and ARB02 cycles. Additionally, SFTP data collected
in the future will likely be over the US06 cycle, which is a combination
of the REP05 and ARB02 cycles and was not designed to represent in-use
driving. The use of the US06 data for in-use emission estimates would
require some kind of correlation or adjustment to get the data on a
REP05-cycle basis. Any adjustment of that kind would introduce
additional uncertainty into the results. Finally, and most importantly,
simply weighting the ST01, REP05, and REM01 does not best reflect the
proper mix of speed and acceleration observed in the chase car and
instrumented vehicle databases. That mix is better represented by one
of the composite cycles developed by Sierra.
Because of the data deficiencies in the SFTP test program, it is much
more appropriate to develop off-cycle corrections from the LA92
emissions data collected by CARB for the purposes of estimating in-use
emissions. Since the LA92 cycle matches the acceleration/speed profiles
from all in-use vehicle operation (at least for Los Angeles), a ratio of
the LA92 results to the corresponding FTP results will provide a good
indication of the emissions increase associated with off-cycle events.
Although it could be argued that the use of a data set developed with
the EPA Composite cycle (which incorporated in-use driving patterns in
Baltimore and Los Angeles) would be more appropriate, sufficient data
are not available to characterize vehicle operation and emissions over
this cycle.
In terms of developing an off-cycle correction factor, the CARB data
would allow a reasonable accounting for possible emission differences by
technology. The 1993-94 special test program included 12 pre-1975 model
year vehicles, 31 1975 to 1980 model year vehicles, and 37 1981 to 1992
model year vehicles. All of the vehicles tested over the LA92 in the
12th Surveillance Program (approximately 170) were from the 1983 and
later model years. With LA92 and FTP tests conducted on slightly over
200 1981 and later model year vehicles, it would be possible to
investigate differences by fuel delivery technology and perhaps by
emitter category. This approach will become more attractive as
additional data are collected by CARB.
Depending on the way in which start emissions are treated in the next
version of MOBILE, the actual development of vehicle start-up and off-
cycle correction factors could be performed in a number of different
ways. For example, if start emissions are separated from running
-30-
-------
emissions, then bag 1 of the LA92 could be correlated with bag 1 of the
FTP (e.g., through a regression analysis). This is particularly
important since bag 1 of the LA92 is much more reflective of vehicle
start-up operation than the FTP bag 1. If start emissions are treated
as an offset, the difference between bag 1 and bag 3 of the LA92 could
be compared to the difference between bag 1 and bag 3 of the FTP.
Alternatively, it may be desirable to determine start emissions directly
from bag 1 (or bag 1 - bag 3) of the LA92 data without considering the
FTP data.
Hot stabilized emissions can be corrected for off-cycle events by
comparing a combination of bags 2 and 3 of the LA92 cycle (i.e., a "hot
LA92") to a combination of bags 2 and 3 of the FTP.' A correction
factor can be developed by taking a simple ratio of the LA92 results to
the FTP results or through a regression analysis. As with the start
correction, the data should be segregated by technology and perhaps
emitter category (i.e., "normal" versus "high" emitters). Although this
adjustment inherently includes a speed correction, the principal
adjustment is to account for the failure of. the FTP to adequately cover
the full range of speeds and accelerations occurring in customer
service.
###
In our opinion, the combination of bags 2 and 3 is more representative
of stabilized operation than bag 2 alone.
-31-
-------
6. USE OF STATE-GENERATED IM240 DATA IN MOBILE
This Work Assignment also called for a review of methodologies that
states could use to develop locality-specific basic emission rates for
use in MOBILE. The development of locality-specific basic emission
rates has the obvious advantage of allowing an area to accurately
represent its fleet of light-duty vehicles, while minimizing the
reliance on certain relations in the MOBILE (and TECH) models that have
been developed with the intent of reflecting national averages. In
addition, developing locality-specific emission rates has the potential
to better reflect the impact of a particular I/M program on vehicular
emissions. However, a number of issues must be considered in order to
have confidence that the local predictions are more representative of an
area than the estimates obtained by simply running MOBILE.
There are two steps involved in developing basic emission rate equations
from state-generated IM240 data. First, the IM240 data need to be
collected and converted to an FTP basis. Second, the simulated FTP
scores need to be analyzed to develop basic emission rate equations.
Although most of the procedures that would have to be followed to
generate locality-specific emission rates from IM240 data have already
been discussed in this document, this section reviews the areas of
particular importance that would have to be considered in such an
analysis.
Data Collection and Development of Simulated FTP Scores
As discussed previously, there are potentially two sources of IM240 data
that will be available from which to develop basic emission rate
equations. Obviously, if a state has an I/M program based on IM240
exhaust measurements, the data collected in the program can be used.
For states not conducting IM240 testing as part of the standard I/M
program, IM240 data will be available from the program evaluation
requirements in Section 51.353 of the I/M rule (i.e., 0.1% of the
subject vehicle fleet must be tested each year over the IM240 cycle or
another transient mass emission test approved as equivalent). Although
ready access to these data makes the development of locality-specific
emission rates attractive, there are a number of additional pieces of
data and test requirements that would be needed before the data could be
used to develop simulated FTP scores.
Because it is unlikely that states would have the resources to conduct
FTP tests on a subset of vehicles tested at the I/M lane, EPA-generated
IM240-to-FTP correlation equations would have to be used. Since those
correlations are based on IM240 tests conducted in a laboratory
environment on a standard test fuel (i.e., Indolene), the state-
collected IM240 data would have to be adjusted to reflect a standard
fuel and temperature. To ensure that proper data are available to
correctly predict FTP-based emission rates from IM240 data, states
should be required to collect a number of pieces of information related
-32-
-------
to the fleet of vehicles they intend to use for emission factor
development. This includes the following:
Ambient temperature should be recorded for all vehicles tested.
The maximum and minimum daily temperatures should also be
recorded.
Fuel samples should be collected and analyzed for a subset of
vehicles included in the database.
By analyzing the test temperature and fuel parameters, a determination
can be made as to whether RVP/temperature interactions (which may lead
to excessive purge) are having an inordinate influence on the results.
Those test reco.rds that are outside of a predetermined RVP/temperature
window could be excluded from the analysis. In addition, the analysis
of other fuel parameters (e.g., oxygenates, aromatics, sulfur) might
allow base fuel/Indolene correction factors to be developed from the
reformulated gasoline Complex Model (or from the data that were used to
formulate the Complex Model, with correlations developed based on Bag 3
of the FTP).
In terms of data collection efforts, several other issues would need to
be considered by the states. First, only full IM240 tests should be
used for emission factor development. With the implementation of fast-
pass and fast-fail algorithms, it is unclear how many vehicles will
receive a full IM240 test in an operating program and whether those
vehicles that do receive a full IM240 will be representative of the
entire fleet. Thus, we recommend a means to ensure that vehicles
selected for emission factor development be chosen at random (e.g.,
every 20th vehicle tested at the lane) and identified as emission factor
vehicles. In addition, those vehicles should be tested over the
complete IM240 cycle regardless of whether they pass or fail the fast-
pass or fast-fail cutpoints. If this procedure is not followed, very
clean and very dirty vehicles will not be properly represented in the
emission factor data set. Second, vehicles selected for emission factor
development should be run over a short preconditioning cycle prior to
the IM240 test (e.g., two to three minutes at 40 mph). This would help
ensure that vehicles that may have cooled off in the queue are back up
to operating temperature before being tested. Finally, information on
technology type (e.g., carbureted, throttle-body injection, multipoint
injection; open-loop, closed-loop) would be needed if the IM240 data
gathered in the program are to be used to forecast emissions. It may be
possible to determine technology type with a VIN decoding routine;'
however, if states do not have access to such a program, then technology
The I/M rule requires VINs to be recorded for each I/M test record.
For 17-character VINs (which have been the standard since the early
1980s), the 9th character represents the "check digit," which is
intended to verify the accuracy of the VIN. The check digit is
determined by a mathematical routine in which each VIN character is
assigned a number, which is then multiplied by a preset value based on
its position in the VIN. These products are then summed and divided by
11, and the remainder represents the VIN check digit. To ensure the
accuracy of the VINs collected in I/M lanes, it is recommended that an
electronic cleaning routine be used to verify the VIN check digit.
-33-
-------
information would also have to be recorded for each vehicle used for
emission factor development.
Development of Basic Emission Rates from Simulated FTP
Scores
Once the IM240 data are converted to an FTP basis, emission rate
equations can be developed. If only the current year rates are desired,
the analytical technique would be fairly straightforward. The data
would first be sorted by vehicle type (i.e., car versus truck) and model
year. Next, the average pre-inspection emission rate would be
determined. (If the data are from an operating IM240 program, there
should be a field indicating whether the test is a baseline or retest;
IM240 data collected as part of the 0.1% requirement are supposed to
reflect emissions immediately prior to inspection.) The effect of the
I/M program on each model-year emission rate would then be estimated
based on the fraction of failures and the benefit of repair (taking into
account waivered vehicles). The repair benefit could be determined from
an analysis of pre-repair and post-repair I/M data which should be
available from the repair effectiveness analysis required in the I/M
rule. Finally, the model-year-specific emission rates would be
determined from the pre-inspection emission rates and the after-repair
emission rates based on whether the I/M program in place is an annual
program or a biennial program. For an annual program, the model-year-
specific emission rate would be calculated as follows:
ERAA = Fractionpass * ERPre_pass
+ (l-FractionPass) * (ERPre_Fail + ERA£.er_Rep)/2
where:
ERM = Annual average emission rate,
ERpre
= nnu v ,
.pass = Pre-inspection emission rate for passing vehicles,
re.Fail = Pre-inspection emission rate for failing vehicles,
Af;er_Rep = After-repair emission rate, and
FractionPass = Fraction of vehicles passing the inspection.
^Rpre-Fail
ER
For a biennial program, the following equation would be used:
ERBA = Fractionlnspecced * ERM
+ (l-FractionInspecced) * ERPre_A11
where ERBA is the model-year-specific emission rate for a biennial
program, ER^ is the annual average rate defined above, Fractionlnspect:ed is
the fraction of vehicles inspected in a given year (i.e., 50% in a
perfectly biennial program) , and ERPre_A11 is the prs-inspection emission
rate of all vehicles.
The method described above provides only the mean model-year emission
rates for one calendar year. Thus, if the data were collected in 1995,
the model-year-specific emission rates could only be used in conjunction
with MOBILE to develop a calendar year 1995 emission estimate (i.e., by
inputting the model-year rates as zero-mile levels and specifying zero
for deterioration rates). Clearly, not being able to forecast emissions
is a significant shortcoming of the above approach, and a means to
-34-
-------
develop emission rates described by a zero-mile level and a
deterioration rate is needed.
To develop emission factors that can be used with MOBILE to forecast
emissions, a method similar to that described above could be used.
First, the simulated FTP data would be sorted by vehicle type, age (or
odometer), and technology (e.g., carbureted, throttle-body injection,
multipoint injection). Pre-inspection and after-repair emission rates
would then be determined by vehicle age (or odometer), which would
result in a plot similar to that illustrated in Figure 4-1 for each
technology. A single emission value would be determined for each
vehicle age by weighting the pre-inspection and after-repair points
based on whether the I/M program in effect has an annual or biennial
inspection frequency. A regression analysis would then be performed on
these points to develop zero-mile levels and deterioration rates as a
function of vehicle technology. Next, model-year emission factors would
be calculated by weighting the technology-specific rates by the mix of
those technologies observed in the fleet. In performing this analysis,
care would have to be taken to ensure that the vehicles included in
calculations had been certified to the same emission levels. To account
for future emission standards, the zero-mile levels would be adjusted by
the ratio of future-to-current standards. (Deterioration rates would
remain unchanged, as they represent the I/M program in effect.) Note
that this approach provides a future inventory that includes the impact
of an I/M program, but it does not predict the benefit of the I/M
program.
Once the model-year zero-mile levels and deterioration rates are
determined, they can be input to MOBILE (as user-input emission rates)
and the model run. Note that since these rates already account for the
presence of the I/M program, the I/M options in MOBILE would not be
invoked.
Summary
Although the potential exists for states to develop locality-specific
basic emission rates from IM240 data collected as part of an operating
I/M program or the program evaluation requirements of the I/M rule, it
is unclear how many states will attempt to do this. This judgment is
based primarily on the following two factors.
At this time, it appears that only a small number of states are
likely to include IM240 testing in their enhanced I/M programs;
thus, available IM240 data will come from the program
evaluation requirements (i.e., 0.1% of the subject fleet must
be tested). This results in a much smaller number of test
records upon which to perform the analyses described above.
Based on the information presented above, a significant
investment in time and resources will be required on the part
of states to develop basic emission rate equations from IM240
data.
###
-35-
-------
7. REFERENCES
1. "Investigation of MOBILESa Emission Factors: Evaluation of IM240-
to-FTP Correlation and Base Emission Rate Equations," Prepared by
Sierra Research for the American Petroleum Institute, API
Publication Number 4605, June 1994.
2. "Investigation of MOBILESa Emission Factors: Assessment of Exhaust
and Nonexhaust Emission Factor Methodologies and Oxygenate
Effects," Prepared by Systems Application International for the
American Petroleum Institute, API Publication Number 4603, June
1994.
3. "Development of the CALIMFAC California I/M Benefits Model,"
Prepared by Sierra Research for the California Air Resources
Board, Report No. SR-91-01-01, January 1991.
###
-36-
-------
APPENDIX A
EIRG'S RESPONSES TO QUESTIONNAIRE ON DEVELOPING BASIC
EMISSION RATES FROM IM240 DATA
Database Adjustments
Weight Foreign Manufacturers - Because the vehicles tested in Hammond
did not accurately reflect the national average fraction of foreign
vehicles, each foreign vehicle in the database was counted 2 to 4 times.
Strengths:
1. Accounts for fleet mix biases in testing areas.
2. Foreign vehicles generally have a much lower DF than domestic.
3. Accounts for under-represented manufacturers. Comparison with the
non-weighted results indicated a net increase in non-normals,
which was most pronounced for carbureted closed-loop technology.
4. Important to account for foreign/domestic split because of
differences in quality, durability, etc.
Weaknesses:
1. No area using MOBILE will match the assumed national average
fleet, and there is no way to account for this.
2. Variations among engine families are just as significant as
foreign/domestic.
3. Has it been established that foreign vehicles are a bias? If it
is important, why not treat those vehicles as a separate
technology group. What about displacement, mileage, etc? Seems
like an arbitrary adjustment.
4. Method used may be based on a poor sample and not reflect
representative mix of foreign vehicles.
Al terna ti ves:
1. This is a second-order effect - ignore it.
2. Predict base emission rates on a manufacturer/model year basis.
3. Develop emission factors separately for foreign/domestic and allow
users the option to input that parameter.
A-l
-------
4. More sophisticated analysis by engine family or by groups of
engine families.
5. This should not be a problem for tech-group-specific analyses.
6. Whether this correction is applied depends on how significant any
technology or durability differences are. Is foreign vs. domestic
enough, or should all individual manufacturers be weighted. If
differences are significant, use sampling theory to pick the
optimum sample for desired weighting.
7. Develop a method to check representativeness of the available
foreign data, e.g., technology, manufacturer, age. Compare with a
more robust sample in a more representative area and then modify
the weighting factors.
Missing or Suspicious Mileage - A number of vehicles in the Hammond
database had '0' or missing mileage and were deleted. In addition,
vehicles that were coded as having an odometer reading > 300,000 miles
were deleted.
Strengths:
1. Removes bad data that could incorrectly influence emissions vs.
mileage regressions, particularly for zero mileage.
2. Prevents compromising odometer-based relationships.
3. Avoids large statistical impacts from inclusion of extreme and
likely erroneous mileages.
4. Obvious way of screening questionable data.
Weaknesses:
I. Some suspicious data could be valid data points.
2. Eliminating records reduces sample size.
3. High-mileage, poor condition vehicles may be more likely to have
odometer problems.
4. Deletes valid data.
5. May remove actual high-mileage vehicles, which are badly needed in
the database.
6. Limits sample size.
A-2
-------
A1 terna ti ves:
1. Group vehicles according to age for some statistics (e.g.,
fraction of high emitters at 5 years vs. 50,000 miles). In this
way, incorrect mileages are not an issue.
2. Compare odometer-based relationships with these vehicles
classified by the mean mileage for that model year (i.e., assign
to them the mean mileage by model year or age).
3 . Leave vehicles in the database and assign to them the average
mileage of the remaining vehicles. This does not affect the slope
of the regressions, just the y-intercept.
4. Use an age-odometer algorithm to identify suspected erroneous
data. This works for both high and low mileage vehicles.
5. Generate mileage as a function of age, but consider each year's
distribution of travel. Look at that distribution in the
database; if it is OK (i.e., not too wide), take mean values as a
function of age and use that.
Seasonal Outliers - Data collected on 14 test dates in March and April
when the ambient temperature was 25°F or more above the monthly average
were deleted because many of those vehicles were statistical outliers.
(Excessive purge was thought to be influencing the IM240 results.)
Strengths -.
1. Excessive purge is added separately in MOBILE (i.e., through
temperature/RVP corrections) and must be eliminated from data used
to estimate base emission rates.
2. Excessive purge is a problem at high RVP and temperature; this
method solves it.
3. Rejection of statistical outliers with data errors enhances the
value of the database.
4. Solves the problem of excessive purge effects.
Weaknesses \
1. Reduces sample size.
2. Deletes valid data, particularly when temperatures are high in the
spring or fall. High ozone episodes can occur during these times
and improved emission factors under these conditions are worth
some effort to develop.
3. True high emitters may be deleted.
A-3
-------
4. Rejection of true outliers reduces the accuracy of the database.
5. If it happened 14 out of 60 days, are these true outliers?
6. This type of activity occurs in the real world. How do the models
account for this effect?
Alternatives:
1. If sample is large enough, only use data collected within the FTP
temperature range for the IM240/FTP correlation and base emission
rates. IM240 data at different temperatures could be used to
develop temperature/fuel correction factors.
2. If this is a real problem in the spring and fall, perhaps there
should be a correction factor.
3. Determine if purge is higher during these times than on hot mid-
summer days. Estimate vapor generation during running conditions
and diurnals under both conditions using actual temperatures and
estimates of local RVP. If spring/fall vapor generation (and thus
purge) is relatively high, then fuel RVP is likely an important
factor to include in the analysis (along with temperature). If
both seasons show similar vapor generation levels, retain data and
correct for temperature.
4. Temperature-correct the outliers to see if that gives more
realistic results.
5. Data rejection should be based on a combination of statistical
plus engineering analysis. The procedure of rejecting all data
when performance problems are suspected is a good one. The long-
range goal of IM240/FTP correlation will probably require some
temperature correction correlation. The problem of excessive
purge may be reduced with ORVR-sized canisters (or increased if
the vehicle was just refueled).
6. Develop seasonal emission factors (i.e., summer, winter,
spring/fall) based on temperatures and fuels reflective of those
seasons.
7. Set some test temperature range for each RVP "season" within which
data are used for FTP correlations.
8. Rather than deleting data out of hand, compute the
temperature/fuel impact and use the results to validate the
performance of the model.
A-4
-------
Fuel/Temperature Adjustments
Because EPA wished to develop the IM240-to-FTP correlations based on
vehicles IM240 tested in a laboratory with Indolene, a method was needed
to account for the differences between the lane and the lab before the
correlation equations were applied to the Hammond lane IM240 data. For
the Hammond database, it was felt that those differences were primarily
related to tank fuel versus Indolene and the temperature differences
occurring between the lane and the lab. (However, a number of other
differences could also impact test variability between the lane and the
lab, e.g., vehicle preconditioning procedures, inconsistent dynamometer
settings, how well the IM240 speed-time trace is followed, etc.)
The fuel/temperature adjustments prepared for MOBILE5 were based on a
subset of the Hammond vehicles that were tested at the lane on tank fuel
and at the lab on Indolene. Adjustment factors were developed by season
(i.e., March-April, May-June, July-September, and October-February) and
the following emitter categories:
Normal HC/CO - lane IM240 s 1.64 g/mi HC and <; 13.6 g/mi CO,
High HC/CO - lane IM240 > 1.64 g/mi HC or > 13.6 g/mi CO,
Normal NOx -' lane IM240 <, 2.0 g/mi NOx, and
High NOx - lane IM240 > 2.0 g/mi NOx.
Once the data were segregated as outlined above, the mean emission
levels for the lane/tank fuel scores and the lab/Indolene scores were
determined. Adjustment factors were then developed from the ratio of
these mean values.
Strengths:
1. Since MOBILE adjusts for fuel and temperature separately, the base
emission rates must be adjusted to FTP conditions. This approach
is simple and easy to understand.
2. At least some accounting for major differences.
3. Only compares large sets of data. Test-to-test and vehicle-to-
vehicle variability is reduced.
4. Any adjustment that accounts for variation due to external factors
is helpful in the overall correlation.
5. Accounts for differences between the lane and the lab.
6. Some accounting for seasonal impacts.
Weaknesses -.
I. The two-step adjustment adds uncertainty. A simple adjustment may
not be appropriate. The emission groupings were not chosen for
best results. No technology groupings.
A-5
-------
2. Two-variable analysis may explain only part of the difference on
specific vehicles.
3. Includes possible offset in lab and lane measurements. Merges
fuel and temperature effects, when temperature is known and fuel
specifications (at the lane) are not. Fuel effects are
sufficiently difficult to assess in controlled experiments with
multiple tests on repeatable vehicles. Probably impossible to
determine under the test conditions existing here.
4. One set of factors are used to correct for fuel/temperature when
going from lane IM240 to lab IM240, and a different set of factors
when going from lab FTP to real-world FTP. Shouldn't factors be
similar to fuel/temperature factors for FTP bag 3?
5. Was this part of a comprehensive study to determine the effects of
different external variables?
6. Data from Hammond were extremely variable and not all of that
variability could be reasonably explained by temperature and fuel
effects (e.g., 20% of the vehicles had lane/tank fuel IM240s and
lab/Indolene IM240s for HC and/or CO differ by more than 3 times).
7. These adjustments appear trivial considering a) cloning of foreign
vehicles, b) using an average "X" in the regression equation,
c) the "X" is of questionable merit, d) log space was used, and
e) residuals are applied.
8. Not clear how the other differences - vehicle preconditioning,
etc. - are accounted for when analyzing test results. Also,
average temperature may mask the effect of unusual swings.
A1 terna ti ves:
1. Fuel samples would aid in the adjustment and allow comparisons
between fuels and temperature.
2. If sample size allows, choosing only records that are similar to
FTP conditions may make the adjustment less important.
3. Possibly develop new emitter groupings or technology groupings.
4. Develop a more sophisticated multivariate analysis of differences
using an engineering model.
5. Quantify temperature effect independently from fuel effect (i.e.,
IM240 versus temperature correlation). Use measured ambient
temperatures at the time of the IM240 test. Do not segregate by
season, given use of actual test temperature. Fuel-related
reasons for segregating by season appear to be weak. While
volatility changes with season, so does average temperature, in a
compensating manner. Largest volatility-related effects will be
on cold/hot days within a season, not across seasons. Given no
knowledge of the lane fuel parameters, the fuel effect will be
A-6
-------
part of the constant in the temperature regression. If, on
average, in-use fuel is somewhat "dirtier" than Indolene, then the
lane emissions will generally be higher than the lab measurements,
temperature effects aside.
6. Should try to do a statistical analysis to determine the
significance of all possible external variables on the final
correlation, then account for those variables vith the
statistically significant impacts.
7. Correlate FTP directly with lane IM240 scores, using only those
conditions (i.e., temperature and fuel) that reasonably match the
FTP.
8. Some accounting for inconsistent preconditioning should be
considered. For example, look at the IM240 bag 1 vs bag 2 scores
and possibly delete record if difference is outside a pre-
determined window. Alternatively, compare lane bag results to lab
bag results. If the difference is large in bag 1 but not bag 2, a
preconditioning problem could have existed.
9. Use MOBILE temperature and RVP correction factors (for bag 3 or a
combination of bags 2 and 3) to adjust the lane scores to an FTP
temperature and Indolene basis, or at least use this information
as a reality check on the factors developed with the test data.
The MOBILE approach (or similar temperature/fuel factors developed
specifically from IM240 tests) could possibly also be used in a
state-based IM240-to-FTP analysis. (It is unlikely that states
would have the resources to run the lab/Indolene IM240s for
generating their own fuel/temperature corrections.)
10. Split the data into more temperature-specific regimes based on
values recorded each day (i.e., look at weather data) and base the
corrections on the temperature regimes.
A-7
-------
IM240-to-FTP Correlations
Once the Hammond lane IM240 data were adjusted to a lab/Indolene basis,
correlation equations relating the IM240 to the FTP were applied to the
data. The IM240-to-FTP correlations were based on a regression analysis
of data collected from vehicles tested over the IM240 on Indolene and
the FTP on Indolene. (The database used for the correlation analysis
included vehicles from the Hammond program as well as vehicles tested in
Ann Arbor.) The regressions were performed according to the following
model year groups and technology types:
1981-1982,
1981+ open-loop,
1983+ carbureted/closed-loop,
1983+ throttle-body injection/closed-loop, and
1983+ multipoint fuel-injection/closed-loop.
The HC and CO correlations were performed in log space with a cold start
offset ("X" in the equation below) that varied by technology, while the
NOx correlations were based on a simple linear equation without a cold
start offset value:
Log10(FTPHC/co - X) = b + m*Log10(IM240HC/co)
FTPNOx = b + m*IM240NOx
For cases in which (FTPHC/CO - X) < 0.01, the IM240 score was substituted
for (FTPHC/CO - X) . In this way, errors resulting from taking the
logarithm of a negative number were avoided. (A discussion of the cold
start offset is included in the next section.)
Strengths:
1. For an average, these correlations should provide a good estimate
of average FTP emissions.
2. Cold start emission excess could be unrelated to hot start
emissions. Any relationship between hot and cold emissions will
automatically be included in the slope.
3. Use of different technology group regressions for limited number
of groups is a good balance between sample size and accounting for
different vehicles.
4. This is a relatively straightforward and easy technique.
5. It's slick and simplifies the use of the data.
A-8
-------
Weaknesses:
1. To the extent that individual predicted FTP values are used, these
correlations are only good for averages. These technology
groupings were not chosen for the best correlations. One fit was
used for all emitter groups.
2. Non-linear relationships were not investigated.
3. Unclear what analyses were performed to decide on logarithmic
relationship for HC and CO and the linear relationship for NOx.
4. The log-based equation is equivalent to:
FTPHC/CO = X + 10b[IM240HC/co]m
Is this a realistic regression? (For b=0 and m=l, it gives a
simple regression.)
5. Has it been established that disaggregation by technology
groupings is justified?
6. There is really no connection between the IM240 and cold start.
Cold start should be directly calculated from FTP data.
7. This method implies that the IM240 is being defined as equivalent
to a "no-start" FTP, and there is no basis for this. The fact
that FTP-X can be negative bears this out.
8. Calculating X from mean[FTP - IM240] implicitly assumes that the
IM240 is equal to a "hot FTP." Is this reasonable?
9. It is not at all clear that X does a good job of accounting for
the cold start offset. The fact that there were problems with
negative numbers suggests that it did not.
Al terna ti ves:
1. Develop multiple correlations separately for emitter groups,
possibly for new technology groupings.
2. Explore different equational forms, but it is unclear that
statistics would improve.
3. Perform both log and linear regressions and examine the variance
about the regression line. The approach that shows a variance
that is constant and randomly distributed about the regression
line regardless of IM240 level is preferred, regardless of the
correlation coefficient. If a log function is still preferred for
HC and CO, then switch to the more complex approach.
4. Regress individual FTP bag data against IM240 level. Use of "X"
should not be necessary for bags 2 and 3, and may not be necessary
for bag 1. Again, be sure that the assumed functional
A-9
-------
relationship meets the basic assumptions necessary for performing
a regression.
5. Manufacturers have claimed that catalyst washcoat technology was
significantly improved in the latter 1980s. The analysis should
explore another major model-year group.
6. Try different regression formulae and pick the one with the best
statistics.
7. Get rid of the cold start offset in the IM240 correlation and only
use the IM240 to predict bag 2 and/or bag 3 (or, alternatively, a
"Hot FTP", i.e., [0.521*Bag 2 + 0.479*Bag 3] = b + m*IM240). The
cold start offset could then be calculated from available FTP
data, with consideration for emitter groups.
8. Focus on the relationship between IM240 and bags 2 and 3.
A-10
-------
Correlation Adjustments
When the correlation equations were applied to the lane IM240 scores
(which had been corrected to a lab/Indolene basis), two additional
adjustments were made. First, the cold start offset was assumed to be a
function of vehicle odometer reading (although the correlations were
performed with a constant X value) , and second, regression residuals
were randomly applied to each data point.
Cold Start Offset - The cold start offset (X) values used in the above
correlation equations were developed, by technology group, from the mean
value of the difference between the FTP and the IM240 for normal
emitters with FTP values greater than the IM240 (i.e., the value of (FTP
- IM240) was determined for each normal emitter, and the mean of the
positive results was used as X) . When the correlation equations were
applied to the IM240 data, the value of X was adjusted to account for
the effects of aging and mileage. The way that this adjustment was
developed for 1983+ model years is described below. (A slightly
different procedure was used for 1981-1982 model year vehicles.)
The value of X in the correlation equations reflects the cold start
offset at the mean mileage of the correlation sample. At mileages less
than this mean, it follows that X should be decreased by some amount to
account for the fact that the catalyst has been aged less and is
expected to be more active. (Alternatively, X should be increased at
mileages above the mean.) Thus, the cold start offset is actually X
plus an increment that is a function of vehicle odometer, i.e.,
X-Offset Function = f (x) = X + f (Odometer)
EPA has defined f (Odometer) in the above equation to be "the difference
of the model year means regression for normal emitters and a 'New' line
created by connecting a point on the model year means ' regression line
at the mean mileage of the correlation sample with the zero mile level
used in MOBILE4 . 1 . " The X-offset function is therefore:
f(x) = X + ZMLHOB^M.! - ZMLMYMeans + ODOM* (DET.New. - DET^ Means)
Strengths -.
1. Simple in concept and allows IM240 data to directly replace FTP
data in the TECH model .
2. Will yield directionally consistent results.
3 . Accounts for cold start emissions which would not be measured in
IM240 (unless IM240 vehicle is not warm) .
4. Calculates a cold start offset that is a function of vehicle
age/mileage .
A-ll
-------
Weaknesses:
1. Clumsy handling. May not reflect the "true" effects of cold
starts (and hot starts).
2. Why is MOBILE4.1 used as the "gold" standard?
3. Creates a mileage effect based on the results of two unrelated
analyses. The effect is then extrapolated far beyond the mileage
at which data have been collected.
4. Looks pretty hokey.
5. Hard to tell. What do statistics for measured versus computed
cold start FTP emissions look like?
6. Really odd way to perform this adjustment. Why was MOBILE4.1
brought into this analysis at all?
7. It is unclear as to why X is defined as being independent of
emissions (i.e., it's based on normals) but varies with
age/odometer. Isn't age important only because emissions
deteriorate accordingly?
8. Continues to use "X", which is a poor surrogate for cold start.
Al terna ti ves
1. Use IM240 data only to estimate non-start emission rates. Use
other data for start emissions directly.
2. Firsc determine whether cold start emissions are related to hot
start emissions (e.g., regress bag 1 versus IM240). If such a
relationship exists, then it can be determined directly from the
regression. If not, cold start emissions must be determined from
bag 1 data. IM240 data should not be used, nor should estimated
FTP data from IM240 data. The IM240/FTP relationship is too
uncertain, and clearly produces higher in-use FTP estimates versus
previous FTP measurements (i.e., MOBILE4.1). The slope of cold
start emissions versus mileage would be due in part to the change
in methodology and overestimate of the mileage effect.
3. Use IM240 to correlate with bag 2/bag 3 of the FTP. Develop
separate correlation between Bag I and bag2/bag 3 using only FTP
data.
4. Reexamine from scratch other possible adjustments (e.g.,
multiplicative versus additive). Look at obtaining actual data on
cold start offset versus odometer as opposed to MOBILE4.1
correlation.
5. Develop a cold start offset that is entirely separate from the
IM240 data. Using FTP data, this could be done in a number of
different ways.
A-12
-------
6. Low-mileage cold start offset can be determined from bag FTP
results or from new car FTP versus IM240 tests.
Regression Residuals - Another adjustment made during the application of
the correlation equations was the addition of randomized regression
residuals, i.e.,
Log10(FTPHC/co - X) = b + m*Log:o(IM240HC/co) + res
FTPNOx = b + m*IM240NOx + res
where "res" represents regression residuals from the correlation sample.
According to EPA, adding the residuals randomly to the FTP emission
levels predicted by the correlation equations attempts to restore a
distribution of predicted FTP values for a given IM240 score.
Otherwise, there will be a single predicted FTP value for each IM240
score. A distribution of predicted FTP scores and emission levels is
important for some analyses, such as the determination of I/M credits.
For example, if residuals were not applied, 100% of the FTP emissions
from a certain emitter group could be identified on the basis of the
IM240 score.
Strengths -.
1. Without some adjustment, the individual predicted FTP values will
tend to clump around the mean, making any evaluations that depend
on emission distributions (e.g., number of high emitters) suspect.
Residuals are actual observed distribution effects.
1. A relatively simple non-parametric way to model emission
distributions.
2. Agree with concept.
3. Good for IM240 ID rates - not necessarily for FTP analysis.
4. Introduces a "distribution" back into the data.
5. Converting emission data from normal to log space, regressing in
log space, and then converting back to normal emissions tends to
yield a lower average emission level when compared to a simple
average of the original data. This occurs because the average of
the logarithms is akin to a geometric mean, which is always lower
than the arithmetic mean when some variability is present. When
the goal is to determine a relative (i.e., percent) change in
emissions, then this reduction in the mean is not a problem.
However, if the goal is to estimate absolute emission levels, then
the reduction is a problem, since the atmospheric impact is the
average of the emissions, not their logarithms. In this case, EPA
desired to develop absolute estimates of FTP emissions from IM240
emissions. EPA may have added the residuals in order to
compensate for the inherent downward bias in the logarithmic
analysis relative to the atmospheric effect.
6. Simple to implement.
A-13
-------
Weaknesses:
1. Since the residuals are randomly applied, the analysis cannot be
replicated without a mapping of which vehicles used which residual
value.
2. Too dependent on the individual points and character of the
database used; could have problems with homoskedasticity.
3. Should add regression residual to value without cold start offset.
Log space could cause problems.
4. The residuals were developed in log space, so the sum of these
residuals (in log space) was zero. However, when the antilog was
taken in the regression equation, it led to a net increase in
predicted FTP emissions (relative to the non-residual equation).
This significantly influenced the fraction of normals and non-
normals in the predicted FTP database (e.g., for the MPFI group,
the fraction of normal emitters was 78.8% with the residuals
applied, and 90.2% without the residuals applied). Is this
consistent with the relative difference between a linear
regression and the log-space regression?
5. It is not clear that any analyses were done to demonstrate that
application of residuals in fact yielded results that matched the
atmospheric impact of the original data.
6. Assuming this was done rigorously, this is a good technique.
However, it is compromised by the relatively large "X" effect,
which is not rigorous.
7. Not clear that the use of the residuals in log space provides a
representative distribution of predicted FTP scores. In fact, it
seems unlikely to do so.
Al terna ti ves:
I. Rather than adjusting each predicted FTP value individually, a
"probability" distribution might be developed which could be used
to predict what portion of the fleet with a given IM240 score was
above or below a given FTP score.
2. Could use log-normal formulation or Weibull distribution, with x
and a derived from data. (Earlier suggested by EEA to EPA but
rejected as being too complex.)
3. Need to ask how this changes the overall distribution. There is
some initial distribution of IM240 scores. Without the addition
of the residuals this distribution will not change when FTP values
are computed. Do they change when residuals are added? If so,
are the results reasonable?
4. This is a good idea, although it is unclear that it really needs
to be done if all the database is used for is base emission rates.
A-14
-------
If it is desired to do this, make sure that application of
residuals does not unintentionally skew results.
5. Compare the arithmetic means of the original FTP data and the
estimated FTP levels using the IM240/FTP correlation. Do this
with and without the residuals added back in. Use the technique
that matches the original data best.
A-15
-------
TECHS Inputs
Once the Hammond data were converted to predicted FTP scores, the
results were used to develop inputs to the TECHS model (i.e., emitter
category emission rates and growth functions). The following emitter
categories were used in TECH5 for HC and CO emissions:
Normal HC/CO - HC < 0.82 g/mi and CO < 10.2 g/mi,
High HC/CO - HC > 0.82 g/mi or CO > 10.2 g/mi,
Very High HC/CO - HC > 1.64 g/mi or CO > 13.6 g/mi, and
Super HC/CO - HC > 10.0 g/mi or CO > 150.0 g/mi.
NOx emissions were analyzed separately from HC and CO, with only two
emitter categories being defined - normals (< 2.0 g/mi) and highs (>
2.0 g/mi).
The data were also segregated by the following technology groups:
open-loop,
carbureted/closed-loop,
throttle-body injection (TBI)/closed-loop, and
multipoint fuel-injection (MPFI)/closed-loop.
Finally, emission rates were determined separately for 1981-1982 model-
year vehicles and 1983+ model-year vehicles.
HC/CO Emission Rates - For HC and CO, the emitter category emission
rates (i.e., zero-mile level (ZM) and deterioration rates (DRs)) were
constructed as follows:
1. MOBILE4.1 ZMs were used for 1981-82 normals.
2. 1983+ DRs were used for 1981-82 normals, highs, and very highs.
3. Emission rates of normals were capped at the same rate for 1981-82
and 1983+ groups.
4. Normal caps were set at the maximum of the 1981-82 or 1983+
100,000-mile levels calculated from the 1981-82 and 1983+ ZM and
DR for normal emitters.
5. Deterioration rates that were negative and without significance
were assumed to be zero.
6. Regression of carburetor very highs was performed for 1983-1988
model years only (although the regression results were applied to
all 1983+ carbureted vehicles). Including 1989 resulted in a
negative ZM.
7. A covariance analysis was used for fuel-injected very highs that
resulted in the same DR but different ZM levels for the 1981-82
group and the 1983+ group. (This resulted in substantially higher
HC and CO emission rates from the 1983+ group compared to the
1981-82 group.)
8. All model years were combined for supers.
A-16
-------
Strengths -.
1. Allows detailed evaluation of the effects of high emitters and the
effect of control programs (i.e., I/M).
2. Accounts for mileage impact on emission rates.
3. Recognizes that technology changes/improvements impact
deterioration rates and creates a structure to account for the
effect.
Weaknesses:
1. Emitter groups should be statistically chosen instead of based on
emission standards. Technology groupings need to be selected
based on emission performance. User input in MOBILE has no impact
on the assumptions used for the base emission rates.
2. Mix and match approach is not defensible. Should use the same
data set for all analyses of regime sizes and emission levels.
3. Could double-count impact of emission deterioration and regime
growth.
4. Many of the assumptions on when technology changed appear
arbitrary and do not account for differential performance that may
occur within the defined groups (distributional effects).
A1 terna ti ves:
1. Possibly develop new emitter groups and technology groups.
2. Incorporate this function into the MOBILE code.
3. Use more regimes so that emissions are not a function of odometer
in a given regime.
4. Use the same data set for all analyses. In cases where data are
sparse, say so and do the best you can with what you've got. In
some cases, if you are slim on data it probably means there are
not that many in the fleet (e.g., 1981-82 MPFI vehicles) so the
impact on fleet-average emission estimates (which is ultimately
what we are trying to figure out here) is minimal. On the other
hand, there probably were sufficient data for the 1981-82
carbureted group to analyze by itself and not use the 1983+ DRs.
If there was concern about the number of normals in this group
(which was probably low, given the fact the 1981-82 vehicles were
10 years old when tested), why not pull in some of the FTP data
from Ann Arbor testing to represent low mileage normals. It's
really the emitter category growth functions that drive the
deterioration rates anyway.
A-17
-------
5. Need to discuss too many issues, e.g., making FI DRs the same for
different groups does not seem reasonable unless it can be proven
that the assumption holds.
NOx Emission Rates - The following procedure was used to develop NOx
emitter category emission rates:
1. 1981-82 model-year normals used the MOBILE4.1 ZM and the DR was
determined from the mean emission level and mileage of the Hammond
sample.
2. 1983+ model-year normals used a covariance analysis that forced
the deterioration rates to be equal for vehicles certified to 1.0
and 0.7 g/mi NOx. (This resulted in different zero-mile levels.)
3. High NOx emitters used DRs from the normal NOx emitters, and the
ZM levels were back-calculated from the mean emission level and
mileage of the Hammond sample.
Strengths:
I. Using the ZM levels ties the results to actual FTP data.
2. Covariance analysis can check the null hypothesis that DRs are the
same for different standards.
3. Simple approach.
Weaknesses:
1. NOx emission estimates are significantly affected by several very
high NOx emitters that are in the 8-12 g/mi range for IM240. It
is unclear if they received accompanying FTP tests. Engine-out
NOx for most of these vehicles should be in the 4-5 g/mi range.
There has been no explanation as to why these were so high - there
should be a review of the database to see if they were improperly
tested at the I/M lane.
2. Why not compute DRs for highs. Method used could lead to
unrealistically high zero-mile emissions for highs.
3. The assumptions drive the emission estimates and it is not clear
how well it represents real-world occurrences .- how is this
validated?
Alternatives:
1. Accept hypothesis that NOx emission DFs may be zero or negative?
2. Compute DFs for each group as shown in the data.
A-18
-------
Growth Functions - As important as the emission level of each emitter
category are the growth functions assigned to those categories. For
MOBILES, EPA wanted to base emission control system deterioration on
both vehicle age and mileage. This was done by using data from the 1987
and later model years to establish the growth rate of non-normals (i.e.,
highs + very highs + supers) for mileages less than 50,000. For
mileages above 50,000, data from the 1981-86 model years were used for
the TBI and carbureted technology groups, while data from 1984-86 model
years were used for MPFI vehicles. (EPA judged that pre-1984 MPFI
represented "prototype" technology.)
The method used to establish the emitter category growth rates was based
on first developing growth rates for the following emitter groups:
supers,
very highs + supers, and
highs + very highs + supers.
Once these were established, individual emitter category growth rates
were determined by subtraction.
The analytical technique used to develop the growth functions for each
of the above groups is best explained with an example. For the MPFI
very highs+super group, the following process was used. First, the
<50,000 mile growth rate was established by determining the fraction of
very highs+supers from all 1987+ MPFI data. In the Hammond sample,
there were 155 very highs+supers out of 1,716 total vehicles in this
group (i.e., 9.03%). This fraction was then divided by the average
mileage of the group (28,182) to obtain a growth rate of 0.03205/10,000
miles. The growth rate beyond 50,000 was calculated by first
determining the fraction of very highs+supers in the 1984-1986 model
year group (138/460, or 30.0%) and the average mileage of that group
(68,464) . The second growth rate was then calculated by linear
extrapolation of a line connecting the fraction of very highs+supers at
50,000 miles (i.e., 5*0.03205, or 16.0%) and the point established from
the >50,000 1984-86 group (i.e., 0.300 at 68,464 miles). This resulted
in a >50,000 growth rate of 0.07568/10,000 miles.
Strengths:
1. Simple, yet accounts for non-linearity.
2. Straightforward technique, but it is unclear how this is a
function of both age and mileage.
3. It gets numbers that can be used in the model.
4. It's an easy procedure and gives the 50,000-mile kink that has
been assumed for years.
5. Simple to do.
A-19
-------
Weaknesses -.
1. Does not account for "shape" of curve at very high mileages.
2. This technique is too sensitive to post-50,000 mile sample
distribution.
3. The 50,000-mile break point is essentially arbitrary. (The only
objective reason to choose it is that it represents the
certification mileage.) Because of relatively low mileages of
samples, calculated slopes are extrapolated far beyond the range
of the data and can drive important policy decisions. Second
slope is likely overestimated based on the assumption of zero high
emitters at zero miles. This assumes that there is a significant
number of high emitters at 1,000 miles and minimizes the projected
number of high emitters at 50,000 miles. This latter fact in turn
maximizes second slope.
4. It does not account for variation in growth rate or zero
population at zero miles.
5. The 50,000-mile kink is not supported by the data. This method
artificially inflates the second DR.
6. Linear growth rate is not obvious; it's more likely to tail off at
high mileage. The method has not been validated and it is
extremely sensitive to the two bins under and over 50,000 miles.
Alternatives:
1. Cap number of highs, very highs, and supers? Add third linear
growth beyond 100,000 miles? Non-linear fit?
2. Statistical analysis of p(high,super) in 10,000-mile bins.
3. Break data into small, but meaningful mileage increments (e.g.,
10,000 mile increments). Plot fractions of high emitters and
emissions of normals and highs. Perform linear and non-linear
regressions to determine if the slope at higher mileage is
increasing or decreasing and, if so, whether the effect is
statistically significant.
4. Break data into 10,000-mile bins (at least up to 100K), and
determine the fraction of each emitter category in each bin. Use
a regression analysis to develop emitter category growth
functions.
5. Break the data into more bins and track the growth rate and
revisit the assumptions about the model year groups included in
the analysis (i.e., it seems to mix mileage and model year when it
should just be mileage).
A-20
------- |