United States        Air and Radiation       EPA420-R-01-020
            Environmental Protection                April 2001
            Agency                      M6.EVP003
vvEPA     Evaluating Multiple Day
            Diurnal Evaporative
            Emissions Using RTD
            Tests
                                  > Printed on Recycled Paper

-------
                                                                          EPA420-R-01-020
                                                                                 April 2001
                                             RTD

                               M6.EVP.003
                                    Phil Enns

                         Assessment and Standards Division
                      Office of Transportation and Air Quality
                       U.S. Environmental Protection Agency
                                    NOTICE

    This technical report does not necessarily represent final EPA decisions or positions.
It is intended, to present technical analysis of issues using data which are currently available.
         The purpose in the release of such reports is to facilitate the exchange of
      technical information and to inform the public of technical developments which
        may form the basis for a final EPA decision, position, or regulatory action.

-------
                             ABSTRACT
     In parallel reports (M6.EVP.001 and M6.EVP.002), EPA
estimated the diurnal emissions produced by vehicles that have
been parked for up to one full day  (24 hours).   This report
documents the method used in MOBILE6 for estimating the diurnal
emissions from vehicles parked for more than 24 hours (i.e.,
multiple days).

     This report was originally released (as a draft) in January
1999.  This current version is the final revision of that draft.
This final revision incorporates suggestions and comments
received from stakeholders during the 60-day review period and
from peer reviewers.

-------
                         TABLE OF CONTENTS
                                                      Page Number
 1.0 Introduction and Background  	    1
 2 . 0 Data Sources	    1
 3.0 Methodology  	    3
     3.1  Model Form	    3
     3.2  Model Estimation  	    5
 4.0 Initial Results  	    5
     4.1  Effect of Fuel Metering, Pressure/Purge,
          and Model Year	    5
     4.2  Special Cases	    7
          4.2.1 Diurnal  Emissions for Periods Longer
                than  Three  Days	    7
          4.2.2 Diurnal  Emissions of
                "Gross Liquid Leakers"  (GLLs)	    8
          4.2.3 Diurnal  Emissions of Vehicles Certified
                Using the Enhanced Evaporative Test
                Procedure  ("ETPs") 	    8
          4.2.4 Avoid Having  "FAILING" Vehicles With
                Lower Emissions Than  "PASSING" Vehicles     8
          4.2.5 Setting a Lower Bound for the Ratios of
                Consecutive Days	    9
          4.2.6 Comparison to MOBILES	    9
 5.0 Conclusion	    9

APPENDICES
 A.  Statistical Output Supporting Tables 2 and 3 ....   11
 B.  Peer Review Comments from H.  T. McAdams	   13
 C.  Comments from Stakeholders	   26
                                11

-------
                  Evaluating Multiple Day Diurnal
              Evaporative Emissions Using RTD Tests
                   Report Number  M6.EVP.003

                            Phil Enns
           U.S. EPA Assessment  and Standards Division
1.0   INTRODUCTION and BACKGROUND

     This report documents an analysis of diurnal evaporative
emissions from light-duty vehicles (LDVs)  and light-duty trucks
(LDTs)  occurring over periods of more than one day.   Results of
this study will be used in MOBILE6 in conjunction with estimates
of vehicle and truck activity and estimates of evaporative
emissions for shorter periods to obtain total diurnal emission
values.

     The underlying causes of diurnal evaporative emissions are
discussed at length in several reports1'2'3.  By definition,
diurnals are those emissions associated with daily temperature
change,  its effect on vaporization of a vehicle's fuel,  and the
expansion of fuel vapor.   The evolution of technology and
regulations is assumed to influence diurnal emission rates.
These trends also are discussed in the references cited above.
In the modeling of multiple day diurnals presented here,  several
categories of vehicles are considered, based on model year, fuel
metering, and the vehicle's performance on the purge and pressure
tests .   These are chosen to achieve  consistency  with groupings
employed in the MOBILE emissions inventory model.


2.0   DATA SOURCES

     In this analysis, EPA considered real-time diurnal (RTD)
test data from testing programs (i.e., work assignments)
^Landman,  L.  "Evaluating Resting Loss  and  Diurnal  Evaporative
Emissions Using RTD Tests," Report No. M6.EVP.001.
          P.L.  and R.G.  Dulla,  "Analysis  of  Real -Time  Evaporative
Emissions Data," Sierra Research,  Report  No. SR97-12-01,
December, 1997.

^Haskew,  H.H.  and T.F.  Liberty,  "Diurnal  Emissions  from  In-Use
Vehicles," Coordinating Research Council, CRC E-9,  January,  1998.

-------
                               -2-
performed under contract for EPA.  The data consist of hourly
values of HC emissions  (in grams) measured under varying
conditions of fuel Reid vapor pressure  (RVP) and ambient
temperature.  (The actual test results are provided with the
report identified in reference 1.)  Daily totals are obtained
directly from these hourly values.

     The RTD testing performed for EPA was done by its testing
contractor  (Automotive Testing Laboratories) over the course of
five (5)  work assignments from 1994 through 1996  (performed under
three different EPA contracts).   A total of 119 light-duty
vehicles (LDVs)  and light-duty trucks  (LDTs) were tested in these
programs.   (That number was reduced to 118 because the status of
one of the test vehicles on the purge and pressure tests could
not be determined.)   Table 1  (below) displays the distribution of
those 118 vehicles and individual tests by several parameters.
Of special interest is the length of the tests, ranging from 33
to 72 hours.
                              Table 1
                 Distribution of EPA Vehicles and Tests

MODEL
YEAR
Pre-80
80-85
86-95
FUEL
METERING
GARB
GARB
Fl
GARB
Fl
PURGE/
PRESSURE
F/P
P/F
P/P
F/P
P/F
P/P
F/P
P/F
P/P
F/P
P/F
P/P
F/P
P/F
P/P
ALL (Totals)
TQC
33 H
Veh
1
2
1
5
5
4
2
3
1
3
2
17
19
20
85
ours
Tests
6
12
6
24
19
21
12
12
4
12
6
96
96
88
414
>t Durat
38 H
Veh
1
2
~
~
1
1
2
7
on (hoi
ours
Tests
4
8
~
~
4
4
8
28
,rc\
72 H
Veh
1
6
~
1
1
1
16
26
ours
Tests
4
27
~
1
6
4
80
122

To
Veh
1
4
1
5
5
8
4
2
3
1
3
3
19
21
38
118
tal
Tests
6
20
6
24
19
35
21
12
12
4
12
7
106
104
176
564
More complete descriptions of these data are found in the reports
cited earlier.

     In addition, the two EPA vehicles identified as "gross
liquid leakers"  (GLLs) are omitted from these analyses.  The

-------
                               -3-
emissions of these vehicles are large, tending to skew estimates
for non-leakers, while the mechanisms by which emissions are
produced are quite different from the two groups.  EPA will treat
multiple day diurnal emissions from gross liquid leakers as
unchanging from day to day.

     Other reports on diurnal emissions utilize data from a
second set of testing programs performed for the Coordinating
Research Council  (CRC).    (See the report identified in reference
1.)  However, because all those additional tests were run for 24
hours only, and thus yield no information on multiple day
emissions, they were not used in this current study.

     CRC conducted another RTD program (Task VE-4)  in which ten
1992-97 model year PFI vehicles (passing both the pressure and
purge tests) were tested using the full 72-hour RTD test.  These
vehicles provide additional results to the stratum that EPA
already had the largest amount of test data.  Also, the
information provided with these results were not in sufficient
detail to allow the EPA and CRC data to be merged.


3.0   METHODOLOGY

     This work involves estimating the change in diurnal
evaporative emissions from the first day to later days.  In the
MOBILE model, these estimates will be used to determine emissions
for full Days 2 and 3 given total emissions based on Day 1.
These in turn can be subdivided into hourly values as needed.

     Factors influencing the RTD (and diurnal)  emissions from
individual vehicles include fuel metering technology, model year
groupings, and outcome of purge and pressure tests performed on
the vehicle.  Ambient temperature and fuel volatility also are
known to play a central role.

     The results of the RTD tests allow us to estimate both the
diurnal emissions and the resting loss emissions.   (The diurnal
emissions being the total RTD results minus the resting loss
emissions.)  In these analyses, we are actually modeling the
changes in the RTD results (for each day).  Thus, after
predicting the RTD for Days 2 and 3, we must subtract the
corresponding estimated resting loss emissions to obtain the
diurnal emissions.


3.1   Model Form

     In the previous draft of this report (dated January 1999) ,
EPA modeled the natural logarithm of emissions as a linear
function of the factors influencing the RTD emissions  (described
in Section 3.0).  Although this approach has a number of

-------
                               -4-
advantages, it also has some significant weaknesses  (as
identified by two of the reviewers).

     Therefore, in response to comments from two of  the
reviewers, EPA altered the form of  the model.  In MOBILE6, the
diurnal emissions for a successive  day are modeled as a linear
combination of:

       1.  the midpoint temperature  (in degrees Fahrenheit) of the
          day  (i.e., the mean of the maximum and minimum daily
          temperatures),

       2.  the Reid vapor pressure  (RVP) of the tank  fuel in
          pounds per square inch (psi),

       3.  the product of the midpoint temperature and the RVP
           (i.e., an interaction term), and

       4.  the full-day (predicted)   diurnal emissions of the
          previous day.

     Several dummy variables were used to produce different sets
of vehicle parameters, thus, creating the categories  (strata)
that are used in MOBILE6.  These (categorical) variables  (used to
switch on or off the factors of fuel delivery, purge/pressure
test status, and model year range)   are:

        •  the status  ("Pass" or "Fail") of the evaporative
          control system of the vehicle, based on the performance
          of two functional tests  (pressure test and purge test),

        •  the fuel delivery system  of the vehicle  (i.e., fuel
          injected versus carbureted), and

        •  two variables to distinguish among the three model year
          ranges (i.e., pre-1980,  1981-85, and 1986-95).

Thus, the form used to model the full-day diurnal emissions  (in
grams)  of the second day ("D2")  is  given below as equation [1] :

    D2=  A+(B*T)  +  (C*R) +  (D*T*R) + (E*D1)     [1]

Where:
    T  =  Midpoint Temperature  (i.e.,  [Min_Temp + Max_Temp] / 2)
            (in degrees Fahrenheit)
    R  =  Fuel RVP  (in psi)
    D1  =  RTD test result  (grams)  for the first day.

Dividing equation [1]  by the diurnal emissions of the first day
("D1")  yields  an estimate of the ratio of the diurnal emissions
of the second day to the first day.  MOBILE6 actually uses these
ratios to estimate the diurnal emissions of the second day.

-------
                               -5-
     Similarly, the form used to model the full-day diurnal
emissions (in grams) of the third day  ("D3")  is given below as
equation [2] :

    D3=  A+(B*T)  +  (C*R) +  (D*T*R) +  (E*D2)     [2]

Similarly,  dividing the preceding equation [2]  by the diurnal
emissions of the second day ("D2")  yields an estimate of the
ratio of the diurnal emissions of the third day to the  second
day.  MOBILE6 actually uses these ratios to estimate the diurnal
emissions of the third (and later)  days.

3.2   Model Estimation

     The above models were fitted using an ordinary least squares
regression.   The diurnal emissions of the previous day  (which
account for additional variation) effectively  fits a different
intercept term to each vehicle and helps produce sharper
estimates of the coefficients shown above.  The goal of the
analysis was to obtain point estimates of the  linear combinations
of the type shown in equations  [1]  and [2]  (in  Section 3.1) .
That approach is adopted in the analysis reported below.

        Because the  available  data  include tests  of varying
length, it is difficult to compare emission values from all tests
for the purpose of estimating full day changes.  In particular,
complete 72-hour tests are available in only six of the model
year, technology, and pressure purge test status categories.
However, as seen in Table 1, there are a large number of 33-hour
and 38-hour tests, and these provide more complete coverage of
the categories. These tests give some indication of change in
evaporative emissions from the first day to the second.  One way
to use these data is to consider only the first nine hours of
each day, since the 33-hour tests give only that number of hours
in Day 2.  If it is assumed that the total emissions in the first
nine hours are comparable across days then the effective data set
numbers 564 tests (almost a five-fold increase) .


4.0   INITIAL RESULTS

     The two models (i.e., equations  [1]  and [2])  were fitted to
the 9-hour data described above, one for Days  1 and 2,  and the
other for Days 2 and 3.  Also, these two models were fitted to
the  (smaller)  72-hour data set.  Regression coefficient estimates
were computed using the SAS GLM procedure.


4.1   Effect  of Fuel Metering. Pressure/Purge Status,  and Model Year

     In modeling the Day_2 or Day_3 diurnal emissions  (i.e.,
equations [1]  and [2]), neither of  the model  year (categorical)
terms is statistically significant.  Therefore, as a first step

-------
                               -6-
toward simplification, the model year factor was  removed  from  the
analysis of both models.

     In the regression analysis of Day-2 versus Day-1 RTD
emissions, after removing the two categorical model year  terms
and refitting the first model  (equation  [1]) ,  all of the
resulting  (analytical) terms are statistically significant.
Also, the categorical variable for fuel metering  (FM) is
significant at the five percent level.

     However, two of the three purge/pressure groupings  ("Fail
Pressure" and "Fail Only Purge") do not differ significantly
(when the test result are compared on a pairwise  basis).  As a
result, they were combined into a single group "FAIL"  (fail one
or both tests).   This combined group  ("FAIL") does differ
significantly from the remaining group "PASS"  (pass both  tests).
Therefore, a further simplification is used in which a vehicle is
classified as "PASS"  (pass both tests) or  "FAIL"  (fail one or
both tests).   The output of this regression analysis is given  in
Appendix A.  The coefficients produced by  this regression are
shown below in Table 2.

                             Table 2
                Coefficients to Estimate Second Day Diurnal
                             By Strata

Coeff
A
B
C
D
E
Passing Bot
Pres
Fl
47.48
-0.70
-8.11
0.12
0.74
h Purge and
sure
Carb
49.06
-0.70
-8.11
0.12
0.74
Failing Eith
Pres
Fl
48.61
-0.70
-8.11
0.12
0.64
er Purge or
sure
Carb
50.19
-0.70
-8.11
0.12
0.64
NOTE;
      The values  for  the  coefficients  "B, "  "C, "  and "D"  in the
      preceding table do  not  change  by stratum.   However,  we may
      want the flexibility  of allowing them to vary as more data
      become available.

     However, following this  same approach  (i.e.,  removing the
two categorical model year  terms and then refitting the  second
model),  yields terms  that are  NOT statistically  significant.   For
this "Day 2 to Day 3" analysis, the purge/pressure test  terms  are
not statistically significant, possibly because  most  (23 of the
26) of the vehicles tested  for the full three  days were  from the
single pass/pass purge/pressure group.  The  categorical  variable
distinguishing between the  fuel -injected vehicles  and  the
carbureted vehicles was also  NOT statistically significant,
possibly because most (18 of  the 26) of the  vehicles tested for
all three days were from  the  fuel -injected group.

-------
                                -7-
     After removing the  (categorical) variables  for fuel  delivery
system and for the results on the Purge/Pressure tests,  the
regression was run once again.  For  this Day-3 versus  Day-2
analysis, all of the terms are statistically significant.   The
output of this analysis is given in  Appendix A.   The results of
this regression analysis are shown  (below)  in Table 3.


                              Table 3

               Coefficients to Estimate Third Day Diurnal
                           For ALL Strata
Coefficients
A
B
C
D
E
Values
12.25
-0.21
-2.61
0.04
0.81
4.2   Special Cases

     Several situations are not covered by  actual  test  data.   In
these cases, EPA made assumptions on how  to handle them in the
MOBILE6 model.
4.2.1  Diurnal Emissions for Periods Longer than Three Days

     For MOBILE6, EPA assumes  that the diurnal  emissions
stabilize following Day  3.  That  is,  for  the  relatively small
number of vehicles parked for  more than three days,  the diurnal
emissions for the fourth and later days will  be identical  to the
diurnal emissions of the third day.

     This appears reasonable since the equations (models)
described by Tables 2 and 3 do not predict  large changes among
the first three days of  diurnal emissions.  For the  case
represented by the largest number of  tests  (i.e.,  fuel  injected
vehicles that pass both  pressure  and  purge  tests), an argument
could be made for modeling continued  positive but decreasing
changes in diurnal evaporative emissions  for  succeeding days.
That is not proposed here since we lack data  with which to form
estimates.

-------
                                -8-
4.2.2 Diurnal Emissions of "Gross  Liquid Leakers" (GLLs)

     In a series of parallel reports  (see  the  report  identified
in reference 1) ,  we noted that for  a  small  number of  vehicles,
the primary mechanism of evaporative  emissions was  the
substantial leakage of liquid gasoline  (as  opposed  to simply
vapor leaks).   In each of those reports, such  vehicles  were
referred to as "Gross Liquid Leakers"  (GLLs).

     For MOBILE6, EPA assumes that  the  quantity  of  leaking
gasoline will remain unchanged for  each day that the  vehicle is
parked.  Thus, the diurnal emissions  for GLLs  for each  day will
be identical to the diurnal emissions of the first  day.


4.2.3  Diurnal Emissions of  Vehicles Certified Using the Enhanced
      Evaporative Test Procedure ("ETPs")

     Beginning with the  1996 model  year, manufacturers  began
phasing in vehicles certified to the  new enhanced evaporative
test procedure (ETPs).   Since these ETPs were  designed  to meet  a
more stringent set of evaporative standards, the assumptions
(used in MOBILE6) predict very low  diurnal  emissions  for the
first day  (assuming that the evaporative control system is
functioning properly).   These first day diurnal  emission values
are lower than the averages used to generate the model  (i.e.,
equation [1]).  Thus,  these vehicles are outside the  range of the
sample data.

     MOBILE6 would normally model second day diurnal  emissions
from ETP vehicles with properly functioning evaporative control
systems using the first  column in Table 2  (fuel-injected vehicles
that pass both the purge and pressure tests).  However,  applying
this equation results in predicting the diurnal  emissions from
the second day to be substantially  higher  than actually measured
(in other programs).  The limited amount of actual  data on these
vehicles suggest that there is little if any difference among the
emissions for the three  days of the RTD test.

     Therefore, in MOBILE6, the diurnal emissions for ETPs with
properly functioning evaporative control systems for  all days
will be identical to the diurnal emissions  of  the first day.


4.2.4  Avoid Having "FAILING" Vehicles With  Lower Emissions Than
      "PASSING" Vehicles

     For the scenarios actually tested, equations  [1] and [2]
predict that the diurnal emissions  of vehicles failing  either the
purge or pressure tests  will be higher  than those of  vehicles
passing both tests  (all  other parameters being equal).   However,
it is mathematically possible for those equations to  predict
higher diurnal emissions for "passing"  vehicles  than  for

-------
                               -9-
"failing" vehicles.  Since this situation is not reasonable, we
will limit ("cap") the diurnal emissions such that:

        •  The diurnal emissions of the vehicles that fail only
          the purge test will not exceed the diurnal emissions of
          the vehicles that fail the pressure test.

        •  The diurnal emissions of the vehicles that pass both
          the purge and pressure tests will not exceed the
          diurnal emissions of the vehicles that fail only the
          purge test.


4.2.5  Setting a Lower Bound  for the Ratios of Consecutive Days

     For the scenarios actually tested, the ratios of consecutive
day diurnal emissions predicted by equations  [1]  and [2]  closely
approximates the ratios of the actual test vehicles.  However, it
is mathematically possible for those ratios to be much too small
for some untested combination of factors.  Hence, we will limit
("cap") the ratios such that:

        •  The ratio of Day_2 to Day_l will not be smaller than
          the "E  " coefficient of equation  [1] .

        •  The ratio of Day_3 to Day_2 will not be smaller than
          the "E  " coefficient of equation  [2] .


4.2.6  Comparison to MOBILES

     Equations  [1] and [2]  predict  smaller changes  in day-to-day
diurnal emissions in MOBILE6 than were predicted in MOBILES.
However, the predictions in MOBILES were based on theoretic
models rather than on actual multi-day testing.  EPA believes
that these new predictions, that are based on actual multi-day
diurnal (i.e.,  72-hour RTD) tests,  are more realistic.  Thus, EPA
will use in MOBILE6 the factors described in this report.


5.0   CONCLUSION

     Day-to-day diurnal evaporative emissions are found to change
over the first  three days for several combinations of a vehicle's
fuel delivery system and pressure/purge test status.  Temperature
and fuel vapor pressure effects also are evident.  Estimates of
these changes based on equations [1]  and [2]  (and Tables  2  and
3),  as modified by the special cases in Section 4.2, are used in
MOBILE6.

     The MOBILE model distinguishes between resting loss and
diurnal evaporative emissions.  The analysis presented here takes
a simplified approach, treating resting losses as constant so

-------
                               -10-
that any change from one day to the next is entirely due to the
diurnal.

     In the parallel report entitled "Modeling Hourly Diurnal
Emissions and Interrupted Diurnal Emissions Based on Real-Time
Diurnal Data" (M6.EVP.002),  EPA states that a vehicle would be
undergoing the second day of a multi-day diurnal if the diurnal
began no later than 8 AM of the previous day which is equivalent
to the engine being shut off by 6 AM of the previous day.

     For MOBILE6 to actually use estimates of multi-day diurnal
emissions, it is obvious that for each hour of the day (or for at
least the 18 hours between 6 AM and midnight),  we must know the
percent of the fleet that has been soaking for "n" hours (n = 1,
2, 3, . .  .  , 72).   The analysis that yields this distribution of
fleet activity can be found in report number M6.FLT.006  (entitled
"Soak Length Activity Factors for Diurnal Emissions").

-------
                                    -11-
                                Appendix  A

                    Statistical  Output Supporting Table 2
                       General Linear  Models  Procedure
Dependent Variable:  HC2

Source

Model

Error

Corrected Total
Source

TMP
RVP
TMP*RVP
HC1
PS
HC1*PS
FM






DF
7
530
537

Sum of

Squares
55268
7512
62780
R-Square
0








.880341
DF
1
1
1
1
1
1
1
39
Type
368
439
647
11882
105
170
209
.47026
.27975
.75001
C.V.
.77774
III SS
.73554
.01450
.61647
.95930
.43909
.37081
.38302
7895
14

Mean
Square
.49575
. 17411



F Value
557


.04


Root MSB
3 .
Mean
368
439
647
11882
105
170
209
764852
Square
.73554
.01450
.61647
.95930
.43909
.37081
.38302

Pr
0.


HC2

> F
0001


Mean
9.464721
F Value
26
30
45
838
7
12
14
.01
.97
.69
.36
.44
.02
.77
Pr
0.
0.
0.
0.
0.
0.
0.
> F
0001
0001
0001
0001
0066
0006
0001
Parameter

INTERCEPT
TMP
RVP
TMP*RVP
HC1
PS
HC1*PS
FM
   Estimate

47.48019419
-0.70187039
-8.10865443
 0.11755584
 0.74038591
 1 . 12814549
-0.10093455
 1.58318070
 T for HO:
Parameter=0
                                                    Pr >  T
Std Error of
  Estimate
4.
-5.
-5.
6.
28.
2.
-3.
3 .
.09
.10
.57
.76
.95
.73
.47
.84
0
0
0
0
0
0
0
0
.0001
.0001
.0001
.0001
.0001
.0066
.0006
.0001
11.
0.
1 .
0.
0.
0.
0.
0.
.61899741
.13760917
.45699245
.01739135
.02557077
.41363019
.02911322
.41191509

-------
               -12-
     Appendix  A  (Continued)




Statistical Output Supporting Table 3
   General Linear Models Procedure
Dependent
Source
Model
Error
Corrected


Source
TMP
RVP
TMP*RVP
HC2
Dependent
Parameter
INTERCEPT
TMP
RVP
TMP* RVP
HC2
Variable: HC3
DF
4
113
Total 117
R-Square
0.952201
DF
1
1
1
1
Variable: HC3

12
-0
-2
0
0

6375
320
6695

Sum of
Squares
.824176
.053957
.878132
C.V.
24.79568
Type
6
9
17
2692

Estimate
24739357
20580431
61263214
04206389
80699924
Ill SS
.987337
.853014
.397714
.480471


1593
2

Mean
Square
956044
832336

F Value
562 .77


Root MSB
1
Mean
6
9
17
2692

T for HO:
Parameter=0

1.12
-1.57
-1.87
2.48
30.83
682955
Square
987337
853014
397714
480471

Pr > r

F Value
2.47
3.48
6 . 14
950.62

Pr > F
0.0001


HC3 Mean
6.787288
Pr > F
0.1191
0.0648
0 . 0147
0.0001

r Std Error of
Estimate
0.2659 10
0.1191 0
0.0648 1
0.0147 0
0.0001 0
95272601
13103008
40076763
01697211
02617395

-------
                               -13-
                           Appendix B


         Response to Peer Review Comments from H. T. Me Adams


     This report was formally peer reviewed by one peer reviewer
(H. T. McAdams).   In this appendix, comments from H. T. McAdams
are reproduced in plain text, and EPA's responses to those
comments are interspersed in indented italics.

     In order to respond to  (and incorporate) comments from the
     peer reviewer  (as well as from stakeholders), this final
     version of the report has changed substantially from the
     earlier draft version that was reviewed.  Some of those
     changes have resulted in many of the comments no longer
     being applicable.


               ************************************

      Evaluating Multiple Day Diurnal Evaporative Emissions
                         Using RTD Tests

                               By

                            Phil Enns

                     Report Number  M6.EVP.003

                       Review and Comments
                               By
                          H.  T.  McAdams


1.  INTRODUCTION AND SUMMARY

Report Number M6.EVP.003 is herein reviewed in accordance with a
letter postmarked February 17, 1999 from Mr. Philip A. Lorang,
Environmental Protection Agency (EPA) to Mr. H. T. McAdams,
AccaMath Services.  The reviewer is tasked to address report
clarity, overall methodology, appropriateness of the data sets
used, statistical and analytical methodology and the
appropriateness of conclusions,  with specific attention to data
stratification and predictive equations.

These topics are summarized briefly here, and are discussed in
more detail, as is deemed necessary, in the body of the report.

   *  The report is well written,  concise and readable.  Notation
     is simple and easy to identify and follow.

-------
                               -14-
   *  The overall methodology is consistent with that employed in
     other, similar EPA reports, specifically M6.EVP.001 and
     M6.EVP.005.  The review offers some modifications for
     possible improvement.

   *  The datasets used,  as in M6.EVP.001 and M6.EVP.005, are far
     from ideal, but considerable ingenuity is displayed in
     adapting the available data to the questions at hand.  In
     the review, suggestions are made for possibly extracting
     even more information from the data.

   *  Statistical approaches other than regression analysis should
     be considered, inasmuch as the "variables" of interest are
     discrete rather than continuous. If regression is used,
     logarithmic transformation of the response variable may not
     be necessary and could even have a biasing effect on
     emission estimates.  Alternative approaches are outlined.

   *  The general thrust of the conclusions is that evaporative
     emissions vary from day to day.  However, the quantitative
     extent of that variation and how long it takes to decay is
     subject to question.


It is to be understood that many of the criticisms of M6.EVP.001
and M6.EVP.005 apply to the report being reviewed. For example,
in the reports previously reviewed, it was indicated that error
bounds are essential to a complete statistical analysis and that
care should be taken in stating levels of significance.  The
report now being reviewed, however, clearly states that the
objective of the statistical analysis is to provide point
estimates of the various quantities of interest. That being the
case, this review eschews the consideration of confidence bounds.
Nevertheless, it is recommended that such concerns be considered
in subsequent modifications of the MOBILE model.

2. ANALYSIS AND DISCUSSION

The report is subject to several methodological difficulties.
Whether regression analysis should be the procedure of choice is
a legitimate question,  particularly in view of the discrete
nature of most of the "variables." And, if a linear model is to
be used, is logarithmic transformation appropriate under the
present circumstances?  A least-squares fit in log space does not
guarantee a least-squares fit in the original space, nor does it
ensure unbiased or minimum variance estimates. Much depends on
the nature of the error distribution and the nature of the
response to incremental changes in the predictor variables.

Just how serious these concerns are can not be ascertained
without further computations considered to be beyond the scope of
this review.  It is recommended, however, that other approaches
be tried, specifically regression without log transformation, and

-------
                               -15-
possibly straightforward Analysis of Variance (ANOVA).   It is
also suggested that residuals and R-Square be computed both in
log space and in the inverse space and that these be compared
with results of a non-linear methodology for fitting the data to
an equation. Finally,  an example is given to show how sampling
experiments can be helpful in selecting the most appropriate
model.

     We agree.  We have replaced the logarithmic approach with a
     simple linear approach.


     2.1 Stratification

In view of the fact that most of the variables are dichotomous,
estimating day-to-day changes in evaporative emissions comes down
to a matter of vehicle classification. Observable classification
features, such as fuel-metering systems,  model year and pass/fail
status re purge and pressure tests provide natural groupings. So
far as evaporative emission characteristics are concerned,
however, some of these groups may be indistinguishable from
others. A primary objective of statistical analysis, therefore,
is to determine the minimum number of vehicle classes or strata
to span the evaporative emission characteristics of the fleet.

Viewed in this light,  the problem would seem to be a candidate
for straightforward Analysis of Variance (ANOVA).  In such
instances, the relative magnitude of within and between groups is
the major discriminant,  and there exists a variety of procedures
and associated software for dealing with such problems.

     We have followed this suggestion, reduction of the number of
     strata in the analysis of "Day_3 to Day_2" data (see the
     current version of Table 3).
     2.2 Model Estimation

The methodology employed in the report is the usual General
Linear Model  (GLM).  The response variable, however, is not
diurnal emissions but log(emissions).  [Note: here the notation
convention is that log (not In) refers to natural logarithms].
Logarithms of the emission observations were regressed on the
variables listed at the top of page 2  in the report. According to
the report,  an advantage of this representation is that when the
equation is differentiated, the derived equation, after being
multiplied by 100,  yields the percent  change in emissions for a
unit change in predictor variables.

Justification for this interpretation is given in the Appendix of
the report.  The derivation given there is mathematically correct,
but is usually applied to continuous variables like RVP or
temperature. In the present instance,  only two of the variables

-------
                               -16-
are continuous, and even those two are irrelevant, because they
are not entered as interacting with the DAY variable. Strictly
speaking, then, the variables that are "differentiated" with
respect to DAY do not have a derivative in the usual sense of the
word.  Rather, the appropriate mathematical discipline is the
calculus of finite differences.

The difference in the value of the emission function when a
predictor variable is set at its extremes  (0 and 1) is what plays
the role of "derivative." If there were three points in the
function's domain, as there is for the DAY variable, differences
can be computed for the second two points just as well as for the
first two points. Also, it would be possible to compute a second
order difference as the difference between those differences, and
that quantity would be analogous to a second derivative in the
case of a continuous variable.

     We agree.  We have dropped the differential approach.

These fine distinctions are not of any particular consequence
except to point out that in the discrete realm, we are dealing
with simple differences between quantities, and that ratios can
be formed without having to resort to logarithmic
transformations, which brings its own nuances  (and nuisances) to
the scene.

An indication of the difficulties that might be encountered is
found in the following quotation from the report, page 6.

     The apparent decrease in the failing fuel-injected mean from
     Day 1 to Day 2 appears inconsistent with the finding of a
     13.3% rate of increase.  This is explained by the fact that
     the percent change is derived from the logarithms of the
     individual emission levels, which has a disproportionate
     effect [on] larger emission values.  For these two
     subsamples, the means of the logarithms increase  (from 1.59
     to 1.76)  as expected.


This discrepancy is a wake-up call for the possibility of other
difficulties associated with a logarithmic transformation. If
logarithms can cause "disproportionate effects" here, then
perhaps they may be causing difficulties elsewhere, in a way that
is not apparent.  Indeed, a closer look at the nature of the
transformation and how it affects the analysis of the present
data is in order.
2.2.1   How  Log  Transformation Affects  Sample  Means  and  Variances

As has been noted, the list of predictor variables reveals the
fact that all but two of the predictor variables are discrete and
dichotomous. For discrete variables, regression is little more

-------
                               -17-
than just a way of classifying data into sets of observations and
finding the means for those sets.

It is for this reason that it is appropriate to examine how a
logarithmic transformation affects the mean and variance of a
simple column of data. The mean is the least-squares estimate of
a measure of central tendency, and, if the data are samples from
a normal distribution, then the least squares estimate is also
the maximum likelihood estimate as well.  Let us not forget,
however, what sample space we are working in.  The sample mean is
the most likely estimate of the population of the logarithms of
the emissions, but the sample mean, when exponentiated, does not
provide the most likely mean of the population of emissions
expressed in appropriate units such as gm/mi.

Under certain circumstances,  logarithmic transformation has
definite advantages in the analysis of emission data and has been
extensively used for that purpose. The Complex Model for
Reformulated Gasoline (RFC) is a good example. Like any good
medicine though, it can have some nasty side effects.  Mostly,
these effects arise from the fact that what is a summation in
emission space becomes an iterated multiplication in log space.
That being the case, computing the mean in log space is
equivalent to computing the geometric mean in the real world -
that is, the Nth root of the product of N numbers, N being the
sample size.

Consider a normally distributed population of logarithms with
mean 0 and variance 1. It can be shown that, if m and s denote
respectively the mean and standard deviation of the logarithms,
then the mean M and variance V of the antilogs are:

          M = exp(m + 1/2 s2)                               (1)

          V = exp(2m + 2s2) -  exp(2m  +  s2)                   (2)

Note that both the mean M and the variance V of the antilogs are
functions of both the mean "m" and the variance "v" of the
logarithms. Of particular note is the fact that if we just
exponentiate the mean logarithmn "m," we will underestimate the
mean in emission space.  There is an added s2 in  the  argument  of
the exponential that leads to the curious property that the
greater the variance of the logarithms, the more the mean of the
antilogs is inflated.

From (1),  therefore, it is clear that if one computes the mean m
for a sample of N normally distributed logarithms, there is only
one circumstance under which it is legitimate to take the
exponential of m as the mean M of the antilogarithms. That
circumstance is when the variance of the sample of logarithms is
zero (0).   In other words, the transformation is legitimate only
if all items in the sample are identical.  Otherwise, the
transformed logarithm yields the geometric mean, and, of course,

-------
                               -18-
if all the observations  are  identical,  then the arithmetic and
geometric means are  the  same.

The impact of these  equations  can be better appreciated by
conducting a simple  sampling simulation.  A sample of 10,000
random numbers was drawn from  a normal  distribution with mean 0
and variance 1. Then the 10,000 random numbers were treated as
logarithms and their mean and  standard deviation were computed.
Next, the exponentials of the  10,000 random logarithms were
computed and were treated as if they were emissions expressed in
appropriate physical units,  such as gm/mi.   In the table below,
sample and population statistics for logs and antilogs are
compared.


   COMPARISON OF SAMPLE MEAN AND VARIANCE OF 10,000 RANDOM SAMPLES

                                                Mean   Variance
   Sample data for 10,000 random logarithms N(0,l)     -0.0095    1.0049
   Exponential transforms of  the above  statistics      0.9985    2.7316

   Statistics for exponentials of 10,000 logs          1.6482    4.9785

   Population parameters for  N(0,l)  logarithms             0        1
   Transforms (exponentials)  of above parameters           1    2.7883


It is clear that if  we transform the mean logarithm just by
taking its antilogarithm,  then the resulting number, if reported
as the mean, is biased downward  (from 1.6482 to 0.9985). From
Equation  (1), it is  clear that if the variance of the logarithms
has any finite value, then the antilog transform of the mean
logarithm is smaller than the  mean computed directly from the
data.  Similarly, if we  take the antilogarithm of the variance
and report it as the variance  of the antilogarithms, it too is
biased.

How much these flaws affect  the computation of the day to day
changes in evaporative emissions is not known, because the
required computation is  beyond the scope of this assignment.
Inasmuch as our interest is  in the ratio of one day's emissions
to another day's emissions,  there may be a compensating effect
that tends to alleviate  part (but not all)  of the error.
Nevertheless, it is  suggested  that the data be reconsidered with
the above considerations in  mind.

     We agree.  We have  replaced the logarithmic approach with a
     simple linear approach.


2.2.2     How Logarithmic Transformation Affects Regression

It is correctly remarked in  the report  that regression provides a
least-squares fit to the data.   However, the parameters estimated
by least squares apply to log  space, not to the real world of

-------
                               -19-
emissions. Perhaps the most disturbing aspect of the log
transformation is that the errors are multiplicative rather than
additive.

For example, consider the linear form of the model, as shown on
page two  (2) of M6.EVP.003:

     y = b0  +  b, x,  +  b2 x2  +  .  .  .  bk xk                       (3)

where

           y = log(emissions)

As written in equation (3), log(emissions) are expressed
deterministically as a function of xi;  x2,  .  . . xk.   Actually,  as
is well known, the equation has an error term:

          y = b0  +  b, x1 +  b2 x2 + .  .  .  bk x2 + err            (4)

where err denotes a random error assumed to be normally
distributed.  Then, transforming  (4) back  into emissions space,
we have:

  Emissions = exp (b0  + b1 x1  + b2 x2 + .  .  .  bk xk)  *exp(err)    (5)

Equation  (5) makes it clear that the greater the emissions, the
greater the error.

This behavior is not necessarily disadvantageous, however. If a
random variable really is such that its variance increases with
its mean value, then a logarithmic transformation tends to
stabilize the variance.  This behavior is a legitimate basis for
performing a log transformation of the dependent variable when
that variable is regressed on one or more predictor variables.

However, if variance is independent of the population mean, then
the log transformation is disadvantageous, because it is
equivalent to giving the larger values in a sample greater weight
in determining the regression coefficients. Consequently, the
regression coefficients are biased in favor of minimizing the
larger residuals at the expense of the smaller residuals.  It is
as if we are performing a weighted least squares estimation in
which the weights are proportional to the numbers being fitted.

Performing least squares regression of logarithms is also
appropriate in another circumstance.  In simple regression, it is
assumed that for a given incremental change in a predictor
variable, the response increment will be the same regardless of
how large the response is.  In some instances, however, the
response is proportional to the value of the response variable.
This phenomenon,  of course, is the basis for interpreting the
regression coefficients as the fractional or percentage change in
the dependent variable for a unit change in the predictor
variable. The Appendix in M6.EVP.003 contains a proof of this
interpretation.

-------
                               -20-
The best of both worlds is realized when both of the above
circumstances are realized simultaneously. It is conjectured that
that circumstance rarely occurs.

It is probably fair to say that logarithmic transformation is
most frequently selected not for either of these reason, but
because it simplifies the analysis.  The General Linear Model can
be applied, and we are spared the difficulties of performing a
nonlinear minimization of errors.

In the case under consideration, in which most of the variables
are dichotomous, it is hard to see why log transformation is
preferred to emission space.  A hypothetical problem was set up
to demonstrate what might happen if only dummy variables are
present.  The dataset for this demonstration is in Appendix I
[renamed Appendix B-l in this report].

The problem data set contains twenty observations in which the
response variable is dependent on three dummy variables that
assume the values of either 0 or 1. A simple model is one in
which the response variable  y  is regressed on the dichotomous
variables xi;  x2, and x3 :

             y  = b0  + b, x1  +  b2 x2  +  b3 x3                    (6)

As an alternative, the response variable was logarithmically
transformed:

             yy = b0  + b1 x1  +  b2 x2  +  b3 x3                    (7)

where yy = log(y).

  The regression coefficients under the two models are as
follows:

                                y-space    yy-space

              Intercept        17.6572      2.7624
              x1               12.7385      0.7430
              x2                3.6462      0.1803
              x3               -9.3615     -0.5391

Residuals were computed for both regressions (See Appendix)and
are plotted in Figure 1.  R-square in y-space is 0.8203 and 0.7981
in yy-space (log space).

                          y-space     yy-space*

      Sum of residuals     0.0000      3.5901
      Residual Std.        3.5202      4.9356

     * Computed as the difference between observed responses and
       exponentiated responses as calculated from the log model.

-------
                               -21-
As predicted, when the responses computed from the log model are
transformed to their equivalents in y-space, the residuals are
biased and have a larger variance than when computed directly
from the simple model in y-space.

The intent of these demonstrations is to emphasize the
desirability of doing some "experimental statistics" before
deciding on whether the analysis is best served in log space or
antilog space.  In the case of M6.EVP.003,  it is suggested that a
simple linear model be considered as an alternative to the log-
based model.  The effort required is small enough to be executed
now, and its implications should be kept in mind in any future
modifications of the MOBILE model.
     2.2.3 How About Vehicle Effects?

According to the wording of the report, "a vehicle factor was
included" in the model.  The report goes on to say, "This
effectively fits a different intercept term to each vehicle and
helps produce sharper  [sic] estimates of the  [regression]
coefficients."  No interpretation or measure is given to clarify
what the term "sharper" implies.

Removing vehicle effects is essential in a case like this. In the
development of the Complex Model for Reformulated Gasoline (RFC),
vehicle-to-vehicle differences accounted for some 95% of the
variance of the response variable.  What is lacking in the
present instance is a similar comparison of vehicle effects and
effects induced by other sources. Beyond the above quotation
regarding "a vehicle factor" and an allusion to intercepts, no
further explanation of dealing with that factor is given.  And,
there is no mention of vehicle effects in the computer print-outs
of Tables 2 and 3.

     The "vehicle factor"  that was used in the analysis was the
     prior day's actual RTD emissions  (identified in Appendix A
     as "HC1" and "HC2").  These  "vehicle factors" were used in
     both the previous draft as well as in this final version.

The conventional way of dealing with extraneous variables like
vehicles is to enter each as a "dummy variable" having two
states, 0 and 1, representing respectively absence or presence of
that vehicle. There is a loss of one degree of freedom for each
vehicle. Vehicle degrees of freedom are not accounted for in the
printouts,  nor is there any data showing the magnitude of the
effects of either individual or aggregated vehicles. The report
needs to be more explicit on this point, not only for the purpose
of quantifying what is meant by "sharper" estimates, but also to
see how vehicle effects compare with DAY effects and other
sources of variance.

-------
                               -22-
Finally, no value is given for the intercept in the computer
printouts. Presumably that is because the matrix of the normal
equations is less than full rank, because the sum of the vehicle
vectors is equal to the intercept vector.  In that case, either
the intercept or some other effect must be taken as inestimable.
For present purposes, lack of an intercept is of no concern,
because interest is concentrated on the ratio of emissions from
one day to another. To scale model estimates up to real-world
values, however, an estimate of baseline emissions would be
required.
3. APPROPRIATENESS OF CONCLUSIONS

A conclusion of the report is that day-to-day evaporative
emissions are found to change over the first three days for
several classes of vehicles. The data, though not ideal, supports
this conclusion, but AccaMath believes that estimates of the
extent of these changes could be improved, even with the present
data.  Least satisfactory of the conclusions, perhaps,  is the
implication that after three days the change over time abruptly
ends.  It would seem natural to expect that after the changes
have peaked they might decay exponentially and gradually approach
an asymptote.  There is no support for the sudden ending.
Perhaps there should be one or more really long-term tests to
indicate the actual growth and decay curve. Admittedly, this is
not a job for day after tomorrow, but might be put on the agenda
for later consideration.

     Running RTD tests for periods longer than 72 hours  (to test
     this hypothesis) is under consideration.  When such testing
     is performed, the results will be considered in future
     models.

We have also raised doubts, in our review of M6.EVP.001 and
M6.EVP.005,  about the validity of the resting loss concept. Those
objections carry over to the multiple day tests of diurnal
losses, but their implication is not addressed here.

     We continue to disagree with this reviewer on the issue of
     resting loss emissions.

To the extent that report conclusions provide estimates of the
multiple day effects, it is believed that the estimates can be
refined with relatively little effort.  Approaches applicable to
that refinement are spelled out and to some extent demonstrated
in this review.

4. REFERENCES

Time and resources did not permit extensive review of references
applicable to M6.EVP.003.  Reviews of M6.EVP.001 and M6.EVP.005,
however, are incorporated as part of the present review, to the

-------
                               -23-
extent that comments on those two documents deal with
corresponding issues in M6.EVP.003.  Following is a list of
references affecting this review.

1)  Landman, L. C., Evaluating Resting Loss and Diurnal
Evaporative Emissions Using RTD Tests. Document Number
M6.EVP.001. (Draft) November 20, 1998

2)  McAdams, H.T., Review of Draft Report M6.EVP.001. February,
1999.

3)  Landman, L. C., Modeling Diurnal and Resting Loss Emissions
from Vehicles Certified to the Enhanced Evaporative Standards.
Document Number M6.EVP.005. (Draft) October 1, 1998

4)  McAdams, H. T., Review of Draft Report M6.EVP.005. February,
1990.

-------
                                   -24-
                              Appendix B-1

                     Appendix to Peer  Review Report
                            DEMONSTRATION DATA
Mean
Std
  XI       X2        X3

   000
   000
   001
   001
   010
   010
   Oil
   Oil
   100
   100
   101
   101
   110
   110
   111
   111
   010
   111
   101
   111

0.55      0.55      0.55
0.5104     0.5104    0.5104
13
17
10
7.
20.
31.
8.
8.
30.
29.
18.
24.
32.
31.
27.
20.
22.
25.
24.
27.
.4
.6
.5
.1
.6
.3
.0
.8
.2
.2
.3
.7
.3
. 1
.6
.1
.4
.3
.8
.1
2
2
2
1.
3.
3.
2.
2.
3.
3.
2.
3 .
3.
3.
3 .
3 .
3.
3.
3 .
3 .
.5953
.8679
.3514
.9601
.0253
.4436
.0794
.1748
.4078
.3742
.9069
.2068
.4751
.4372
.3178
.0007
.1091
.2308
.2108
.2995
  21.52
    2.9737
Exp(2.9737)
    .3050    0.4851
         Exp(0.4851)
                                                       = 19.5642
                                                       =  1.6243
      REGRESSION COEFFICIENTS FOR DEMONSTRATION DATA

                        Log       Antilog
Constant
XI
X2
X3
2.7624
0.7430
0.1803
-0.5391
17.6572
12.7385
3 .6462
-9.3615
            R-square
               0.7981
0.8203

-------
                                  -25-
                        Appendix  B-1  (CONT.)
               CALCULATED RESPONSE AND RESIDUALS
           Y    Calculated Responses
                 Y-space   Z-Space*
  Residuals
Y-space   Z-space
Sum
Std.
13 .
17.
10.
7 .
20.
31.
8.
8.
30.
29.
18.
24.
32.
31.
27.
20.
22.
25.
24.
27.


.4000
.6000
.5000
.1000
.6000
.3000
.0000
.8000
.2000
.2000
.3000
.7000
.3000
.1000
.6000
.1000
.4000
.3000
.8000
.1000


17
17
8
8
21
21
11
11
30
30
21
21
34
34
24
24
21
24
21
24


.6572
.6572
.2957
.2957
.3034
.3034
.9420
.9420
.3957
.3957
.0342
.0342
.0420
.0420
.6805
.6805
.3034
.6805
.0342
.6805


15
15
9
9
18
18
11
11
33
33
19
19
39
39
23
23
18
23
19
23


.8381
.8381
.2376
.2376
.9676
.9676
.0628
.0628
.2960
.2960
.4199
.4199
.8749
.8749
.2571
.2571
.9676
.2571
.4199
.2571


-4.
-0.
2.
-1 .
-0.
9.
-3.
-3.
-0.
-1.
-2.
3.
-1.
-2.
2.
-4 .
1.
0.
3.
2.
0.
3.
.2572
.0572
.2043
.1957
.7034
.9966
.9420
.1420
.1957
.1957
.7342
.6658
.7420
.9420
.9195
.5805
.0966
.6195
.7658
.4195
.0000
.5202
-2.
1.
1 .
-2.
1.
12.
-3.
-2.
-3 .
-4.
-1 .
5.
-7.
-8.
4 .
-3.
3 .
2.
5.
3.
3 .
4 .
.4381
.7619
.2624
.1376
.6324
.3324
.0628
.2628
.0960
.0960
.1199
.2801
.5749
.7749
.3429
. 1571
.4324
.0429
.3801
.8429
.5901
.9356

-------
                               -26-
                           Appendix C

            Response to Written Comments from Stakeholders
     The following comments were submitted in response to EPA's
posting a draft of this report on the MOBILE6 website.  The full
text of each of these comments is posted on the MOBILE6 website.

     In responding to  (and incorporating) comments from the peer
     reviewer  (as well as from stakeholders), this final version
     of the report has changed substantially from the earlier
     draft version that was reviewed.  Some of those changes have
     resulted in many of the comments no longer being applicable.


Comment Number:        74

     Name / Affiliation:    James M.  Lyons /  Sierra Research

     Date:              May  28,  1999

     Comment:

     "Before addressing the problems with the EPA proposal, one
     point that needs to be made is that it is unclear whether
     the data that were used in the EPA analysis had been
     adjusted to correct for the elimination of resting losses.
     Since resting losses are treated separately from diurnal
     losses by both MOBILE5a and the proposed MOBILE6, EPA needs
     to assure that resting losses are properly dealt with."

     EPA's Response:

     No.  These estimates are for the second and third days of
     the RTD tests.  The diurnal emissions are obtained by
     subtracting the estimated resting loss emissions (see report
     M6.EVP.001) from these predicted RTD test results.


     Comment:

     "Returning to the EPA approach, the problems begin with the
     form EPA has postulated for Equation 2, which does not
     appear to be reasonable.   For example,  assuming that day i
     is day 1,  the term D = 0 and day 1 emissions as predicted by
     Equation 2 for all vehicle types, regardless of
     purge/pressure test status, is a function of only RVP and
     temperature.  In addition, day 1 emissions based on Equation
     2 are also not a function of vehicle age (e.g., model year).

-------
                          -27-
Clearly, this is not correct, nor is it the manner in which
MOBILE5a or MOBILE6 treats day 1 diurnal emissions."

EPA's Response:

This report has been revised to incorporate the comments
received.  This revised approach has eliminated (avoided)
this problem in this final version of the report.
Comment:

"The second problem deals with EPA's differentiation of
Equation 2 with respect to D. Equation 2 is a discontinuous
function of D, since values of D are restricted to integers
(e.g.,  1,  2, 3 ).   As a result, if one plots emissions
versus D using Equation 2 for three days, one would see
three discrete data points and not a continuous curve.
Since the derivative of discontinuous functions cannot be
taken,  there is no mathematical basis for Equation 4.  Since
EPA uses this equation as the underlying basis for the
multiple-day diurnal correction factors derived from the
agency's statistical analysis, there is no real basis
supporting the current EPA approach."

EPA's Response:

This report has been revised to incorporate the comments
received.   This revised approach has eliminated (avoided)
this problem in this final version of the report.
Comment:

"In the statistical analysis described in the draft report,
EPA indicates that the analysis was performed using the
"ESTIMATE" function in SAS.   While we are not directly
familiar with this function, the basic EPA approach was to
attempt to estimate the regression coefficients (i.e., the
b1  terms)  in  Equation  2 using  linear  regression  techniques
applied to the entire multiple-day diurnal emissions
database  (e.g.,  all vehicles regardless of the fuel metering
system or purge/pressure status).   These constants would
then be inserted into Equation 4,  which EPA incorrectly
derived from Equation 2,  to yield estimates of the
percentage change in diurnal emissions on day 2 relative to
day 1 and on day 3 relative to day 2.  As a result, the
values of regression coefficients are different for the day
2 to day 1 and day 3 to day 2 comparisons.

"Even if one ignores the problems underlying EPA's basic
approach, additional problems can be seen in the results of

-------
                          -28-
the statistical analysis documented in the draft report.
The first issue is that there are relatively few data for
carbureted vehicles.  Therefore, one would expect that it
will be difficult to develop statistically sound regression
coefficients for these vehicles.  Turning to the analysis
itself, an example of additional problems can be seen in the
results of EPA's first attempt to fit the multiple-day
diurnal database to Equation 2, which are shown in Tables
2(a) and 2(b) of the EPA draft report (attached).   Several
facts are apparent from a quick review of these tables.
First,  the estimates of the intercept term b0  are  not
reported.  These are important because,  without them,  it is
impossible to compare the emission values predicted by
Equations 2 and 4 to the actual emission values.  Obviously,
such a comparison needs to be made in order to assess the
validity of the derived equations."

EPA's Response:

This report has been revised to incorporate the comments
received.  This revised approach has eliminated (avoided)
this problem in this final version of the report.
Comment:

"Next, as expected, the RVP and temperature terms have the
strongest correlation with the magnitude of diurnal
emissions (in terms of grams per day) on all days of a
multiple-day diurnal event.  Therefore, it is these
variables that account for most of the variation in the
daily diurnal emission rates explained by the EPA equations.
However,  since the purpose of the EPA analysis is to
determine the increase in emissions relative to day 1 on
subsequent days of a diurnal episode, and EPA concludes that
RVP and temperature should not be considered in evaluating
this increase (page 7),  it is not clear why EPA chose to
include these variables in the statistical analysis.
Instead,  it seems that EPA should have fit the percentage
differences in emissions from one day to the next using
Equation 4.   Based on the large variations in the percentage
changes in emissions from day to day for different vehicles
in the database,  we expect that if this had been done, very
poor regression results would have been obtained.  This in
turn would highlight the basic difficulties associated with
using the EPA approach to develop multiple-day diurnal
correction factors."

EPA's Response:

This report has been revised to incorporate the comments
received.  This revised approach has eliminated  (avoided)
this problem in this final version of the report.

-------
                          -29-
Comment:

"Moving on, the remaining coefficients associated with terms
that are statistically significant at the 95% confidence
level  (as shown in the tables by Pr > T values of 0.05 or
less) are all negative, as shown in the attached Tables 2(a)
and  (b) of the draft EPA report. If one recalls that the
basic HC emission estimate derived from Equation 2 applies
to fuel-injected P/P vehicles, these results indicate, for
example, that diurnal emissions on the second day of a
multiple-day event from fuel-injected F/P, F/F, or P/F
vehicles are lower than those for P/P vehicles.  Clearly,
this is not what one would intuitively expect, nor is it
what the data themselves indicate, as shown in the table on
page 6 of the draft EPA report."

EPA's Response:

Section 4.2.4 has been added  to this report to address this
problem.
Comment:

"The reasonableness of the multiple-day correction factors
EPA proposed can also be evaluated by simply comparing them
to the data from which they were derived.  This has been
done for fuel-injected P/P vehicles by plotting the ratio of
day 2 and day 3 emissions to day 1 emissions.  These plots
are shown in the attached Figures 1 and 2, respectively.
Also shown are the lines representing the EPA correction
factors (1.365 for day 2/dayl and 1.791 for day 3/day 1) and
the lines representing the average values for the data sets.
As can be seen from the figures, there is a large degree of
variability in these ratios.  In many cases, the values are
less than one, indicating lower rather than higher diurnal
emissions on subsequent days of a multiple-day event.  This
high degree of variability, as discussed below, is not
surprising since EPA has assumed that RVP and temperature
are not important."

EPA's Response:

This report has been revised to incorporate the comments
received.   This revised approach has eliminated  (avoided)
this problem in this final version of the report.

-------
                          -30-
Comment:

"Also, it is not clear from the EPA database that all
multiple-day diurnal testing was performed at the same fuel
tank fill level.   Since the amount of vapor generated during
a diurnal depends on the magnitude of vapor space in the
fuel tank, differences in fill level under different testing
programs could be making some contribution to the observed
variability."

EPA's Response:

The fill levels were the same for all of the multi-day
tests.
Comment:

"Other observations that can be made regarding the data in
Figures 1 and 2 include the fact that average values for the
data sets are much greater than one and are substantially
higher (1.3 times greater for day 2 and 2 times greater for
day 3)  than correction factors derived by EPA.  Based on the
above,  EPA's statistical analysis notwithstanding, it is not
at all clear to us that the correction factors EPA derived
are in any way superior to the average values obtained from
the data sets.  What is clear, given the scatter in the
data,  is that both the averages and the EPA correction
factors are not very robust estimates of changes in
emissions that occur on the second and subsequent days of a
multiple-day diurnal.  Our overall conclusion is that there
are fundamental problems that suggest that the current EPA
approach needs to be abandoned in favor of an approach
similar to one of the alternatives described below.

"We believe that the current EPA approach needs to be
abandoned and that a complete reanalysis of the EPA
multiple-day diurnal database along the lines described
above is clearly warranted.  In any case, the treatment of
multiple-day emissions should at least be consistent with
the physical effects that are known to be controlling the
process.   For vehicles with high day 1 emissions and low
emission control system efficiencies, it may turn out to be
acceptable to simply use day 1 emission rate estimates to
represent emissions on subsequent days.  However, for
vehicles with low to moderate day 1 emissions and moderate
to high emission control system efficiencies, substantial
changes in emission rates that are related to a number of
factors can occur and these changes should be taken into
account in MOBILE6."

-------
                               -31-
     EPA's Response:

     As this reviewer suggested, the approach used in the
     previous draft version of this report was  "abandoned," and a
     "complete reanalysis" was performed.  This revised approach
     has eliminated (avoided) this problem in this final version
     of the report.
Comment Number:        75

     Name / Affiliation:   David Lax /  API

     Date:              June  8,  1999

     Comment:

     This submission is simply a cover letter for the previous
     item (Comment number 74 from Sierra)

     EPA's Response:

     See comments on previous item.



Comment Number:        78

     Name / Affiliation:   Tom Darlington /  Air Improvement  Resource,
                       Inc.

     Date:              June  23,  1999

     Comment:

     The test program used to obtain the data, although not
     described in the report, is a highly  suspect source of good
     data due to numerous problems with the way the program was
     conducted.

     EPA appears to have ignored the Auto/Oil and CRC multiple-
     day test data, which was tested correctly and could be used
     for this purpose.

     EPA's Response:

     The testing was performed correctly.  In fact, the same
     contractor was used by both EPA and CRC for the RTD testing.
     As to the "ignoring" CRC test data, we added the last
     paragraph in Section 2.0  (page 3) to  address this point.

-------
                          -32-
Comment:

There are errors in the data sample Table 1.

EPA's Response:

The errors in Table I have been corrected.


Comment:

Industry agrees that passing vehicles should not be higher
than failing vehicles, but this is a function of the
relative emissions on the first day.

EPA's Response:

This assumption is applied  (in MOBILES) to each day of
diurnal emissions.


Comment:

The factors are also not appropriate for vehicles certified
to enhanced evaporative emission standards. EPA may not have
intended to use them for vehicles subject to the enhanced
evaporative standards, but the report is not clear on this
point.

EPA's Response:

Section 4.2.3 has been added to clarify this point.


Comment:

The report does not indicate how the analysis will fit in
with the rest of the diurnal emissions analysis, i.e., the
partial day diurnals, the full day diurnals,  the activity
data, etc.

EPA's Response:

The last paragraph of Section 5.0 has been added to address
this point.

-------